Tutorial :How to tractably solve the assignment optimisation task



Question:

I'm working on a script that takes the elements from companies and pairs them up with the elements of people. The goal is to optimize the pairings such that the sum of all pair values is maximized (the value of each individual pairing is precomputed and stored in the dictionary ctrPairs).

They're all paired in a 1:1, each company has only one person and each person belongs to only one company, and the number of companies is equal to the number of people. I used a top-down approach with a memoization table (memDict) to avoid recomputing areas that have already been solved.

I believe that I could vastly improve the speed of what's going on here but I'm not really sure how. Areas I'm worried about are marked with #slow?, any advice would be appreciated (the script works for inputs of lists n<15 but it gets incredibly slow for n > ~15)

def getMaxCTR(companies, people):      if(memDict.has_key((companies,people))):          return memDict[(companies,people)] #here's where we return the memoized version if it exists      if(not len(companies) or not len(people)):          return 0        maxCTR = None      remainingCompanies = companies[1:len(companies)] #slow?        for p in people:          remainingPeople = list(people) #slow?          remainingPeople.remove(p) #slow?          ctr = ctrPairs[(companies[0],p)] + getMaxCTR(remainingCompanies,tuple(remainingPeople)) #recurse          if(ctr > maxCTR):              maxCTR = ctr      memDict[(companies,people)] = maxCTR      return maxCTR  


Solution:1

To all those who wonder about the use of learning theory, this question is a good illustration. The right question is not about a "fast way to bounce between lists and tuples in python" â€" the reason for the slowness is something deeper.

What you're trying to solve here is known as the assignment problem: given two lists of n elements each and n×n values (the value of each pair), how to assign them so that the total "value" is maximized (or equivalently, minimized). There are several algorithms for this, such as the Hungarian algorithm (Python implementation), or you could solve it using more general min-cost flow algorithms, or even cast it as a linear program and use an LP solver. Most of these would have a running time of O(n3).

What your algorithm above does is to try each possible way of pairing them. (The memoisation only helps to avoid recomputing answers for pairs of subsets, but you're still looking at all pairs of subsets.) This approach is at least Ω(n222n). For n=16, n3 is 4096 and n222n is 1099511627776. There are constant factors in each algorithm of course, but see the difference? :-) (The approach in the question is still better than the naive O(n!), which would be much worse.) Use one of the O(n^3) algorithms, and I predict it should run in time for up to n=10000 or so, instead of just up to n=15.

"Premature optimization is the root of all evil", as Knuth said, but so is delayed/overdue optimization: you should first carefully consider an appropriate algorithm before implementing it, not pick a bad one and then wonder what parts of it are slow. :-) Even badly implementing a good algorithm in Python would be orders of magnitude faster than fixing all the "slow?" parts of the code above (e.g., by rewriting in C).


Solution:2

i see two issues here:

  1. efficiency: you're recreating the same remainingPeople sublists for each company. it would be better to create all the remainingPeople and all the remainingCompanies once and then do all the combinations.

  2. memoization: you're using tuples instead of lists to use them as dict keys for memoization; but tuple identity is order-sensitive. IOW: (1,2) != (2,1) you better use sets and frozensets for this: frozenset((1,2)) == frozenset((2,1))


Solution:3

This line:

remainingCompanies = companies[1:len(companies)]

Can be replaced with this line:

remainingCompanies = companies[1:]  

For a very slight speed increase. That's the only improvement I see.


Solution:4

If you want to get a copy of a tuple as a list you can do mylist = list(mytuple)


Note:If u also have question or solution just comment us below or mail us on toontricks1994@gmail.com
Previous
Next Post »