Tutorial :Sort a multidimensional list by a variable number of keys


I've read this post and is hasn't ended up working for me.

Edit: the functionality I'm describing is just like the sorting function in Excel... if that makes it any clearer

Here's my situation, I have a tab-delimited text document. There are about 125,000 lines and 6 columns per line (columns are separated by a tab character). I've split the document into a two-dimension list.

I am trying to write a generic function to sort two-dimensional lists. Basically I would like to have a function where I can pass the big list, and the key of one or more columns I would like to sort the big list by. Obviously, I would like the first key passed to be the primary sorting point, then the second key, etc.

Still confuzzled?

Here's an example of what I would like to do.

Joel    18  Orange  1  Anna    17  Blue    2  Ryan    18  Green   3  Luke    16  Blue    1  Katy    13  Pink    5  Tyler   22  Blue    6  Bob     22  Blue    10  Garrett 24  Red 7  Ryan    18  Green   8  Leland  18  Yellow  9  

Say I passed this list to my magical function, like so:

sortByColumn(bigList, 0)    Anna    17  Blue    2  Bob     22  Blue    10  Garrett 24  Red 7  Joel    18  Orange  1  Katy    13  Pink    5  Leland  18  Yellow  9  Luke    16  Blue    1  Ryan    18  Green   3  Ryan    18  Green   8  Tyler   22  Blue    6  


sortByColumn(bigList, 2, 3)    Luke    16  Blue    1  Anna    17  Blue    2  Tyler   22  Blue    6  Bob     22  Blue    10  Ryan    18  Green   3  Ryan    18  Green   8  Joel    18  Orange  1  Katy    13  Pink    5  Garrett 24  Red 7  Leland  18  Yellow  9  

Any clues?


import operator:  def sortByColumn(bigList, *args)      bigList.sort(key=operator.itemgetter(*args)) # sorts the list in place  


This will sort by columns 2 and 3:



The key idea here (pun intended) is to use a key function that returns a tuple. Below, the key function is lambda x: (x[idx] for idx in args) x is set to equal an element of aList -- that is, a row of data. It returns a tuple of values, not just one value. The sort() method sorts according to the first element of the list, then breaks ties with the second, and so on. See http://wiki.python.org/moin/HowTo/Sorting#Sortingbykeys

#!/usr/bin/env python  import csv  def sortByColumn(aList,*args):      aList.sort(key=lambda x: (x[idx] for idx in args))      return aList    filename='file.txt'  def convert_ints(astr):      try:          return int(astr)      except ValueError:          return astr      biglist=[[convert_ints(elt) for elt in line]           for line in csv.reader(open(filename,'r'),delimiter='\t')]    for row in sortByColumn(biglist,0):      print row    for row in sortByColumn(biglist,2,3):      print row  


Make sure you have converted the numbers to ints, otherwise they will sort alphabetically rather than numerically

# Sort the list in place  def sortByColumn(A,*args):      import operator      A.sort(key=operator.itemgetter(*args))      return A  


# Leave the original list alone and return a new sorted one  def sortByColumn(A,*args):      import opertator      return sorted(A,key=operator.itemgetter(*args))  

Note:If u also have question or solution just comment us below or mail us on toontricks1994@gmail.com
Next Post »