Tutorial :Sorting a string that could contain either a time or distance


I have implemented a sorting algorithm for a custom string that represents either time or distance data for track & field events. Below is the format

'10:03.00 - Either ten minutes and three seconds or 10 feet, three inches

The result of the sort is that for field events, the longest throw or jump would be the first element while for running events, the fastest time would be first. Below is the code I am currently using for field events. I didn't post the running_event_sort since it is the same logic with the greater than/less than swapped. While it works, it just seems overly complex and needs to be refactored. I am open to suggestions. Any help would be great.

event_participants.sort!{ |a, b| Participant.field_event_sort(a, b) }    class Participant  def self.field_event_sort(a, b)    a_parts = a.time_distance.scan(/'([\d]*):([\d]*).([\d]*)/)    b_parts = b.time_distance.scan(/'([\d]*):([\d]*).([\d]*)/)      if(a_parts.empty? || b_parts.empty?)      0    elsif a_parts[0][0] == b_parts[0][0]      if a_parts[0][1] == b_parts[0][1]        if a_parts[0][2] > b_parts[0][2]          -1        elsif a_parts[0][2] < b_parts[0][2]          1        else          0        end      elsif a_parts[0][1] > b_parts[0][1]        -1      else        1      end      elsif a_parts[0][0] > b_parts[0][0]       -1    else      1    end  end  end  


This is a situation where #sort_by could simplify your code enormously:

event_participants = event_participants.sort_by do |s|      if s =~ /'(\d+):(\d+)\.(\d+)/          [ $1, $2, $3 ].map { |digits| digits.to_i }       else          []      end  end.reverse  

Here, I parse the relevant times into an array of integers, and use those as a sorting key for the data. Array comparisons are done entry by entry, with the first being the most significant, so this works well.

One thing you don't do is convert the digits to integers, which you most likely want to do. Otherwise, you'll have issues with "100" < "2" #=> true. This is why I added the #map step.

Also, in your regex, the square brackets around \d are unnecessary, though you do want to escape the period so it doesn't match all characters.

One way the code I gave doesn't match the code you gave is in the situation where a line doesn't contain any distances. Your code will compare them as equal to surrounding lines (which may get you into trouble if the sorting algorithm assumes equality is transitive. That is a == b, b == c implies a ==c, which is not the case for your code : for example a = "'10:00.1", b = "frog", c="'9:99:9").

#sort_by sorts in ascending order, so the call to #reverse will change it into descending order. #sort_by also has the advantage of only parsing out the comparison values once, whereas your algorithm will have to parse each line for every comparison.


Instead of implementing the sort like this, maybe you should have a TrackTime and FieldDistance models. They don't necessarily need to be persisted - the Participant model could create them from time_distance when it is loaded.

You're probably going to want to be able to get the difference between two values, validate values as well sort values in the future. The model would make it easy to add these features. Also it would make unit testing a lot easier.

I'd also separate time and distance into two separate fields. Having dual purpose columns in the database only causes pain down the line in my experience.


I don't know ruby but here's some c-like pseudo code that refactors this a bit.

/// In c, I would probably shorten this with the ? operator.  int compareIntegers(a, b) {      int result = 0;      if (a < b) {          result = -1;      } else if (a > b) {          result = 1;      }      return result;  }    int compareValues(a, b) {      int result = 0;      if (!/* check for empty*/) {          int majorA = /* part before first colon */          int majorB = /* part before first colon */          int minorA = /* part after first colon */          int minorB = /* part after first colon */            /// In c, I would probably shorten this with the ? operator.          result = compareIntegers(majorA, majorB);          if (result == 0) {              result = compareIntegers(minorA, minorB);          }      }      return result;  }  


Your routine looks fine but you could just remove the ''', ':' and '.' and treat the result as a numeric string. In other words the 10' 5" would become 1005 and 10' 4" would be 1004. 1005 is clearly more than 1004.

Since the higer order elements are on the left, it will sort naturally. This also works with time for the same reasons.


I agree that converting to integers will make is simpler. Also note that for integers

if a > b    1  elsif a < b    -1  else     0  

can be simplified to a<=>b. To get the reverse use -(a <=> b).


In this scenario:

Since you know you are working with feet, inches, and (whatever your third unit of measure is), why not just create a total sum of the two values you are comparing?

So after these two lines:

a_parts = a.time_distance.scan(/'([\d]):([\d]).([\d])/) b_parts = b.time_distance.scan(/'([\d]):([\d]).([\d])/)

Generate the total distance for a_parts and b_parts:

totalDistanceA = a_parts[0][0].to_i * 12 + a_parts[0][1].to_i + b_parts[0][2].to_i * (whatever your third unit of measure factor against the size of an inch)

totalDistanceB = b_parts[0][0].to_i * 12 + b_parts[0][1].to_i + b_parts[0][2].to_i * (whatever your third unit of measure factor against the size of an inch)

Then return the comparison of these two values:

totalDistanceA <=> totalDistanceB

Note that you should keep the validation you are already making that checks if a_parts and b_parts are empty or not:

a_parts.empty? || b_parts.empty?

For doing the sorting by time scenario, do the exact same thing except with different factors (for example, 60 seconds to a min).


Why not do

a_val = a_parts[0][0].to_i * 10000 + a_parts[0][1].to_i * 100 + a_parts[0][2].to_i  b_val = b_parts[0][0].to_i * 10000 + b_parts[0][1].to_i * 100 + b_parts[0][2].to_i  a_val <=> b_val  

The numbers won't make sense to subtract, etc but they should sort ok.

You may want to check [1] and [2] are always two digits in the regexp.

Note:If u also have question or solution just comment us below or mail us on toontricks1994@gmail.com
Next Post »