Tutorial :merging array items in ruby


Given an array of arrays [["B", "C", "E", "F"], ["A", "B", "C", "D"], ["G"]]

What is the simplest way to merge the array items that contain members that are shared by any two or more arrays items. For example the above should be [["A", "B", "C", "D","E", "F"], ["G"]] since "B" and "C" are shared by the first and second array items.

Here are some more test cases.

[["B", "C", "E", "F"], ["A", "B", "C", "D"], ["F", "G"]]  => [["A", "B", "C", "D", "E", "F", "G"]]    [["B", "C", "E", "F"], ["A", "B", "C", "D"], ["G"], ["G", "H"]]  => [["A", "B", "C", "D", "E", "F"], ["G", "H,"]]  


Edit: Martin DeMello code was fixed.

When running Martin DeMello code (the accepted answer) I get:

[["B", "C", "E", "F"], ["A", "B", "C", "D"], ["F", "G"]] =>  [["B", "C", "E", "F", "A", "D", "G"], ["A", "B", "C", "D"], ["F", "G"]]  and  [["B", "C", "E", "F"], ["A", "B", "C", "D"], ["G"], ["G", "H"]] =>  [["B", "C", "E", "F", "A", "D"], ["A", "B", "C", "D"], ["G", "H"], ["G", "H"]]  

which does not seem to meet your spec.

Here is my approach using a few of his ideas:

a = [["B", "C", "E", "F"], ["A", "B", "C", "D"], ["F", "G"]]  b = [["B", "C", "E", "F"], ["A", "B", "C", "D"], ["G"], ["G", "H"]]    def reduce(array)    h = Hash.new {|h,k| h[k] = []}    array.each_with_index do |x, i|       x.each do |j|        h[j] << i        if h[j].size > 1          # merge the two sub arrays          array[h[j][0]].replace((array[h[j][0]] | array[h[j][1]]).sort)          array.delete_at(h[j][1])          return reduce(array)          # recurse until nothing needs to be merged        end      end    end    array  end    puts reduce(a).to_s #[["A", "B", "C", "D", "E", "F", "G"]]  puts reduce(b).to_s #[["A", "B", "C", "D", "E", "F"], ["G", "H"]]  


Here is my quick version which can be optimized I am sure :)

# array = [["B", "C", "E", "F"], ["A", "B", "C", "D"], ["G"]]  # array = [["B", "C", "E", "F"], ["A", "B", "C", "D"], ["F", "G"]]  array = [["B", "C", "E", "F"], ["A", "B", "C", "D"], ["G"], ["G", "H"]]    array.collect! do |e|    t = e    e.each do |f|      array.each do |a|        if a.index(f)          t = t | a        end      end    end    e = t.sort  end    p array.uniq  


Different algorithm, with a merge-as-you-go approach rather than taking two passes over the array (vaguely influenced by the union-find algorithm). Thanks for a fun problem :)

A = [["A", "G"],["B", "C", "E", "F"], ["A", "B", "C", "D"], ["B"], ["H", "I"]]  H = {}  B = (0...(A.length)).to_a    def merge(i,j)    A[j].each do |e|      if H[e] and H[e] != j        merge(i, H[e])      else        H[e] = i      end    end      A[i] |= A[j]    B[j] = i  end    A.each_with_index do |x, i|     min = A.length    x.each do |j|       if H[j]        merge(H[j], i)      else        H[j] = i      end    end  end    out = B.sort.uniq.map {|i| A[i]}  p out  


Not the simplest ,may be the longest :)

l = [["B", "C", "E", "F"], ["A", "B", "C", "D"], ["G"]]  puts l.flatten.inject([[],[]])  {|r,e| if l.inject(0) {|c,a| if a.include?(e) then c+1 else c end} >= 2 then r[0] << e ; r[0].uniq! else r[1] << e end ; r}.inspect  #[["B", "C"], ["E", "F", "A", "D", "G"]]  


 l = [["B", "C", "E", "F"], ["A", "B","C", "D"], ["G"]]    p l.inject([]){|r,e|       r.select{|i|i&e!=[]}==[]&&(r+=[e])||(r=r.map{|i|(i&e)!=nil&&(i|e).sort||i})   }  

im not sure about your cond.


The simplest way to do it would be to take the powerset of an array (a set containing every possible combination of elements of the array), throw out any of the resulting sets if they don't have a common element, flatten the remaining sets and discard subsets and duplicates.

Or at least it would be if Ruby had proper Set support. Actually doing this in Ruby is horribly inefficient and an awful kludge:

power_set = array.inject([[]]){|c,y|r=[];c.each{|i|r<<i;r<<i+[y]};r}.reject{|x| x.empty?}  collected_powerset = power_set.collect{|subset| subset.flatten.uniq.sort unless    subset.inject(subset.last){|acc,a| acc & a}.empty?}.uniq.compact    collected_powerset.reject{|x| collected_powerset.any?{|c| (c & x) == x && x.length < c.length}}  

Power set operation comes from here.


Straightforward rather than clever. It's destructive of the original array. The basic idea is:

  • go down the list of arrays, noting which array each element appears in
  • for every entry in this index list that shows an element in more than one array, merge all those arrays into the lowest-indexed array
  • when merging two arrays, replace the lower-indexed array with the merged result, and the higher-indexed array with a pointer to the lower-indexed array.

It's "algorithmically cheaper" than intersecting every pair of arrays, though the actual running speed will depend on what ruby hands over to the C layer.

a = [["B", "C", "E", "F"], ["A", "B", "C", "D"], ["G"], ["G", "H"]]    h = Hash.new {|h,k| h[k] = []}  a.each_with_index {|x, i| x.each {|j| h[j] << i}}  b = (0...(a.length)).to_a  h.each_value do |x|     x = x.sort_by {|i| b[i]}    if x.length > 1      x[1..-1].each do |i|         b[i] = [b[i], b[x[0]]].min        a[b[i]] |= a[i]      end     end  end    a = b.sort.uniq.map {|i| a[i]}  


def merge_intersecting(input, result=[])    head = input.first    tail = input[1..-1]    return result if tail.empty?    intersection = tail.select { |arr| !(head & arr).empty? }    unless intersection.empty?      merged = head | intersection.flatten      result << merged.sort    end    merge_intersecting(tail, result)  end      require 'minitest/spec'  require 'minitest/autorun'    describe "" do    it "merges input array" do      input  = [["B", "C", "E", "F"], ["A", "B", "C", "D"], ["F", "G"]]      output = [["A", "B", "C", "D", "E", "F", "G"]]      merge_intersecting(input).must_equal output    end    it "merges input array" do      input  = [["B", "C", "E", "F"], ["A", "B", "C", "D"], ["G"], ["G", "H"]]      output = [["A", "B", "C", "D", "E", "F"], ["G", "H"]]      merge_intersecting(input).must_equal output    end  end  

Note:If u also have question or solution just comment us below or mail us on toontricks1994@gmail.com
Next Post »