Partitioning Sets in Ruby

During hacking on various tasks, I needed to partition a set of elements quite a few times. I have attacked the problem with different homegrown implementations, mostly involving select-ing every element belonging into the same basket in turn. Fortunately I run across divide recently, which does exactly this… No more wheel reinvention! Let’s see a concrete example.

I have an input file like this:

a 53 2 3
b 8 62 1 23
a 9 0 31
b 4 45 4 16 7
b 1 23
c 3 42 2 31 4 6
a 1 3 22
a 7 83 1 23 3
b 1 14 4 15 16 2
c 5 16 2 34

The goal is to sum up all the numbers in rows beginning with the same character (e.g. to sum up all the numbers that are in a row beginning with ‘a’). The result should look like:

[{"a"=>241}, {"b"=>246}, {"c"=>145}]

This is an ideal task for divide! Let’s see one possible solution for the problem:

require 'set'

input = Set.new open('input.txt').readlines.map{|e| e.chomp}
groups = input.divide {|x,y| x.map[0][0] == y.map[0][0] }
#build the array of hashes
p groups.map.inject([]) {|a,g|
   #build the hashes for the number sequences with same letters
    a << g.map.inject(Hash.new(0)) {|h,v|
    #for every sequence, sum the numbers it contains
    h[v[0..0]] += v[2..-1].split(' ').inject(0) {|c,x|
      c+=x.to_i; c}; h
  }; a
}

The output is:

[{"a"=>241}, {"b"=>246}, {"c"=>145}]

Great - it works! Now let's take a look into the code...

The 3rd line loads the lines into a set like this:




The real thing happens on line 4. After it's execution, groups looks like:


, , }>

As you can see, the set is correctly partitioned now - with almost no effort! We did not even need to require an external library...

The rest of the code is out of the scope of this article (everybody is always complaining about the long articles here, so I am trying to keep them short) - and anyway, the remaining snippet is just a bunch of calls to inject. If inject does not feel too natural to you, don't worry - it took me months until I got used to it, and some people (despite of the fact that they fully understand and are able to use it) never reach after it - I guess it's a matter of taste...'

5 thoughts on “Partitioning Sets in Ruby

  1. Ben,

    theoretically that could happen – there is absolutely no guarantee that two rows will be different.
    Well… it seems even divide is not perfect (it could do the partitioning into a Hash, where the value set would be the number of occurrences of the sets or something) – or maybe this is possible somehow?
    Going to check it…

  2. So why complicate it with divide?

    Why not:

    #

    h = {}
    open(‘input.txt’).eachline{|l|
    h[l[0..0]] +=
    l[2..-1].split(‘ ‘).inject(0) {|c,x| c+=x.to
    i; c}}
    }
    p h.to_a.map{|k,v| {k => v}

    #

    ?

    It would also remove the same-line weakness.

  3. Well, the original idea was to show how Set.divide works… Admiring my just found shiny new hammer, I guess this problem pretty much looked like a big nail :-)

    Thanks for the solution, Aur.

  4. Come on, you can do better than that :)

    There is a method Set.classify, related to Set.divide, which gets you closer to your goal:

    require ‘set’
    input = Set.new open(‘input.txt’).readlines.map{|line| line.chomp.split}
    groups = input.classify {|e| e.shift}
    result = groups.map {|k,v| {k => v.toa.flatten.inject(0) {|sum, x| sum+x.toi}}}

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>