Sometimes less is more

Update: A lot of people were disappointed that 10.minutes.ago etc. is not working in pure Ruby. Well, after executing the line require ‘active_support’ it does – I think this is a fairly small thing to do to enable these powerful features.

Every guide published on favorable writing principles emphasizes the power of brief and concise style. This is especially true in the case of technical texts, and in my opinion, in the case of well-designed programming languages as well.

Note the word well-designed. I did not say in the case of any (programming) language, since that would just not be true: conciseness can come at the cost of readability. (If you ever tried to read kanji, you know what I am talking about ;-) . However, I am claiming that in the case of a really well-designed programming language, succinctness helps readability, reduces bloat and leads to easier and faster understanding of the code. In my experience, the amount of boilerplate code to write is decreasing proportionally with the terseness of the programming language, ultimately leading to a coding style where you (nearly) don’t need to write boilerplate at all.

I will demonstrate this on a few Java vs. Ruby code examples. However, this is NOT a Ruby-bashing-Java article, but a few examples of idioms and interesting constructs; C++ vs Haskell or Lisp could serve equally well (sometimes even better), but since I am currently working with Java and Ruby on a daily basis, it is easier for me to use them.

If you are a pro Ruby and/or Java programmer, and/or you think the article is too long for you, please jump to the “Random Code Snippets” section.

Possibly the most straightforward reason why Ruby code is more readable even in shorter form is that really everything is an object [1] in Ruby-land. For example in Java, primitives need wrapper classes to ‘become’ objects., while in Ruby they are first class objects on their own. This makes constructs like

10.times { print "ho" }  #=> "hohohohohohohohohoho"

or (will output the same string)

print "ho" * 10 #=> "hohohohohohohohohoho"

possible.

There are a handful of other reasons which make Ruby more readable and elegant, but before I get bogged down in the explanation too much, let’s see the examples!

Whetting your appetite

In the first part I will describe some basic constructs which would make the life of any Java developer much easier. These techniques are neat, but they are not using any really sophisticated stuff yet: I will try to take a look at those in the next bigger section.

The empty program

Java:

class Test
{
    public static void main(String args[]) {}
}

Ruby:

   

I did not forget the Ruby snippet; You can not see anything there because actually a Ruby program doing nothing is exactly 0 characters long.
On the other hand, the Java version is slightly longer. I is kind of weird to explain to a newcomer what do ‘class’, ‘public’, ‘static’, ‘void’, ‘String’, the [] operator and several braces here and there mean, and why are they needed if the program does literally nothing

Fun with numbers

Note:For some of the next examples you will need to use Rails Active Support.

Java:

if ( 1 % 2 == 1 ) System.err.println("Odd!") #=> Odd!

Ruby:

if 11.odd? print "Odd!" #=> Odd!

Does not the first example make more sense (even for a non-programmer)?. I believe it does. More of this type:

Java:

102 * 1024 * 1024 + 24 * 1024 + 10 #=> 106979338

Ruby:

102.megabytes + 24.kilobytes + 10.bytes #=> 106979338

OK, maybe this is an unfair comparison since Java does not have (?) those functions. However, the point is that even if it had,
the best I could come up with would look like:

Util.megaBytes(102) + Util.kiloBytes(24) + Util.bytes(10) #=> 106979338

Which is far from the elegance and readability of the Ruby example.

In the next example we will assume that we have a Java function similar to ordinalize in Ruby.


Java:

System.err.println("Currently in the" + Util.ordinalize(2) + "trimester");

Ruby:

 print "Currently in the #{2.ordinalize} trimester"    #=> "Currently in the 2nd trimester"

In this example we can observe variable interpolation: anything wrapped in #{} inside double quotes gets evaluated
and substituted in the string, providing a more readable form without a lot of + + Java constructs (which is cool mainly if you have more variables inside the double quotes).

Dates

In my opinion, handling dates and times is a great PITA in Java, especially if you are implementing some complex code.

Java:

System.out.println("Running time: " + (3600 + 15 * 60 + 10) + "seconds");

Ruby:

puts "Running time: #{1.hour + 15.minutes + 10.seconds} seconds"

Java:

new Date(new Date().getTime() - 20 * 60 * 1000)

Ruby:

20.minutes.ago

Java:

Date d1 = new GregorianCalendar(2006,9,6,11,00).getTime();
Date d2 = new Date(d1.getTime() - (20 * 60 * 1000));

Ruby:

20.minutes.until("2006-10-9 11:00:00".to_time)

I think you do not have to be biased towards Ruby at all to admit which code makes more sense instantly…

I have recently found a very cool way of parsing dates in Ruby: using Chronic. However, I would not like to present it here since it is not a feature of the language, ‘just’ a nifty natural-language date parser [2].

A little bit more advanced stuff

Classes

Java:

Class Circle
  private Coordinate center, float radius;

  public void setCenter(Coordinate center)
  {
    this.center = center;
  }

  public Coordinate getCenter()
  {
    return center;
  }

  public void setRadius(float radius)
  {
    this.radius = radius;
  }

  public Coordinate getRadius()
  {
    return radius;
  }
end;

Ruby:

class Circle
  attr_accessor :center, :radius
end

Believe it or not, the two code snippets are absolutely equal; The getter and setter methods in Ruby code are generated automatically, so not only you do not have to write them, but they are not even there to clutter the code.

I have seen argumentation from Java guys that stuff like this (i.e. the public static void main … thing, getters/setters and other boilerplate code) can be generated with any decent GUI like Eclipse (or by tools like XDoclet etc) is a non-issue. Well, as for their generation, let us say this is true. But for the readability of code it is absolutely not!

For example. take getters/setters: Every variable in Java ads 8 more lines of code (not counting the lines between the function declarations) compared to the Ruby :attr_accessor idiom. That is, a simple class definition having 10 fields in Java will have 80+ lines of code compared to 1 lines of the same code in Ruby. For me, this definitely means a big difference.

Arrays (and other containers)

This section was inspired by a blog entry by Steve Yegge.

Arrays are interesting citizens of Java: They are not really objects in the “classical” sense , so they have very limited functionality compared to first-class Java objects. On the other hand, they are offering a huge advantage over the other container classes: they can be easily initialized.


Java:

String languages[] = new String[] {"Java", "Ruby", "Python", "Perl"};

instead of

List<String> languages = new LinkedList<String>();
languages.add("Java");
languages.add("Ruby");
languages.add("Python");
languages.add("Perl");

which is kind of lame when you quickly need to hack up some testing data.

However, they have also some serious problems: you have to define the number of the elements upon construction time, like so:


Java:

String someOtherLanguages<String>[] = new String[15];

which sometimes really cripples their functionality. [3]

How does this work in Ruby? Let’s see on three different examples (All three code snippets provide the same result):

Ruby:

stuff = [] #An empty array - as you can see there is no need to define the size
stuff << "Java", "Ruby", "Python" #Add some elements
#Initialize the array with the values
stuff = ["Java", "Ruby", "Python"]
#Yet another method yielding the same result
stuff = %w(Java Ruby Python)

In my opinion, these forms (especially the last one) are more straightforward and can save a lot of typing.

Another major shortcoming of Java arrays is that besides the [] operator you have only the methods inherited from Object and a single instance variable : length [4]. This means that even essential functionality like sorting, selecting elements based on something etc. has to be done via a ‘third party’ function, like this:


Java:

Arrays.sort(languages);

which seemed quite normal to me when I have been learning Java and have had no previous experience with dynamic languages, but now it looks kind of annoying.

Another Java-container-woe compared to Ruby is that in Java, an array is an array. A list is a list. A stack is a stack.
If you are wondering what the hell I am talking about, check out these Ruby code snippets:

Ruby:

stuff = %w(Java Ruby Python)
#Add the string "Perl" to the array
stuff << "Perl"
#Prepend the string "Ocaml" 
stuff.unshift "Ocaml"  
=> ["OCaml", "Java", "Ruby", "Python", "Perl"]
#Use the array as a stack
stuff.pop 
=> "Perl"  #stuff is now ["OCaml", "Java", "Ruby", "Python"] 
stuff.push "C", "LISP"
=> ["OCaml", "Java", "Ruby", "Python", "C", "LISP"]
#Update C to C++ 
stuff[4] = "C++"
=> ["OCaml", "Java", "Ruby", "Python", "C++", "LISP"]
#Remove the fisrt element
stuff.shift
=> "OCaml" #stuff is now ["Java", "Ruby", "Python", "C++", "LISP"] 
#Let's just stick with Java and Ruby - slice out the  rest!
stuff.slice!(2..4)
=> ["Python", "C++", "LISP"] #stuff is now ["Java", "Ruby"]

As you can see, the Ruby Array class offers functionality that could be achieved only by mixing up several Java containers into one (to my knowledge, at least) [5]. In practice, this usually speeds things up a lot.

Another thing that really annoys me when using containers in Java is the lack of this functionality:

Ruby:

stuff = ["OCaml", "Java", "Ruby", "Python", "C++", "LISP"]
#Lua is just gaining steam, add it to the 7th place
stuff[7] = "Lua"
=> ["OCaml", "Java", "Ruby", "Python", "C", "LISP", nil, "Lua"]

i.e. that if I am adding an element to an index which is bigger than the size of the array, the empty space inbetween is filled with nils. Now seriously, who would not exchange this for the Java behaviour (an IndexOutOfBoundsException is thrown) – after all, if I would need this functionality (which is VERY seldomly the case) in Ruby, I could check it myself and raise an exception if I don’t like what I see.

I wanted to write a bit about differences between hashes and files in Ruby and Java, but the post is already longer now then I wanted it to be so I guess I will just show some concrete code snippets to conclude.

Random Code Snippets

In this section I would like to present some real cases I have been solving with Ruby recently. Since I am still new to Ruby, I was totally amazed just how much more simpler, shorter yet much more understandable the code can be in Ruby compared to Java.

Files and Regular Expressions

As Bruce Eckel once put it, In Java, it’s a research project to open a file. Well, I have to agree. Maybe I am the only one Java programmer (besides Bruce) who – even after using Java professionally for five years – still can not write to a file without using google first. Maybe I should learn it one day?

Regular expression support in java is OK (at least one does not have to use external packages as in the pre-1.4 era), however, compared to Ruby the syntax is quite heavy.

Let’s see a demonstration on the following task: Open the file ‘test.txt’ and write all the sentences to the console (one sentence per line) which contain the word ‘Ruby’. First, the Java solution:


Java

import java.io.BufferedReader;
import java.io.FileReader;
import java.io.IOException;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class Test 
{
	public static void main(String[] args)
	{
	    try {
	        BufferedReader in = new BufferedReader(new FileReader("test.txt"));
	        StringBuffer sb = new StringBuffer();
	        String str;
	        while ((str = in.readLine()) != null) 
	          { sb.append(str + "\n"); }	        
	        in.close();
	        String result = sb.toString();
	        Pattern sentencePattern = Pattern.compile("(.*?\\.)\\s+?");
	        Pattern javaPattern = Pattern.compile("Ruby");
	        Matcher matcher = sentencePattern.matcher(result);
	        while (matcher.find()) {
	            String match = matcher.group();
	            Matcher matcher2 = javaPattern.matcher(match);
	            if (matcher2.find())
	            	System.err.println(match);
	        }
	    } catch (IOException e) 
	      {
	    	e.printStackTrace();
	      }		
	}
}

It is quite straightforward what this relatively simple code snippet doing – but if this is straightforward, what should I say about the Ruby version?

Ruby

File.read('test.txt').scan(/.*?\. /).each { |s| puts s if s =~ /Ruby/ }

Well, umm… I guess this example quite much expresses the point I am talking about from the beginning: sometimes less is more, a.k.a. Succinctness is Power!

Again, I wanted to show much more examples, but I have the feeling that since the article is already too long, no one would read it :-) It is a big pity since I did not even talk about hashes, blocks, closures, metaprogramming (well, I will mention it briefly in the last (really :-) ) example) and other goodies – maybe in part 2?

If this is still not enough…

Although I find it very easy and natural to express a lots of things in Ruby, the language can not offer anything I would ever need. However, there is a powerful concept to invoke in such situations, called metaprogramming.

A few days ago I needed to test some algorithms on trees, so I needed to hack up a lot of tree test data. Here is how I would go about this in Java using the example below (let’s assume in both languages that we have a simple data structure Tree):

               a
            /      \
          b         c
         /  \      / | \
        d    e    f  g  h

Java

Tree a = new Tree("a");

Tree b = new Tree("b");
Tree c = new Tree("c");
a.addChild(b);
a.addChild(c);

Tree d = new Tree("d");
Tree e = new Tree("e");
b.addChild(d);
b.addchild(e);

Tree f = new Tree("f");
Tree g = new Tree("g");
Tree h = new Tree("h");
c.addChild(f);
c.addChild(g);
c.addChild(h);

Another possibility would be to create an XML file with the description of the tree and parse it from there. This solution is even more convenient since though you have to write the parsing code, you just have to edit an XML file once it is written. One possibility how the tree of this example could look something like


XML

  <node name="a">
      <node name="b">
          <node name="d"/>
          <node name="e"/>
      </node>
      <node name="c">
          <node name="f"/>
          <node name="g"/>
          <node name="h"/>
      </node>
  </node>

The latter solution is quite cool. After all you do not need to write any code, just alter the XML file and that’s it.

Now let’s see the Ruby solution I came up with:

Ruby

tree = a {
            b { d e }
            c { f g h }
          }

Well… suddenly even the XML file seems too heavy, does not it? :-) Not to mention the fact that the latter example is pure Ruby code – there is no need to open an external file and parse it – you just run it and the variable tree will contain your tree. That’s it.

Of course Ruby can not handle this code as it is – for this we need to invoke some metaprogramming magic.
[6]

Metaprogramming is a way to drive Ruby with Ruby. Java (especially J2EE) is usually driven by XML (which is not always really a good thing in my opinion) As you could see, Ruby is driven by Ruby instead :-)

This example merely scratched the surface of Ruby’s possibilities through metaprogramming. However, as with the other examples, my goal was not to advocate a concrete pattern/method over a different one, but rather to show how a specific toolset can change the way of thinking about the task at hand, and the way of code design/implementation in general.

Final thoughts

When I was a child, I spoke as a child, I understood as a child, I thought as a child: but when I became a man, I put away childish things. – The Bible, I Corinthians 13:11

This thought pretty well expresses how I felt about Java/C++/(substitute any non-dynamic language here) when I came to know (some of) Ruby’s true dynamism and expressive power through terse yet powerful idioms which transformed my whole thinking about programming. Of course I do not claim that I ‘became a man’ because that’s still a very long way to go, but still, even with my very limited knowledge of Ruby, the way to express things in Java/C++ now seems… well… childish ;-) . [7]


Creating a site on online tutorials is not a completely bad idea. One should include courses for 642-054, 642-054, 642-176, ruby on rails, cissp certification. If you have the knowledge for creating the tutorials, the rest is very simple. By using dot5hosting company’s site you can purchase a web hosting package that is economical. Even dedicated servers can be found for low prices. Then through the use of ip telephony one can directly reach its potential clients. Other internet marketing methods should also be considered to create awareness the site.


Notes

[1] I wonder whether this déjà vu will happen to me in the future once again: I have had this ‘Wow, everything is an object’ feeling when coming from C++ to Java; Then after coming from Java to Python; and most recently, after coming from Python to Ruby. Back


[2] Some examples that Chronic can handle:

  summer
  6 in the morning
  tomorrow
  this tuesday
  last winter
  this morning
  3 years ago
  1 week hence
  7 hours before tomorrow at noon
  3rd wednesday in november
  4th day last week

Kudos…
Back


[3] In the previous array initialization example this was not needed since it is trivial when all the elements are stated beforehand.Back


[4]which somewhat confusing given that all the other containers use the method size() (and not a field!) to determine the count of elements. To nicely mesh with the confusion, the String object provides the method length() (and not a field, as with array) to query the number of characters… Back


[5] I mean it is not possible to construct any container – other than an array – with literals, you can use the [] operator on arrays only, you can not get the i-th element of a stack etc. Back


[6] The code I have been using to accomplish this task relies on the method_missing idiom:

class Object
  @stack = []
  @parent = nil

  def method_missing(method_name, *args, &block)
    tree = Tree.new(method_name)
    @parent.add_child(tree) if @parent != nil
    if block_given?
      @stack ||=[]
      @parent = tree
      @stack.push @parent
      yield block
      @stack.pop
      @parent = @stack.last
    end
    tree
  end
end

Back


[7] This does not necessarily mean that Java is bad and Ruby is good – just that it was the ‘Ruby way’ that struck a chord in me after trying/playing around with programming in many programming languages. Many of the features I adore in Ruby are there in Java as well, but they did not ‘came through’ whereas with Ruby there was a point of enlightenment when I really understood a lot of generic, non-language specific principles. Back

W3C Mozilla DOM Connector was released today

I have announced the upcoming release of the W3C Mozilla DOM Connector in one if my previous posts, and now it has finally arrived. You can view it at


http://svn.rubyrailways.com/W3CConnector/

or check it out with svn:

svn co http://svn.rubyrailways.com/W3CConnector/

For a description about the connector, please refer to my previous post. If you would like to try it out, here is how:

#this code snippet gives you a DOM document of the currently loaded page:
  nsIWebBrowser brow = getWebBrowser();
  nsIWebNavigation nav =
      (nsIWebNavigation)
      brow.queryInterface(nsIWebNavigation.NS_IWEBNAVIGATION_IID);
  nsIDOMDocument doc = (nsIDOMDocument) nav.getDocument();
  Document mozDoc = (Document)
org.mozilla.dom.NodeFactory.getNodeInstance(doc);

From now on, you can use all the existing java/dom libraries such as an XPath2 engine like saxon, xalan, whatever you want working on mozilla documents.

This means tremendous power compared to (in their category outstanding, but still limited) tools like RubyfulSoup or Mechanize, stemming from the power of XPath to query XML documents.
A simple example – dumping DOM of the html document to stdout:

public static void writeDOM(Node n)
      throws IOException
  {
      try {
          StreamResult sr = new StreamResult(System.out);
          TransformerFactory trf = TransformerFactory.newInstance();
          Transformer tr = trf.newTransformer();
          tr.setOutputProperty(OutputKeys.ENCODING, "UTF-8");
          tr.transform(new DOMSource(n), sr);
      }
      catch (TransformerException e) {
          throw new IOException();
      }
   }

Cool, isn’t it?

At the moment, I am discussing different integration issues with the Mozilla guys, since the connector should be the part of Mozilla and the Eclipse editor in the future.

W3C Mozilla DOM Connector will be released soon!

I am happy to announce that the much anticipated W3C Connector, after lots of coding, testing, bug fixing and several months of successful usage in a commercial product was proven worthy to be released to the public. If everything goes well, it will hit the streets next week.

OK, but what the heck is the W3C Connector?

The W3C Connector is a Java package which can be used to access the Mozilla DOM tree from Java, while implementing the standard org.w3c.* interfaces. This means you can use it with any standard Java package that is expecting org.w3c.* interfaces ( Xerces, Saxon, Jaxen, … ) to execute effective queries on the Mozilla DOM (XML/XSLT/XPath/XQuery operations for example).

Technically,the W3C Connector is an implementation of the standard org.w3c.* interfaces. The implementing classes are calling Javier Pedemonte’s JavaXPCOM package, which in turn wraps the Mozilla XPCOM in order to gain access to the Mozilla DOM. See the image for an illustration:

This is very nice and all, but why should I care about it?

If you ever wanted to do (or have done) a screen scraping application, where you needed to understand the underlying document to some extent (regular expressions were not sufficient) you should know that there are many pitfalls along the way:

  • First of all, malformed HTML code: Despite the continuous efforts of the W3C and other organizations/individuals to remedy this problem by promoting X(HT)ML and other machine parsable formats, a lots of web pages still have malformed code in them. Based on the level of non-standardness, parsing such a page can be more than a moderate technical problem: in practice there are pages which can not be parsed to produce an usable input.
  • You can not use a standard query language like XPath or XQuery – these languages require a XML input, which you can not ensure because of the previous point, so you are left to roll your own code to process the parsed data.

Of course this is not a big problem for a crafted programmer, mainly if he is equipped with tools like HTMLTidy to address the first point, RubyfulSoup or similar to tackle the second. However, even these (and other) tools and a cool programming language are still just easing up the pain of effective screen scraping, but not offering a generic solution. If you want to scrap a lot and different pages, these problems in practice will cripple your efforts (or at least make it last very long time in practice).

How does the W3C Connector solve this problem?

By solving both points: The Mozilla DOM is a structure reflecting how gecko (the mozilla rendering engine) renders the page, and it always translates to valid XML (no unclosed tags or otherwise malformed code), and because of implementing the org.w3c.* interfaces you can use very robust and effective XPath packages (like Saxon) to query the document for effective HTML extraction.

There are of course a lot of other possible uses – the connector is not a tool itself, but a gateway to the world of W3C compliant XML tools – it is up to you how to leverage the power it gives you.

The Big Brother

The W3C Connector will be released officially as the part of theATF project. The code is under the last review at the moment, it is possible that I will come out with a preview release before the official one.

Java and Ruby (on Rails)

Being a professional Java programmer myself, i collected some links that might bridge the gap for Java programmers who would like to take a peek at Ruby:

Introductory materials

Java – Ruby integration

  • JRuby – A 1.8.2 compatible Ruby interpreter written in 100% pure Java. Charles Oliver Nutter, one of the JRuby developers in a discussion claimed that “We’re currently working to make JRuby more EJB and J2EE-friendly, so you’re certain to see more of these opportunities.”

The question from the top of every RoR/Java FAQ:

Ajax Goodies:

Not exactly java, but nice Rails+Ajax technology showoff