header image

Archive for the 'Tutorial' Category

acts_as_state_machine for Dummies, part I

Monday, May 5th, 2008

I recently applied this great plugin a few times to tackle different tasks and would like to share with you the joys of thinking in state machines!

Disclaimer: This installment is ‘just’ an intro to the topic (a teaser if you like), it doesn’t contain actual instructions or code on how to use actsasstate_machine - that comes in part II. Though the original intent was to write up a small tutorial with some example code, I started with an intro and the article grew so long that I decided to split it up - so stay tuned for part deux!

State what…?!?

A finite state machine (FSM for short) a.k.a. finite state automaton (FSA) is a 5-tuple (Σ,S,s0,δ,F)… OK, just kidding. I doubt too much people are interested in the rigorous definition of the FSM (for the rest, here it is), so let’s see a more down-to earth description.

According to wikipedia, Finite state machine is “a model of behavior composed of a finite number of states, transitions between those states, and actions“. Somewhat better than those gammas and sigmas and stuff but if you are not the abstract thinker type, it might take some time to wrap your brain around it. I believe a good example can help here!

Everybody knows and loves regular expressions - but probably it’s not that wide known fact that regular expression matching can be solved with an FSM (and in fact, a lot of implementations are using some kind of FSM on steroids). So let’s see a simple example. Suppose we would like to match a string against the following, simple regexp:

ab+(a|c)b*

First we have to construct the FSM, which will be fed with the string we would like to match. An FSM for the above regular expression might look like this:

fsm_correct.png

String matching against this FSM is basically answering the question ’starting from the initial state, can we reach the finish after feeding the FSM the whole string?’. Pretty easy - the only thing we have to define is ‘feeding’.

So let’s take the string ‘abbbcbbb’ as an example and feed the FSM! The process looks like this:

  1. we are in q0, the initial state (where the ’start’ label is). Starting to consume the string
  2. we receive the first character, it’s an ‘a‘. We have an ‘a‘ arrow to state q1, so we make a transition there
  3. we receive the next character, ‘b‘. We have two b-arrows: to q1 and q2. We choose to go to q1 (in fact, staying in q1) - remember, the question is whether we _can_ reach the finish, not whether all roads lead to the finish - so the choice is ours!
  4. identical to the above
  5. after the two above steps, we are still in q1. We still get a ‘b‘ but this time we decide to move to q2.
  6. we are in q2 and the input is ‘c‘. We have no analysis-paralysis here since the only thing we can do is to move to q4 - so let’s do that!
  7. Whoa! We reached the finish line! (q4 is one of the terminal states). However, we didn’t consume the whole string yet, so we can’t yet tell whether the regexp matches or not.
  8. So we eat the rest of the string (the ‘how’ is left as an exercise to the reader) and return ‘match!’

Let’s see a very simple non-matching example on the string ‘abac’

  1. in q0
  2. got an ‘a‘, move to q1
  3. in q1, got a ‘b‘, move to q2
  4. in q2, got an ‘a‘, move to q3 - we reached the finish, but still have a character to consume
  5. in q3, got a ‘c‘… oops. We have no ‘c’ arrow from q3 so we are stuck. return ‘no match!’

Of course the real-life scenarios are much more complicated than the above one and sometimes FSMs are not enough (for example to my knowledge it’s not possible to tell about a number whether it is prime or not with a vanilla FSM - but a regexp doing just that has been floating around some time ago) but to illustrate the concept this example served fine.

This is cool and all but why should I care?!?

Well, yeah, you are obviously not going to model an FSM the next time you would like to match a regexp - that would be wheel-reinvention at it’s finest. However there are some practical scenarios where an FSM can come handy:

  • sometimes the logic flow is just too complicated to model - an if-forrest is rarely a good solution (on the flip side, don’t model an if-else with an FSM :-) )
  • encapsulate complex logic flow into a pattern and not clutter your code with it.
  • you are in a stateless world - for example HTTP
  • asynchronous and/or distributed processing where you explicitly need to maintain your state and act upon it

Some real life examples of FSM usage in the Ruby/Rails world are why the lucky stiff’s Hpricot (using Ragel) or Rick Olson’s restful authentication plugin (using actsasstate_machine)

The Next Episode

In the next installment I’d like to focus on the practical usage of the actsasstate_machine plugin - I’ll attempt to create an asynchronous messaging system in a Rails app using it.

tags:

Great Ruby on Rails REST resources

Wednesday, May 30th, 2007

REST-cheatsheet If I had to choose the single most not-really-well-understood, mystified, unsuccessfully demystified, explained and still not-really-grasped topic in the Rails world (and beyond), my vote would definitely go to REST. It seems to me that there are two types of people in the world: those who don’t get REST (and they think it’s a basic postulate to rocket science explained through quantum theory) and those who get it, and don’t understand the former group (unless they are coming from there, that is).
I have been playing around with RESTful Rails recently. Below is my collection or Rails REST howtos, tutorials and other resources I have found so far and which were adequate for my transition from the first group to the second :-) .

You should definitely begin with REST 101 - then check out the other stuff as well!

Please leave a comment if you know some more (just for completeness’ sake - I think the above resources should be enough to grasp RESTful Rails both theoretically and practically.

Creating a site and uploading is considered to be the easy part these days. Especially with languages like ruby on rails you can develop sites in no time. Companies providing hosting services give you a wide variety of options to choose from for your hosting services such as asp hosting or php hosting. Not only that but they also hire 350-040 certified to provide you quality services. Then yahoo hosting provides simple methods for uploading site. With the use of computer backup software you can easily avoid data loss. The actual time consuming part is working on the site’s search engine ranking. Not only does it take time but it is also expensive.

tags:

Partitioning Sets in Ruby

Thursday, April 26th, 2007

During hacking on various tasks, I needed to partition a set of elements quite a few times. I have attacked the problem with different homegrown implementations, mostly involving select-ing every element belonging into the same basket in turn. Fortunately I run across divide recently, which does exactly this… No more wheel reinvention! Let’s see a concrete example.

I have an input file like this:

a 53 2 3
b 8 62 1 23
a 9 0 31
b 4 45 4 16 7
b 1 23
c 3 42 2 31 4 6
a 1 3 22
a 7 83 1 23 3
b 1 14 4 15 16 2
c 5 16 2 34

The goal is to sum up all the numbers in rows beginning with the same character (e.g. to sum up all the numbers that are in a row beginning with ‘a’). The result should look like:

[{"a"=>241}, {"b"=>246}, {"c"=>145}]

This is an ideal task for divide! Let’s see one possible solution for the problem:

  1. require ’set’
  2.  
  3. input = Set.new open(‘input.txt).readlines.map{|e| e.chomp}
  4. groups = input.divide {|x,y| x.map[0][0] == y.map[0][0] }
  5. #build the array of hashes
  6. p groups.map.inject([]) {|a,g|
  7.    #build the hashes for the number sequences with same letters
  8.     a << g.map.inject(Hash.new(0)) {|h,v|
  9.     #for every sequence, sum the numbers it contains
  10.     h[v[0..0]] += v[2..-1].split(‘ ‘).inject(0) {|c,x|
  11.       c+=x.to_i; c}; h
  12.   }; a
  13. }

The output is:

  1. [{"a"=>241}, {"b"=>246}, {"c"=>145}]

Great - it works! Now let’s take a look into the code…

The 3rd line loads the lines into a set like this:

  1. <Set: {"b 1 23 ", "c 5 16 2 34", "a 9 0 31", "a 7 83 1 23 3", "b 1 14 4 15 16 2", "a 53 2 3", "c 3 42 2 31 4 6", "b 4 45 4 16 7", "b 8 62 1 23", "a 1 3 22 "}>

The real thing happens on line 4. After it’s execution, groups looks like:

  1. <Set: <Set: {"a 9 0 31", "a 7 83 1 23 3", "a 53 2 3", "a 1 3 22 "}>, <Set: {"b 1 23 ", "b 1 14 4 15 16 2", "b 8 62 1 23", "b 4 45 4 16 7"}>, <Set: {"c 5 16 2 34", "c 3 42 2 31 4 6"}>}>

As you can see, the set is correctly partitioned now - with almost no effort! We did not even need to require an external library…
The rest of the code is out of the scope of this article (everybody is always complaining about the long articles here, so I am trying to keep them short) - and anyway, the remaining snippet is just a bunch of calls to inject. If inject does not feel too natural to you, don’t worry - it took me months until I got used to it, and some people (despite of the fact that they fully understand and are able to use it) never reach after it - I guess it’s a matter of taste…’

tags:

Getting Beast up and Running on Dreamhost (for the Truly Lazy)

Thursday, March 22nd, 2007

Though dreamhost offers phpBB as one of their one-click install goodies (ergo it is the easiest to install of all forums since you almost don’t have to do anything), I have been looking for something different. To me, phpBB’s interface was always quite unintuitive and too heavy - I wanted something smaller, easier, more compact. The problem was I did not know what should I search for - until I came across beast, a lightweight forum written in Ruby on Rails. It was love at the first sight!

When it comes to tools I am using, I am really language agnostic - this very blog uses WordPress (PHP), I am using Trac (Python) to track my projects, mediaWiki (PHP) is my preferred wiki etc - so even if it may seem so, I did not choose beast because it is written in Rails (although +1 for that :-) ), but because of the design and ease of use. My first thought after trying it was ‘wow, this is as easy to use as a 37signals app’ - it’s really that intuitive and well designed!

Well, this sounds fine and all, but installation on dreamhost was a different story. Thanks God I have found a superb, step by step HOWTO here. However, even after following all the steps, I got ‘incomplete headers’ and other problems, which I have managed to fix - here are some additional comments to the HOWTO:

6. You can forget about this point; as the HOWTO says, it is already installed on DH and it will work without any problems.
7. Forget about ‘development’ and ‘test’, however be sure to get ‘production’ right, as the next step will not work otherwise. It should look something like this:

production:
  adapter: mysql
  database: beast_prod
  host: mysql.myhost.com
  username: us3r
  password: p4ss
  port: 3306
8. For me it worked only *with* the RAILS_ENV=production parameter specified.
9. You can change the salt to anything - it just must not stay the same. The easiest thing is to add or remove a random character from the string.
12. The shebang should be updated to #!/usr/bin/ruby
13. The || should be removed, i.e. it should read:
ENV[‘RAILS_ENV’] = ‘production’
14. Make sure you change the permission of those directories only - I have changed everything recursively, destroying the executable flag of dispatch.fcgi :-) .

Now you should apply the ‘GetText patch’ - it can be found later in the thread. After you should be up and running!

After playing around, I have found that the user listing is not working - fortunately I have found this as well in the forum. The solution is:
app/views/users/index.rhtml line 3 should be modified to

%lt;% form_tag '', :method => 'get' do -%>
Enjoy this great forum!

tags:

Web 2.0 Tutorial

Thursday, January 18th, 2007

First of all, I have to make a disappointing confession: this is not a Web 2.0 tutorial - but fear not, at least the logical and absolutely valid question to this dilemma (i.e. why the hell is the article entitled ‘Web 2.0 tutorial’ then?) will be provided.

Although this blog’s tagline is ‘Ruby, Rails, Web2.0′ and I am blogging/planning to blog about all these topics in the future, I did not have an exclusively-and-only-about-Web2.0 post yet (as far as I remember). That’s why it strikes me odd that according to google analytics, a lot of people are finding this site via the keyword combination ‘Web2.0 tutorial’. This post was inspired by them and for them!

Since this trend is nearly as old as this blog - and it seems to continue, and even rise as time goes by - I am now really curious what the heck are people imagining behind the term ‘Web2.0 tutorial’. Why? Well, there are more reasons to ponder about:

  • Nobody knows what Web 2.0 actually is (or if does, the others don’t agree :-) ). Since coined by Tim O’Reilly back in 2005, ‘Web 2.0′ has been redefined, argued about, glorified, despised, parodied, upgraded to Web 3.0, regarded as vapor, bubble etc. (and who knows what else…) countless times - just one thing did not happen: A commonly accepted, concise (or even lengthy) definition with which everybody would agree. You won’t find anybody interested in the Web today who would not have his own definition associated with Web2.0 - however, these definitions (although more overlapping and similar than ever) will be varying from person to person.

  • The conjunction itself is kind of absurd - even if we accept that there is a common understanding of the term ‘Web2.0′, it definitely has more facets: Look (Apple aqua reinvented, round corners galore, reflections of reflections etc), social aspect (digg, del.icio.us, youTube, myspace et al), theoretical backend (ontologies, folksonomies, openAPIs, microformats, mashups etc), standards (XHTML (2.0! :-) ), RDF, FOAF, ATOM, SVG, SOAP), innovative ways of communication and catering to the users (WS, REST, Podcasts, Videocasts), typical Web2.0-purpose pages (wikis, blogs), development tools and frameworks (AJAX, Ruby on Rails, …) and other buzzwords :-)

  • Even if we define Web2.0 as a collection of the things from the previous point, the term ‘Web 2.0 tutorial’ is too broad-sense to get you too much relevant results (I believe - maybe some smart webmasters engaged in the ways of SEO tricking found out the carving after a Web2.0 tutorial already and wrote up a few for you :-) ). Just as someone would not search a ‘programming language tutorial’ (but a ‘Ruby tutorial’ instead) or a ’sport tutorial’ (rather a ’squash tutorial’), searching after a real ‘Web2.0 tutorial’ could be ineffective, too. I suggest to look for ’rounded corners tutorial’, ‘mashup tutorial’ or ‘Ruby on Rails tutorial’ etc. instead. Additionally, if you are really keen on Web2.0-ness of these documents, don’t forget to add ‘Web2.0′ to the query - just in case :-) .

  • Related to the previous point: attack the problem from bottom up rather than the other way around - i.e. try to look for solutions of concrete problems and assemble them into a Web2.0 style whatever once you are done, rather than trying to do something which is Web2.0 in the first place. In my opinion you should think like ‘I would like to create a great mashup in Ruby on Rails with AJAX and a Web2.0 look - how should I go about this?’ rather than ‘Let’s see a good Web 2.0 tutorial and then I will cook something great’. You should strive for creating great looking websites with great content and functionality, and people will like it and use it - whether you call it Web2.0, Web3.0 or whatever - even if the URL of the site will be www.thissiteisnotweb2.0.com :-) .

Now that I have mentioned ‘Web2.0′ and ‘Web 2.0 tutorial’ more times in this article, I guess I’ll be receiving even more hits through this query - though this was definitely not the reason for writing this article. However, if you already got this far, please take a few seconds and share with us your thoughts on this. After all Web2.0 is also about collaboration, you know. Heck, I might even write a few Web2.0 tutorials in the future - just tell me what a ‘Web2.0 tutorial’ means… :-) .

tags:

Implementing ‘15 Exercises for Learning a new Programming Language’

Thursday, November 16th, 2006

A short time ago in a galaxy not so far, far away I came across a nice blog post: 15 Exercises for Learning a new Programming Language.

One could argue if these are *really* the most appropriate 15(+) exercises to learn a new programming language - however, the task of answering this rather complex question is left as an exercise for the reader. Instead of this I will show you their implementation in Ruby - rubyrailways.com style.

Why did I bother to solve these problems (including not really trivial ones, like a scientific calculator with a GUI) ? Well, actually to learn a new programming language! I still consider myself a beginner Ruby apprentice just playing it by ear in my somewhat scarce free time, so I thought that systematically implementing a task list like this will mean great step forward for me compared to just coding random things at random times. Fortunately I was perfectly right!

Before we move onto the code, one last disclaimer: the fact that I am still a Ruby n00b implies that the code can be somewhat hairy/not optimal/[insert any other language than Ruby here]-ish so don’t use these snippets as a textbook solution of the problems or anything like that. I would be glad if someone could suggest a bit of refactoring of the bad parts but I also hope that that there are some nice parts which you can learn from (actually I am quite sure about this since I used some magick formulas from a few Ruby (grand)masters in some cases).

OK, enough talk for now. Let’s see the stuff!

1. Problem: “Display series of numbers (1,2,3,4, 5….etc) in an infinite loop. The program should quit if someone hits a specific key (Say ESCAPE key).”

Solution: Hmm, well, errr…uh-oh… I could not solve this problem fully (what a terrific start :-) ). If Henry Ford would sit beside me now, he would say : You can hit any key to exit - so long as it’s ‘C’ - and one more advice: don’t forget to hold CTRL during this action :-) . More on this after the code snippet:

  1. i = 0
  2. loop { print "#{i+=1}, " }

Comments : If anyone knows how to add code which will cause this program to stop with a specific keyhit (say ‘ESC’) please, please, please drop me a note. I have been researching this for at least 10% of the time of solving all the tasks, nearly spitting blood when I gave up :-) . It seems (to me) that there is no simple (i.e. no threads and similar) and clean platform-independent solution for this problem. I guess (hope) the author’s idea here was different than to introduce threading or writing platform specific-code…

2. Problem: “Fibonacci series, swapping two variables, finding maximum/minimum among a list of numbers.”

Solution:

  1. #Fibonacci series
  2. Fib = Hash.new{ |h, n| n < 2 ? h[n] = n : h[n] = h[n - 1] + h[n - 2] }
  3. puts Fib[50]
  4.  
  5. #Swapping two variables
  6. x,y = y,x
  7.  
  8. #Finding maximum/minimum among a list of numbers
  9. puts [1,2,3,4,5,6].max
  10. puts [7,8,9,10,11].min

Comments: The Fibonacci code was written by Andrew Johnson (found via Ruby Quiz). I like it so much that I think it would be a shame to present a trivial version here. I guess the rest of the code is self-explanatory.

3. Problem: “Accepting series of numbers, strings from keyboard and sorting them ascending, descending order.”

Solution:

  1. a = []
  2. loop { break if (c = gets.chomp) == ‘q’; a << c }
  3. p a.sort
  4. p a.sort { |a,b| b<=>a }

Comments: This version is accepting strings - I think anybody who got to this point can adapt it to work with numbers.

4. Problem: “Reynolds number is calculated using formula (D*v*rho)/mu Where D = Diameter, V= velocity, rho = density mu = viscosity Write a program that will accept all values in appropriate units (Don’t worry about unit conversion) If number is < 2100, display Laminar flow, If it’s between 2100 and 4000 display 'Transient flow' and if more than '4000', display 'Turbulent Flow' (If, else, then...)"

Solution:

  1. vars = %w{D V Rho Mu}
  2.  
  3. vars.each do |var|
  4.   print "#{var} = "
  5.   val = gets
  6.   eval("#{var}=#{val.chomp}")
  7. end
  8.  
  9. reynolds = (D*V*Rho)/Mu.to_f
  10.  
  11. if (reynolds < 2100)
  12.   puts "Laminar Flow"
  13. elsif (reynolds > 4000)
  14.   puts "Turbulent Flow"
  15. else
  16.   puts "Transient Flow"
  17. end

Comments: Can you spot the trick in the part which is filling up the variables? They don’t go out of scope after the loop ends because they are constants. Other possibility would be to use $global variables but I guess it is usually not a very good programming practice to do that.

5. Problem: “Modify the above program such that it will ask for ‘Do you want to calculate again (y/n), if you say ‘y’, it’ll again ask the parameters. If ‘n’, it’ll exit. (Do while loop) While running the program give value mu = 0. See what happens. Does it give ‘DIVIDE BY ZERO’ error? Does it give ‘Segmentation fault..core dump?’. How to handle this situation. Is there something built in the language itself? (Exception Handling)”

Solution:

  1. vars = { "d" => nil, "v" => nil, "rho" => nil, "mu" => nil }
  2.  
  3. begin
  4.   vars.keys.each do |var|
  5.     print "#{var} = "
  6.     val = gets
  7.     vars[var] = val.chomp.to_i
  8.   end
  9.  
  10.   reynolds = (vars["d"]*vars["v"]*vars["rho"]) / vars["mu"].to_f
  11.   puts reynolds
  12.  
  13.   if (reynolds < 2100)
  14.     puts "Laminar Flow"
  15.   elsif (reynolds > 4000)
  16.     puts "Turbulent Flow"
  17.   else
  18.     puts "Transient Flow"
  19.   end
  20.  
  21.   print "Do you want to calculate again (y/n)? "
  22. end while gets.chomp != "n"

Comments: As you can see, I could not use the same trick here when asking for the variables, because when somebody wants to calculate again, Ruby will complain (although by printing a warning only) that the constants have been already set up. Therefore I went for the hash solution. I think the do-you-want-to-calculate-again part is straightforward so I won’t analyze that here.
“While running the program give value mu = 0.”
Ruby gives a rather interesting result in this case: infinity :-) .
“Is there something built in the language itself?”
Sure: exception handling. Division by zero could be caught with a ZeroDivisionError rescue clause.

6. Problem: “Scientific calculator supporting addition, subtraction, multiplication, division, square-root, square, cube, sin, cos, tan, Factorial, inverse, modulus”

Solution:
Since this code snippet is longer It would look ugly here - you can download it from here instead.

Screenshot:

screenshot of the scientific calculator in action

If you would like to try it, you will need the Tk bindings for Ruby (maybe you have them already, here on Ubuntu I did not). Also note that only the regular 0-9 keys (and of course the mouse) work, the numpad ones do not. One more little detail: % stands for modulo, not percent.

Comments: Phew, this was a real challenge, mostly because I never did any GUI in Ruby before. I was amazed that I could code up a relatively feature rich calculator in 100+ lines of code, without any golfing or trying to optimize for shortness. What I wanted to say with this is that the shortness does not praise my programming skills (since I did not eve try to golf) but the superb terseness of Ruby. OK, of course there are some problems (e.g. cube, cos, tan, inverse are not implemented) but the usability/amount of code ratio is unbelievably high.

The GUI is also not the nicest since I have used Tk - wxRuby or qt-ruby would produce much nicer results, but since I did not code any GUI in Ruby previously, I have decided to try the good-old-skool Tk for the first time.

7. Problem: “Printing output in different formats (say rounding up to 5 decimal places, truncating after 4 decimal places, padding zeros to the right and left, right and left justification)(Input output operations)”

Solution:

  1. #rounding up to 5 decimal pleaces
  2. puts sprintf("%.5f", 124.567896)
  3.  
  4. #truncating after 4 decimal places
  5. def truncate(number, places)
  6.   (number * (10 ** places)).floor / (10 ** places).to_f
  7. end
  8.  
  9. puts truncate(124.56789, 4)
  10.  
  11. #padding zeroes to the left
  12. puts ‘hello’.rjust(10,’0)
  13.  
  14. #padding zeroes to the right
  15. puts ‘hello’.ljust(10,’0)
  16.  
  17. #right justification
  18. puts ">>#{’hello’.rjust(20)}<<"
  19.  
  20. #left justification
  21. puts ">>#{’hello’.ljust(20)}<<"

Comments: Amazingly lot of things can be done with sprintf() - I could solve nearly all the problems with it - but that would not really be rubyish, so I have decided for built-in (and one homegrown) functions. However, mastering (s)printf() is a very handy thing, since nearly all big players (C (of course :-) ), C++, Java, PHP, … ) have it so you get a powerful function in more languages for the price of learning one). As you can see, r/ljust is a nice one, too.

8. Problem: “Open a text file and convert it into HTML file. (File operations/Strings)”

Solution: Well, this problem was not specified in a great detail, to say the least - or to put it otherwise, the solvers are given a great freedom to provide a solution spiced up with their fantasy. This is what I came up with:

  1. doc = <<DOC
  2.  This is the first line in the first paragraph. Nothing really interesting here, just plain text.
  3.  
  4. This is the second paragraph. Let’s see some *strong* markup in action, and also /italic/. So far soo good.
  5.  
  6. This is the last paragraph, with one more <strong>strong tag</strong>.
  7. DOC
  8.  
  9. final_doc = <<FINAL_DOC
  10. <html>
  11.   <head>
  12.     <title>Text to HTML fun!</title>
  13.   </head>
  14.   <body>
  15.     <p>
  16.     embed_doc_here
  17.     </p>
  18.   </body>
  19. </html>
  20. FINAL_DOC
  21.  
  22. rules = {‘*something*’ => ‘<strong>something</strong>’,
  23.          ’/something/’ => ‘<i>something</i>’}
  24.  
  25. rules.each do |k,v|
  26.   re = Regexp.escape(k).sub(/something/) {"(.+?)"}
  27.   doc.gsub!(Regexp.new(re)) do
  28.     content = $1
  29.     v.sub(/something/) { content }
  30.   end
  31. end
  32.  
  33. doc.gsub!("\n\n") {"</p>\n<p>"}
  34.  
  35. final_doc.sub!(/embed_doc_here/) {doc}
  36.  
  37. puts final_doc

Comments: As you can see, besides that the text is wrapped around with a minimal HTML, every occurrence of words between asterisks is outputted in strong and between slashes in italic. You can add as many such rules as you like, they will be (hopefully) substituted in the final output.

9. Problem: “Time and Date : Get system time and convert it in different formats ‘DD-MON-YYYY’, ‘mm-dd-yyyy’, ‘dd/mm/yy’ etc.”

Solution: Well, it was not really clear (for me) what should be the difference between ‘yyyy’ and ‘YYYY’ (resp. ‘dd’ vs ‘DD’) so again I had to use my imagination. However, I guess it does not matter too much, the solution has to be changed by 1-2 characters only if the original author had something different on his mind.

  1. require ‘date’
  2.  
  3. time = Time.now
  4. #’DD-MON-YYYY’, e.g. 12-Nov-2006 in my interpetation
  5. puts time.strftime("%d-%b-%Y")
  6.  
  7. #’mm-dd-yyyy’, e.g. 11-12-2006 in my interpetation
  8. puts time.strftime("%m-%d-%Y")
  9.  
  10. #’dd/mm/yy’, e.g. 12/11/2006 in my interpetation
  11. puts time.strftime("%d/%m/%Y")

10. Problem: “Create files with date and time stamp appended to the name”

Solution:

  1. #Create files with date and time stamp appended to the name
  2. require ‘date’
  3.  
  4. def file_with_timestamp(name)
  5.   t = Time.now
  6.   open("#{name}-#{t.strftime(’%m.%d’)}-#{t.strftime(’%H.%M’)}", ‘w’)
  7. end
  8.  
  9. my_file = file_with_timestamp(test.txt)
  10. my_file.write(‘This is a test!’)
  11. my_file.close

Comments: Maybe a more elegant solution could be to subclass File and override its constructor - but maybe that would be an overkill. I have voted for the latter option in this case :-) .

11. Problem: “Input is HTML table. Remove all tags and put data in a comma/tab separated file.”

Solution: Since web extraction is both my PhD topic and my everyday job (and even my free-time activity :-) ) I will present 3 solutions for this problem. First, the classic old-school regexp way (by Paul Lutus), then with HPricot and finally with scRUBYt!, a simple yet powerful Ruby web extraction framework currently developed by me.

  1. table = <<DOC
  2. <table>
  3.   <tr>
  4.     <td>1</td>
  5.     <td>2</td>
  6.   </tr>
  7.   <tr>
  8.     <td>3</td>
  9.     <td>4</td>
  10.     <td>5</td>
  11.   </tr>
  12.   <tr>
  13.     <td>6</td>
  14.   </tr>
  15. </table>
  16. DOC
  17.  
  18. rows = table.scan(%r{<tr>.*?</tr>}m)
  19.  
  20. rows.each do |row|
  21.    fields = row.scan(%r{<td>(.*?)</td>}m)
  22.    puts fields.join(",")
  23. end

Now for the HPricot solution (in the further examples let’s consider that table is initialized as in the previous example):

  1. require ‘rubygems’
  2. require ‘hpricot’
  3.  
  4. h_table = Hpricot(table)
  5.  
  6. rows = h_table/"//tr"
  7. rows.each do |row|
  8.   child_text = (row/"//td").collect {|elem| elem.innerHTML }
  9.   puts child_text.join(‘,’)
  10. end

and last, but not least scRUBYt!

  1. require ’scrubyt’
  2.  
  3. table_data = P.table do
  4.                P.cell1
  5.              end
  6.  
  7. table_data.generalize :cell
  8.  
  9. puts table_data.to_csv

Some explanation: first of all, at the moment scRUBYt! is avaliable on my hard disk (and partially in my head) only - it should be released around XMAS 2006. I am using this solution for a little bit of self-promotion :-) .

The example works like this: extract something (in this case a HTML <table>) which has something (in this case <td>) which has ‘1′ as its text (well in reality much more is going on in the background, but roughly along these lines). This little code snippet will extract the first <td>s of ALL <tables> on a HTML page. With the ‘generalize’ call we tell the extractor that it should not extract just the first <td> in a table (which is the default setting), but all of them.

scRUBYt! can handle much, much, MUCH more complicated examples than this (like an ebay or amazon page) and has loads of sophisticated functions… so stay tuned!

12. Problem: “Extract uppercase words from a file, extract unique words.”

Solution: (you can find some_uppercase_words.txt here and some_repeating_words.txt here

  1. open(’some_uppercase_words.txt).read.split().each { |word| puts word if word =~ /^[A-Z]+$/ }
  2.  
  3. words = open(’some_repeating_words.txt).read.split()
  4. histogram = words.inject(Hash.new(0)) { |hash, x| hash[x] += 1; hash}
  5. histogram.each { |k,v| puts k if v == 1 }

13. Problem: “Implement word wrapping feature (Observe how word wrap works in windows ‘notepad’).”

Solution: Unfortunately I am not a Windows user and I have seen notepad a *quite* long time ago - so I am not sure the task and it’s implementation are fully in-line - I have tried my best. Here we go:

  1. input = "Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum."
  2.  
  3. def wrap(s, len)
  4.   result = ‘’
  5.   line_length = 0
  6.   s.split.each do |word|
  7.     if line_length + word.length + 1  < len
  8.       line_length += word.length + 1
  9.       result += (word + ‘ ‘)
  10.     else
  11.       result += "\n"
  12.       line_length = 0
  13.     end
  14.   end
  15.   result
  16. end
  17.  
  18. puts wrap(input, 30)

14. Problem: “Adding/removing items in the beginning, middle and end of the array.”

Solution:

  1. x = [1,3]
  2.  
  3. #adding to beginning
  4. x.unshift(0)
  5.  
  6. #adding to the end
  7. x << 4
  8.  
  9. #adding to the middle
  10. x.insert(2,2)
  11.  
  12. #removing from the beginning
  13. x.shift
  14.  
  15. #removing from the end
  16. x.pop
  17.  
  18. #removing from the middle
  19. x.delete(2)
  20.  
  21. #we have arrived at the original array!

15. Problem: “Are these features supported by your language: Operator overloading, virtual functions, references, pointers etc.”

Solution: Well this is not a real problem (not in Ruby, at least). Ruby is a very high level language ant these things are a must :) .

Finally, you can download all the solutions in a single archive from here. I would like to see the implementation of these tasks in both Ruby (different (more optimal) solutions of course) as well as in anything else. If you set out to do something like that, be sure to drop me a note.

Internet contains huge number of opportunities to earn money online. Simply create a site that you think has the potential to sell hot items using ruby on rails. Register a relevant domain name and purchase a web hosting service through hostgator, one of the better web host out there today. Get a internet connection through one of the wireless internet providers to upload your site. Work on search engine optimization to get a better traffic and also use affiliate marketing program for the same reason. Finally get a free voip phone service to contact customers directly. The pc to phone system is the most effective method of marketing.


tags:

Sometimes less is more

Thursday, October 19th, 2006

Update: A lot of people were disappointed that 10.minutes.ago etc. is not working in pure Ruby. Well, after executing the line require ‘active_support’ it does - I think this is a fairly small thing to do to enable these powerful features.

Every guide published on favorable writing principles emphasizes the power of brief and concise style. This is especially true in the case of technical texts, and in my opinion, in the case of well-designed programming languages as well.

Note the word well-designed. I did not say in the case of any (programming) language, since that would just not be true: conciseness can come at the cost of readability. (If you ever tried to read kanji, you know what I am talking about ;-) . However, I am claiming that in the case of a really well-designed programming language, succinctness helps readability, reduces bloat and leads to easier and faster understanding of the code. In my experience, the amount of boilerplate code to write is decreasing proportionally with the terseness of the programming language, ultimately leading to a coding style where you (nearly) don’t need to write boilerplate at all.

I will demonstrate this on a few Java vs. Ruby code examples. However, this is NOT a Ruby-bashing-Java article, but a few examples of idioms and interesting constructs; C++ vs Haskell or Lisp could serve equally well (sometimes even better), but since I am currently working with Java and Ruby on a daily basis, it is easier for me to use them.

If you are a pro Ruby and/or Java programmer, and/or you think the article is too long for you, please jump to the “Random Code Snippets” section.

Possibly the most straightforward reason why Ruby code is more readable even in shorter form is that really everything is an object [1] in Ruby-land. For example in Java, primitives need wrapper classes to ‘become’ objects., while in Ruby they are first class objects on their own. This makes constructs like

10.times { print "ho" }  #=> "hohohohohohohohohoho"
or (will output the same string)
print "ho" * 10 #=> "hohohohohohohohohoho"
possible.

There are a handful of other reasons which make Ruby more readable and elegant, but before I get bogged down in the explanation too much, let’s see the examples!

Whetting your appetite

In the first part I will describe some basic constructs which would make the life of any Java developer much easier. These techniques are neat, but they are not using any really sophisticated stuff yet: I will try to take a look at those in the next bigger section.

The empty program

Java:

class Test
{
    public static void main(String args[]) {}
}

Ruby:


I did not forget the Ruby snippet; You can not see anything there because actually a Ruby program doing nothing is exactly 0 characters long. On the other hand, the Java version is slightly longer. I is kind of weird to explain to a newcomer what do ‘class’, ‘public’, ’static’, ‘void’, ‘String’, the [] operator and several braces here and there mean, and why are they needed if the program does literally nothing

Fun with numbers

Note:For some of the next examples you will need to use Rails Active Support.
Java:

if ( 1 % 2 == 1 ) System.err.println("Odd!") #=> Odd!

Ruby:

if 11.odd? print "Odd!" #=> Odd!
Does not the first example make more sense (even for a non-programmer)?. I believe it does. More of this type:
Java:
102 * 1024 * 1024 + 24 * 1024 + 10 #=> 106979338

Ruby:

102.megabytes + 24.kilobytes + 10.bytes #=> 106979338

OK, maybe this is an unfair comparison since Java does not have (?) those functions. However, the point is that even if it had, the best I could come up with would look like:

Util.megaBytes(102) + Util.kiloBytes(24) + Util.bytes(10) #=> 106979338

Which is far from the elegance and readability of the Ruby example.
In the next example we will assume that we have a Java function similar to ordinalize in Ruby.
Java:

System.err.println("Currently in the" + Util.ordinalize(2) + "trimester");

Ruby:

 print "Currently in the #{2.ordinalize} trimester"    #=> "Currently in the 2nd trimester"

In this example we can observe variable interpolation: anything wrapped in #{} inside double quotes gets evaluated and substituted in the string, providing a more readable form without a lot of + + Java constructs (which is cool mainly if you have more variables inside the double quotes).

Dates

In my opinion, handling dates and times is a great PITA in Java, especially if you are implementing some complex code.
Java:

System.out.println("Running time: " + (3600 + 15 * 60 + 10) + "seconds");

Ruby:

puts "Running time: #{1.hour + 15.minutes + 10.seconds} seconds"

Java:

new Date(new Date().getTime() - 20 * 60 * 1000)

Ruby:

20.minutes.ago

Java:

Date d1 = new GregorianCalendar(2006,9,6,11,00).getTime();
Date d2 = new Date(d1.getTime() - (20 * 60 * 1000));

Ruby:

20.minutes.until("2006-10-9 11:00:00".to_time)

I think you do not have to be biased towards Ruby at all to admit which code makes more sense instantly…

I have recently found a very cool way of parsing dates in Ruby: using Chronic. However, I would not like to present it here since it is not a feature of the language, ‘just’ a nifty natural-language date parser [2].

A little bit more advanced stuff

Classes

Java:

Class Circle
  private Coordinate center, float radius;

  public void setCenter(Coordinate center)
  {
    this.center = center;
  }

  public Coordinate getCenter()
  {
    return center;
  }

  public void setRadius(float radius)
  {
    this.radius = radius;
  }

  public Coordinate getRadius()
  {
    return radius;
  }
end;

Ruby:

class Circle
  attr_accessor :center, :radius
end

Believe it or not, the two code snippets are absolutely equal; The getter and setter methods in Ruby code are generated automatically, so not only you do not have to write them, but they are not even there to clutter the code.

I have seen argumentation from Java guys that stuff like this (i.e. the public static void main … thing, getters/setters and other boilerplate code) can be generated with any decent GUI like Eclipse (or by tools like XDoclet etc) is a non-issue. Well, as for their generation, let us say this is true. But for the readability of code it is absolutely not!

For example. take getters/setters: Every variable in Java ads 8 more lines of code (not counting the lines between the function declarations) compared to the Ruby :attr_accessor idiom. That is, a simple class definition having 10 fields in Java will have 80+ lines of code compared to 1 lines of the same code in Ruby. For me, this definitely means a big difference.

Arrays (and other containers)

This section was inspired by a blog entry by Steve Yegge.

Arrays are interesting citizens of Java: They are not really objects in the “classical” sense , so they have very limited functionality compared to first-class Java objects. On the other hand, they are offering a huge advantage over the other container classes: they can be easily initialized.
Java:

String languages[] = new String[] {"Java", "Ruby", "Python", "Perl"};
instead of
List<String> languages = new LinkedList<String>();
languages.add("Java");
languages.add("Ruby");
languages.add("Python");
languages.add("Perl");
which is kind of lame when you quickly need to hack up some testing data.

However, they have also some serious problems: you have to define the number of the elements upon construction time, like so:
Java:

String someOtherLanguages<String>[] = new String[15];
which sometimes really cripples their functionality. [3]

How does this work in Ruby? Let’s see on three different examples (All three code snippets provide the same result):
Ruby:

stuff = [] #An empty array - as you can see there is no need to define the size
stuff << “Java”, “Ruby”, “Python” #Add some elements
#Initialize the array with the values
stuff = [”Java”, “Ruby”, “Python”]
#Yet another method yielding the same result
stuff = %w(Java Ruby Python)

In my opinion, these forms (especially the last one) are more straightforward and can save a lot of typing.