"Lets" organize our Ruby Code

(This is a snapshot of my old weblog. New posts and selected republished essays can be found at raganwald.com.)

Friday, May 30, 2008

This post is for folks interested in the Invocation Construction Kit for Ruby. There is a new invocation, Ick::Syntax::Lets:


lets(
    :person => Person.find(:first, ...),
    :place  => City.select { ... },
    :thing  => %w(ever loving blue eyed)
) {
    "#{person.name} lives in #{place} where he is known as the '#{thing.join(' ')} thing.'"
}

Quite simply, #lets provides you with a way of making block-local variables. Ick already includes #let, which does almost exactly the same thing. However, #let is limited to just one variable:


let(Person.find(:first, ...)) { |person| ... }

Expanding #let to multiple variables with its existing syntax just doesn’t scale properly:


let(
    Person.find(:first, ...),
    City.select { ... },
    %w(ever loving blue eyed)
) { |person, place, thing|
    "#{person.name} lives in #{place} where he is known as the '#{thing.join(' ')} thing.'"
}

Thus, the #lets syntax uses a hash so that the variable names are close to the expressions denoting their values.

That’s all. If you are interested in why anyone would need #lets, read on…

Scope Creep

Recently there has been a lot of interest in exploring the extremes of OO style. One of the tenets of such a style is to work with small classes and short methods. Why do you suppose we want to do that?

Well, the underlying principle is to chunk things into small, workable pieces with clear purposes. This principle abounds everywhere: posts just like this are broken up into sections with headings and the prose is subdivided into paragraphs. So I quite agree with the principle behind small classes and short methods.

__________. You just knew that a word like “However” would follow a paragraph like that, didn’t you?

The principle of subdivision is terrific. But are classes and methods the only mechanisms we have for subdividing code? In some languages, that may be true. In other languages, that is not true.

For example, since version 1.1 Java has permitted classes to be nested inside of other classes, calling them “inner classes.” This is a very powerful technique of organizing code: If class A is the only class that ever uses class B, why should class B live in its own file, visible to every other class? Placing class B inside of class A is a big win: it is immediately clear that A is the only user of B, and when looking at the code in your IDE you do not see class B promoted to equal standing with A: it is clearly subordinate to A.

Of course, placing class B inside of class A makes A a bigger class. Is that wrong? Perhaps it is at some times, but it’s a big win at others. Remember we said B is subordinate to A? An inner class can be made private, explicitly telling the compiler and other programmers that it is limited in scope.

Limiting scope is a very powerful organizing technique in code. We can debate how important it is that the compiler enforce scope, however given the importance of writing code for humans to read and understand, I’m personally in favour of any technique that sends a strong signal to your fellow programmers explaining the intended structure and organization of the code.

Small Methods

There are techniques for organizing methods into classes or modules, and techniques for organizing classes and modules into larger classes and modules. But what about individual lines of code? Are methods really the only mechanism we have for organization at the lowest level?

I think not. Even within methods, there are certain practices that logically chunk your code into small, workable pieces with clear purposes. For example, Algol, Pascal, Modula, and many other languages support nested functions or procedures:


procedure print(var j: integer);

  function next(k: integer): integer;
  begin
    next := k + 1
  end;

begin
  writeln('The total is: ', j);
  j := next(j)
end;

It is clear that #next is used solely by #print. Since most OO languages do not permit nested procedures, the programmer is left to choose between making #nest a private method or carving #print out and making it a strategy class. Making #next a private method does keep it from prying eyes outside of the class, however it implies that all methods of the class may want to use it, which is clearly not the case.

If we are allowed to nest objects, we can fake nested procedures with objects. Ruby’s Proc class makes it easy:


def print(j)
    next = proc { |k|
        k + 1
    }
    p "the total is: #{j}"
    j = next.call(j)
end

(There is a fairly major difference in what these two snippets of code do thanks to the difference between call-by-value and call-by-reference, but let’s wave our hands furiously and stick to the point about nesting functions.)

Containing Variables

In imperative languages, one of our biggest headaches is managing mutable local variables. If you only need one in one particular place, it is helpful to have a way of nesting the variable definition. In Java:


for(int i=1; i<11; i++) {
    StringBuffer j = new StringBuffer();
    // ...
}

And in Perl (thanks Chromatic):


{
  my $person = Person->find( first => ... );
  my $place = City->select( sub { ... } );
  my $thing = [qw( ever loving blue eyed )];

  return $person->name() . " lives in $place where he is known as the "
    . join( ' ', @$thing ) . ' thing.';
}

The variables i and j are limited in scope to the body of the for loop. This called block scoping, and I personally love it. Block scoping permits us to make small, self-contained blocks of code and use them inline. If we need to move them or change them, we know which things are limited in scope to the block and which reach outside of the block and thus might be affected by our changes.

In Javascript we can fake block scoping using procedures, but it’s syntactically noisy. Can we do the same thing in Ruby? Yes, and that’s exactly what #lets does:


lets(
    :person => Person.find(:first, ...),
    :place  => City.select { ... },
    :thing  => %w(ever loving blue eyed)
) {
    "#{person.name} lives in #{place} where he is known as the '#{thing.join(' ')} thing.'"
}

This is a block with three block-local variables in it, just like you might find in Java. It breaks a complicated expression up into smaller pieces (“How to find the person,” “How to find the place,” “What kind of thing we have,” and finally “How to describe it all as a string”). And those pieces are all grouped together so you know they are not used elsewhere.

A Hack, wrapped in a Workaround, inside a Kludge

Unlike the other elements of Ick, #lets actually rewrites your ruby code. The code example above is actually rewritten as:


proc { |__121217088531733__|
  lambda do |person, place, thing|
    "#{person.name} lives in #{place} where he is known as the '#{thing.join(" ")} thing.'"
  end.call(
    __121217088531733__[:person], 
    __121217088531733__[:place], 
    __121217088531733__[:thing]
  )
}.call(
    :person => Person.find(:first, ...),
    :place  => City.select { ... },
    :thing  => %w(ever loving blue eyed)
)

The rewritten code is then evaluated. At the moment, this happens every time you call #lets, so it is expensive. Even by Ruby standards. Hopefully, there will be a future version of Ick::Syntax that only rewrites by need or perhaps just once when the code is first read.

So, #lets takes the syntactic noise of using procs to create block structure and hides it from us. How?

Well, my first cut at it used Ruby2Ruby and regular expressions. Then I was inspired by Aanand Prasad’s Haskell-style monad do-notation for Ruby to work directly with s-expressions. Although the code is technically longer, it’s actually much, much better. For meta-syntactic programming, you need to work with the abstract syntax tree.

And with ParseTree and Ruby2Ruby, you have all the tools you need to write code that writes code, with the teeny-weeny proviso that what you want to do is translate your Ruby into Lisp, manipulate the Lisp, and then translate it back into Ruby:


def rewrite_sexp(names_to_values, sexp)
  mono_parameter = :"__#{Time.now.to_i}#{rand(100000)}__"
  # the next four assignments are exactly why we want #lets:
  sorted_symbols = (names_to_values || {}).keys.map(&:to_s).sort.map(&:to_sym)
  parameters = if sorted_symbols.size == 1
    s(:dasgn_curr, sorted_symbols.first)
  else
    s(:masgn, s(:array, *sorted_symbols.map { |sym| s(:dasgn_curr, sym) }) )
  end
  values = s(:array,
    *sorted_symbols.map { |sym|
      s(:call, s(:dvar, mono_parameter), :[], s(:array, s(:lit, sym)))
    }
  )
  body = sexp.last

  s(:defn, :__anonymous__, 
    s(:bmethod, 
      s(:dasgn_curr, mono_parameter), 
      s(:call, 
        s(:iter, 
          s(:fcall, :lambda), 
          parameters,
          body
        ), 
        :call, 
        values
      )
    )
  )
end

You know what to do:

sudo gem install ick

Cheers, and thanks very much to Ryan Davis, Eric Hodel, Matt Mower, and Aanand Prasad.

¶ 3:02 PM

Comments on “"Lets" organize our Ruby Code”:

Wow - let as syntax sugar for lambda. It's so Scheme-ish it makes my eyes a little misty.

For the non-Rubyists who read this post, it's worth mentioning that in many (though not all) of the curly brace languages you can use braces for the exclusive purpose of introducing a nested scope without necessarily using a for or if or whatever.

Perhaps it would make sense for Ruby to allow nested begin/end pairs for the same purpose. Just a thought.

# posted by

James Iry : 9:02 PM

James:

As you probably know, in Scheme the macro let is implemented by rewriting using lambda. I wonder where I got the idea???

It’s a personal opinion, but I wouldn’t call it Syntactic Sugar if it introduces a new metaphor or mental model. To me, it’s only Syntactic Sugar if the writer and reader think “lambda” in their heads but use the macro to save typing characters.

# posted by

Reginald Braithwaite : 10:48 PM

In Perl:

{
my $person = Person->find( first => ... );
my $place = City->select( sub { ... } );
my $thing = [qw( ever loving blue eyed )];

return $person->name() . " lives in $place where he is known as the "
. join( ' ', @$thing ) . ' thing.';
}

No lexical scoping metaprogramming fakery needed.

# posted by

chromatic : 11:59 PM

Reginald:

Yup - that's why my eyes are getting misty. It's beautiful to see some Scheme in another language. Scheme remains one of my first loves (but I'm a bit of a language slut and have a lot of loves :-)

As for what is and isn't syntactic sugar, I get your point about new metaphors. I was not using the phrase "syntax sugar" dismissively! Abstraction is what makes programming possible at all.

But it's often useful for a programmer to know how to peek behind the magic curtain when needed. If you think of Haskell's "do" solely as a way to work with effectful computations or Scala's "for" as a clever way to work with collections then in either case you'll never make it to the point where you build your own monads.

At other levels it's powerful to see an OO style object as an encapsulation of identity, state and behavior - or as a closure of several functions that share the same environment - or as a hash table from a name to function (at least in some dynamic OO languages) - or... Each of these different conceptions will lead to different design alternatives. The first conception will likely lead to lots of interacting mutable objects. The hashtable interpretation will support designs based on dynamic metaprogramming. The closure interpretation will allow you to embed functional concepts in those OO languages that don't support it very well (I'm looking at you Java!)

# posted by

James Iry : 1:13 AM

Chromatic:

So Perl has something in common with Java? Nice to know :-)

# posted by

Reginald Braithwaite : 8:01 AM

Reg, it's probably more accurate to say that Ruby (and Python) get variable declaration wrong then to be amazed that other languages get it right.

# posted by

chromatic : 4:16 PM

Chromatic:

Speaking as someone who learned Algol (in 1974!), Scheme and C++ long before learning Ruby, I am entirely with you on this subject and always have been.

# posted by

Reginald Braithwaite : 5:49 PM

Don't get me wrong -- if Java had a similar problem, any workaround would be several times more hideous. Let's not even think about C++.

Credit to Ruby for allowing language-level customization (I mean, not hacking bytecode directly) to fix the problem. Still, minus a few points for having the problem in the first place.

# posted by

chromatic : 9:35 PM

Silly question, but can't "let" detect if it is getting a hash parameter and effectively do what "lets" does in that case?

In other words, is there really a need for two different methods?

# posted by

Sam Ruby : 10:14 PM

Sam:

What would you do if you wanted one parameter that happened to be a hash, such as let(My_active_record_model.attributes) { |attrs| ... }?

I considered this possibility and wondered about detecting it by looking at whether the block takes parameters or not.

So let(???) { |x| ... } is the original kind of let and let(???) { ... } behaves likes lets.

What do you think?

# posted by

Reginald Braithwaite : 10:19 PM

Ah, but what if I want to do a bunch of let nesting? Won't all those curly braces get noisy?

Here's my new syntax suggestion for Ruby: have variables be immutable, and declared like this --

let x = 5 in
blah;
blah;
blah;
let y = 10 in
foo;
bar;
baz

# posted by

Robert Fischer : 8:52 AM

Reg --

I'd vote for one method that auto-detects the hash.

~~ Robert.

# posted by

Robert Fischer : 8:54 AM

Reginald,

Can you discuss how the "let method" is superior to standard variable declaration?

def local(&b)
b.call
end

local do
person = Person.find(:first, ...)
place = City.select { ... }
thing = %w(ever loving blue eyed)
"#{person.name} lives in #{place} where he is known as the '#{thing.join(' ')} thing.'"
end

This way will not create new variables accessable outside the block, but will change the value of existing variables. Is this the major complaint?

Thanks!

# posted by

Mike Harris : 12:48 PM

This way will not create new variables accessable outside the block, but will change the value of existing variables. Is this the major complaint?

Yes, in Ruby 1.9 local may blow existing variables away.

# posted by

Reginald Braithwaite : 12:53 PM

<< Home