Not going dark

(This is a snapshot of my old weblog. New posts and selected republished essays can be found at raganwald.com.)

Tuesday, June 17, 2008

Rubyforge is now hosting an “initial pre-release of a preview of an alpha of an undocumented proof-of-concept” of the rewrite gem. More on this presently, but the important (and really the only thing of interest) about rewrite is that when you use rewrite to write:

Person.find_by_last_name("Braithwaite").andand.first_name

...You are not opening the Object, Person or Nil classes to add an andand method, nor are you creating a weird temporary BlankSlate object. Rewrite does its thing by rewriting Ruby code: it performs syntactic metaprogramming, much as Lisp macros rewrite Lisp code.

what problem does rewrite solve?

Recall that when you use the “standard” implementation of things like andand or try, you are openly modifying core classes like Object.

Therefore, you are reaching out and touching every line of code in your project. You probably aren’t breaking everything, but even if the chance of introducing a bug by adopting something like “try” is infinitesimal for each source code file in your project, the chance grows greater and greater as your application grows.

The problem is that you are introducing a change on Object, and everything depends on Object. This is very different than introducing a change in your code. In that case, only the other bits of code that directly depend on your code are at risk.

Also, imagine if you introduce try and are careful not to break anything. Now somebody else wakes up one day and decides they need a method that works like Prototype’s Try.these. They call it “try.” They just broke your code, dude! Not only are you making everything dependant upon your version of try, but your code is dependent upon everyone else not breaking try as well. It’s a train-wreck waiting to happen.

Rewrite restricts things like andand or try to your code and your code alone. Sure, if you introduce a bug in your code, you may break things that directly depend on your code. But if you introduce “try” using rewrite instead of modifying Object, you will not reach out across your project and break something entirely unrelated that happens to have defined its own version of try in a completely different way.

how does it work?

Rewrite takes your code, converts it to an sexp with Parse Tree, then rewrites the sexp using one or more rewriters you specify. Finally, it converts the sexp back to Ruby with Ruby2Ruby and evals it. It does this when the code is first read, not every time it is invoked, so we mitigate the “do not use andand in a tight loop” problem.

For example, rewrite converts this:


emails.find_by_email(email).try(:destroy)

Into:


lambda { |receiver, method|
   receiver.send(method) if receiver.respond_to? method
 }.call(emails.find_by_email(email), :destroy)

And this:


 numbers.andand.inject(base_sum()) { |total, number| total + number }

Into:


 lambda { |__1234567890__|
   if __1234567890__.nil?
     nil
   else
     __1234567890__.inject(base_sum()) { |total, number| total + number }
   end
 }.call(numbers)

Note that with the examples, the names “andand” and “try” completely go away. If someone else defines a try method elsewhere, it will not affect your code because your code never executes a method called try.

the most important open problem to solve in this area

At the moment that is a huge PITA when creating new rewriters. I don’t think that it's OK that a language makes it harder for a library creator than for an application developer, so I am working on making it easy to write things like andand or try.

puzzles for language weenies

Note that the andand example above uses gemsym for its parameter while the try example does not. Why? What could break if it used a name like “receiver?” If you can figure out the salient difference between these two example rewrites, you can probably explain why rewriting try produces exactly the same semantics as the original open classes implementation, but rewriting andand produces a subtle change in semantics.

¶ 11:30 PM

Comments on “Not going dark”:

Very cool - can't wait to start hacking on this.

I tried grabbing the gem, though, and there was no lib dir, or really anything but the tests. Also, the svn url listed on rubyforge doesn't seem to exist.

# posted by

James Golick : 8:48 AM

If you're concerned with namespace conflicts, why not just do something like the following? Try.new(emails.find_by_email(email)).destroy

Except for readability concerns, would it not be equivalent without all the metaprogramming stuff?

# posted by

codebrulee : 9:23 AM

I tried grabbing the gem, though, and there was no lib dir, or really anything but the tests.

Ah! I suppose that I really must populate manifest.txt before running rake deploy.

Please try it now, although I cannot take responsibility for anything that happens if you actually try this on your code.

I know that the MIT License already says that, but this time I really, really, really mean it.

# posted by

Reginald Braithwaite : 10:11 AM

Except for readability concerns, would it not be equivalent?

I think that readability concerns are the only thing at stake here. if they weren’t, we could just as easily write:

email_model = emails.find_by_email(email)
email_model.destroy if email_model.respond_to? :destroy

Also, there are some other things where the readability difference is incredibly substantial. More on this as I work towards the Ruby.rewrite(Ruby) presentation at RubyFringe.

# posted by

Reginald Braithwaite : 10:14 AM

Looking forward to hearing more of your thoughts on readability with this stuff...

# posted by

codebrulee : 10:28 AM

Seems a lot of trouble to go through just to get Scala's lexically scoped class extensions. :-)

# posted by

Daniel : 11:57 AM

Daniel:

In all seriousness, this goes from “interesting, possibly useful” to “disruptive” if I can succeed in making rewriters easy to write.

Until then, this is a lot like arguing about Ruby the language. if I show a programmer Enumerable#map, he can say that his existing language has for loops. But If I show him how he can write his own methods that take blocks as arguments, a light goes on.

Likewise, implementing things like andand and try using rewriters is not the goal. Nor is it really the goal to help people implement one-liner syntactic rewriters.

The goal is to make it easy for people to invent entirely new things.

Some of those new things could be implemented with Scala’s existing semantics. Some of them could be implemented with C#’s existing semantics too.

I have no idea where this could go, but it is generally a mistake to look at something and think it has no potential because you can replicate its early results with something that already exists.

In the early days, you use a new tool to replicate existing phenomena because that is all you know how to imagine. But as you get familiar with the new tool, new possibilities occur to you and you take it in new directions.

# posted by

Reginald Braithwaite : 12:12 PM

I suppose if you're already doing something in Ruby, or for some reason must use Ruby this is fantastic to have. On the other hand it is syntactically very awkward compared to the similar Lisp expression. Lisp is so much faster and has infinitely more tractable syntax.

# posted by

rob : 5:00 PM

Rob:

My experience with Lisp is that statements of the following form are nearly always true:

I suppose if you're already doing something in Language X, or for some reason must use Language X, Feature Y is fantastic to have. On the other hand it is syntactically very awkward/semantically less clean compared to the similar Lisp expression.

Nevertheless, I am choosing to write in Ruby at the moment of my own free will, and since I am making that choice, I also choose to sharpen my Ruby saw while I am at it :-)

# posted by

Reginald Braithwaite : 6:01 PM

Therefore, you are reaching out and touching every line of code in your project. You probably aren’t breaking everything, but even if the chance of introducing a bug by adopting something like “try” is infinitesimal for each source code file in your project, the chance grows greater and greater as your application grows.

I'm not sure I buy this. First, a piece of code is either buggy or it's not. The number of places it's used doesn't affect its bugginess. #try and #andand are simple enough that you don't need too many code paths to find a bug in debug.

Second, while someone writing their own #try might be conceivable, is it really very likely that someone would use #andand for a method name? And in the very unlikely event that they did, wouldn't any reasonable test suite catch this problem immediately?

So while the rewrite stuff sounds cool and useful, I don't accept some of the assumptions the article is based on.

///ark

# posted by

Mark Wilden : 7:50 PM

I would be very interested in seeing a ruby-to-(real)-sexprs compiler that could be used to do interesting things with scheme and/or lisp backends.

This just seems like halfway hackery to me, though it is kinda fun.

# posted by

Justin George : 9:15 PM

Mark: I don't understand your point at all. You point out that #try is a grenade that could potentially blow up in your face. You say #andand is a bit less combustible. If there exists an example of a situation where it is a problem, then it is a problem.

Also: you say that a test case will find the bug quickly. Then what? What happens when I try to import a large and crucial PDF generator or symbolic math library into my large and pre-existing application and find a conflict? What does it cost to rewrite one or the other?

I have no idea in the abstract and therefore I cannot give a predictable answer to my boss about what it will cost to integrate a new library. "Should be a day's work...unless there is a name clash that requires me to rewrite tens of thousands of lines of code in which case it is months of work."

# posted by

Paul Prescod : 10:39 AM

On the problem of global overloading, I remember seeing something on the ruby mailing list about namespaces. (Years ago, after 1.8, before Rails)

When you open a class inside a namespace, the changes you make there only affect code that runs under the same namespace.

So you can create MyPrettyApp namespace, open and change Object there, and I can create my InsanelyGreatLib namespace and make my own mess with Object and you can use my lib in your app and we'll all be happy.

Sorta like scala's thing, I believe.

Does anybody know whatever happened to that idea?

# posted by

Caio : 9:52 PM

I like the direction namespaces represent, although my impression is that once you get past the really trivial examples, namespaces could become really, really complicated, really, really quickly.

So a well-designed namespaces feature would be terrific, while a poorly-designed namespaces feature would be just like Java Generics: really pretty for the demo--making sure collections are homogenous--but excruciating in real life.

# posted by

Reginald Braithwaite : 9:49 AM

This post has been removed by the author.

# posted by

chocolateboy : 10:43 AM

I implemented something similar that allowed you to do something along the lines of:

scope = Scope.new(String => Facets::String, Fixnum => [ MyFixnum, Math ])

scope.lexical do
    42.sqrt
    "my_str".camelcase
end

scope.dynamic do
    42.sin
    "my_str".to_rx
end

But it went out of scope (pun intended) the day Binding.of_caller died.

It was also slow as it had to install and uninstall bindings for each block.

Your approach looks much cleaner and much faster.

If you're not familiar with them, you might want to take a look at Fluid and Behaviors.

And what's with all the references to our smart cousin Scala? Where's the love for our cute baby daughter Groovy and its categories? :-)

# posted by

chocolateboy : 10:50 AM

Also, with lexically-scoped monkey patching, you don't need andand. You can just splice in a version of NilClass#method_missing that returns self:

class MyNilClass
    def method_missing(*args)
        self
    end
end

scope = Scope.new(NilClass => MyNilClass)

scope.lexical do
    test = { 'foo' => 1, 'bar' => 2 }
    puts test['baz']['whatever'] # => nil
end

# posted by

chocolateboy : 11:14 AM

ChcolateBoy:

We have walked similar paths. The limitation I encountered was this: If I patch a class or an object within my scope, what is the behaviour of objects of that class within methods I am not defining?

For example, if I patch NilClass just so that I can avoid andand, do all the nils in the program have this behaviour while my code is executing within the scope? If so, we have a leak.

Also, and this is a small point, patching NilClass gives a very different set of semantics than andand. As discussed in another post, andand using rewrite has call-by-name semantics. Also, and this may matter, all methods should return nil, not just those that Nil doesn’t support.

So nil.to_s => '' but nil.andand.to_s => nil

# posted by

Reginald Braithwaite : 1:40 PM

We have walked similar paths. The limitation I encountered was this: If I patch a class or an object within my scope, what is the behaviour of objects of that class within methods I am not defining?

For example, if I patch NilClass just so that I can avoid andand, do all the nils in the program have this behaviour while my code is executing within the scope? If so, we have a leak.

The way scope.rb handles this is to 1) create a local variable in the block with eval 2) install the custom methods and 3) when those methods are called, check to see if they're in the correct lexical scope by looking up the sentinel variable in the caller's bindings (using Binding.of_caller or, now, ruby-debug). If the sentinel variable isn't defined, then fallback to the original method:

b = Debugger.current_context.binding_n(0)
vars = eval("local_variables()", b)
if vars.include? sentinel
# correct scope - call the patched method
else
# not in scope - fallback to the original method
end

Also, and this is a small point, patching NilClass gives a very different set of semantics than andand. As discussed in another post, andand using rewrite has call-by-name semantics. Also, and this may matter, all methods should return nil, not just those that Nil doesn’t support.

So nil.to_s => '' but nil.andand.to_s => nil

Fair enough. I haven't played with andand, to be honest. But it looks like rewrite holds a lot more promise than just providing a hygienic implementation of andand, however useful that might be.

# posted by

chocolateboy : 7:47 AM

<< Home