Thursday, April 26, 2012

Why R is Popular and Scheme Isn't

I really enjoyed John Cook's talk on Why People Use R. The premise: to programmers, R is slow and messy. Yet, to statisticians, It Just Works. And works so well, they have no choice but to make it popular. I'm not a statistician, and so have no need or appreciate for R. But, I've worked with another DSL that that was plenty ugly yet and mighty popular. It's none other than PHP.

While PHP has spent years of effort to make itself into a legitimate general purpose language (it's even got anonymous functions now!), its roots very much match those of R. PHP started its life as a DSL for building web pages (back when the notion of a web application was in its infancy). It was a slow (compared to C, anyway) and clunky language. Like R, it contained shortcuts that users of the language loved, like having nl2br() as a built in primitive, or the use of register_globals, but were far from smart programming language decisions. You didn't learn PHP from the spec, you learned it from tutorials and copying and pasting code. Sure, PHP had some good ideas (like tightly coupling the language to a database), but it also contained plenty of duds.

So, Just Worksness trumps language beauty when it comes to DSLs. That's certainly a key point to consider when designing a DSL. One that is far from obvious to programmers.

It's also instructive to consider a language like Scheme, who is nearly the opposite of R in every way. You can, in fact, learn Scheme by printing out the spec and reading reading it (at least, until R5RS you could). But why would you want to bother? Scheme, out of the box, does almost nothing. Yet, in terms of elegance for building abstractions, Scheme brings plenty to the table. Clearly language elegance of specific functionality is preferred.

In reality, people don't download a pure version of Scheme. They use a system like Racket that not only provide the basic core, but also provide the libraries to do interesting things. Still, I think this is one more explanation as to why Scheme isn't popular. With no specialized niche to plant the seed (another example: Perl for text processing), there isn't a base of users clamoring for the language.

And you know, I'm OK with that.

4 comments:

  1. Scheme isn't popular?!

    ReplyDelete
  2. Thanks! Glad you enjoyed the talk.

    You made an interesting point about elegance versus it-just-works. It would be nice to have both, but we have to compromise. The more capable you are of dealing with things that don't quite work the first time, the more emphasis you're able place on elegance or other factors. The less capable you are, the less you can afford to be picky about elegance.

    ReplyDelete
  3. Just a couple of comments. The statement that R (and S) is a DSL is factually incorrect. S was designed from the beginning as a general-purpose language with an aptitude to perform data analysis. Check with Chambers or read Chambers' "Programming with Data"; or even better read a fraction of the 1100+ functions in R-core, which will do pretty much anything (and more) that a language like C or Python can do.

    As a user of R and Racket, I'd say that the batteries included in Racket are not sufficient to make it appealing, at least not to a scientific programmer, as it lacks sophisticated numerical, visualization and data transformation libraries. Racket has many qualities and is more usable than other Schemes, but I think it has less industrial traction than NewLisp. If one has to use a Lisp in the real world, Common Lisp + Quicklisp or Clojure + Java libraries may be more practical choices, and seem more popular in industry.

    ReplyDelete
  4. Well said. I hadn't thought of perl or php as DSL's but I guess you're right.

    ReplyDelete