Worse is Better and Clojure

I’ve been writing code in Clojure now for a few weeks and I’m really enjoying the simplicity and power of the language. I think that the progress being made right now in the Clojure community is great and that there are definitely good things to come. I couldn’t help but thinking back to the Worse is Better series of papers the first week or so I was learning the language. For those that haven’t read the paper, I’d highly recommend it, along with the rebuttal found here and another here. I wrote a blog entry about it about 3 or 4 years ago, but unfortunately it looks like it’s been taken down. It was on a company blog and it looks like it’s been replaced with another blog system.

I remember reading the article for the first time and realizing how right Richard Gabriel was and how I wanted him to be wrong. The realization that the best solution to the problem isn’t always the right solution floored me. As someone who enjoys a hard problem and tries hard to come up with the best solution I can to problems, the C analogy was very thought provoking. This brings me to Clojure. Clojure seems like it might well be the compromise talked about in Worse is Better, yet with enough of the essence of Lisp to still have the right solution. There’s no doubt that the Clojure folks have had to make some compromises to fit into the mould of the JVM. An example is Tail Recursion in Clojure, implemented via the recur special form. As a user of the language, I obviously would prefer tail calls to just work, without me having to tell it. That is a hard problem in the context of the JVM, so I understand the decision. This felt to me like the PC loser-ing problem In the context of Worse is Better. Although the right decision might be to crack the hard problem or worse yet, wait for the tail calls on the JVM, this seems like a small trade-off that is still workable.

Another good call by the Clojure folks, in my opinion, is the Java integration. Below is a quote from Worse is Better on integration:

In the worse-is-better world, integration is linking your .o files together, freely intercalling functions, and using the same basic data representations. You don’t have a foreign loader, you don’t coerce types across function-call boundaries, you don’t make one language dominant, and you don’t make the woes of your implementation technology impact the entire system.

Sound familiar? Not only is calling Java from Clojure seamless, there’s actually syntax sugar (through macros) to make calling Java code easier. No need to convert everything over to a specific Clojure object format or anything like that, it just works. You might have to make a Java collection seq-able or something similar, but it’s pretty minimal fuss. There are also facilities for Clojure code to create Java proxies and Java interfaces (though I’ve not used them). This allows Java code to integrate with Clojure code. It seems to me that the Java integration in Clojure very much fits with the quote from Richard Gabriel. This tight integration I believe will be the path in to Clojure for many developers.

Comments

Emacs Talk Online

A video of the talk I gave at Lambda Lounge last Thursday can be found here.

Comments

Design By Contract with Clojure

I just learned about the design by contract features of Clojure, and I’m impressed by the simplicity. It’s implemented using regular Clojure metadata (i.e. no new language constructs to support this). {Small correction to this previous statement. It looks like metadata, and can be read as metadata, but is actually compiled into the function (i.e. can’t be modified at runtime). Thanks for the correction Alex.} Several times I desired DbC in Java and have tried some of the libraries written for Java. The Java ones were generally built on comments or annotations. Bottom line is that they just didn’t feel like they seamlessly integrated into the language, and they seemed to have a short shelf life. By short shelf life, I mean there were a lot of proof of concepts and abandoned projects, but none that were viable over the long term.

DbC in Clojure

Pretty slick how it’s implemented in Clojure. First we take a normal function definition;

(defn pos-add [& args]
(apply + args))

It doesn’t really do anything interesting, just delegates to the plus operator, but should only be used for positive integers. If you’ve not seen the & symbol, it just collects all function arguments in as a sequence. So a precondition of this function is that all arguments passed into pos-add should be zero or greater. To add this, the code looks like:

(defn pos-add [& args]
{:pre [(not-any? neg? args)]
:post [(<= 0 %)]}
(apply + args))

So there are two new pieces, a :pre that takes expressions, all of the expressions must return true for the pre-condition to pass. In the example above, there is only one expression, and it ensures that there are not-any negative numbers in the argument parameters. It also insures the the result is 0 or greater. The post condition isn’t of much value here, but I added it to demonstrate where it would go. Calling the function is the same as calling any other function, but if the pre/post conditions are not met, an AssertionError is thrown. Below are some basic tests for the function:

(is (= 10 (pos-add 1 2 3 4)))
(is (zero? (pos-add 0 0 0 0)))
(is (= 5 (pos-add 1 1 2 1)))
(is (thrown? AssertionError (pos-add 1 2 3 -4)))

What led me to Clojure’s DbC features was reading On Clojure and there was a proposal for a new DbC syntax. I like DbC in the original style, but I think that the one at On Clojure has some additional benefits because it can provide some hints as to what types are expected in the function. If you read Smalltalk Best Practice Patterns by Kent Beck, he recommends to name the variables after the type that is expected. So if the method is findByName, the parameter would be aString to give the caller a hint as to what is expected. What was detailed in the On Clojure blog not only provided hints about the type but also would also let you know acceptable values just by looking at the function declaration and the accepted parameters.

I would like to see the pre/post condition information somehow worked into the documentation generated by Clojure. Seems like it would be a very useful feature for callers of APIs.

Comments (2)

Emacs Talk at Lambda Lounge

I’m giving a talk tomorrow at the Lambda Lounge on Emacs. I’m planning on spending about half the talk on some Emacs conceptual basics. When it comes to the basics I’m going to try and avoid the “this key does this” and “that key does that”. I think going over cursor movements etc will just be lost. If you’re truely new to Emacs, you’ll need to go through the Emacs Tutorial (which comes with Emacs, there’s a link to it on startup) to get the basic movements down. Instead I’m planning to go over the concepts of buffers in Emacs, the modeline, the minibuffer, the .emacs file etc. I’m also planning on spending a decent amount of time going over the help system, both the info system and the built-in Elisp documentation. Then I’m planning on going over some Emacs Lisp basics, Eshell, org-mode and a little on Dired. Then I’m planning on doing some development demos, specifically highlighting REPL environment integration into Emacs with SLIME, Tuareg Mode and maybe some Ruby.

Comments

Clojure Apply

I ran into the following situation a few times, I had a list of items, and I was attempting to call a function that accepted individual items. As an example, I had a list of contact information, where the first element is the contact name, the second, contact info, the third a contact name and so on. This is exactly how the sorted-map function works in Clojure, only it expects the items individually, not the list. The doc for sorted-map is below:

clojure.core/sorted-map
([& keyvals])
keyval => key val
Returns a new sorted map with supplied mappings.

An example of how to call it is something like:

user> (sorted-map 1 2 3 4)
{1 2, 3 4}

What I had instead was something like [1 2 3 4]. An easy way to convert the list to individual arguments is the apply function:

clojure.core/apply
([f args* argseq])
Applies fn f to the argument list formed by prepending args to argseq.

Swapping out the direct sorted-map call for an apply call looks like:

user> (apply sorted-map '(1 2 3 4))
{1 2, 3 4}

Note it returns the same result as calling the function directly. Using apply saved me from writing code to iterate over each item and manually add them to the map.

Comments

Being Lazy with Clojure

I’ve done development in some other functional languages, but I’ve done the most development in OCaml. OCaml is a functional language, but not lazy by default. There are libraries that can be used to cause data structures and such to be lazy, but you have to go out of your way to use them. Clojure is lazy by default. One interesting ramification of this is in the lazy lists that it creates. I’m still trying to learn the language, so I figured I’d parse a CSV type of file that contained contact information that I dumped from a contact application I use. I wanted to get the lines in the CSV into a list so I could take a look at them from a Clojure perspective. To do this, I wrote some code like below:


(with-open [rdr (reader "/some/directory/contactinfo.csv")]
(def lst (line-seq rdr)))

The code basically opens the file, executes the code in the body and then closes the file (think of the def lst… part as in the try block). What I was expecting to happen was line-seq to read each line and store it in lst. Much to my surprise, when I attempted to look at the first element of lst, I received an error message:

java.io.IOException: Stream closed
[Thrown class java.lang.RuntimeException]

This pointed out two interesting things to me. First, when I accessed the list (since it was lazy) it tried to pull the first line out of the file. Since I used the with-open function, that file was closed. Next, I realized that if it was trying to read in from the closed file, (line-seq) was not behaving as I was expecting. The book I’m reading by Stuart Halloway did discuss this, I just forgot. My first reaction was to pass a function that does the parsing to the function above. This would cause all of the operations on the file to occur before the closing of the file. The file I was parsing ended up not being easily parsed line by line, since most of the data items spanned several lines. I found the right function was slurp:


clojure.core/slurp
([f] [f enc])
Reads the file named by f using the encoding enc into a string
and returns it.

Slurp will just pull all of the file into a String. I was able to differential the contact title from the contact details easily with the Clojure re-partition function:


clojure.contrib.str-utils/re-partition
([re string])
Splits the string into a lazy sequence of substrings, alternating
between substrings that match the pattern and the substrings
between the matches. The sequence always starts with the substring
before the first match, or an empty string if the beginning of the
string matches.

The only trick to using it was to go back through the list and discard the substrings between the matches. I was easily able to do this is the filter function. More on Clojure and lazy lists to come.

Comments

Squeak Going 100% Open Source

Good news for Squeak developers. It looks like they are in the final throws of converting all code to the MIT or the Apache license. They still need an official release before the conversion is complete, but it looks like it’s just a formality. There have been several problems with the Squeak License. First, the license has a few clauses that are different than most of the mainstream Open Source licenses. An indemnity clause that could require software distributors to pay legal fees to Apple in case of law suit. Another to add in export restrictions. At one time, some of the pieces could not be used in “for profit” software (like the fonts). So all around good news for Smalltalkers!

Comments

Emacs and Clojure

I’m spending some time learning Clojure and am so far, thoroughly enjoying it. Writing the code is a breeze in Emacs. Historically, Emacs has always been a top notch environment for writing Lisp code. Thanks to the swank-clojure module, writing Clojure in Emacs is just as easy as writing Common Lisp in Emacs. I think this is a great benefit to the Clojure community. One of the problems many languages face is the tool set early in the life of the language. Languages that come to mind are Haskell, OCaml, Ruby, Groovy, JavaFX and so on. Ruby, Groovy and JavaFX are coming along, but early on, it was a struggle. Clojure instead was able to lean on SLIME for that IDE-like support. SLIME has been around for quite a while and already had robust support for Lisp. Since Clojure could leverage this, it got most of the IDE stuff for free.

Comments

Moving On

Yesterday I accepted a position at Revelytix. I’ll be starting in March. They’re a tech company that is working on Sematic Web technologies. I’m looking forward to the cutting edge technologies and digging into the Semantic web. I’ll be working with Alex Miller and a few others as we start on development of a new product. Expect some posts on the Semantic Web, Clojure and Scala!

Comments (1)

Spring Remoting – A Step Toward SOA?

Spring Remoting

Spring Remoting is an RMI type of facility built into the Spring framework. Basically you define an interface and an implementation on a remote application. Spring then places a proxy in your application and when it is called, goes over HTTP to the remote implementation and returns it as if the implementation was local. It’s really quite easy to implement using Spring. A few lines of configuration of where to find the remote implementation, a few lines to expose the remote implementation over HTTP and you’re set. This ends up being a very cheap way to start having services exposed in your applications. There are definitely some downsides to this approach. The first is that it’s only Java. There are some options to use Hessian/Burlap extensions that you can use, but deeper object graphs have difficulty travelling across the wire. Another is the potential set of dependency problems that can occur when using an RMI-like solution.

RMI and Dependencies

Probably the most significant downside to RMI doesn’t really occur until you have used it for a while. Maybe you just have a few services that need to be exposed, Spring Remoting seems easy, so you use it. But then it grows, maybe other applications use it and it becomes more critical. The question is what objects are being transferred over RMI? So if you call the service and try to find the address associated with user John Doe, how is the address returned? Probably this is some type of Address class. Then the big question. Where does Address class live? The problem is the Address class needs to be available to both the server (which knows how to look up addresses) and each of the clients calling it. Changes to the service or to the objects can have significant ripple effects in the application. The problem is easy to understand, but slowly creeps up on a project and becomes a dependency nightmare. I thought that this was a logical first step toward a true web service. The problem with this line of thinking is that if it stays this way too long, theres already too much damage and the refactor is too costly.

Why not start with a web service? Web services are somewhat expensive to create so you have to make sure it’s necessary. First you must develop some form of input to be accepted. Maybe this is an XML, or a JSON object and whether or not there is a proper schema doesn’t really matter. It still needs to be thought about and defined, formally or informally. Next, code needs to be written to translate between the request and the business objects of the back end system. The same translation needs to happen for the response. The client also needs to translate to/from this same intermediate format. There are obviously things that can make this easier like code generation and such, but it’s still additional work. In early phases of a project where the inputs/outputs might be changing substantially, this can lead to a developers thrashing with the services and producing very little.

Hibernate and RMI

Another potential RMI gotcha is attempting to transfer Hibernate POJOs. First, Hibernate POJOs are special. They have lazy loaded collections and other proxied objects that are more complex than just your basic JDK objects. The immediate consequence of this is every caller of the RMI service not only needs to have the POJO classes in their classpath, but also the Hibernate jars. The more subtle consequence of this, is what happens when one of those lazily loaded collections is transferred to the caller? The objects can’t be lazy loaded from the client, the client doesn’t have the database connection etc. From here you really have three options. The first option is to enable remote lazy loading (example here). I’ve not used this, it seems far too complex and error prone. The second option involves just marking all associations non-lazy (or using joins). Lazy fetching is a nice performance feature of Hibernate and the service will no longer be able to leverage it. The third options is to add a custom object serializer to Spring remoting that will exchange the lazy collections for a real collection. This will remove the dependency on Hibernate and essentially force non-lazy loading of all associations. All of these solutions make RMI less attractive and all of them are a good indication that you should rethink the need for service remoting, or rethink using RMI over a proper web service.

Other workarounds – is it worth it?

There are several techniques that can be used that can reduce the symptoms of these problems. Aside from Hibernate, interfaces for each request and response for the data passed in and returned from the RMI service. This will reduce the amount of data available to the service, require well defined input and output and will be easier to refactor to a service later. All of this adds up to a decent amount of extra work. I think in the end, the extra time involved evens out or becomes more than a proper web service.

Lesson Learned

I think that the lesson I have learned is that Spring remoting does not give you cheap services. Rather it gives you services with a low cost of entry, but that cost climbs much more quickly. With web services, you pay more up front, and less over the long term. I think maybe the best of both worlds is to use RMI/Spring remoting for the very early stages of the project (i.e. before going to prod) so that the service can be ironed out. What input data is really needed? What should be returned? Do we know most of what the service needs to do? With answers to these questions (which will only be known after some development) we are better armed for creating a real web service. At this point, the RMI implementation can be swapped and refactored to a web service, hopefully avoiding the longer term RMI issues discussed above.

Comments (2)