Mocking Clojure Protocols with Atticus

Atticus is a mocking library written by Hugo Duncan. For more information on Atticus, you can see Hugo’s blog post from May here. I have added the ability to mock protocols to Atticus and would like some feedback as to the best approach for binding the mocked protocol instance. There’s a survey below, but first a little background on the implementation.

Mocking Protocols

What makes the protocol instances a bit tough to mock is that they’re not just straight Clojure functions. They dip a bit into Java and a bit into Clojure. With that in mind, using Atticus without modification, wouldn’t allow the mocking of those protocols. Below is an example of some code that uses this new functionality:

1
2
3
4
5
6
7
8
(defprotocol Squared 
  (square [impl x]))
 
(deftest mock-protocol-test
  (expects
   [(instance Squared
	      (square [impl y] (once (* y y))))]
   (is (= 9 (square instance 3)))))

Originally I had a marker function name mock-protocol-instance which didn’t serve much purpose and was a bit awkward. Talking with Hugo, I switched it to the above syntax. The first item in that list is instance and is the symbol that the mocked protocol instance is bound to. The next item is the protocol followed by functions. The (once…) wrapped around the body of the function is existing functionality in Atticus that expands into the code to ensure the function was called exactly once. An example of mocking a regular function in Atticus is below:

1
2
3
4
(deftest test-cube
  (expects
   [(cube [y] (once (* y y y)))]
   (is (= 8 (cube 2)))))

The difference in this code and the first is that the binding syntax is familiar because it is similar to letfn. Below is an example of letfn:

1
2
3
(letfn [(add5 [x] (+ x 5))
	(subtract5 [x] (- x 5))]
  (= 7 (add5 (subtract5 7))))

In the style above, the first argument is the function name and so anything that refers to add5 in the body of the letfn gets the function bound above in the letfn. This letfn type of binding makes sense for Atticus when mocking functions. They both have similar goals, binding a function temporarily. Where this is a little more tricky is in mocking the protocol. In the first example above, the first element in the list is special. It’s the symbol bound to the protocol instance. This is really more appropriate for a let style of binding. Where one element is the symbol and the other is an expression. Unfortunately, switching to a let style binding for the expects macro will make the syntax a little more cumbersome for mocking functions because you would have to add “fn”. This would probably look something like this:

1
2
3
4
5
6
7
(deftest mock-protocol-test-alt
  (expects
    [instance (Squared
	       (square [impl y] (once (* y y))))
     cube (fn [y] (once (* y y y)))]
        (is (= 9 (square instance 3)))
	(is (= 8 (cube y 2)))))

The above is just spit balling, but the key point is that expects would use a let binding style. The first is a let style binding of the protocol instance and the second is binding a function.

Put it to a vote!

The question is, which one is better? Is sticking with the letfn binding and the brevity that it allows worth it even though the protocol mocking is a bit different? The letfn style is the first and second code examples above. Or is it confusing enough to warrant a little extra code around mocking functions (the example immediately above). Is there another approach that would be better? Below is just a quick survey on which one is preferable. Thanks for giving your input!


Comments

Clojure Futures

Futures in Clojure are basically a way that you can execute some bit of code on a background thread. I was using it as a way to allow timeouts for long running code. In this entry I’ll give a run down on how to use futures.

Future Basics

It is pretty easy to start using futures in Clojure. Most of the function calls start with (future…). First I’ll start by creating a future that calls a long running sleep function:

1
2
3
4
5
(def f 
  (future
    (println "Starting to sleep...")
    (Thread/sleep 600000)
    (println "Done sleeping.")))

There are two ways via Clojure to create a future, the first is with the macro future (used above), the other is the future-call function. The future macro is just syntax sugar around future-call. Whatever you pass in to future with be wrapped in a no-arg function and passed to future-call. The future-call function just invokes the passed in function on background type of thread. The above code, even though it sleeps for an hour will return immediately. The object returned is a future (intentionally not getting into details about what is returned right now). This returned future allows you to peek inside the execution of the code passed in and get information like is it done, or has it been canceled:

1
2
3
4
5
(future-done? f)
=> false
 
(future-cancelled? f)
=> false

You can then decide to cancel the future:

1
2
3
4
5
(future-cancel f)
=>true
 
(future-cancel f)
=>false

The above will return false when it was not able to cancel the future. As an example, if we try to cancel the future again, it will return false, because it has already been canceled. Another set of useful functions is getting the value returned by the executing future. Let’s say we’re still computing Fibonacci numbers in the slow, recursive way:

1
2
3
4
5
(defn fib [x]
  (cond (zero? x) 0
            (= 1 x) 1
            :else
	    (+ (fib (- x 1)) (fib (- x 2)))))

Computing (fib 40) on my laptop takes about 10 seconds. Below is code to execute the code in a future and then use deref to pull the value out:

1
2
3
4
5
6
7
8
9
10
(def f (future-call #(fib 40)))
=> #'user/f
 
(time @f)
=> "Elapsed time: 8075.452326 msecs"
102334155
 
(time @f)
=> "Elapsed time: 0.082342 msecs"
102334155

The first deref (@) call will block until the future has completed running and then return the value. So timing the deref, it took about 8 seconds until returning the value. But if it has already computed the value, it will just return that computed value. So the second deref returns very quickly.

Timeouts

One thing that is hidden from users of futures in Clojure is what is actually returned when calling (future…). I say that it’s hidden because when you use future-cancel, future-done? etc, it doesn’t matter what kind of object is returned from the future function call. It’s some object that you are able to pass to these other functions and it just works. Where you actually do need to know more about the implementation of this is when you want to have your future timeout. The object returned by future-call is an implementation of java.util.concurrent.Future. With this information, you can use the Java APIs for the Future. Below is some code that creates a future that will timeout before it finishes:

1
2
(def f (future-call #(fib 50)))
(.get f 10 (java.util.concurrent.TimeUnit/SECONDS))

The last line does the same thing that we did before when we deref’d the future, but this time there is a timeout on how long we will wait for that deref to happen. If the timeout is reached (in this case 10 seconds), it will throw a java.util.concurrent.TimeoutException. Armed with the knowledge that the future is actually a java.util.concurrent.Future, the deref is just a more Clojure way of calling the get method on the future:

1
2
3
4
5
6
7
8
9
10
(def f (future-call #(fib 40)))
=> #'user/f
 
(time (.get f))
=> "Elapsed time: 10713.176672 msecs"
102334155
 
(time (.get f))
=> "Elapsed time: 0.075569 msecs"
102334155

Hey, where did my console output go?

I use Emacs and Slime for my development environment. One thing that I noticed was that code running in a future does not write to the console like code running from the Slime REPL. This is because code running in the Slime REPL gets the bound variables from the REPL thread, whereas the future code runs in a different thread that does not have those variables bound. bound-fn makes it easy to fix this problem:

1
2
3
4
5
6
(def f (future-call #(println "Hello World")))
=>#'user/f
 
(def f (future-call (bound-fn [] (println "Hello World"))))
=>Hello World
#'user/f

Comments (2)

Easy Java Interop with Clojure

This past week I started writing some code to work with Amazon EC2 Instances. I started using the JClouds library . It’s a great library for spinning up public AMIs in a cloud neutral way, however didn’t it do everything that I needed. Some of the things I needed were EC2 specific, so that’s not so surprising. I fell back to the AWS SDK for Java, which basically just wraps calls to the Amazon web services. Using that library, I wrote some Clojure functions that wrapped the Java calls to do what I needed. Examples of what I needed would be to start up an existing EC2 EBS backed instance, stop an EBS instance and determine what state an EBS VM is in. This led to Clojure code that would build up some request objects and interpret some response POJOs. The API is a little awkward, even using Java. Starting, stopping and describing an instance via the API all require one thing, one or more instance ids. In the Amazon API, they have created a separate class for each (DescribeInstancesRequest, StopInstancesRequest and StartInstancesRequest) and have those classes include a method where you setInstanceIds rather than just calling a method and passing a list of Strings (or something similar). Working with this API helped me learn more about the Clojure Java Interop functions.

dot dot

The first feature that made my life easier was .. . This is a macro for chaining Java calls. It takes code that in Java would look like this:

instance.getState().getName()

And puts a similar feel in Clojure:

(.. instance getState getName)

Without this macro, you would have to:

(.getName (.getState instance))

It takes the return value of the first part (in the inner expression above) and passes it into the second. The code above works great for chained method calls, but doesn’t help much with side affects.

doto

I found doto useful when I needed to call setters in constructing POJOs. Calls to determine the status of a running EC2 instance returns an object graph of several nested objects and were particularly awkward to test, since there were a decent amount of objects to construct. Before I realized doto could help me, I had code that looked like below:

(defn single-instance-result-example []
    (let [reservation-list (ArrayList.)
           instance-list (ArrayList.)
           reservation (Reservation.)
           instance (Instance.)
           instance-result (DescribeInstancesResult.)]
       (.setInstanceId instance "testinstance")
       (.add instance-list instance)
       (.setInstances reservation instance-list)
       (.add reservation-list reservation)
       (.setReservations instance-result reservation-list)
       instance-result))

I then transformed this into some code that used some nested dotos

(defn single-instance-result-example []
    (doto (DescribeInstancesResult.)
        (.setReservations
        (doto (ArrayList.)
            (.add
                (doto (Reservation.)
	            (.setInstances
	                (doto (ArrayList.)
	                    (.add
                                (doto (Instance.)
                                    (.setInstanceId "testinstance")))))))))))

I must admit that the dotos here took some time to get comfortable with. Some interesting things to note is that the above does not include all of the intermediate references to objects. In the first example above I always passed the objects in, such as (.setInstanceId instance “testinstance”). This is no longer necessary with doto. The intermediate let-bound variables are also not necessary. I seeing the above code, I felt like I still had some room for improvement. In the above example and in other areas of my code I was seeing a common pattern:

(doto (ArrayList.)
    (.add (doto...))
    (.add (doto...))
     ...)

So I created a quick macro that I called doto-list that would bundle that piece up:

(defmacro doto-list [& forms]
     `(doto (ArrayList.)
         ~@(map (fn [item] `(.add ~item)) forms)))

Which then made the function look like:

(defn single-instance-result-example []
    (doto (DescribeInstancesResult.)
        (.setReservations
        (doto-list
            (doto (Reservation.)
	        (.setInstances
	            (doto-list
	                (doto (Instance.)
	                    (.setInstanceId "test")))))))))

Which I think is a nice improvement when I’m creating a decent amount of ArrayLists.

memfn

The next piece I used integration with the AWS APIs was memfn. The call to describeInstances returns a List of Instance objects which have a few fields I’m interested in. There are quite a few fields on the Instance object (20+) and I was only interested in a few. Furthermore, I also did not want the callers of the functions to have to know they were dealing with a Java objects. One was bean, which transforms an object into a map of it’s bean properties. It seemed like it would work for me, but would pull over fields I cared about and a lot of ones that I didn’t. It would also require knowledge of the Java object and the generated map structure. I thought a better way to do this would be memfn. The memfn macro takes an method name (and optionally arguments) and returns a function that takes an object as a parameter. The function then invokes the method on that object when called. It basically translates the above call into:

((memfn getPublicIpAddress) some-object)

This seemed closer to what I wanted, but what I really wanted was a function called “public-ip” that you could pass an instance to. So I ended up attaching a name to the memfn function that was returned by creating a little macro:

(defmacro defmemfn
     [name method-name & args]
     `(def ~name (memfn ~method-name ~@args)))

A call to defmemfn looks like:

(defmemfn public-ip getPublicIpAddress)

And asking for a public ip looks like:

(public-ip instance)

Up Next – Testing

In conclusion, I think that working with the AWS APIs was easy thanks to Clojure’s great Java integration. Testing this code was also much easier than I thought which I’ll post next.

Comments (4)

Narrowing the Scope of Globals with Let

I’ve been reading OnLisp by Paul Graham. It’s about becoming a better Lisp programmer. It’s written for Common Lisp, but I have found quite a bit of it carries over into Clojure. One interesting code snippit I found in the book was on using let when two functions required the same value. Previously I would have done this through a def, like below:

(def step-by 7)
(defn increment [x] (+ x step-by))
(defn decrement [x] (- x step-by))

Paul Graham instead approaches it like:

 (let [step-by 7]
	   (defn increment1 [x] (+ x step-by))
	   (defn decrement1 [x] (- x step-by)))

Obviously the example above is trivial, however there are times that shared immutable data is necessary. Database connection properties, connection URI information etc. I often have a small number of functions that need that sort of data. For these kinds of situations, I like this second approach. It more narrowly scopes the variable and I can’t think of any drawbacks. I think this also highlights a difference in approach from imperative languages and functional languages in general. Paul (on page 30 of OnLisp) describes imperative language code structure as more “solid and blockish” whereas functional code tends to be more “fluid”. The first example above fits the blockish imperative model where you define your variables and functions at the same level. This is exactly how I would go about it in Java. I’ve noticed that with the Clojure code I’ve been writing, and thinking back to code I’ve written in OCaml, it is definitely more fluid, a less rigid block structure. I’ve gone through about 6 chapters of the book thus far and am looking forward to reading through the chapters on macros.

Comments (1)

Installing Parliament on Ubuntu

What is Parliament?

Parliament is an open source triple store that is an improved version of DAMLDB. There is some good information on using the triple in the User’s Guide. What’s interesting about Parliament is how it stores the triples. Relational databases are a more common implementation of an RDF store, but Parliament goes a different way. Parliament takes linked list style of approach It uses BerkeleyDB for storing the URI values and then stores the triple of references in a linked list. For more information on their approach, there’s a great paper on it than can be found here.

Building Parliament for Ubuntu

Parliament has binaries on it’s website for Windows and Mac, but none for Ubuntu (or any other Linux distro). Parliament is written in C++ and Java, so make sure you have the g++ package and the JDK installed. You’ll also need to make sure you have a Subversion client installed to get the source and Ant installed to build the Java code. Below are the steps I went through to get Parliament to run on Linux:

  1. Build Boost Jam
    • Download and unzip Boost Jam
    • Run build.sh in the boost-jam directory
    • Put jam executable in your PATH
  2. Build the Boost C++ Libraries
    • Download and unzip Boost
    • cd into the boost directory and build boost with the following command (modify accordingly if your not on a 64 bit system)
      bjam -q --build-dir=linux/build --stagedir=linux/stage
      --layout=versioned --with-test architecture=combined address-model=64
      variant=debug,release threading=multi link=shared runtime-link=shared
      stage
      
  3. Build and Install BerkleyDB
    • Download and unzip [http://www.oracle.com/technology/software/products/berkeley-db/htdocs/popup/db/4.7.25/db-targz.html BerkleyDB] 4.7.x
    • cd into /build/unix and type ../dist/configure
    • run make
    • run sudo make install
  4. Create the following environment variables (if you don’t already have them
    • JAVA_HOME=/usr/lib/jvm/java-6-sun
    • BOOST_ROOT=/path/to/boost/boost_1_42_0
    • BDB_HOME=/usr/local/BerkeleyDB.4.7/
    • BOOST_BUILD_PATH=$BOOST_ROOT/tools/build/v2
    • BOOST_TEST_LOG_LEVEL=message
  5. Building Parliament
    • Checkout the Parliament source: svn checkout –username anonsvn https://projects.semwebcentral.org/svn/parliament/trunk
    • Copy the parliament_dir/doc/Linux/*.jam files to ~/
    • The Parliament build uses pushd and popd, which is not build into /bin/dash (which is where /bin/sh is symlinked in Ubuntu). To fix this, I changed the /bin/sh symlink to /bin/bash
    • Copy build.properties.template from the Parliament source directory to build.properties
    • Comment the various build architectures (for Mac and Windows) and make sure the below line in uncommented nativeBuildParams=toolset=gcc-4.4 address-model=64 variant=release
  6. Source Changes
    • When I tried to build Parliament the first time, I received an error that the method remove could not be found when compiling Parliament/KbCore/FileHandle.cpp. It was due to the lines of code below:
      #if defined(PARLIAMENT_WINDOWS)
      	if (!DeleteFile(filePath.c_str()))
      #else
      	if (remove(filePath.c_str()) == -1)
      #endif
      

      I added an include to the top of the file:

      #if !defined(PARLIAMENT_WINDOWS)
      #	include <errno.h>
      #	include <fcntl.h>
      #	include <sys/stat.h>
      #      include <stdio.h> //<-- Added this
      

      to fix the problem.

    • The and build file includes the Mac environment variable for the C libraries, but not the Linux ones. I changed the build.xml in the Parliament directory to include either the Linux or Mac environment variable depending on the build architecture:
      	<condition property="libraryEnvVariable" value="DYLD_LIBRARY_PATH"> <!-- Line 290 -->
      			<os family="mac"/>
      		</condition>
      		<condition property="libraryEnvVariable" value="LD_LIBRARY_PATH">
      			<and>
      				<os family="unix"/>
      				<not><os family="mac"/></not>
      			</and>
      		</condition>
      ...
       <env key="${libraryEnvVariable}" path="${artifactsDir}/${nativeArtifactsDir}"/> <!-- Line 302 -->
      
  7. From the source tree root, run ant
  8. Copy Parliament-v2.6.7.0-InsertPlatformHere.zip from the target directory to your install directory (can be anywhere)
  9. In the install directory copy all of the files from gcc-4.4/release/64/ to the ParliamentKB directory
  10. Run the StartParliament.sh script to start Parliament

Comments

Clojure Protocols Part 3

Recently there have been some changes to the Clojure Protocols code out on Github. Not huge changes, but enough that the examples I wrote from Part 1 and Part 2 will no longer work. I thought I’d finish out my protocol blog entries by showing how I used it and include the new syntax. I also have a better understanding on how reify can be used (thanks Meikel) and will include some of that. First the goal of protocol usage. I have been working on some comparisons and evaluations of triplestores. Triplestores can be used to store RDF data which is a series of subject/predicate (or property)/object triples. There are many triplestores out there and of the triplestores that are out there, many have several interfaces. For example, Oracle has a JDBC interface that uses stored procedures and a Jena API that incorporates pieces of the Jena framework. This was some pretty low hanging fruit from an abstraction perspective. Whether inserting a new triple in Oracle JDBC, Jena (with Oracle) or one of the other triplestore impelementations, on the surface, it is the same. Take this subject, predicate and object and store it. The same could be said for querying it with SPARQL or deleting entries. I ended up with a protocol named TriplestoreOperations like below:

(ns revelytix.triplestore-operations)

(defprotocol TriplestoreOperations
  "Interface for the various operations allowed by a triple store"
  (create-graph [impl graph-name] "Creates a new graph of name graph-name")
  (delete-graph [impl graph-name] "Deletes graph graph-name if graph exists")
  (insert-quad [impl graph-name subject predicate object]
    "Creates a new triple, data is assumed to be a full URI")
  ;;...)

This syntax is the same. The first argument is used to pass in the implementation of TriplestoreOperations. The graph-name or model in Oracle terms, is what is going to hold the triples. The protocol exists in one namespace (called triplestore-operations above) and the implementations of the interfaces are in separate namespaces. The first is an Oracle JDBC implementation of TriplestoreOperations. It’s parameterized by the database connection details and the name of the table to store the data in.

(ns oracle.oracle-jdbc
  (:use clojure.contrib.sql
	triplestore-operations))

(deftype OracleJdbcOperations [db table-name]  TriplestoreOperations
  (delete-graph [impl graph-name]
	(let [drop-model-string (create-sql-string DROP-MODEL-SQL graph-name)
	      drop-table-string (create-sql-string DROP-TABLE-SQL table-name)]
	  (with-connection db
	      (with-open [drop-model-statement (.prepareCall (connection) drop-model-string)]
		(do
		  (drop-entailment-if-exists db graph-name "RDFS")
		  (.execute drop-model-statement)
		  (do-commands drop-table-string))))))
  (create-graph [impl graph-name]
      (let [createModelString (create-sql-string CREATE-MODEL-SQL graph-name table-name)
	    createTableString (create-sql-string CREATE-TABLE-SQL table-name)]
	(do (with-connection db
	      (with-open [createModelStatement
                                    (.prepareCall (connection) createModelString)]
		(do-commands createTableString)
		(.execute createModelStatement))))))
  (insert-quad [impl graph-name subject predicate object]
	       (create-family-triple table-name db graph-name subject predicate object))
  ;;...)

  (defn create-oracle-jdbc-triplestore-instance [table-name]
           (OracleJdbcOperations *oracle-jdbc-props* table-name)) ;;Awkward see below

One difference between the above code and the code in Part 1 or Part 2 is that the implementation parameter in the previous version of deftype disappeared. So the create-graph function above would have had only had a single parameter. I like the change, I found the original code a little confusing, wondering where the first parameter went etc. The next implementation of the TriplestoreOperations protocol was a Jena implementation of the protocol. The below code makes use of the reify function and feels a little more idomatic Clojure and less like the implementation of a protocol is something special and different from just functions. I like the refiy syntax over deftype and I’ve been moving my code over to use it. I’m going to cut a decent portion of the implementation below because it mostly calls Java APIs and is a bit noisy:

(ns jena-operations
  (:use triplestore-operations)
  ;;...)

(defn create-jena-operations-instance [jena-support-impl]
  (reify TriplestoreOperations
	  (create-graph [impl modelString] nil)
	  (delete-graph [impl modelString]
			(with-triplestore-connection ;...)
	  (insert-quad [impl modelString subject predicate object]
		       (with-triplestore-connection ;;...)
          ;;...))

The reify function call above also creates a new instance of the protocol TriplestoreOperations with the functions defined in line. There’s also not a need to create an instance of the type like is being done in the previous example. The end result, deftype or reify from a functionality perspective is the same, there’s just a different way to get there. Reading through some of the docs, it looks like reify is more dynamic and deftype results in generated code. One difference between Jena and the Oracle JDBC interface is that graphs don’t need to be created explicitly using Jena, so that method does nothing. The above code is slightly different as well in that the implementation parameter no longer disappears. Another interesting part is that the JenaOperations instance is parametrized by another protocol called JenaSupport. What I have found is that many vendors support the Jena APIs, but they implement it slightly different. It’s definitely not as pluggable as something like JDBC. This JenaOperations implementation is generic for the Jena APIs and is used by several triplestores with Jena implementations. The JenaSupport protocol abstracts things like getting a Jena connection, creating the correct implementation of Model etc which is different from implementation to implementation.

Development Gotchas

I have found a few issues when developing Clojure code that uses protocols. I’m using Leiningen and Lein Swank for development of the code. First I found that if I had AOT compilation enabled, and had run lein install, the protocol definition results in compiled code in the classes directory of the project. Where this caused a problem was when I tried to change a protocol definition. I’d make a change in Emacs, load the file with the updated protocol code and behaviour of the code would be such that I made no change to the protocol at all. What was happening was the old version of the code, the one that had the interface code generated, was still on the class path in the classes directory. Removing that code (through lein clean or something similar) allowed my changes to take affect. This problem stumped me for a couple of hours. I can avoid this entirely by just not using the AOT compilation (I don’t really need it) but others might not.

Another gotcha I found was in the loading of files that use implementations of protocols. In the example above, let’s say I have a test file (I’ll call it test-A) that executes functions from TriplestoreOperations on the JenaOperations implementation that in turn uses the Oracle implementation of JenaSupport. Just loading test-A.clj file does not cause the loading of the Jena implementation of the TriplestoreOperations, or the Oracle version of JenaSupport. Rather it just complains that there is not an implementation of TriplestoreOperations for ‘nil’. Loading those files individually fixes the problem, it just doesn’t do that automatically for me.

Comments

Clojure Protocols Part 2

Stale code warning

There have been small changes to the protocols code in Clojure. The below post is still useful, but a few details of the example code is different. See part 3 for the updated syntax.

Clojure Protocols Part 2

This is the second in the series of blog entries on Clojure protocols. The first can be found here. This entry continues by using protocols to implement Java interfaces and reify interfaces/protocols inline in a function invocation. First I’ll use reify to define an implementation of the TextOutput interface in-line of the function call. I’ll change the italics syntax to the MediaWiki italics format:

(println (output-string (reify TextOutput
		      (output-string [x] (str "''" x "''"))) "stuff"))
''stuff''

The acceptable things to reify are Interfaces (in Java) protocols or Object. I’ve not yet find a use for reify in code that I have written. One of the things that can be passed to reify are regular Java interfaces. This can also be passed to deftype to define Clojure implementations of Java interfaces. An implementation of Comparator looks like below:

(deftype ThreeCompare [] java.util.Comparator
	       (compare [o1 o2]
			(cond (= o1 3) -1
			      (= o2 3) 1
			      :else (.compareTo o1 o2)))
	       (equals [other] (isa? other ThreeCompare)))

The deftype above implements the protocol java.util.Comparator that when sorting a list of numbers will always put any values of 3 first in the list followed by the rest in ascending order. This can be used like any Java implementation of Comparator:

(def java-list (java.util.ArrayList. (list 1 2 3 4 5 6 7 8)))
(java.util.Collections/sort java-list (ThreeCompare))
(println java-list)

A nice side benefit of deftype is something that reminded me of records in OCaml:

(deftype Point [x y])
(defn midpoint [point1 point2]
	    (Point (/ (+ (:x point1) (:x point2)) 2)
		   (/ (+ (:y point1) (:y point2)) 2)))
(println (midpoint (Point -1 2) (Point 3 -6)))
#:Point{:x 1, :y -2}

The above code defines a new type Point, a midpoint function that takes two points and return a new Point that represents the midpoint of the two points.

Default Implementations

One feature I was looking for when I first incorporated protocols into some existing Clojure code was the concept of a default implementation of a protocol within a namespace. I think this would be a pretty typical usage of a protocol, you might have several implementations, but generally you’re only working with one at a time. In my case, I was testing three implementations of a protocol for an integration test. I wanted to run the same tests on all three implementations of the protocol. This presented a problem because the deftest macro doesn’t allow passed in parameters and yet each function that I called needed to be parametrized based on the implementation I was testing. I first attacked this problem with a bound variable and then just had all functions called on the protocol use the bound variable as their implementation. Then when a switch to another implementation was needed, I’d change the implementation assigned to the bound variable. This worked for me because it was just test code, but I think this will come up more in the future.

Comments (2)

Clojure Protocols Part 1

Stale code warning

There have been small changes to the protocols code in Clojure. The below post is still useful, but a few details of the example code is different. See part 3 for the updated syntax.

Clojure Protocols

Protocols are a new feature in Clojure, set to be released in the next version. They provide polymorphism in a very Clojure-ish way. I think it’s a great lightweight polymorphism implementation that has a lot of potential. In true Clojure style I think it meets the polymorphism objective and yet doesn’t need to totally change the way you already write your code in Clojure. I’m breaking this entry into more than one piece to show some different ways that Clojure protocols can be used. Because it’s so new, there are not a lot of docs out there on it, but Rich does some good documentation on the macros themselves. If you want to try these examples, make sure you’re running off of the 1.2 version of Clojure (from Clojars or a local build from the Clojure git repo). First I’ll start by defining a simple protocol:

(defprotocol TextOutput
	  (output-string [x string]))

In Java terms, I’m defining a TextOutput interface (actually a Java interface is being created, but more on that later), that has a single function named output-string that includes no implementation details. The input to this function is a little tricky though. I specified a parameter x and another one called string. The first parameter will be used to pass the implementation of the interface into the function. You don’t need to write code to handle the parameter x and when you write your implementation, you’ll act like it doesn’t exist. A wiki type text output of an italics string would look like:

(deftype ItalicsOutput [] TextOutput
	       (output-string [string] (str "_" string "_")))

I have begun thinking about this in Java terms as a class ItalicsOutput that implements the TextOutput interface. Here in the output-string function, I only specify one parameter (not two). Next you can use this implementation with the following code:

(output-string (ItalicsOutput) "stuff")
"_stuff_"

I’m telling Clojure I want it to execute the output-string function, on the implementation (ItalicsOutput) (more in this below) with the argument “stuff”. I think that below is a little more readable:

(def italics-impl (ItalicsOutput))
(output-string italics-impl "stuff")
"_stuff_"

Which just assigns the instantiated implementation to a variable which can then be used. These implementations can also have parameters, like:

(deftype PrefixedOutput [prefix-string] TextOutput
	       (output-string [string] (str prefix-string " " string)))

I think passing a variable in makes the instantiation step make a little more sense:

(def prefix-with-more (PrefixedOutput "more"))
(output-string prefix-with-more "stuff")
"more stuff"

Both implementations can be used together as well:

(defn print-all []
	(let [italics-impl (ItalicsOutput)
	     prefix-with-more (PrefixedOutput "more")]
	     (println (output-string italics-impl "stuff"))
	     (println (output-string prefix-with-more "stuff"))))

With output that would look like:

(print-all)
_stuff_
more stuff

Comments (4)

SICP – Chapter 1

I have began reading through Structure and Interpretation of Computer Programs through a study group (a spawn from the Lambda Lounge). A classic computer science textbook, I’ve wanted to read it for a while now, and I’m amazed that thus far I have avoided reading it. Maybe because it’s older is the reason I missed the SICP cut-off. I have to say, I’m impressed with the pace of the book. It’s partially a function of the language, but I really like how the text gets the the bare metal, in that it builds everything from the ground up. Scheme allows it to do this in that many things that are syntax in other languages (like basic arithmetic operations) are not syntax in Scheme. Languages like Java have operations like addition and division built into the syntax of the language. My answers to the exercises in the textbook are here. They are written in Clojure and I’ve been pretty surprised at how closely Clojure code corresponds with Scheme code.

The Good Stuff

Like I described above, Abelson and Sussman getting to the bare metal in terms of Scheme I think is a real benefit. I’m sure it took a lot of restraint to not use the fancy macros or functions early on and start small. I really liked the way that they described the benefits of tail recursion. I have been asked on several occasions to give such a description. I have to say their approach using visuals is much better than mine. I will definitely be borrowing theirs when asked that question in the future. Building on that I though that the exercises that were in 1.2 did a good job of covering how to go about making something tail recursive. I think their coverage of higher order functions was thorough and look forward to them revisiting the flexibility of this in the coming chapters.

The Bad Stuff

I thought their coverage of recurrences and asymptotic growth was particularly bad. I think recurrences are a very tough topic and section 1.22 barely skimmed the surface enough to give an exercise like 1.13. Maybe the students at MIT had a prerequisite that covered that or something, but I spent many hours in grad school trying to understand recurrences and I know I would have been drowning with such a light coverage of the topic. Maybe asymptotic growth will be covered in depth in another section, but I though that just skimmed the surface as well. The only other negative comment I have is that the amount of math makes the book less approachable. I know why they did it, it’s the only base that could be built upon easily for them to use their bare metal sort of approach. I also don’t think you need a very extensive math background to read it, they don’t come in expecting a whole lot. But math is intimidating to many people and just having something that smells and looks like hard math will turn people away.

Conclusion

In conclusion, chapter 1 from SICP has been worth the time and definitely a good start for a foundation in computer science. I think that the first chapter is a good read for any software developer. I’m looking forward to continuing with the rest of the book.

Comments

Worse is Better and Clojure

I’ve been writing code in Clojure now for a few weeks and I’m really enjoying the simplicity and power of the language. I think that the progress being made right now in the Clojure community is great and that there are definitely good things to come. I couldn’t help but thinking back to the Worse is Better series of papers the first week or so I was learning the language. For those that haven’t read the paper, I’d highly recommend it, along with the rebuttal found here and another here. I wrote a blog entry about it about 3 or 4 years ago, but unfortunately it looks like it’s been taken down. It was on a company blog and it looks like it’s been replaced with another blog system.

I remember reading the article for the first time and realizing how right Richard Gabriel was and how I wanted him to be wrong. The realization that the best solution to the problem isn’t always the right solution floored me. As someone who enjoys a hard problem and tries hard to come up with the best solution I can to problems, the C analogy was very thought provoking. This brings me to Clojure. Clojure seems like it might well be the compromise talked about in Worse is Better, yet with enough of the essence of Lisp to still have the right solution. There’s no doubt that the Clojure folks have had to make some compromises to fit into the mould of the JVM. An example is Tail Recursion in Clojure, implemented via the recur special form. As a user of the language, I obviously would prefer tail calls to just work, without me having to tell it. That is a hard problem in the context of the JVM, so I understand the decision. This felt to me like the PC loser-ing problem In the context of Worse is Better. Although the right decision might be to crack the hard problem or worse yet, wait for the tail calls on the JVM, this seems like a small trade-off that is still workable.

Another good call by the Clojure folks, in my opinion, is the Java integration. Below is a quote from Worse is Better on integration:

In the worse-is-better world, integration is linking your .o files together, freely intercalling functions, and using the same basic data representations. You don’t have a foreign loader, you don’t coerce types across function-call boundaries, you don’t make one language dominant, and you don’t make the woes of your implementation technology impact the entire system.

Sound familiar? Not only is calling Java from Clojure seamless, there’s actually syntax sugar (through macros) to make calling Java code easier. No need to convert everything over to a specific Clojure object format or anything like that, it just works. You might have to make a Java collection seq-able or something similar, but it’s pretty minimal fuss. There are also facilities for Clojure code to create Java proxies and Java interfaces (though I’ve not used them). This allows Java code to integrate with Clojure code. It seems to me that the Java integration in Clojure very much fits with the quote from Richard Gabriel. This tight integration I believe will be the path in to Clojure for many developers.

Comments