Previously: Debugging.
In this chapter, we’ll discuss some of Clojure’s mechanisms for polymorphism: writing programs that do different things depending on what kind of inputs they receive. We’ll show ways to write open functions, which can be extended to new conditions later on, without changing their original definitions. Along the way, we’ll investigate Clojure’s type system in more detail–discussing interfaces, protocols, how to construct our own datatypes, and the relationships between types which let us write flexible programs.
Thus far, our functions have taken one type of input. For example:
(defn append
"Adds an element x to the end of a vector v."
[v x]
(conj v x))
scratch.polymorphism=> (append [1 2] 3)
[1 2 3]
But we might want to append to more than vectors. What if we wanted to append something to the end of a list?
scratch.polymorphism=> (append '(1 2) 3)
(3 1 2)
Since conj
prepends to lists, our append
function doesn’t work correctly here. We could redefine append
in a way that works for both vectors and lists–for instance, using concat
:
(defn append-concat
"Adds an element x to the end of a collection coll by concatenating a
single-element list (x) to the end of coll."
[coll x]
(concat coll (list x)))
But this is less than ideal: concat
produces a wrapper object every time we call append-concat
, which introduces unnecessary overhead when working with vectors. What we would like is a function which does different things to different types of inputs. This is the heart of polymorphism.
A Simple Approach
We have a function type
which returns the type of an object. What if append asked for the type of collection it was being asked to append to, and did different things based on that type? Let’s check the types of lists and vectors:
(type [1 2])
clojure.lang.PersistentVector
(type '(1 2))
clojure.lang.PersistentList
Okay, so we could try checking whether the type of our collection is a PersistentVector, and if so, use conj
to append an element efficiently!
(defn append
"Adds an element x to the end of a collection coll. Coll may be either a
vector or a list."
[coll x]
(condp = (type coll)
clojure.lang.PersistentVector
(conj coll x)
clojure.lang.PersistentList
(concat coll (list x))))
As an aside: we’re using condp =
instead of case
, even though case
might seem like the obvious solution here. That’s because case
uses optimizations which require that each case is a a compile-time constant, and classes like clojure.lang.PersistentVector
aren’t actually constant in that sense. Don’t worry too much about this—it’s not important for understanding this chapter. The important question is: does this approach of checking the type at runtime work? Can we append to both vectors and lists?
scratch.polymorphism=> (append [1 2] 3)
[1 2 3]
scratch.polymorphism=> (append '(1 2) 3)
(1 2 3)
It does! We’ve written a polymorphic function which can take two different kinds of input, and does different things depending on what type of input was provided. Just to confirm, let’s try an empty list:
scratch.polymorphism=> (append '() 3)
IllegalArgumentException No matching clause: class clojure.lang.PersistentList$EmptyList scratch.polymorphism/append (polymorphism.clj:7)
Oh shoot. Are empty lists… a different type?
scratch.polymorphism=> (type '())
clojure.lang.PersistentList$EmptyList
Indeed, they are. Empty lists have a special type in Clojure: clojure.lang.PersistentList
is not the same type as clojure.lang.PersistentList$EmptyList
. Why, then, are they mostly interchangeable? What is it that lets ()
and (1 2 3)
behave as if they were both the same type of thing?
Subtypes
Most languages have a notion of a relationship between types. The exact nature of these relationships is complex and language-specific, but informally, most languages have a way to express that type A
is a subtype of type B
, and conversely, B
is a supertype of A
. For instance, type Cat
might be a subtype of type Animal
. This allows us to write functions which depend only on properties of Animal
, in such a way that they work automatically on Cat
s, Dog
s, Fish
, and so on. This is another form of polymorphism!
Some languages organize their types into a tree, such that each type is a subtype of exactly one other type ( except for a single “all-inclusive” type, often called Top
or Object
). We might say, for instance, that Cat
s are Animal
s, AlarmClock
s are Electronic
s, and both Animal
s and Electronic
s are Object
s.
This sounds straightforward enough, but types rarely fall into this kind of tree-like hierarchy neatly. For instance, both Cat
s and AlarmClock
s can yowl at you when you’d really prefer to be sleeping. Perhaps both should be subtypes of Noisemaker
! But not all Animal
s are Noisemaker
s, nor are all Noisemaker
s Animal
s. Down this path lies madness! For this reason, most type systems allow a type to have multiple supertypes: a Cat
can be both a Noisemaker
and an Animal
. In the JVM—the program which underlies Clojure—there are (and I speak very loosely here: we’re going to ignore primitives and smooth over all kinds of internal details) two kinds of types, and both of these kinds of relationships are in play.
The types of JVM values—things like java.lang.Long
, java.lang.String
, clojure.lang.PersistentVector
, etc.—are called classes. If you have a value like 2
or ["foo" :bar]
in Clojure, that value’s type is a class. Each class is a subtype of exactly one other class, except for Object
, the JVM’s Top class.
The other kind of JVM type is called an interface (or an abstract class—we’ll use “interface” to refer to both throughout this chapter) and it defines the behavior for a type. In essence, an interface defines a collection of functions which take an instance of that interface as their first argument. Both classes and interfaces can be a subtype of any number of interfaces. Clojure uses interfaces to define the behavior of things like “a list” or “something you can look up values in”, and provides a variety of classes, each optimized for a different kind of work, which are subtypes of those interfaces. These shared interfaces are why we can have two types of lists which work the same way.
We can see these relationships between types in Clojure with the supers
function, which returns the supertypes of a given type:
scratch.polymorphism=> (supers clojure.lang.PersistentList$EmptyList)
#{clojure.lang.Obj clojure.lang.IPersistentCollection clojure.lang.IMeta clojure.lang.IObj clojure.lang.Sequential java.lang.Iterable java.io.Serializable clojure.lang.IPersistentStack java.lang.Object clojure.lang.IHashEq clojure.lang.IPersistentList clojure.lang.Seqable clojure.lang.ISeq clojure.lang.Counted java.util.List java.util.Collection}
scratch.polymorphism=> (supers clojure.lang.PersistentList)
#{clojure.lang.Obj clojure.lang.IPersistentCollection clojure.lang.IReduce clojure.lang.IMeta clojure.lang.IObj clojure.lang.Sequential java.lang.Iterable java.io.Serializable clojure.lang.IPersistentStack java.lang.Object clojure.lang.IHashEq clojure.lang.IPersistentList clojure.lang.Seqable clojure.lang.ISeq clojure.lang.ASeq clojure.lang.Counted java.util.List java.util.Collection clojure.lang.IReduceInit}
A few of these types, like java.lang.Object
, are actual classes. The rest are interfaces. Note that these sets are almost identical: empty and non-empty lists share almost all their supertypes. Both, for example, are subtypes of clojure.lang.Counted
, which means that they keep track of how many elements they contain—the count
function uses Counted
to count collections efficiently. Both are clojure.lang.Seqable
, which means they can be interpreted as a sequence of objects—that’s why we can call map
, filter
, and so on over lists. Most relevant for our purposes, both are kinds of clojure.lang.IPersistentList
, which defines the core of how lists work: using cons
to prepend elements. Let’s change our append
function to use the IPersistentList
type instead, and see if it lets us append to empty lists.
(defn append
"Adds an element x to the end of a collection coll. Coll may be either a
vector or a list."
[coll x]
(condp = (type coll)
clojure.lang.PersistentVector
(conj coll x)
clojure.lang.IPersistentList
(concat coll (list x))))
scratch.polymorphism=> (append '() 1)
IllegalArgumentException No matching clause: class clojure.lang.PersistentList$EmptyList scratch.polymorphism/append (polymorphism.clj:7)
Ah, of course. We’re asking if the types of coll
is equal to clojure.lang.IPersistentList
, but they’re not actually the same type. What we want to know is if the type of coll
is a subtype of clojure.lang.IPersistentList
. Let’s check if any of coll
’s supertypes match as well:
(defn append
"Adds an element x to the end of a collection coll. Coll may be either a
vector or a list."
[coll x]
(let [t (type coll)
types (conj (supers t) t)]
(cond (types clojure.lang.PersistentVector)
(conj coll x)
(types clojure.lang.IPersistentList)
(concat coll (list x))
true (str "Sorry, I don't know how to append to a "
(type coll) ", which has supertypes " types))))
scratch.polymorphism=> (append '() 1)
(1)
We’ve generalized our function from depending on specific types to depending on a type or its supertypes. What about… a lazy sequence, like the ones returned by map
?
scratch.polymorphism=> (append (map inc [1 2 3]) 5)
"Sorry, I don't know how to append to a class clojure.lang.LazySeq, which has supertypes #{java.util.List clojure.lang.IHashEq java.io.Serializable clojure.lang.IObj clojure.lang.IPersistentCollection clojure.lang.ISeq java.util.Collection java.lang.Iterable clojure.lang.Seqable clojure.lang.IPending clojure.lang.Sequential java.lang.Object clojure.lang.IMeta clojure.lang.Obj}"
We could add another clause for LazySeq
to our definition of append
—but would it actually be any different from how we append to lists? If we plan to concat
for both, perhaps we should search for a type that sequences and lists have in common.
(require '[clojure.set :as set])
scratch.polymorphism=> (set/intersection (supers clojure.lang.IPersistentList) (supers clojure.lang.LazySeq))
#{clojure.lang.IPersistentCollection clojure.lang.Seqable clojure.lang.Sequential}
These types have three supertypes in common. One is IPersistentCollection
, which defines how any Clojure collection works, including sets, maps, etc. Another is Seqable
, which means that the collection can be interpreted as a sequence of values—this too applies to sets and maps. The final type in common is Sequential
, which applies only to collections with a well-defined order: lists and vectors, but not sets and maps. If we think of append
as operating only over ordered collections, we should define it in terms of Sequential, rather than Seqable.
(defn append
"Adds an element x to the end of any sequential collection--faster for vectors."
[coll x]
(let [t (type coll)
types (conj (supers t) t)]
(cond (types clojure.lang.PersistentVector)
(conj coll x)
(types clojure.lang.Seqable)
(concat coll (list x))
true (str "Sorry, I don't know how to append to a "
(type coll) ", which has supertypes " types))))
scratch.polymorphism=> (append (map inc [1 2 3]) 5)
(2 3 4 5)
Now our function is even more general: it can accept vectors, lists, and lazy sequences of all kinds, while being smart about it: for vectors, it efficiently adds elements to the end using conj
, and for other Sequential types, it falls back to using concat
.
This idea—checking a value’s type and supertypes—is so useful that there’s a special function for it. We say that a value v
is an instance of type T
if v
’s type, or any of its supertypes, is T
. We can use the instance?
function to ask if this is so!
scratch.polymorphism=> (instance? clojure.lang.PersistentVector [])
true
scratch.polymorphism=> (instance? clojure.lang.PersistentVector (list))
false
Thanks to the instance?
function, we don’t need to compute the set of types and supertypes ourselves.
(defn append
"Adds an element x to the end of any sequential collection--faster for
vectors."
[coll x]
(cond (instance? clojure.lang.PersistentVector coll)
(conj coll x)
(instance? clojure.lang.IPersistentList coll)
(concat coll (list x))
true (str "Sorry, I don't know how to append to a "
(type coll))))
Wonderful! The supertype machinery disappears, and we’re left with something that asks succinctly about how a value might behave.
This is a perfectly valid way to write a polymorphic function, but it has an important limitation. Whenever someone finds or creates a new type they’d like to append to, they have to edit the append
function to add support for that type. This is one half of a classic dilemma in programming languages known as the expression problem. It would be nice if we could define functions piece by piece, so that we could add support for different types without changing the original definition of the function. This is the motivation behind Clojure’s multimethods.
Multimethods
A multimethod is a special kind of function. Instead of a function body, it has a dispatch function, which takes the arguments to the function and tells us not what to return, but how to find a particular implementation of that function. We define the implementations (essentially, the function bodies) separately.
To define a multimethod, use defmulti
:
(defmulti append
"Appends an x to collection coll."
(fn [coll x] (type coll)))
Here, we’re defining an append
function. This will overwrite our append
function from earlier, so you can rename or delete the original to avoid the conflict, if you like. Like defn
, we provide a docstring. Unlike defn
, we follow that with a dispatch function, which takes two arguments (coll
and x
) and returns the type of coll
. The return value of the dispatch function is how Clojure decides which implementation to use. All together, this defmulti
says “the behavior of append
, a function of two arguments, depends on the type of its first argument.”
Next, we need to provide an implementation of the append
function. We do this with defmethod
:
(defmethod append clojure.lang.PersistentVector
[coll x]
(conj coll x))
When append
’s dispatch function returns clojure.lang.PersistentVector
, we take the arguments coll
and x
, and use conj
to append x
to coll
. This is the same implementation as our original polymorphic function for vectors, but we’ve decoupled the plumbing from the implementation: one function decides which implementation to run, and the implementation does the work. This decoupling means we can add additional implementations (again using defmethod
) without changing our existing implementation!
(defmethod append clojure.lang.Sequential
[coll x]
(concat coll (list x)))
This implementation of append
takes a clojure.lang.Sequential
as its first argument, and uses concat
to add x to the end. Now our append
function can take either a vector or any sequential object:
scratch.polymorphism=> (append [1 2] 3)
[1 2 3]
scratch.polymorphism=> (append (map inc [1 2]) 4)
(2 3 4)
That’s odd! We dispatched using (type coll)
, which, for (map inc ...)
, would have been a LazySeq
. But we didn’t define any method for LazySeq
. Why… why did this work?
The answer is that Clojure doesn’t compare multimethod dispatch values via =
. It compares them using a function we haven’t seen before: isa?
.
scratch.polymorphism=> (doc isa?)
-------------------------
clojure.core/isa?
([child parent] [h child parent])
Returns true if (= child parent), or child is directly or indirectly derived from
parent, either via a Java type inheritance relationship or a
relationship established via derive. h must be a hierarchy obtained
from make-hierarchy, if not supplied defaults to the global
hierarchy
So isa?
tells us whether two things are equal (using =
), or whether child
is related to parent
via Java types, or via “a relationship established via derive”, whatever that is. The fact that isa?
knows about Java type relationships means that we can use a supertype (e.g. Sequential
) rather than listing every specific type (e.g. PersistentList
, LazySeq
, etc).
scratch.polymorphism=> (isa? clojure.lang.PersistentList clojure.lang.Counted)
true
scratch.polymorphism=> (isa? clojure.lang.PersistentList clojure.lang.PersistentVector)
false
isa?
has another trick up its sleeve–it says it can use relationships defined via derive
. What does that do?
scratch.polymorphism=> (doc derive)
-------------------------
clojure.core/derive
([tag parent] [h tag parent])
Establishes a parent/child relationship between parent and
tag. Parent must be a namespace-qualified symbol or keyword and
child can be either a namespace-qualified symbol or keyword or a
class. h must be a hierarchy obtained from make-hierarchy, if not
supplied defaults to, and modifies, the global hierarchy.
Huh. So this lets us establish relationships between symbols or keywords. And classes, too—though classes can only be children. Let’s give that a shot.
(derive ::milk ::dairy)
(derive ::dairy ::grocery)
scratch.polymorphism=> (isa? ::milk ::milk)
true
scratch.polymorphism=> (isa? ::milk ::furniture)
false
scratch.polymorphism=> (isa? ::milk ::dairy)
true
scratch.polymorphism=> (isa? ::milk ::grocery)
true
With these derive
statements, we’ve built a web of relationships between these keywords. Now isa?
not only knows that milk is a kind of dairy, but also (because dairy is a kind of grocery) that milk is a kind of grocery. And we know that milk is not furniture—I’m pretty sure that’s true. Note that we’re using qualified keywords here (beginning with a ::
), which prevents us from accidentally changing the relationships in other namespaces.
We’re not limited to defining 1:1 relationships. Milk can be a grocery and refrigerated. Apples can also be groceries.
(derive ::milk ::refrigerated)
(derive ::apples ::grocery)
scratch.polymorphism=> (isa? ::milk ::grocery)
true
scratch.polymorphism=> (isa? ::milk ::refrigerated)
true
scratch.polymorphism=> (isa? ::apples ::grocery)
true
We can see the all the things that milk is by using the parents
function. That’s kind of like supertypes, only these aren’t types: they’re just plain old keywords.
scratch.polymorphism=> (parents ::milk)
#{:scratch.polymorphism/refrigerated :scratch.polymorphism/dairy}
And we can see all the things that are refrigerated using descendents
. That’s kind of like subtypes:
scratch.polymorphism=> (descendants ::grocery)
#{:scratch.polymorphism/milk :scratch.polymorphism/apples :scratch.polymorphism/dairy}
Now imagine we represented our groceries as maps. Something like {:item-type ::milk, :size :gallon}
. When we get home from running errands, we’d like a function to put those grocery maps away—but how they’re stored should depend on the :item-type
of the grocery item. We could write:
(defmulti put-away
"Stores an item when we get home."
:item-type)
This takes advantage of the fact that keywords are functions: :item-type
will look up the type of the item, and use that to choose an implementation.
In general, we can put groceries in the pantry, and refrigerated items, we’ll put in the fridge.
(defmethod put-away ::grocery
[item]
(println "Putting a" (name (:size item)) "of" (name (:item-type item))
"in the pantry"))
(defmethod put-away ::refrigerated
[item]
(println "Storing a" (name (:size item)) "of" (name (:item-type item))
"in the fridge"))
Now we can store some apples, and see them go into the pantry.
scratch.polymorphism=> (put-away {:item-type ::apples, :size :large-bag})
Putting a large-bag of apples in the pantry
How about milk?
scratch.polymorphism=> (put-away {:item-type ::milk, :size :gallon})
IllegalArgumentException Multiple methods in multimethod 'put-away' match dispatch value: :scratch.polymorphism/milk -> :scratch.polymorphism/grocery and :scratch.polymorphism/refrigerated, and neither is preferred clojure.lang.MultiFn.findAndCacheBestMethod (MultiFn.java:178)
Ah, that’s interesting. Since milk is both a grocery and refrigerated, either of these implementations could apply to it. We can tell Clojure how to resolve the ambiguity using prefer-method
:
(prefer-method put-away ::refrigerated ::grocery)
scratch.polymorphism=> (put-away {:item-type ::milk, :size :gallon})
Storing a gallon of milk in the fridge
Very good! We’ve established that the ::refrigerated
item type takes precedence over the ::grocery
item type. It’s important to prevent spoilage!
You can use multimethods wherever you need to extend a function’s behavior later. This is especially useful when you intend your code to be used by other people—if someone else were to use our grocery-storage system, they could define new types of items, and be able to tell put-away
exactly how to handle those new item types. We didn’t talk about garbage bags, pencils, or medication here, but because put-away
is a multimethod, someone else could define something like {:item-type ::medication}
, and extend put-away
to store it correctly.
Throughout this example, we’ve talked about “item types”, but… we used keywords, like ::apples
, to represent those types. These aren’t types in the sense of Clojure’s type system, but we could use them like types. In a very real sense, what we’ve done here is define our own tiny language, with its own itty bitty type system, completely separate from Clojure’s. The core ideas are the same: we use subtype relationships to write code which depends only on general things (e.g. “refrigerated things”) automatically cover more specific things (e.g. “milk”).
Multimethods are powerful and general thanks to their dispatch functions. However, because those dispatch functions get involved in every call to a multimethod, they’re a bit slower than regular function calls. When performance matters, we turn to interfaces and protocols.
Interfaces
The idea of a polymorphic function which decides what to do based on the type of its arguments is so common, and so useful, that most languages provide special facilities for it. We call this “type dispatch”: the type of the value being passed chooses which particular code the language invokes. We wrote a version of type dispatch using multimethods and the type
function. Many languages, such as Haskell and Java, build type dispatch into every function—types are attached to each argument, and used to decide between alternative implementations.
To support this feature in Java, the JVM has a fast, built-in mechanism for type dispatch using interfaces. We aren’t limited to using the interfaces given to us by Clojure and the JVM. We can define our own interfaces, and use them to get extra-speedy type dispatch, using the definterface
macro.
(definterface IAppend
(append [x]))
We’ve defined a new type: specifically, an interface. The name of our interface is IAppend
. We’ve also stated that if a value coll
is an instance of type IAppend, then there must be a method, named append
. These methods are (and I know this is confusing) not the multimethods we discussed earlier. These methods are JVM methods: a sort of primitive function. Methods take arguments, evaluate code, and return results, like functions. Unlike Clojure’s functions, they aren’t values: you can’t ask them for docstrings, or pass them around to map
or filter
. We’ve provided only a single method here, but if we liked, we could define several in the same definterface
.
The append
method we defined takes two arguments. Yes, two. Interfaces always take a first argument, which in this case must be an instance of IAppend
. Since the first argument is mandatory, definterface
doesn’t ask us to write it down. This is a bit weird, and contradicts how function definitions work everywhere else, but we’re stuck with this behavior for historical reasons. Long story short: (append [x]
tells us that our first argument is an IAppend
, and our second argument is some object called x
. And that’s it! Like a multimethod, there’s no function body: we provide that later. Unlike a multimethod, there’s no dispatch function. The JVM will always dispatch based on the type of the first argument.
“All right”, you might say. “It’s great that we have type to express that something is appendable, and an append
… method, whatever that is, exists. But how do we make an appendable thing?”
For this, we need new tools.
Making An Appendable Thing
We have an interface, IAppend
, and we’d like to make an instance of that type. The quickest way to make an object of some type is to use a macro called reify
: a fancy philosophical word that means “make a concrete thing out of an abstract concept.” In Clojure, reify
takes interfaces, and definitions for how the methods in those interfaces should work, and returns an object which is an instance of those interfaces. For instance, perhaps we want an object to keep track of a grocery list:
(defn grocery-list
"Creates an appendable grocery list. Takes a vector of
groceries to buy."
[to-buy]
(reify IAppend
(append [this x]
(grocery-list (conj to-buy x)))))
There are two parts here: the first is a function, grocery-list
, which we’re going to call when we want to make a new grocery list. The second is the (reify IAppend ...)
, which constructs a value. That value will be an instance of type IAppend
; at compile time, reify
summons a new, anonymous class from the void, and makes sure that class is a subtype of IAppend
. Each call to this (reify ...)
constructs a new instance of that anonymous class.
Inside the reify
, we’ve provided definitions for how to handle IAppend
’s methods: when someone calls the append
method with this
(some value which this reify constructed) and x
, we add x
to the end of the to-buy
vector using conj
, and call grocery-list
to make a new GroceryList
out of it. That way we can keep appending more things later.
An interesting thing to note: like fn
, reify
can use variables, like to-buy
, from the surrounding code. When grocery-list
returns, the object constructed by reify
remembers the value of to-buy
, and can use it later. We say that reify
, like fn
, closes over those variables: reify
and fn
are closures. That’s a fancy bit of programming jargon you can use to get strangers to stop talking to you at parties.
Let’s try it out:
scratch.polymorphism=> (grocery-list [:eggs])
#object[scratch.polymorphism$grocery_list$reify__1950 0x70e02b5 "scratch.polymorphism$grocery_list$reify__1950@70e02b5"]
This is not a particularly helpful representation of a grocery list. If you squint, you can see the namespace (scratch.polymorphism
) and function (grocery_list
) in there, and also reify
, since we used reify
to make this value. The _1950
is a unique number that helps the computer tell this particular reify apart from others. In fact, this whole first part is the automatically generated class which reify
defined for us. 0x70e02b5
is a number that identifies where this particular instance of that class lives in memory. Unhelpfully, nothing here tells us about the to-buy list we provided ([:eggs]
).
One thing we do know, though, is that this is something we can append to.
scratch.polymorphism=> (supers (type (grocery-list [:eggs])))
#{clojure.lang.IObj scratch.polymorphism.IAppend java.lang.Object clojure.lang.IMeta}
Remember how many types were in (supers clojure.lang.PersistentVector)
? Objects made with reify
are far simpler. There’s IAppend
: that’s the interface type we defined earlier. There’s java.lang.Object
, of course. clojure.lang.IObj
and IMeta
mean that our reify object has metadata. Wait—what is this thing’s metadata anyway?
scratch.polymorphism=> (meta (grocery-list [:eggs]))
{:line 12, :column 3}
Huh! That’s the line and column number, of the reify
expression which made this object. But what about appending? How do we use append
with this thing?
scratch.polymorphism=> (append (grocery-list [:eggs]) :tofu)
"Sorry, I don't know how to append to a class scratch.polymorphism$grocery_list$reify__1491"
Oh, wait, hang on—that’s our append
function from before. We wanted to call the append method we defined using definterface
: methods and functions are different things, even if they have the same name. To make a method call, we put a .
in front of the method name:
scratch.polymorphism=> (.append (grocery-list [:eggs]) :tofu)
#object[scratch.polymorphism$grocery_list$reify__1950 0x40eb00f0 "scratch.polymorphism$grocery_list$reify__1950@40eb00f0"]
If we wanted a function for append
, we could write one which calls the method. We might call this a wrapper function, since it wraps the append method up in a nice functional package. This version of append
we can use with reduce
or partial
, and so on.
(defn append
"Appends x to the end of coll."
[coll x]
(.append coll x))
Moving on: we’ve called our append
method, and it gave us… another unhelpful grocery-list. It’d be great if we had a more reasonable way to print these lists to the console. In the JVM, the Object
class defines a method called toString
. That’s how str
(typically) makes strings out of things. Let’s expand our reify
to define a different toString
method:
(defn grocery-list
"Creates an appendable (via IAppend) grocery list. Takes a vector of
groceries to buy."
[to-buy]
(reify
IAppend
(append [this x]
(grocery-list (conj to-buy x)))
Object
(toString [this]
(str "To buy: " to-buy))))
In general, reify
takes a type followed by method definitions for that particular type, then another type, and any number of methods for that type, and so on. Our grocery lists were already Object
before, and they were given simple, default definitions for all of Object
’s methods–that’s how the REPL was able to show us #object[scratch.polymorphism$grocery_list$reify__1950 ...]
. But now our call to reify
states explicitly: when interpreted as an Object
, here’s how the toString
method works.
Let’s see it in action!
scratch.polymorphism=> (str (grocery-list [:eggs]))
"To buy: [:eggs]"
Hey, that’s more helpful! This is another kind of polymorphism at work: the toString
method (and by extension, the str
function) does different things depending on the type of object it’s given. And what’s neat is that unlike our initial polymorphic function append
—where we had a single function definition which had to know about all the types we wanted to call… we didn’t have to change toString
or str
’s definitions. The plumbing—looking up what code to evaluate—is handled automatically. As with multimethods, we’re free to define behaviors for new types without having to change the definitions for other types.
Let’s try out our append
method again and see if it works:
scratch.polymorphism=> (str (.append (grocery-list [:eggs]) :tomatoes))
"To buy: [:eggs :tomatoes]"
Hey, that’s great! We can see the results of appending to our grocery list. What about appending to lists and vectors, though? Can we use .append
with them?
scratch.polymorphism=> (.append [1 2] 3)
IllegalArgumentException No matching method found: append for class clojure.lang.PersistentVector clojure.lang.Reflector.invokeMatchingMethod (Reflector.java:53)
The reflector is a part of Clojure which figures out what definition of a method to use for a given type. It failed to find a matching method for append
, given a clojure.lang.PersistentVector
—which makes sense, because we haven’t made clojure.lang.PersistentVector
a subtype of IAppend
. Let’s do that next!
I have terrible news: we can’t do this. Interfaces are a one-way street: when we define a new type (as we did with reify
), we can say how that type works with any number of interfaces. But when we define an interface, we don’t get to say how it works with existing types. That’s just how the JVM’s type system works.
“But this is awful!” You might exclaim. “The whole reason we defined an interface was so that we could write polymorphic functions like append
, which could append to many kinds of objects. Instead, we’re limited to polymorphism only over types which we ourselves define!”
This is the other half of the expression problem we mentioned earlier: existing (regular) functions can’t be extended to new types, and existing types can’t be extended to new interfaces. We solved the function-extension problem with multimethods and interfaces… but how do we solve the interface-extension problem?
Protocols
In Clojure, a protocol is like an interface which can be extended to existing types. It defines a named type, together with functions whose first argument is an instance of that type. Where interfaces are built into the JVM, protocols are a Clojure-specific construct. To define a protocol, we use defprotocol
:
(defprotocol Append
"This protocol lets us add things to the end of a collection."
(append [coll x]
"Appends x to the end of collection coll."))
If you still have the append
function we wrote earlier, this append
function will replace it; you’ll see a message like Warning: protocol #'scratch.polymorphism/Append is overwriting function append
at the REPL. You can delete or rename the original append
function if you like.
We’ve named our protocol Append
(not to be confused with the interface IAppend
), and given it a bit of documentation to remind us what it’s for. It has one function, named append
, which takes two arguments: coll
and x
. We can give a docstring for the append
function too. Like an interface, we don’t define how the function works: we’re simply saying it exists. Unlike interfaces, these are real functions, not methods. Their first arguments are explicit, they have docstrings, we don’t need to use a .
to call them, and they can be passed around to other functions.
We can ask for our protocol’s documentation at the repl, just like we can for functions and namespaces:
scratch.polymorphism=> (doc Append)
-------------------------
scratch.polymorphism/Append
This protocol lets us add things to the end of a collection.
And likewise, functions defined in defprotocol
can be inspected, just like those made with defn
.
scratch.polymorphism=> (doc append)
-------------------------
scratch.polymorphism/append
([coll x])
Appends x to the end of collection coll.
If we try to use the append
function with a grocery list, it’s going to fail: the grocery list reify
is a subtype of the interface IAppend
, but we haven’t told it how the protocol Append
works yet:
scratch.polymorphism=> (append (grocery-list [:eggs]) :tomatoes)
IllegalArgumentException No implementation of method: :append of protocol: #'scratch.polymorphism/Append found for class: scratch.polymorphism$grocery_list$reify__1758 clojure.core/-cache-protocol-fn (core_deftype.clj:568)
This error tells us that the append
function doesn’t have an implementation (a function body) for the type scratch.polymorphism$grocery_list$reify__1758
. We can fix that by changing our reify
to use the Append
protocol, instead of the IAppend
interface. This is a one-character change: protocol functions and interface methods are defined in reify
in exactly the same way.
(defn grocery-list
"Creates an appendable (via IAppend) grocery list. Takes a vector of
groceries to buy."
[to-buy]
(reify
Append
(append [this x]
(grocery-list (conj to-buy x)))
Object
(toString [this]
(str "To buy: " to-buy))))
Now we can use our append
function with grocery lists!
scratch.polymorphism=> (str (append (grocery-list [:eggs]) :tomatoes))
"To buy: [:eggs :tomatoes]"
So far, we’ve done exactly what we did with interfaces. In fact, when we called defprotocol
, it not only defined a protocol: it also defined an interface as well. But unlike interfaces, we can extend our protocol to cover existing types. To do this, we use extend-protocol
:
(extend-protocol Append
clojure.lang.IPersistentVector
(append [v x]
(conj v x)))
This expresses that the Append
protocol’s functions (i.e. append
) can now be used on anything which is an IPersistentVector
. When we call (append v x)
with a vector v
, we return the result of (conj v x)
. Let’s try it out:
scratch.polymorphism=> (append [1 2] 3)
[1 2 3]
Fantastic! What about other sequential collections?
(extend-protocol Append
clojure.lang.IPersistentVector
(append [v x]
(conj v x))
clojure.lang.Sequential
(append [v x]
(concat v (list x))))
extend-protocol
can take several types, and the function definitions for each of them. Here, we’re extending Append
over both IPersistentVector
and Sequential
—and providing definitions for how append
works in each case. If you want to extend a single type to multiple protocols, use extend-type
. Both extend-protocol
and extend-type
can be called as often as you like: all their definitions get merged together.
We can even extend a protocol over nil
! We could add this to the existing extend-protocol
, or write it separately. This is another advantage of protocols over interfaces.
(extend-protocol Append
nil
(append [v x]
[x]))
scratch.polymorphism=> (append nil 2)
[2]
Named Datatypes
We’ve used reify
to make an object which satisfies some interfaces or protocols. Like an anonymous function (fn [x] ...)
, reify
creates an anonymous type. Because the reify
type has no (predictable) name, we can’t extend protocols to it later. How do we make a type with a name–like clojure.lang.PersistentVector
, or clojure.lang.LazySeq
?
There are two tools at our disposal here: deftype
and defrecord
. Both define new named types—classes, to be exact. The deftype
macro produces a very basic datatype, whereas defrecord
defines a type which behaves, in many respects, like a Clojure map. First, deftype
:
(deftype GroceryList [to-buy]
Append
(append [this x]
(GroceryList. (conj to-buy x)))
Object
(toString [this]
(str "To buy: " to-buy)))
We’re defining a new type, named GroceryList
. Objects of this type keep track of a single variable, called to-buy
. Just as with reify
, we provide a sequence of types we’d like GroceryLists to be a subtype of, and provide implementations for their functions or methods. The only difference is that in append
, we construct a new grocery list using (GroceryList. to-buy)
. We use the name of the class followed by a period .
to make a new instance of Grocerylist.
Let’s try creating one of these GroceryLists.
scratch.polymorphism=> (GroceryList. [:eggs])
#object[scratch.polymorphism.GroceryList 0x370dbd33 "To buy: [:eggs]"]
Voilà! An instance of GroceryList. We’ve got the full name of the type: GroceryList
, preceded by the namespace scratch.polymorphism
. There’s a memory address, and then our string representation. Can we append to it?
scratch.polymorphism=> (append (GroceryList. [:eggs]) :spinach)
#object[scratch.polymorphism.GroceryList 0x3c612037 "To buy: [:eggs :spinach]"]
Indeed we can. What else can we do with a GroceryList?
scratch.polymorphism=> (supers GroceryList)
#{clojure.lang.IType scratch.polymorphism.Append java.lang.Object}
Not much. There’s clojure.lang.IType
, which just means “this thing is a Clojure datatype”. There’s our Append
protocol, and java.lang.Object
, of course—almost everything is a subtype of Object. As it turns out, deftype
is pretty bare-bones.
We do get a few things for free with deftype
. We can access the fields by using .some-field-name
, like so:
scratch.polymorphism=> (.to-buy (GroceryList. [:eggs]))
[:eggs]
And we also get a function that takes a to-buy
list and builds a new GroceryList
. These “constructor functions” take one argument for each field in the deftype
.
scratch.polymorphism=> (->GroceryList [:strawberries])
#object[scratch.polymorphism.GroceryList 0x44cc69b3 "To buy: [:strawberries]"]
This is a small wrapper around (GroceryList. to-buy)
. It’s there because GroceryList.
, like a method, isn’t a full-fledged Clojure function. Like methods, we can’t use GroceryList.
with map
or apply
, or other things that expect functions. But we can use ->GroceryList
in these contexts!
scratch.polymorphism=> (map GroceryList. [[:twix] [:kale :bananas]])
CompilerException java.lang.ClassNotFoundException: GroceryList., compiling:(/tmp/form-init2122621676255621718.clj:1:1)
scratch.polymorphism=> (map ->GroceryList [[:twix] [:kale :bananas]])
(#object[scratch.polymorphism.GroceryList 0x552db723 "To buy: [:twix]"] #object[scratch.polymorphism.GroceryList 0x4d81eefd "To buy: [:kale :bananas]"])
The types constructed by deftype
are so basic that they lack properties we’ve taken for granted so far—like equality:
scratch.polymorphism=> (= (GroceryList. [:cheese]) (GroceryList. [:cheese]))
false
The only thing a GroceryList is equal to is itself.
scratch.polymorphism=> (let [gl (GroceryList. [:fish])] (= gl gl))
true
This is Clojure being conservative—it doesn’t know if, say, two GroceryLists with the same to-buy
list can really be considered equivalent. It’s up to us to define that by providing an implementation for the equals
method—another part of Object
.
(deftype GroceryList [to-buy]
Append
(append [this x]
(GroceryList. (conj to-buy x)))
Object
(toString [this]
(str "To buy: " to-buy))
(equals [this other]
(and (= (type this) (type other))
(= to-buy (.to-buy other)))))
scratch.polymorphism=> (= (GroceryList. [:cheese]) (GroceryList. [:cheese]))
true
Want to make all grocery lists equal? Go wild!
(deftype GroceryList [to-buy]
Append
(append [this x]
(GroceryList. (conj to-buy x)))
Object
(toString [this]
(str "To buy: " to-buy))
(equals [this other]
(= (type this) (type other))))
scratch.polymorphism=> (= (GroceryList. [:ketchup]) (GroceryList. [:mayo]))
true
So, deftype
gives us the power to construct our own, primitive types. But most of the time, we don’t want this degree of control: defining exactly how to print our values, how to compare two values together, and so on. After all, plain old maps are a great way to model data. They’re easy to print and easy to manipulate. It’d be nice if we could create a type—to take advantage of protocols—but have it still work like a map. Clojure calls this kind of type a record.
(defrecord GroceryList [to-buy]
Append
(append [this x]
(GroceryList. (conj to-buy x))))
The defrecord
macro looks almost exactly like deftype
: it takes the name of the type we’re defining, the names of the fields each instance will keep track of, and then a series of types with method implementations. As with deftype
, we can construct instances of our GroceryList
type using GroceryList.
or the ->GroceryList
function.
scratch.polymorphism=> (GroceryList. [:beans])
#scratch.polymorphism.GroceryList{:to-buy [:beans]}
Unlike deftype
, we get a nice, concise string representation for free. The first part shows the type name, and after that it looks just like a map, showing the fields of this GroceryList
and their corresponding values.
We don’t have to define our own equality either: two records are equal if they’re of the same type, and their fields are equal.
scratch.polymorphism=> (= (GroceryList. [:beans]) (GroceryList. [:beans]))
true
scratch.polymorphism=> (= (GroceryList. [:beans]) {:to-buy [:beans]})
false
A GroceryList works like a map, but it’s not the same type: records aren’t equal to maps, even if they have the same keys and values.
Like deftype
, we can access the fields of a record using .to-buy
:
scratch.polymorphism=> (.to-buy (GroceryList. [:bread]))
[:bread]
But since records work like maps, we can also access them using get
, or by using keywords as functions:
scratch.polymorphism=> (get (GroceryList. [:bread]) :to-buy)
[:bread]
scratch.polymorphism=> (:to-buy (GroceryList. [:bread]))
[:bread]
And we can alter those fields using assoc
and update
, just like maps. Let’s replace our shopping list with onions:
scratch.polymorphism=> (-> (GroceryList. [:chicken])
(assoc :to-buy [:onion]))
#scratch.polymorphism.GroceryList{:to-buy [:onion]}
… and add some beets:
scratch.polymorphism=> (-> (GroceryList. [:chicken])
(assoc :to-buy [:onion])
(update :to-buy conj :beets))
#scratch.polymorphism.GroceryList{:to-buy [:onion :beets]}
Just as with maps, these updates are immutable: they don’t alter the original GroceryList. Instead, they create copies with our requested changes. We aren’t limited to the fields we explicitly defined in the defrecord
, either. Let’s tack on a :note
to our grocery list:
scratch.polymorphism=> (assoc (GroceryList. [:cherries]) :note "Tart cherries if possible!")
#scratch.polymorphism.GroceryList{:to-buy [:cherries], :note "Tart cherries if possible!"}
This is possible because records (unlike deftypes) always carry around an extra map—just in case they need to store additional fields we didn’t define up front. The assoc
function tries to update a field if it can, and if there’s no field by that name, it stores it in the record’s extra map.
Both deftype
and defrecord
produce named types, which means we can extend protocols over them after the fact. Let’s add a new protocol for printing out things nicely to the console—something we could use to print our grocery list and the items on it.
(defprotocol Printable
(print-out [x] "Print out the given object, nicely formatted."))
Now we can define how to print a GroceryList
. Let’s add a basic print-out
function that works on any object, while we’re at it:
(extend-protocol Printable
GroceryList
(print-out [gl]
(println "GROCERIES")
(println "---------")
(doseq [item (:to-buy gl)]
(print "[ ] ")
(print-out item)
(println)))
Object
(print-out [x]
(print x)))
Like we saw earlier, we can use (:to-buy gl)
to get the items on the grocery list. We go through each one in turn using doseq
, and call that particular item item
. With that item, we print out a pair of brackets "[ ] "
. Then we do something a bit strange: we call print-out
again, but this time, with the item
in question. The Object
implementation takes over from there.
scratch.polymorphism=> (print-out (GroceryList. [:cilantro :carrots :pork :baguette]))
GROCERIES
---------
[ ] :cilantro
[ ] :carrots
[ ] :pork
[ ] :baguette
Nice! This actually looks like a real grocery list. What if we wanted to keep track of how many carrots to buy? We could introduce a new type to keep track of things that come in a certain quantity:
(defrecord CountedItem [thing quantity]
Printable
(print-out [this]
(print-out thing)
(print (str " (" quantity "x)"))))
We’ve defined how to print out a counted item: first we print out the thing, then the quantity in parentheses. Let’s give that a shot:
scratch.polymorphism=> (print-out (GroceryList. [:cilantro (CountedItem. :carrots 2) :pork :baguette]))
GROCERIES
---------
[ ] :cilantro
[ ] :carrots (2x)
[ ] :pork
[ ] :baguette
Neat! We didn’t have to change GroceryList
at all to get this behavior. Because it used the polymorphic protocol function print-out
, it automatically knew how to work with our new CountedItem
type.
When To Use Deftype and Defrecord
If you’re coming from an object-oriented language (e.g. Ruby, Java), or a typed language with algebraic datatypes (e.g. Haskell, ML), you might see defprotocol
, deftype
, and defrecord
, and think: “Ah, finally. Here are the tools I’ve been waiting for.” You might start by wanting to model a person, and immediately jump to (defrecord Person [name pronouns age])
. While this is valid, you should take a step back, and ask: do I need polymorphism here? Are there going to be functions that take people and animals? Or do I simply want to keep track of some data?
If you don’t need polymorphism, there’s a good chance your data can be modeled in Clojure as plain old maps, sets, vectors, and so on. Need to represent a person? How about:
{:name "Morgan"
:pronouns [:they :them]
:age 56}
No defrecord
required. Sticking to maps keeps your data in a shape that can be easily manipulated using standard Clojure functions. It’s easy to store this data on disk, or send it across the network. It’s easier to share this kind of data with other people. And it’s more concise to print at the console, which makes debugging your programs easier.
Conversely, you’ll want to use defrecord
and deftype
when maps aren’t sufficient: when you need polymorphism, when you need to participate in existing protocols or interfaces, or when multimethod performance is too slow. Records are often faster and more memory-efficient than maps, so even if you don’t need the polymorphism, it can be worthwhile to define a record or so when map performance bogs you down. This is something you’ll want to find out by measuring your code, though, rather than simply assuming.
If you’re reaching for records for type safety: it’s not going to be as helpful as you’d like. Functions like assoc
work equally well across all kinds of records, and the compiler won’t warn you about using the wrong keyword. Sticking to methods eliminates some of those risks, but it’s nothing like the type guardrails in Java or Haskell. Clojure programs generally rely more on tests and contracts to prevent these type errors. There are also static type systems like core.typed, which we’ll discuss later.
Review
When a function’s behavior depends on the type of values it is provided, we call that function polymorphic. Many of Clojure’s core functions, like conj
or reduce
, are polymorphic: we can conj
into maps, vectors, sets, and lists, and each does something different. Often, our own code is implicitly polymorphic by virtue of using other polymorphic functions: (defn add-bird [coll] (conj coll :bird))
can add birds to lots of different things.
When we need a function whose behavior explicitly depends on its arguments, we can use ad-hoc approaches, like if
, cond
, or case
. The instance?
, type
, and supers
functions let us choose what to do based on the type of the value.
When we need an open function—one whose behavior can be extended to new things later—we use a multimethod, an interface, or a protocol. Multimethods are the most general approach: they use a dispatch function, which receives the function’s arguments and decides which implementation to call. They’re not limited to dispatching by argument type: they can use arbitrary values and relationships between keywords, defined with derive
. They also offer fine-grained control when that dispatch would be ambiguous. This flexibility comes with a performance penalty: Clojure has to evaluate the dispatch function every time the multimethod is called.
When a function’s behavior depends on the type of the first argument, use protocols or interfaces. Interfaces can’t be extended to existing types; protocols can. Protocols have some ergonomic advantages: they define regular functions, rather than methods, and come with documentation—though there’s nothing stopping you from writing your own documented wrapper functions, or using definterface+, which does so automatically. Interfaces are slightly faster; prefer them when performance matters.
To create instances of a new type, we have reify
. Like (fn [x] ...)
, reify
generates an anonymous type—it can have interfaces and protocols as supertypes, and provides implementations for those types, but has no (predictable, meaningful) name. When we want to name our types—perhaps so that other people can extend them later—we use deftype
and defrecord
. Most of the time, defrecord
is most useful: they work like maps out of the box. However, deftype
is available should we need to construct bare-bones types with unusual behaviors.
We haven’t talked about the details of classes or inheritance in this discussion. These are important for Java interop, but we don’t use these concepts often in Clojure. A topic for later discussion!
Problems
-
Write a
sorted
function that usescond
andinstance?
to convert lists to sorted lists (using(sort ...)
), and sets to sorted sets (using(into (sorted-set) ...)
). -
Rewrite
sorted
as a multimethod. Usingdefmethod
, extendsorted
to handle maps. -
Add a
checked-off
field to theGroceryList
type, and use it to store a set of items that are already in the cart. Write acheck-off
function that takes a grocery list and checks off an item on it, by adding that item to thechecked-off
set:(check-off my-list eggs)
-
Write a
remaining
function which takes aGroceryList
and returns the items that haven’t been checked off yet. -
Change the definition of
print-out
forGroceryList
to take thechecked-off
set into account, printing an[x]
in front of checked-off items. -
Imagine Clojure had no built-in sets. Make up a
Set
protocol with some basic operations, likeadd-element
,has-element?
, andremove-element
. -
Using a vector or list to store your elements, write a basic implementation of your
Set
protocol. Experiment to make sure adding the same item twice doesn’t add two copies. -
Try making larger and larger sets–say, with ten, a thousand, and a hundred thousand elements. Use
(time (has-element? some-set 123))
to see how your set performance changes with size. Why is this? -
Write a different implementation of a
Set
, which uses a map to store its elements. Compare its performance to your list-based set. -
The
deref
function uses an interface calledclojure.lang.IDeref
to return the current value of a container type. Usingdeftype
, define your own container type. Try@(MyContainer. :hi)
to verify that you can get the value of your container (:hi
) back. -
[advanced] So far, we’ve worked only with immutable types.
deftype
lets us define mutable types by tagging a field with^:unsynchronized-mutable
, like so:(deftype DangerBox [^:unsynchronized-mutable value] ...)
. Design aMutable
protocol with a(write! box value)
function to overwrite the value of a mutable container. Using(set! field value)
, build your own mutable container type which supports bothMutable
andIDeref
. Confirm that you can change its value usingwrite!
, and read it back using@
. -
[advanced] Use your mutable container as a counter by reading its current state and writing back a value one greater–e.g. via
(write! box (inc @box))
. Usingdotimes
, perform many updates in a row, and verify that the final value of the counter is the same as the number you passed todotimes
. -
[advanced] Run this
dotimes
increment loop in two threads at once, usingfuture
. Is the final counter value what you expected? Why? How does this compare to using an(atom)
withswap!
?
FYI, the link to the previous blog entry is broken. It should have led to hxxps://aphyr.com/posts/319-clojure-from-the-ground-up-debugging