https://aphyr.com/Aphyr: Clojure from the ground up2022-01-10T09:40:10-05:00https://aphyr.com/posts/352-clojure-from-the-ground-up-polymorphismClojure from the ground up: polymorphism2020-08-27T18:05:00-05:002020-08-27T18:05:00-05:00Aphyrhttps://aphyr.com/<p>Previously: <a href="/posts/319-clojure-from-the-ground-up-debugging">Debugging</a>.</p>
<p>In this chapter, we’ll discuss some of Clojure’s mechanisms for <em>polymorphism</em>: writing programs that do different things depending on what kind of inputs they receive. We’ll show ways to write <em>open</em> functions, which can be extended to new conditions later on, without changing their original definitions. Along the way, we’ll investigate Clojure’s type system in more detail–discussing <em>interfaces</em>, <em>protocols</em>, how to construct our own datatypes, and the relationships between types which let us write flexible programs.</p>
<p>Thus far, our functions have taken one type of input. For example:</p>
<pre><code><span></span><span class="p">(</span><span class="kd">defn </span><span class="nv">append</span>
<span class="s">"Adds an element x to the end of a vector v."</span>
<span class="p">[</span><span class="nv">v</span> <span class="nv">x</span><span class="p">]</span>
<span class="p">(</span><span class="nb">conj </span><span class="nv">v</span> <span class="nv">x</span><span class="p">))</span>
<span class="nv">scratch.polymorphism=></span> <span class="p">(</span><span class="nf">append</span> <span class="p">[</span><span class="mi">1</span> <span class="mi">2</span><span class="p">]</span> <span class="mi">3</span><span class="p">)</span>
<span class="p">[</span><span class="mi">1</span> <span class="mi">2</span> <span class="mi">3</span><span class="p">]</span>
</code></pre>
<p>But we might want to append to <em>more</em> than vectors. What if we wanted to append something to the end of a list?</p>
<pre><code><span></span><span class="nv">scratch.polymorphism=></span> <span class="p">(</span><span class="nf">append</span> <span class="o">'</span><span class="p">(</span><span class="mi">1</span> <span class="mi">2</span><span class="p">)</span> <span class="mi">3</span><span class="p">)</span>
<span class="p">(</span><span class="mi">3</span> <span class="mi">1</span> <span class="mi">2</span><span class="p">)</span>
</code></pre>
<p>Since <code>conj</code> prepends to lists, our <code>append</code> function doesn’t work correctly here. We could redefine <code>append</code> in a way that works for both vectors and lists–for instance, using <code>concat</code>:</p>
<pre><code><span></span><span class="p">(</span><span class="kd">defn </span><span class="nv">append-concat</span>
<span class="s">"Adds an element x to the end of a collection coll by concatenating a</span>
<span class="s"> single-element list (x) to the end of coll."</span>
<span class="p">[</span><span class="nv">coll</span> <span class="nv">x</span><span class="p">]</span>
<span class="p">(</span><span class="nb">concat </span><span class="nv">coll</span> <span class="p">(</span><span class="nb">list </span><span class="nv">x</span><span class="p">)))</span>
</code></pre>
<p>But this is less than ideal: <code>concat</code> produces a wrapper object every time we call <code>append-concat</code>, which introduces unnecessary overhead when working with vectors. What we would like is a function which does different things to different types of inputs. This is the heart of <em>polymorphism</em>.</p>
<h2><a href="#a-simple-approach" id="a-simple-approach">A Simple Approach</a></h2>
<p>We have a function <code>type</code> which returns the type of an object. What if append asked for the type of collection it was being asked to append to, and did different things based on that type? Let’s check the types of lists and vectors:</p>
<pre><code><span></span><span class="p">(</span><span class="nf">type</span> <span class="p">[</span><span class="mi">1</span> <span class="mi">2</span><span class="p">])</span>
<span class="nv">clojure.lang.PersistentVector</span>
<span class="p">(</span><span class="nf">type</span> <span class="o">'</span><span class="p">(</span><span class="mi">1</span> <span class="mi">2</span><span class="p">))</span>
<span class="nv">clojure.lang.PersistentList</span>
</code></pre>
<p>Okay, so we could try checking whether the type of our collection is a PersistentVector, and if so, use <code>conj</code> to append an element efficiently!</p>
<pre><code><span></span><span class="p">(</span><span class="kd">defn </span><span class="nv">append</span>
<span class="s">"Adds an element x to the end of a collection coll. Coll may be either a</span>
<span class="s"> vector or a list."</span>
<span class="p">[</span><span class="nv">coll</span> <span class="nv">x</span><span class="p">]</span>
<span class="p">(</span><span class="nf">condp</span> <span class="nb">= </span><span class="p">(</span><span class="nf">type</span> <span class="nv">coll</span><span class="p">)</span>
<span class="nv">clojure.lang.PersistentVector</span>
<span class="p">(</span><span class="nb">conj </span><span class="nv">coll</span> <span class="nv">x</span><span class="p">)</span>
<span class="nv">clojure.lang.PersistentList</span>
<span class="p">(</span><span class="nb">concat </span><span class="nv">coll</span> <span class="p">(</span><span class="nb">list </span><span class="nv">x</span><span class="p">))))</span>
</code></pre>
<p>As an aside: we’re using <code>condp =</code> instead of <code>case</code>, even though <code>case</code> might seem like the obvious solution here. That’s because <code>case</code> uses optimizations which require that each case is a a compile-time constant, and classes like <code>clojure.lang.PersistentVector</code> aren’t actually constant in that sense. Don’t worry too much about this—it’s not important for understanding this chapter. The important question is: does this approach of checking the type at runtime <em>work</em>? Can we append to both vectors and lists?</p>
<pre><code><span></span><span class="nv">scratch.polymorphism=></span> <span class="p">(</span><span class="nf">append</span> <span class="p">[</span><span class="mi">1</span> <span class="mi">2</span><span class="p">]</span> <span class="mi">3</span><span class="p">)</span>
<span class="p">[</span><span class="mi">1</span> <span class="mi">2</span> <span class="mi">3</span><span class="p">]</span>
<span class="nv">scratch.polymorphism=></span> <span class="p">(</span><span class="nf">append</span> <span class="o">'</span><span class="p">(</span><span class="mi">1</span> <span class="mi">2</span><span class="p">)</span> <span class="mi">3</span><span class="p">)</span>
<span class="p">(</span><span class="mi">1</span> <span class="mi">2</span> <span class="mi">3</span><span class="p">)</span>
</code></pre>
<p>It does! We’ve written a <em>polymorphic function</em> which can take two different kinds of input, and does different things depending on what type of input was provided. Just to confirm, let’s try an empty list:</p>
<pre><code><span></span><span class="nv">scratch.polymorphism=></span> <span class="p">(</span><span class="nf">append</span> <span class="o">'</span><span class="p">()</span> <span class="mi">3</span><span class="p">)</span>
<span class="nv">IllegalArgumentException</span> <span class="nv">No</span> <span class="nv">matching</span> <span class="nv">clause</span><span class="err">:</span> <span class="nb">class </span><span class="nv">clojure.lang.PersistentList$EmptyList</span> <span class="nv">scratch.polymorphism/append</span> <span class="p">(</span><span class="nf">polymorphism.clj</span><span class="ss">:7</span><span class="p">)</span>
</code></pre>
<p>Oh shoot. Are empty lists… a <em>different type</em>?</p>
<pre><code><span></span><span class="nv">scratch.polymorphism=></span> <span class="p">(</span><span class="nf">type</span> <span class="o">'</span><span class="p">())</span>
<span class="nv">clojure.lang.PersistentList$EmptyList</span>
</code></pre>
<p>Indeed, they are. Empty lists have a special type in Clojure: <code>clojure.lang.PersistentList</code> is not the same type as <code>clojure.lang.PersistentList$EmptyList</code>. Why, then, are they mostly interchangeable? What is it that lets <code>()</code> and <code>(1 2 3)</code> behave as if they were both the same type of thing?</p>
<h2><a href="#subtypes" id="subtypes">Subtypes</a></h2>
<p>Most languages have a notion of a <em>relationship</em> between types. The exact nature of these relationships is complex and language-specific, but informally, most languages have a way to express that type <code>A</code> is a <em>subtype</em> of type <code>B</code>, and conversely, <code>B</code> is a supertype of <code>A</code>. For instance, type <code>Cat</code> might be a subtype of type <code>Animal</code>. This allows us to write functions which depend only on properties of <code>Animal</code>, in such a way that they work automatically on <code>Cat</code>s, <code>Dog</code>s, <code>Fish</code>, and so on. This is another form of polymorphism!</p>
<p>Some languages organize their types into a tree, such that each type is a subtype of exactly one other type ( except for a single “all-inclusive” type, often called <code>Top</code> or <code>Object</code>). We might say, for instance, that <code>Cat</code>s are <code>Animal</code>s, <code>AlarmClock</code>s are <code>Electronic</code>s, and both <code>Animal</code>s and <code>Electronic</code>s are <code>Object</code>s.</p>
<p>This sounds straightforward enough, but types rarely fall into this kind of tree-like hierarchy neatly. For instance, both <code>Cat</code>s and <code>AlarmClock</code>s can yowl at you when you’d really prefer to be sleeping. Perhaps both should be subtypes of <code>Noisemaker</code>! But not all <code>Animal</code>s are <code>Noisemaker</code>s, nor are all <code>Noisemaker</code>s <code>Animal</code>s. Down this path lies madness! For this reason, most type systems allow a type to have <em>multiple</em> supertypes: a <code>Cat</code> can be <em>both</em> a <code>Noisemaker</code> and an <code>Animal</code>. In the JVM—the program which underlies Clojure—there are (and I speak very loosely here: we’re going to ignore <a href="https://www.baeldung.com/java-primitives-vs-objects">primitives</a> and smooth over all kinds of internal details) two kinds of types, and both of these kinds of relationships are in play.</p>
<p>The types of JVM values—things like <code>java.lang.Long</code>, <code>java.lang.String</code>, <code>clojure.lang.PersistentVector</code>, etc.—are called <em>classes</em>. If you have a value like <code>2</code> or <code>["foo" :bar]</code> in Clojure, that value’s type is a class. Each class is a subtype of exactly one other class, except for <code>Object</code>, the JVM’s Top class.</p>
<p>The other kind of JVM type is called an <em>interface</em> (or an <a href="https://pythonconquerstheuniverse.wordpress.com/2011/05/24/java-abc-vs-interface/"><em>abstract class</em></a>—we’ll use “interface” to refer to both throughout this chapter) and it defines the behavior for a type. In essence, an interface defines a collection of functions which take an instance of that interface as their first argument. Both classes and interfaces can be a subtype of any number of interfaces. Clojure uses interfaces to define the behavior of things like “a list” or “something you can look up values in”, and provides a variety of classes, each optimized for a different kind of work, which are <em>subtypes</em> of those interfaces. These shared interfaces are why we can have two types of lists which work the same way.</p>
<p>We can see these relationships between types in Clojure with the <code>supers</code> function, which returns the <em>supertypes</em> of a given type:</p>
<pre><code><span></span><span class="nv">scratch.polymorphism=></span> <span class="p">(</span><span class="nf">supers</span> <span class="nv">clojure.lang.PersistentList$EmptyList</span><span class="p">)</span>
<span class="o">#</span><span class="p">{</span><span class="nv">clojure.lang.Obj</span> <span class="nv">clojure.lang.IPersistentCollection</span> <span class="nv">clojure.lang.IMeta</span> <span class="nv">clojure.lang.IObj</span> <span class="nv">clojure.lang.Sequential</span> <span class="nv">java.lang.Iterable</span> <span class="nv">java.io.Serializable</span> <span class="nv">clojure.lang.IPersistentStack</span> <span class="nv">java.lang.Object</span> <span class="nv">clojure.lang.IHashEq</span> <span class="nv">clojure.lang.IPersistentList</span> <span class="nv">clojure.lang.Seqable</span> <span class="nv">clojure.lang.ISeq</span> <span class="nv">clojure.lang.Counted</span> <span class="nv">java.util.List</span> <span class="nv">java.util.Collection</span><span class="p">}</span>
<span class="nv">scratch.polymorphism=></span> <span class="p">(</span><span class="nf">supers</span> <span class="nv">clojure.lang.PersistentList</span><span class="p">)</span>
<span class="o">#</span><span class="p">{</span><span class="nv">clojure.lang.Obj</span> <span class="nv">clojure.lang.IPersistentCollection</span> <span class="nv">clojure.lang.IReduce</span> <span class="nv">clojure.lang.IMeta</span> <span class="nv">clojure.lang.IObj</span> <span class="nv">clojure.lang.Sequential</span> <span class="nv">java.lang.Iterable</span> <span class="nv">java.io.Serializable</span> <span class="nv">clojure.lang.IPersistentStack</span> <span class="nv">java.lang.Object</span> <span class="nv">clojure.lang.IHashEq</span> <span class="nv">clojure.lang.IPersistentList</span> <span class="nv">clojure.lang.Seqable</span> <span class="nv">clojure.lang.ISeq</span> <span class="nv">clojure.lang.ASeq</span> <span class="nv">clojure.lang.Counted</span> <span class="nv">java.util.List</span> <span class="nv">java.util.Collection</span> <span class="nv">clojure.lang.IReduceInit</span><span class="p">}</span>
</code></pre>
<p>A few of these types, like <code>java.lang.Object</code>, are actual classes. The rest are interfaces. Note that these sets are almost identical: empty and non-empty lists share almost all their supertypes. Both, for example, are subtypes of <code>clojure.lang.Counted</code>, which means that they keep track of how many elements they contain—the <code>count</code> function uses <code>Counted</code> to count collections efficiently. Both are <code>clojure.lang.Seqable</code>, which means they can be interpreted as a sequence of objects—that’s why we can call <code>map</code>, <code>filter</code>, and so on over lists. Most relevant for our purposes, both are kinds of <code>clojure.lang.IPersistentList</code>, which <a href="https://www.javadoc.io/doc/org.clojure/clojure/1.10.1/clojure/lang/IPersistentList.html">defines</a> the core of how lists work: using <code>cons</code> to prepend elements. Let’s change our <code>append</code> function to use the <code>IPersistentList</code> type instead, and see if it lets us append to empty lists.</p>
<pre><code><span></span><span class="p">(</span><span class="kd">defn </span><span class="nv">append</span>
<span class="s">"Adds an element x to the end of a collection coll. Coll may be either a</span>
<span class="s"> vector or a list."</span>
<span class="p">[</span><span class="nv">coll</span> <span class="nv">x</span><span class="p">]</span>
<span class="p">(</span><span class="nf">condp</span> <span class="nb">= </span><span class="p">(</span><span class="nf">type</span> <span class="nv">coll</span><span class="p">)</span>
<span class="nv">clojure.lang.PersistentVector</span>
<span class="p">(</span><span class="nb">conj </span><span class="nv">coll</span> <span class="nv">x</span><span class="p">)</span>
<span class="nv">clojure.lang.IPersistentList</span>
<span class="p">(</span><span class="nb">concat </span><span class="nv">coll</span> <span class="p">(</span><span class="nb">list </span><span class="nv">x</span><span class="p">))))</span>
<span class="nv">scratch.polymorphism=></span> <span class="p">(</span><span class="nf">append</span> <span class="o">'</span><span class="p">()</span> <span class="mi">1</span><span class="p">)</span>
<span class="nv">IllegalArgumentException</span> <span class="nv">No</span> <span class="nv">matching</span> <span class="nv">clause</span><span class="err">:</span> <span class="nb">class </span><span class="nv">clojure.lang.PersistentList$EmptyList</span> <span class="nv">scratch.polymorphism/append</span> <span class="p">(</span><span class="nf">polymorphism.clj</span><span class="ss">:7</span><span class="p">)</span>
</code></pre>
<p>Ah, of course. We’re asking if the types of <code>coll</code> is <em>equal</em> to <code>clojure.lang.IPersistentList</code>, but they’re not actually the same type. What we want to know is if the type of <code>coll</code> is a <em>subtype</em> of <code>clojure.lang.IPersistentList</code>. Let’s check if any of <code>coll</code>’s <em>supertypes</em> match as well:</p>
<pre><code><span></span><span class="p">(</span><span class="kd">defn </span><span class="nv">append</span>
<span class="s">"Adds an element x to the end of a collection coll. Coll may be either a</span>
<span class="s"> vector or a list."</span>
<span class="p">[</span><span class="nv">coll</span> <span class="nv">x</span><span class="p">]</span>
<span class="p">(</span><span class="k">let </span><span class="p">[</span><span class="nv">t</span> <span class="p">(</span><span class="nf">type</span> <span class="nv">coll</span><span class="p">)</span>
<span class="nv">types</span> <span class="p">(</span><span class="nb">conj </span><span class="p">(</span><span class="nf">supers</span> <span class="nv">t</span><span class="p">)</span> <span class="nv">t</span><span class="p">)]</span>
<span class="p">(</span><span class="nb">cond </span><span class="p">(</span><span class="nf">types</span> <span class="nv">clojure.lang.PersistentVector</span><span class="p">)</span>
<span class="p">(</span><span class="nb">conj </span><span class="nv">coll</span> <span class="nv">x</span><span class="p">)</span>
<span class="p">(</span><span class="nf">types</span> <span class="nv">clojure.lang.IPersistentList</span><span class="p">)</span>
<span class="p">(</span><span class="nb">concat </span><span class="nv">coll</span> <span class="p">(</span><span class="nb">list </span><span class="nv">x</span><span class="p">))</span>
<span class="nv">true</span> <span class="p">(</span><span class="nb">str </span><span class="s">"Sorry, I don't know how to append to a "</span>
<span class="p">(</span><span class="nf">type</span> <span class="nv">coll</span><span class="p">)</span> <span class="s">", which has supertypes "</span> <span class="nv">types</span><span class="p">))))</span>
<span class="nv">scratch.polymorphism=></span> <span class="p">(</span><span class="nf">append</span> <span class="o">'</span><span class="p">()</span> <span class="mi">1</span><span class="p">)</span>
<span class="p">(</span><span class="mi">1</span><span class="p">)</span>
</code></pre>
<p>We’ve generalized our function from depending on <em>specific</em> types to depending on a type <em>or its supertypes</em>. What about… a lazy sequence, like the ones returned by <code>map</code>?</p>
<pre><code><span></span><span class="nv">scratch.polymorphism=></span> <span class="p">(</span><span class="nf">append</span> <span class="p">(</span><span class="nb">map inc </span><span class="p">[</span><span class="mi">1</span> <span class="mi">2</span> <span class="mi">3</span><span class="p">])</span> <span class="mi">5</span><span class="p">)</span>
<span class="s">"Sorry, I don't know how to append to a class clojure.lang.LazySeq, which has supertypes #{java.util.List clojure.lang.IHashEq java.io.Serializable clojure.lang.IObj clojure.lang.IPersistentCollection clojure.lang.ISeq java.util.Collection java.lang.Iterable clojure.lang.Seqable clojure.lang.IPending clojure.lang.Sequential java.lang.Object clojure.lang.IMeta clojure.lang.Obj}"</span>
</code></pre>
<p>We could add another clause for <code>LazySeq</code> to our definition of <code>append</code>—but would it actually be any <em>different</em> from how we append to lists? If we plan to <code>concat</code> for both, perhaps we should search for a type that sequences and lists have in common.</p>
<pre><code><span></span><span class="p">(</span><span class="nf">require</span> <span class="o">'</span><span class="p">[</span><span class="nv">clojure.set</span> <span class="ss">:as</span> <span class="nv">set</span><span class="p">])</span>
<span class="nv">scratch.polymorphism=></span> <span class="p">(</span><span class="nf">set/intersection</span> <span class="p">(</span><span class="nf">supers</span> <span class="nv">clojure.lang.IPersistentList</span><span class="p">)</span> <span class="p">(</span><span class="nf">supers</span> <span class="nv">clojure.lang.LazySeq</span><span class="p">))</span>
<span class="o">#</span><span class="p">{</span><span class="nv">clojure.lang.IPersistentCollection</span> <span class="nv">clojure.lang.Seqable</span> <span class="nv">clojure.lang.Sequential</span><span class="p">}</span>
</code></pre>
<p>These types have three supertypes in common. One is <code>IPersistentCollection</code>, which defines how <em>any</em> Clojure collection works, including sets, maps, etc. Another is <code>Seqable</code>, which means that the collection can be <em>interpreted</em> as a sequence of values—this too applies to sets and maps. The final type in common is <code>Sequential</code>, which applies only to collections <em>with a well-defined order</em>: lists and vectors, but not sets and maps. If we think of <code>append</code> as operating only over <em>ordered</em> collections, we should define it in terms of Sequential, rather than Seqable.</p>
<pre><code><span></span><span class="p">(</span><span class="kd">defn </span><span class="nv">append</span>
<span class="s">"Adds an element x to the end of any sequential collection--faster for vectors."</span>
<span class="p">[</span><span class="nv">coll</span> <span class="nv">x</span><span class="p">]</span>
<span class="p">(</span><span class="k">let </span><span class="p">[</span><span class="nv">t</span> <span class="p">(</span><span class="nf">type</span> <span class="nv">coll</span><span class="p">)</span>
<span class="nv">types</span> <span class="p">(</span><span class="nb">conj </span><span class="p">(</span><span class="nf">supers</span> <span class="nv">t</span><span class="p">)</span> <span class="nv">t</span><span class="p">)]</span>
<span class="p">(</span><span class="nb">cond </span><span class="p">(</span><span class="nf">types</span> <span class="nv">clojure.lang.PersistentVector</span><span class="p">)</span>
<span class="p">(</span><span class="nb">conj </span><span class="nv">coll</span> <span class="nv">x</span><span class="p">)</span>
<span class="p">(</span><span class="nf">types</span> <span class="nv">clojure.lang.Seqable</span><span class="p">)</span>
<span class="p">(</span><span class="nb">concat </span><span class="nv">coll</span> <span class="p">(</span><span class="nb">list </span><span class="nv">x</span><span class="p">))</span>
<span class="nv">true</span> <span class="p">(</span><span class="nb">str </span><span class="s">"Sorry, I don't know how to append to a "</span>
<span class="p">(</span><span class="nf">type</span> <span class="nv">coll</span><span class="p">)</span> <span class="s">", which has supertypes "</span> <span class="nv">types</span><span class="p">))))</span>
<span class="nv">scratch.polymorphism=></span> <span class="p">(</span><span class="nf">append</span> <span class="p">(</span><span class="nb">map inc </span><span class="p">[</span><span class="mi">1</span> <span class="mi">2</span> <span class="mi">3</span><span class="p">])</span> <span class="mi">5</span><span class="p">)</span>
<span class="p">(</span><span class="mi">2</span> <span class="mi">3</span> <span class="mi">4</span> <span class="mi">5</span><span class="p">)</span>
</code></pre>
<p>Now our function is even <em>more</em> general: it can accept vectors, lists, and lazy sequences of all kinds, while being <em>smart</em> about it: for vectors, it efficiently adds elements to the end using <code>conj</code>, and for other Sequential types, it falls back to using <code>concat</code>.</p>
<p>This idea—checking a value’s type <em>and</em> supertypes—is so useful that there’s a special function for it. We say that a value <code>v</code> is an <em>instance</em> of type <code>T</code> if <code>v</code>’s type, or any of its supertypes, is <code>T</code>. We can use the <code>instance?</code> function to ask if this is so!</p>
<pre><code><span></span><span class="nv">scratch.polymorphism=></span> <span class="p">(</span><span class="nb">instance? </span><span class="nv">clojure.lang.PersistentVector</span> <span class="p">[])</span>
<span class="nv">true</span>
<span class="nv">scratch.polymorphism=></span> <span class="p">(</span><span class="nb">instance? </span><span class="nv">clojure.lang.PersistentVector</span> <span class="p">(</span><span class="nf">list</span><span class="p">))</span>
<span class="nv">false</span>
</code></pre>
<p>Thanks to the <code>instance?</code> function, we don’t need to compute the set of types and supertypes ourselves.</p>
<pre><code><span></span><span class="p">(</span><span class="kd">defn </span><span class="nv">append</span>
<span class="s">"Adds an element x to the end of any sequential collection--faster for</span>
<span class="s"> vectors."</span>
<span class="p">[</span><span class="nv">coll</span> <span class="nv">x</span><span class="p">]</span>
<span class="p">(</span><span class="nb">cond </span><span class="p">(</span><span class="nb">instance? </span><span class="nv">clojure.lang.PersistentVector</span> <span class="nv">coll</span><span class="p">)</span>
<span class="p">(</span><span class="nb">conj </span><span class="nv">coll</span> <span class="nv">x</span><span class="p">)</span>
<span class="p">(</span><span class="nb">instance? </span><span class="nv">clojure.lang.IPersistentList</span> <span class="nv">coll</span><span class="p">)</span>
<span class="p">(</span><span class="nb">concat </span><span class="nv">coll</span> <span class="p">(</span><span class="nb">list </span><span class="nv">x</span><span class="p">))</span>
<span class="nv">true</span> <span class="p">(</span><span class="nb">str </span><span class="s">"Sorry, I don't know how to append to a "</span>
<span class="p">(</span><span class="nf">type</span> <span class="nv">coll</span><span class="p">))))</span>
</code></pre>
<p>Wonderful! The supertype machinery disappears, and we’re left with something that asks succinctly about how a value might behave.</p>
<p>This is a perfectly valid way to write a polymorphic function, but it has an important limitation. Whenever someone finds or creates a new type they’d like to append to, they have to edit the <code>append</code> function to add support for that type. This is one half of a classic dilemma in programming languages known as <a href="https://wiki.c2.com/?ExpressionProblem">the expression problem</a>. It would be nice if we could define functions piece by piece, so that we could add support for different types <em>without</em> changing the original definition of the function. This is the motivation behind Clojure’s <em>multimethods</em>.</p>
<h2><a href="#multimethods" id="multimethods">Multimethods</a></h2>
<p>A <em>multimethod</em> is a special kind of function. Instead of a function body, it has a <em>dispatch function</em>, which takes the arguments to the function and tells us not what to return, but how to find a particular <em>implementation</em> of that function. We define the implementations (essentially, the function bodies) separately.</p>
<p>To define a multimethod, use <code>defmulti</code>:</p>
<pre><code><span></span><span class="p">(</span><span class="kd">defmulti </span><span class="nv">append</span>
<span class="s">"Appends an x to collection coll."</span>
<span class="p">(</span><span class="k">fn </span><span class="p">[</span><span class="nv">coll</span> <span class="nv">x</span><span class="p">]</span> <span class="p">(</span><span class="nf">type</span> <span class="nv">coll</span><span class="p">)))</span>
</code></pre>
<p>Here, we’re defining an <code>append</code> function. This will overwrite our <code>append</code> function from earlier, so you can rename or delete the original to avoid the conflict, if you like. Like <code>defn</code>, we provide a docstring. Unlike <code>defn</code>, we follow that with a <em>dispatch function</em>, which takes two arguments (<code>coll</code> and <code>x</code>) and returns the type of <code>coll</code>. The return value of the dispatch function is how Clojure decides which implementation to use. All together, this <code>defmulti</code> says “the behavior of <code>append</code>, a function of two arguments, depends on the type of its first argument.”</p>
<p>Next, we need to provide an <em>implementation</em> of the <code>append</code> function. We do this with <code>defmethod</code>:</p>
<pre><code><span></span><span class="p">(</span><span class="kd">defmethod </span><span class="nv">append</span> <span class="nv">clojure.lang.PersistentVector</span>
<span class="p">[</span><span class="nv">coll</span> <span class="nv">x</span><span class="p">]</span>
<span class="p">(</span><span class="nb">conj </span><span class="nv">coll</span> <span class="nv">x</span><span class="p">))</span>
</code></pre>
<p>When <code>append</code>’s dispatch function returns <code>clojure.lang.PersistentVector</code>, we take the arguments <code>coll</code> and <code>x</code>, and use <code>conj</code> to append <code>x</code> to <code>coll</code>. This is the same implementation as our original polymorphic function for vectors, but we’ve decoupled the plumbing from the implementation: one function decides <em>which</em> implementation to run, and the implementation does the work. This decoupling means we can add additional implementations (again using <code>defmethod</code>) without changing our existing implementation!</p>
<pre><code><span></span><span class="p">(</span><span class="kd">defmethod </span><span class="nv">append</span> <span class="nv">clojure.lang.Sequential</span>
<span class="p">[</span><span class="nv">coll</span> <span class="nv">x</span><span class="p">]</span>
<span class="p">(</span><span class="nb">concat </span><span class="nv">coll</span> <span class="p">(</span><span class="nb">list </span><span class="nv">x</span><span class="p">)))</span>
</code></pre>
<p>This implementation of <code>append</code> takes a <code>clojure.lang.Sequential</code> as its first argument, and uses <code>concat</code> to add x to the end. Now our <code>append</code> function can take either a vector or any sequential object:</p>
<pre><code><span></span><span class="nv">scratch.polymorphism=></span> <span class="p">(</span><span class="nf">append</span> <span class="p">[</span><span class="mi">1</span> <span class="mi">2</span><span class="p">]</span> <span class="mi">3</span><span class="p">)</span>
<span class="p">[</span><span class="mi">1</span> <span class="mi">2</span> <span class="mi">3</span><span class="p">]</span>
<span class="nv">scratch.polymorphism=></span> <span class="p">(</span><span class="nf">append</span> <span class="p">(</span><span class="nb">map inc </span><span class="p">[</span><span class="mi">1</span> <span class="mi">2</span><span class="p">])</span> <span class="mi">4</span><span class="p">)</span>
<span class="p">(</span><span class="mi">2</span> <span class="mi">3</span> <span class="mi">4</span><span class="p">)</span>
</code></pre>
<p>That’s odd! We dispatched using <code>(type coll)</code>, which, for <code>(map inc ...)</code>, would have been a <code>LazySeq</code>. But we didn’t define any method for <code>LazySeq</code>. Why… why did this work?</p>
<p>The answer is that Clojure doesn’t compare multimethod dispatch values via <code>=</code>. It compares them using a function we haven’t seen before: <code>isa?</code>.</p>
<pre><code><span></span><span class="nv">scratch.polymorphism=></span> <span class="p">(</span><span class="nb">doc </span><span class="nv">isa?</span><span class="p">)</span>
<span class="nv">-------------------------</span>
<span class="nv">clojure.core/isa?</span>
<span class="p">([</span><span class="nv">child</span> <span class="nv">parent</span><span class="p">]</span> <span class="p">[</span><span class="nv">h</span> <span class="nv">child</span> <span class="nv">parent</span><span class="p">])</span>
<span class="nv">Returns</span> <span class="nv">true</span> <span class="k">if </span><span class="p">(</span><span class="nb">= </span><span class="nv">child</span> <span class="nv">parent</span><span class="p">)</span>, <span class="nb">or </span><span class="nv">child</span> <span class="nv">is</span> <span class="nv">directly</span> <span class="nb">or </span><span class="nv">indirectly</span> <span class="nv">derived</span> <span class="nv">from</span>
<span class="nv">parent</span>, <span class="nv">either</span> <span class="nv">via</span> <span class="nv">a</span> <span class="nv">Java</span> <span class="nv">type</span> <span class="nv">inheritance</span> <span class="nv">relationship</span> <span class="nb">or </span><span class="nv">a</span>
<span class="nv">relationship</span> <span class="nv">established</span> <span class="nv">via</span> <span class="nv">derive.</span> <span class="nv">h</span> <span class="nv">must</span> <span class="nv">be</span> <span class="nv">a</span> <span class="nv">hierarchy</span> <span class="nv">obtained</span>
<span class="nv">from</span> <span class="nv">make-hierarchy</span>, <span class="k">if </span><span class="nb">not </span><span class="nv">supplied</span> <span class="nv">defaults</span> <span class="nv">to</span> <span class="nv">the</span> <span class="nv">global</span>
<span class="nv">hierarchy</span>
</code></pre>
<p>So <code>isa?</code> tells us whether two things are equal (using <code>=</code>), <em>or</em> whether <code>child</code> is related to <code>parent</code> via Java types, <em>or</em> via “a relationship established via derive”, whatever that is. The fact that <code>isa?</code> knows about Java type relationships means that we can use a supertype (e.g. <code>Sequential</code>) rather than listing every specific type (e.g. <code>PersistentList</code>, <code>LazySeq</code>, etc).</p>
<pre><code><span></span><span class="nv">scratch.polymorphism=></span> <span class="p">(</span><span class="nf">isa?</span> <span class="nv">clojure.lang.PersistentList</span> <span class="nv">clojure.lang.Counted</span><span class="p">)</span>
<span class="nv">true</span>
<span class="nv">scratch.polymorphism=></span> <span class="p">(</span><span class="nf">isa?</span> <span class="nv">clojure.lang.PersistentList</span> <span class="nv">clojure.lang.PersistentVector</span><span class="p">)</span>
<span class="nv">false</span>
</code></pre>
<p><code>isa?</code> has another trick up its sleeve–it says it can use relationships defined via <code>derive</code>. What does <em>that</em> do?</p>
<pre><code><span></span><span class="nv">scratch.polymorphism=></span> <span class="p">(</span><span class="nb">doc </span><span class="nv">derive</span><span class="p">)</span>
<span class="nv">-------------------------</span>
<span class="nv">clojure.core/derive</span>
<span class="p">([</span><span class="nv">tag</span> <span class="nv">parent</span><span class="p">]</span> <span class="p">[</span><span class="nv">h</span> <span class="nv">tag</span> <span class="nv">parent</span><span class="p">])</span>
<span class="nv">Establishes</span> <span class="nv">a</span> <span class="nv">parent/child</span> <span class="nv">relationship</span> <span class="nv">between</span> <span class="nv">parent</span> <span class="nv">and</span>
<span class="nv">tag.</span> <span class="nv">Parent</span> <span class="nv">must</span> <span class="nv">be</span> <span class="nv">a</span> <span class="nv">namespace-qualified</span> <span class="nb">symbol or keyword </span><span class="nv">and</span>
<span class="nv">child</span> <span class="nv">can</span> <span class="nv">be</span> <span class="nv">either</span> <span class="nv">a</span> <span class="nv">namespace-qualified</span> <span class="nb">symbol or keyword or </span><span class="nv">a</span>
<span class="nv">class.</span> <span class="nv">h</span> <span class="nv">must</span> <span class="nv">be</span> <span class="nv">a</span> <span class="nv">hierarchy</span> <span class="nv">obtained</span> <span class="nv">from</span> <span class="nv">make-hierarchy</span>, <span class="k">if </span><span class="nv">not</span>
<span class="nv">supplied</span> <span class="nv">defaults</span> <span class="nv">to</span>, <span class="nb">and </span><span class="nv">modifies</span>, <span class="nv">the</span> <span class="nv">global</span> <span class="nv">hierarchy.</span>
</code></pre>
<p>Huh. So this lets us establish relationships between symbols or keywords. And classes, too—though classes can only be children. Let’s give that a shot.</p>
<pre><code><span></span><span class="p">(</span><span class="nf">derive</span> <span class="ss">::milk</span> <span class="ss">::dairy</span><span class="p">)</span>
<span class="p">(</span><span class="nf">derive</span> <span class="ss">::dairy</span> <span class="ss">::grocery</span><span class="p">)</span>
<span class="nv">scratch.polymorphism=></span> <span class="p">(</span><span class="nf">isa?</span> <span class="ss">::milk</span> <span class="ss">::milk</span><span class="p">)</span>
<span class="nv">true</span>
<span class="nv">scratch.polymorphism=></span> <span class="p">(</span><span class="nf">isa?</span> <span class="ss">::milk</span> <span class="ss">::furniture</span><span class="p">)</span>
<span class="nv">false</span>
<span class="nv">scratch.polymorphism=></span> <span class="p">(</span><span class="nf">isa?</span> <span class="ss">::milk</span> <span class="ss">::dairy</span><span class="p">)</span>
<span class="nv">true</span>
<span class="nv">scratch.polymorphism=></span> <span class="p">(</span><span class="nf">isa?</span> <span class="ss">::milk</span> <span class="ss">::grocery</span><span class="p">)</span>
<span class="nv">true</span>
</code></pre>
<p>With these <code>derive</code> statements, we’ve built a web of relationships between these keywords. Now <code>isa?</code> not only knows that milk is a kind of dairy, but also (because dairy is a kind of grocery) that milk is a kind of grocery. And we know that milk is <em>not</em> furniture—I’m <em>pretty</em> sure that’s true. Note that we’re using qualified keywords here (beginning with a <code>::</code>), which prevents us from accidentally changing the relationships in other namespaces.</p>
<p>We’re not limited to defining 1:1 relationships. Milk can be a grocery <em>and</em> refrigerated. Apples can <em>also</em> be groceries.</p>
<pre><code><span></span><span class="p">(</span><span class="nf">derive</span> <span class="ss">::milk</span> <span class="ss">::refrigerated</span><span class="p">)</span>
<span class="p">(</span><span class="nf">derive</span> <span class="ss">::apples</span> <span class="ss">::grocery</span><span class="p">)</span>
<span class="nv">scratch.polymorphism=></span> <span class="p">(</span><span class="nf">isa?</span> <span class="ss">::milk</span> <span class="ss">::grocery</span><span class="p">)</span>
<span class="nv">true</span>
<span class="nv">scratch.polymorphism=></span> <span class="p">(</span><span class="nf">isa?</span> <span class="ss">::milk</span> <span class="ss">::refrigerated</span><span class="p">)</span>
<span class="nv">true</span>
<span class="nv">scratch.polymorphism=></span> <span class="p">(</span><span class="nf">isa?</span> <span class="ss">::apples</span> <span class="ss">::grocery</span><span class="p">)</span>
<span class="nv">true</span>
</code></pre>
<p>We can see the all the things that milk is by using the <code>parents</code> function. That’s kind of like supertypes, only these aren’t types: they’re just plain old keywords.</p>
<pre><code><span></span><span class="nv">scratch.polymorphism=></span> <span class="p">(</span><span class="nf">parents</span> <span class="ss">::milk</span><span class="p">)</span>
<span class="o">#</span><span class="p">{</span><span class="ss">:scratch.polymorphism/refrigerated</span> <span class="ss">:scratch.polymorphism/dairy</span><span class="p">}</span>
</code></pre>
<p>And we can see all the things that are refrigerated using <code>descendents</code>. That’s kind of like subtypes:</p>
<pre><code><span></span><span class="nv">scratch.polymorphism=></span> <span class="p">(</span><span class="nf">descendants</span> <span class="ss">::grocery</span><span class="p">)</span>
<span class="o">#</span><span class="p">{</span><span class="ss">:scratch.polymorphism/milk</span> <span class="ss">:scratch.polymorphism/apples</span> <span class="ss">:scratch.polymorphism/dairy</span><span class="p">}</span>
</code></pre>
<p>Now imagine we represented our groceries as maps. Something like <code>{:item-type ::milk, :size :gallon}</code>. When we get home from running errands, we’d like a function to put those grocery maps away—but <em>how</em> they’re stored should depend on the <code>:item-type</code> of the grocery item. We could write:</p>
<pre><code><span></span><span class="p">(</span><span class="kd">defmulti </span><span class="nv">put-away</span>
<span class="s">"Stores an item when we get home."</span>
<span class="ss">:item-type</span><span class="p">)</span>
</code></pre>
<p>This takes advantage of the fact that keywords are functions: <code>:item-type</code> will look up the type of the item, and use that to choose an implementation.</p>
<p>In general, we can put groceries in the pantry, and refrigerated items, we’ll put in the fridge.</p>
<pre><code><span></span><span class="p">(</span><span class="kd">defmethod </span><span class="nv">put-away</span> <span class="ss">::grocery</span>
<span class="p">[</span><span class="nv">item</span><span class="p">]</span>
<span class="p">(</span><span class="nb">println </span><span class="s">"Putting a"</span> <span class="p">(</span><span class="nb">name </span><span class="p">(</span><span class="ss">:size</span> <span class="nv">item</span><span class="p">))</span> <span class="s">"of"</span> <span class="p">(</span><span class="nb">name </span><span class="p">(</span><span class="ss">:item-type</span> <span class="nv">item</span><span class="p">))</span>
<span class="s">"in the pantry"</span><span class="p">))</span>
<span class="p">(</span><span class="kd">defmethod </span><span class="nv">put-away</span> <span class="ss">::refrigerated</span>
<span class="p">[</span><span class="nv">item</span><span class="p">]</span>
<span class="p">(</span><span class="nb">println </span><span class="s">"Storing a"</span> <span class="p">(</span><span class="nb">name </span><span class="p">(</span><span class="ss">:size</span> <span class="nv">item</span><span class="p">))</span> <span class="s">"of"</span> <span class="p">(</span><span class="nb">name </span><span class="p">(</span><span class="ss">:item-type</span> <span class="nv">item</span><span class="p">))</span>
<span class="s">"in the fridge"</span><span class="p">))</span>
</code></pre>
<p>Now we can store some apples, and see them go into the pantry.</p>
<pre><code><span></span><span class="nv">scratch.polymorphism=></span> <span class="p">(</span><span class="nf">put-away</span> <span class="p">{</span><span class="ss">:item-type</span> <span class="ss">::apples</span>, <span class="ss">:size</span> <span class="ss">:large-bag</span><span class="p">})</span>
<span class="nv">Putting</span> <span class="nv">a</span> <span class="nv">large-bag</span> <span class="nv">of</span> <span class="nv">apples</span> <span class="nv">in</span> <span class="nv">the</span> <span class="nv">pantry</span>
</code></pre>
<p>How about milk?</p>
<pre><code><span></span><span class="nv">scratch.polymorphism=></span> <span class="p">(</span><span class="nf">put-away</span> <span class="p">{</span><span class="ss">:item-type</span> <span class="ss">::milk</span>, <span class="ss">:size</span> <span class="ss">:gallon</span><span class="p">})</span>
<span class="nv">IllegalArgumentException</span> <span class="nv">Multiple</span> <span class="nv">methods</span> <span class="nv">in</span> <span class="nv">multimethod</span> <span class="ss">'put-away</span><span class="o">'</span> <span class="nv">match</span> <span class="nv">dispatch</span> <span class="nv">value</span><span class="err">:</span> <span class="ss">:scratch.polymorphism/milk</span> <span class="nb">-> </span><span class="ss">:scratch.polymorphism/grocery</span> <span class="nb">and </span><span class="ss">:scratch.polymorphism/refrigerated</span>, <span class="nb">and </span><span class="nv">neither</span> <span class="nv">is</span> <span class="nv">preferred</span> <span class="nv">clojure.lang.MultiFn.findAndCacheBestMethod</span> <span class="p">(</span><span class="nf">MultiFn.java</span><span class="ss">:178</span><span class="p">)</span>
</code></pre>
<p>Ah, that’s interesting. Since milk is both a grocery <em>and</em> refrigerated, <em>either</em> of these implementations could apply to it. We can tell Clojure how to resolve the ambiguity using <code>prefer-method</code>:</p>
<pre><code><span></span><span class="p">(</span><span class="nf">prefer-method</span> <span class="nv">put-away</span> <span class="ss">::refrigerated</span> <span class="ss">::grocery</span><span class="p">)</span>
<span class="nv">scratch.polymorphism=></span> <span class="p">(</span><span class="nf">put-away</span> <span class="p">{</span><span class="ss">:item-type</span> <span class="ss">::milk</span>, <span class="ss">:size</span> <span class="ss">:gallon</span><span class="p">})</span>
<span class="nv">Storing</span> <span class="nv">a</span> <span class="nv">gallon</span> <span class="nv">of</span> <span class="nv">milk</span> <span class="nv">in</span> <span class="nv">the</span> <span class="nv">fridge</span>
</code></pre>
<p>Very good! We’ve established that the <code>::refrigerated</code> item type takes precedence over the <code>::grocery</code> item type. It’s important to prevent spoilage!</p>
<p>You can use multimethods wherever you need to extend a function’s behavior later. This is especially useful when you intend your code to be used by other people—if someone else were to use our grocery-storage system, they could define new types of items, and be able to tell <code>put-away</code> exactly how to handle those new item types. We didn’t talk about garbage bags, pencils, or medication here, but because <code>put-away</code> is a multimethod, someone else could define something like <code>{:item-type ::medication}</code>, and extend <code>put-away</code> to store it correctly.</p>
<p>Throughout this example, we’ve talked about “item types”, but… we used keywords, like <code>::apples</code>, to represent those types. These aren’t types in the sense of Clojure’s type system, but we could use them <em>like</em> types. In a very real sense, what we’ve done here is define our own tiny language, with its own itty bitty type system, completely separate from Clojure’s. The core <em>ideas</em> are the same: we use subtype relationships to write code which depends only on general things (e.g. “refrigerated things”) automatically cover more specific things (e.g. “milk”).</p>
<p>Multimethods are powerful and general thanks to their dispatch functions. However, because those dispatch functions get involved in every call to a multimethod, they’re a bit slower than regular function calls. When performance matters, we turn to <em>interfaces</em> and <em>protocols</em>.</p>
<h2><a href="#interfaces" id="interfaces">Interfaces</a></h2>
<p>The idea of a polymorphic function which decides what to do based on the type of its arguments is so common, and so useful, that most languages provide special facilities for it. We call this “type dispatch”: the type of the value being passed chooses which particular code the language invokes. We wrote a version of type dispatch using multimethods and the <code>type</code> function. Many languages, such as Haskell and Java, build type dispatch into <em>every</em> function—types are attached to each argument, and used to decide between alternative implementations.</p>
<p>To support this feature in Java, the JVM has a fast, built-in mechanism for type dispatch using interfaces. We aren’t limited to using the interfaces given to us by Clojure and the JVM. We can define our own interfaces, and use them to get extra-speedy type dispatch, using the <code>definterface</code> macro.</p>
<pre><code><span></span><span class="p">(</span><span class="kd">definterface </span><span class="nv">IAppend</span>
<span class="p">(</span><span class="nf">append</span> <span class="p">[</span><span class="nv">x</span><span class="p">]))</span>
</code></pre>
<p>We’ve defined a new type: specifically, an interface. The name of our interface is <code>IAppend</code>. We’ve also stated that if a value <code>coll</code> is an instance of type IAppend, then there must be a <em>method</em>, named <code>append</code>. These methods are (and I know this is confusing) <em>not</em> the multimethods we discussed earlier. These methods are <em>JVM</em> methods: a sort of primitive function. Methods take arguments, evaluate code, and return results, like functions. Unlike Clojure’s functions, they aren’t values: you can’t ask them for docstrings, or pass them around to <code>map</code> or <code>filter</code>. We’ve provided only a single method here, but if we liked, we could define several in the same <code>definterface</code>.</p>
<p>The <code>append</code> method we defined takes two arguments. Yes, <em>two</em>. Interfaces always take a first argument, which in this case must be an instance of <code>IAppend</code>. Since the first argument is mandatory, <code>definterface</code> doesn’t ask us to write it down. This is a bit weird, and contradicts how function definitions work everywhere else, but we’re stuck with this behavior for historical reasons. Long story short: <code>(append [x]</code> tells us that our first argument is an <code>IAppend</code>, and our second argument is some object called <code>x</code>. And that’s it! Like a multimethod, there’s no function body: we provide that later. Unlike a multimethod, there’s no dispatch function. The JVM will always dispatch based on the type of the first argument.</p>
<p>“All right”, you might say. “It’s great that we have type to express that something is appendable, and an <code>append</code>… method, whatever that is, exists. But how do we make <em>an appendable thing</em>?”</p>
<p>For this, we need new tools.</p>
<h2><a href="#making-an-appendable-thing" id="making-an-appendable-thing">Making An Appendable Thing</a></h2>
<p>We have an interface, <code>IAppend</code>, and we’d like to make an <em>instance</em> of that type. The quickest way to make an object of some type is to use a macro called <code>reify</code>: a fancy philosophical word that means “make a concrete thing out of an abstract concept.” In Clojure, <code>reify</code> takes interfaces, and definitions for how the methods in those interfaces should work, and returns an object which is an instance of those interfaces. For instance, perhaps we want an object to keep track of a grocery list:</p>
<pre><code><span></span><span class="p">(</span><span class="kd">defn </span><span class="nv">grocery-list</span>
<span class="s">"Creates an appendable grocery list. Takes a vector of</span>
<span class="s"> groceries to buy."</span>
<span class="p">[</span><span class="nv">to-buy</span><span class="p">]</span>
<span class="p">(</span><span class="nf">reify</span> <span class="nv">IAppend</span>
<span class="p">(</span><span class="nf">append</span> <span class="p">[</span><span class="nv">this</span> <span class="nv">x</span><span class="p">]</span>
<span class="p">(</span><span class="nf">grocery-list</span> <span class="p">(</span><span class="nb">conj </span><span class="nv">to-buy</span> <span class="nv">x</span><span class="p">)))))</span>
</code></pre>
<p>There are two parts here: the first is a function, <code>grocery-list</code>, which we’re going to call when we want to make a new grocery list. The second is the <code>(reify IAppend ...)</code>, which constructs a value. That value will be an instance of type <code>IAppend</code>; at compile time, <code>reify</code> summons a new, anonymous class from the void, and makes sure that class is a subtype of <code>IAppend</code>. Each call to this <code>(reify ...)</code> constructs a new instance of that anonymous class.</p>
<p>Inside the <code>reify</code>, we’ve provided definitions for <em>how</em> to handle <code>IAppend</code>’s methods: when someone calls the <code>append</code> method with <code>this</code> (some value which this reify constructed) and <code>x</code>, we add <code>x</code> to the end of the <code>to-buy</code> vector using <code>conj</code>, and call <code>grocery-list</code> to make a new <code>GroceryList</code> out of it. That way we can keep appending more things later.</p>
<p>An interesting thing to note: like <code>fn</code>, <code>reify</code> can use variables, like <code>to-buy</code>, from the surrounding code. When <code>grocery-list</code> returns, the object constructed by <code>reify</code> <em>remembers</em> the value of <code>to-buy</code>, and can use it later. We say that <code>reify</code>, like <code>fn</code>, <em>closes over</em> those variables: <code>reify</code> and <code>fn</code> are <em>closures</em>. That’s a fancy bit of programming jargon you can use to get strangers to stop talking to you at parties.</p>
<p>Let’s try it out:</p>
<pre><code><span></span><span class="nv">scratch.polymorphism=></span> <span class="p">(</span><span class="nf">grocery-list</span> <span class="p">[</span><span class="ss">:eggs</span><span class="p">])</span>
<span class="o">#</span><span class="nv">object</span><span class="p">[</span><span class="nv">scratch.polymorphism$grocery_list$reify__1950</span> <span class="mi">0</span><span class="nv">x70e02b5</span> <span class="s">"scratch.polymorphism$grocery_list$reify__1950@70e02b5"</span><span class="p">]</span>
</code></pre>
<p>This is not a particularly helpful representation of a grocery list. If you squint, you can see the namespace (<code>scratch.polymorphism</code>) and function (<code>grocery_list</code>) in there, and also <code>reify</code>, since we used <code>reify</code> to make this value. The <code>_1950</code> is a unique number that helps the computer tell this particular reify apart from others. In fact, this whole first part is the automatically generated class which <code>reify</code> defined for us. <code>0x70e02b5</code> is a number that identifies where this particular instance of that class lives in memory. Unhelpfully, <em>nothing</em> here tells us about the to-buy list we provided (<code>[:eggs]</code>).</p>
<p>One thing we <em>do</em> know, though, is that this is something we can append to.</p>
<pre><code><span></span><span class="nv">scratch.polymorphism=></span> <span class="p">(</span><span class="nf">supers</span> <span class="p">(</span><span class="nf">type</span> <span class="p">(</span><span class="nf">grocery-list</span> <span class="p">[</span><span class="ss">:eggs</span><span class="p">])))</span>
<span class="o">#</span><span class="p">{</span><span class="nv">clojure.lang.IObj</span> <span class="nv">scratch.polymorphism.IAppend</span> <span class="nv">java.lang.Object</span> <span class="nv">clojure.lang.IMeta</span><span class="p">}</span>
</code></pre>
<p>Remember how many types were in <code>(supers clojure.lang.PersistentVector)</code>? Objects made with <code>reify</code> are far simpler. There’s <code>IAppend</code>: that’s the interface type we defined earlier. There’s <code>java.lang.Object</code>, of course. <code>clojure.lang.IObj</code> and <code>IMeta</code> mean that our reify object has metadata. Wait—what <em>is</em> this thing’s metadata anyway?</p>
<pre><code><span></span><span class="nv">scratch.polymorphism=></span> <span class="p">(</span><span class="nb">meta </span><span class="p">(</span><span class="nf">grocery-list</span> <span class="p">[</span><span class="ss">:eggs</span><span class="p">]))</span>
<span class="p">{</span><span class="ss">:line</span> <span class="mi">12</span>, <span class="ss">:column</span> <span class="mi">3</span><span class="p">}</span>
</code></pre>
<p>Huh! That’s the line and column number, of the <code>reify</code> expression which made this object. But what about appending? How do we use <code>append</code> with this thing?</p>
<pre><code><span></span><span class="nv">scratch.polymorphism=></span> <span class="p">(</span><span class="nf">append</span> <span class="p">(</span><span class="nf">grocery-list</span> <span class="p">[</span><span class="ss">:eggs</span><span class="p">])</span> <span class="ss">:tofu</span><span class="p">)</span>
<span class="s">"Sorry, I don't know how to append to a class scratch.polymorphism$grocery_list$reify__1491"</span>
</code></pre>
<p>Oh, wait, hang on—that’s our <code>append</code> function from before. We wanted to call the append <em>method</em> we defined using <code>definterface</code>: methods and functions are different things, even if they have the same name. To make a method call, we put a <code>.</code> in front of the method name:</p>
<pre><code><span></span><span class="nv">scratch.polymorphism=></span> <span class="p">(</span><span class="nf">.append</span> <span class="p">(</span><span class="nf">grocery-list</span> <span class="p">[</span><span class="ss">:eggs</span><span class="p">])</span> <span class="ss">:tofu</span><span class="p">)</span>
<span class="o">#</span><span class="nv">object</span><span class="p">[</span><span class="nv">scratch.polymorphism$grocery_list$reify__1950</span> <span class="mi">0</span><span class="nv">x40eb00f0</span> <span class="s">"scratch.polymorphism$grocery_list$reify__1950@40eb00f0"</span><span class="p">]</span>
</code></pre>
<p>If we wanted a function for <code>append</code>, we could write one which calls the method. We might call this a <em>wrapper</em> function, since it wraps the append method up in a nice functional package. This version of <code>append</code> we can use with <code>reduce</code> or <code>partial</code>, and so on.</p>
<pre><code><span></span><span class="p">(</span><span class="kd">defn </span><span class="nv">append</span>
<span class="s">"Appends x to the end of coll."</span>
<span class="p">[</span><span class="nv">coll</span> <span class="nv">x</span><span class="p">]</span>
<span class="p">(</span><span class="nf">.append</span> <span class="nv">coll</span> <span class="nv">x</span><span class="p">))</span>
</code></pre>
<p>Moving on: we’ve called our <code>append</code> method, and it gave us… another unhelpful grocery-list. It’d be great if we had a more reasonable way to <em>print</em> these lists to the console. In the JVM, the <code>Object</code> class defines a method called <code>toString</code>. That’s how <code>str</code> (typically) makes strings out of things. Let’s expand our <code>reify</code> to define a <em>different</em> <code>toString</code> method:</p>
<pre><code><span></span><span class="p">(</span><span class="kd">defn </span><span class="nv">grocery-list</span>
<span class="s">"Creates an appendable (via IAppend) grocery list. Takes a vector of</span>
<span class="s"> groceries to buy."</span>
<span class="p">[</span><span class="nv">to-buy</span><span class="p">]</span>
<span class="p">(</span><span class="nf">reify</span>
<span class="nv">IAppend</span>
<span class="p">(</span><span class="nf">append</span> <span class="p">[</span><span class="nv">this</span> <span class="nv">x</span><span class="p">]</span>
<span class="p">(</span><span class="nf">grocery-list</span> <span class="p">(</span><span class="nb">conj </span><span class="nv">to-buy</span> <span class="nv">x</span><span class="p">)))</span>
<span class="nv">Object</span>
<span class="p">(</span><span class="nf">toString</span> <span class="p">[</span><span class="nv">this</span><span class="p">]</span>
<span class="p">(</span><span class="nb">str </span><span class="s">"To buy: "</span> <span class="nv">to-buy</span><span class="p">))))</span>
</code></pre>
<p>In general, <code>reify</code> takes a type followed by method definitions for that particular type, then another type, and any number of methods for <em>that</em> type, and so on. Our grocery lists were already <code>Object</code> before, and they were given simple, default definitions for all of <code>Object</code>’s methods–that’s how the REPL was able to show us <code>#object[scratch.polymorphism$grocery_list$reify__1950 ...]</code>. But now our call to <code>reify</code> states explicitly: when interpreted as an <code>Object</code>, here’s how the <code>toString</code> method works.</p>
<p>Let’s see it in action!</p>
<pre><code><span></span><span class="nv">scratch.polymorphism=></span> <span class="p">(</span><span class="nb">str </span><span class="p">(</span><span class="nf">grocery-list</span> <span class="p">[</span><span class="ss">:eggs</span><span class="p">]))</span>
<span class="s">"To buy: [:eggs]"</span>
</code></pre>
<p>Hey, that’s more helpful! This is another kind of polymorphism at work: the <code>toString</code> method (and by extension, the <code>str</code> function) does different things depending on the type of object it’s given. And what’s neat is that <em>unlike</em> our initial polymorphic function <code>append</code>—where we had a single function definition which had to know about <em>all</em> the types we wanted to call… we didn’t have to change <code>toString</code> or <code>str</code>’s definitions. The plumbing—looking up what code to evaluate—is handled automatically. As with multimethods, we’re free to define behaviors for new types <em>without</em> having to change the definitions for other types.</p>
<p>Let’s try out our <code>append</code> method again and see if it works:</p>
<pre><code><span></span><span class="nv">scratch.polymorphism=></span> <span class="p">(</span><span class="nb">str </span><span class="p">(</span><span class="nf">.append</span> <span class="p">(</span><span class="nf">grocery-list</span> <span class="p">[</span><span class="ss">:eggs</span><span class="p">])</span> <span class="ss">:tomatoes</span><span class="p">))</span>
<span class="s">"To buy: [:eggs :tomatoes]"</span>
</code></pre>
<p>Hey, that’s great! We can see the results of appending to our grocery list. What about appending to lists and vectors, though? Can we use <code>.append</code> with them?</p>
<pre><code><span></span><span class="nv">scratch.polymorphism=></span> <span class="p">(</span><span class="nf">.append</span> <span class="p">[</span><span class="mi">1</span> <span class="mi">2</span><span class="p">]</span> <span class="mi">3</span><span class="p">)</span>
<span class="nv">IllegalArgumentException</span> <span class="nv">No</span> <span class="nv">matching</span> <span class="nv">method</span> <span class="nv">found</span><span class="err">:</span> <span class="nv">append</span> <span class="nb">for class </span><span class="nv">clojure.lang.PersistentVector</span> <span class="nv">clojure.lang.Reflector.invokeMatchingMethod</span> <span class="p">(</span><span class="nf">Reflector.java</span><span class="ss">:53</span><span class="p">)</span>
</code></pre>
<p>The <em>reflector</em> is a part of Clojure which figures out what definition of a method to use for a given type. It <em>failed</em> to find a matching method for <code>append</code>, given a <code>clojure.lang.PersistentVector</code>—which makes sense, because we haven’t made <code>clojure.lang.PersistentVector</code> a subtype of <code>IAppend</code>. Let’s do that next!</p>
<p>I have terrible news: we <em>can’t</em> do this. Interfaces are a one-way street: when we define a new type (as we did with <code>reify</code>), we can say how that type works with any number of interfaces. But when we define an interface, we <em>don’t</em> get to say how it works with existing types. That’s just how the JVM’s type system works.</p>
<p>“But this is awful!” You might exclaim. “The whole reason we defined an interface was so that we could write polymorphic functions like <code>append</code>, which could append to <em>many</em> kinds of objects. Instead, we’re limited to polymorphism only over types which we ourselves define!”</p>
<p>This is the other half of the <a href="https://wiki.c2.com/?ExpressionProblem">expression problem</a> we mentioned earlier: existing (regular) functions can’t be extended to new types, and existing types can’t be extended to new interfaces. We solved the function-extension problem with multimethods and interfaces… but how do we solve the interface-extension problem?</p>
<h2><a href="#protocols" id="protocols">Protocols</a></h2>
<p>In Clojure, a <em>protocol</em> is like an interface which can be extended to existing types. It defines a named type, together with functions whose first argument is an instance of that type. Where interfaces are built into the JVM, protocols are a Clojure-specific construct. To define a protocol, we use <code>defprotocol</code>:</p>
<pre><code><span></span><span class="p">(</span><span class="kd">defprotocol </span><span class="nv">Append</span>
<span class="s">"This protocol lets us add things to the end of a collection."</span>
<span class="p">(</span><span class="nf">append</span> <span class="p">[</span><span class="nv">coll</span> <span class="nv">x</span><span class="p">]</span>
<span class="s">"Appends x to the end of collection coll."</span><span class="p">))</span>
</code></pre>
<p>If you still have the <code>append</code> function we wrote earlier, this <code>append</code> function will replace it; you’ll see a message like <code>Warning: protocol #'scratch.polymorphism/Append is overwriting function append</code> at the REPL. You can delete or rename the original <code>append</code> function if you like.</p>
<p>We’ve named our protocol <code>Append</code> (not to be confused with the interface <code>IAppend</code>), and given it a bit of documentation to remind us what it’s for. It has one function, named <code>append</code>, which takes two arguments: <code>coll</code> and <code>x</code>. We can give a docstring for the <code>append</code> function too. Like an interface, we <em>don’t</em> define how the function works: we’re simply saying it exists. Unlike interfaces, these are real functions, not methods. Their first arguments are explicit, they have docstrings, we don’t need to use a <code>.</code> to call them, and they can be passed around to other functions.</p>
<p>We can ask for our protocol’s documentation at the repl, just like we can for functions and namespaces:</p>
<pre><code><span></span><span class="nv">scratch.polymorphism=></span> <span class="p">(</span><span class="nb">doc </span><span class="nv">Append</span><span class="p">)</span>
<span class="nv">-------------------------</span>
<span class="nv">scratch.polymorphism/Append</span>
<span class="nv">This</span> <span class="nv">protocol</span> <span class="nv">lets</span> <span class="nv">us</span> <span class="nv">add</span> <span class="nv">things</span> <span class="nv">to</span> <span class="nv">the</span> <span class="nv">end</span> <span class="nv">of</span> <span class="nv">a</span> <span class="nv">collection.</span>
</code></pre>
<p>And likewise, functions defined in <code>defprotocol</code> can be inspected, just like those made with <code>defn</code>.</p>
<pre><code>scratch.polymorphism=> (doc append)
-------------------------
scratch.polymorphism/append
([coll x])
Appends x to the end of collection coll.
</code></pre>
<p>If we try to use the <code>append</code> function with a grocery list, it’s going to fail: the grocery list <code>reify</code> is a subtype of the <em>interface</em> <code>IAppend</code>, but we haven’t told it how the <em>protocol</em> <code>Append</code> works yet:</p>
<pre><code>scratch.polymorphism=> (append (grocery-list [:eggs]) :tomatoes)
IllegalArgumentException No implementation of method: :append of protocol: #'scratch.polymorphism/Append found for class: scratch.polymorphism$grocery_list$reify__1758 clojure.core/-cache-protocol-fn (core_deftype.clj:568)
</code></pre>
<p>This error tells us that the <code>append</code> function doesn’t have an <em>implementation</em> (a function body) for the type <code>scratch.polymorphism$grocery_list$reify__1758</code>. We can fix that by changing our <code>reify</code> to use the <code>Append</code> protocol, instead of the <code>IAppend</code> interface. This is a one-character change: protocol functions and interface methods are defined in <code>reify</code> in exactly the same way.</p>
<pre><code><span></span><span class="p">(</span><span class="kd">defn </span><span class="nv">grocery-list</span>
<span class="s">"Creates an appendable (via IAppend) grocery list. Takes a vector of</span>
<span class="s"> groceries to buy."</span>
<span class="p">[</span><span class="nv">to-buy</span><span class="p">]</span>
<span class="p">(</span><span class="nf">reify</span>
<span class="nv">Append</span>
<span class="p">(</span><span class="nf">append</span> <span class="p">[</span><span class="nv">this</span> <span class="nv">x</span><span class="p">]</span>
<span class="p">(</span><span class="nf">grocery-list</span> <span class="p">(</span><span class="nb">conj </span><span class="nv">to-buy</span> <span class="nv">x</span><span class="p">)))</span>
<span class="nv">Object</span>
<span class="p">(</span><span class="nf">toString</span> <span class="p">[</span><span class="nv">this</span><span class="p">]</span>
<span class="p">(</span><span class="nb">str </span><span class="s">"To buy: "</span> <span class="nv">to-buy</span><span class="p">))))</span>
</code></pre>
<p>Now we can use our <code>append</code> function with grocery lists!</p>
<pre><code><span></span><span class="nv">scratch.polymorphism=></span> <span class="p">(</span><span class="nb">str </span><span class="p">(</span><span class="nf">append</span> <span class="p">(</span><span class="nf">grocery-list</span> <span class="p">[</span><span class="ss">:eggs</span><span class="p">])</span> <span class="ss">:tomatoes</span><span class="p">))</span>
<span class="s">"To buy: [:eggs :tomatoes]"</span>
</code></pre>
<p>So far, we’ve done exactly what we did with interfaces. In fact, when we called <code>defprotocol</code>, it not only defined a protocol: it also defined an interface as well. But unlike interfaces, we can extend our protocol to cover <em>existing</em> types. To do this, we use <code>extend-protocol</code>:</p>
<pre><code><span></span><span class="p">(</span><span class="nf">extend-protocol</span> <span class="nv">Append</span>
<span class="nv">clojure.lang.IPersistentVector</span>
<span class="p">(</span><span class="nf">append</span> <span class="p">[</span><span class="nv">v</span> <span class="nv">x</span><span class="p">]</span>
<span class="p">(</span><span class="nb">conj </span><span class="nv">v</span> <span class="nv">x</span><span class="p">)))</span>
</code></pre>
<p>This expresses that the <code>Append</code> protocol’s functions (i.e. <code>append</code>) can now be used on anything which is an <code>IPersistentVector</code>. When we call <code>(append v x)</code> with a vector <code>v</code>, we return the result of <code>(conj v x)</code>. Let’s try it out:</p>
<pre><code><span></span><span class="nv">scratch.polymorphism=></span> <span class="p">(</span><span class="nf">append</span> <span class="p">[</span><span class="mi">1</span> <span class="mi">2</span><span class="p">]</span> <span class="mi">3</span><span class="p">)</span>
<span class="p">[</span><span class="mi">1</span> <span class="mi">2</span> <span class="mi">3</span><span class="p">]</span>
</code></pre>
<p>Fantastic! What about other sequential collections?</p>
<pre><code><span></span><span class="p">(</span><span class="nf">extend-protocol</span> <span class="nv">Append</span>
<span class="nv">clojure.lang.IPersistentVector</span>
<span class="p">(</span><span class="nf">append</span> <span class="p">[</span><span class="nv">v</span> <span class="nv">x</span><span class="p">]</span>
<span class="p">(</span><span class="nb">conj </span><span class="nv">v</span> <span class="nv">x</span><span class="p">))</span>
<span class="nv">clojure.lang.Sequential</span>
<span class="p">(</span><span class="nf">append</span> <span class="p">[</span><span class="nv">v</span> <span class="nv">x</span><span class="p">]</span>
<span class="p">(</span><span class="nb">concat </span><span class="nv">v</span> <span class="p">(</span><span class="nb">list </span><span class="nv">x</span><span class="p">))))</span>
</code></pre>
<p><code>extend-protocol</code> can take several types, and the function definitions for each of them. Here, we’re extending <code>Append</code> over both <code>IPersistentVector</code> and <code>Sequential</code>—and providing definitions for how <code>append</code> works in each case. If you want to extend a single type to multiple protocols, use <code>extend-type</code>. Both <code>extend-protocol</code> and <code>extend-type</code> can be called as often as you like: all their definitions get merged together.</p>
<p>We can even extend a protocol over <code>nil</code>! We could add this to the existing <code>extend-protocol</code>, or write it separately. This is another advantage of protocols over interfaces.</p>
<pre><code><span></span><span class="p">(</span><span class="nf">extend-protocol</span> <span class="nv">Append</span>
<span class="nv">nil</span>
<span class="p">(</span><span class="nf">append</span> <span class="p">[</span><span class="nv">v</span> <span class="nv">x</span><span class="p">]</span>
<span class="p">[</span><span class="nv">x</span><span class="p">]))</span>
<span class="nv">scratch.polymorphism=></span> <span class="p">(</span><span class="nf">append</span> <span class="nv">nil</span> <span class="mi">2</span><span class="p">)</span>
<span class="p">[</span><span class="mi">2</span><span class="p">]</span>
</code></pre>
<h2><a href="#named-datatypes" id="named-datatypes">Named Datatypes</a></h2>
<p>We’ve used <code>reify</code> to make an object which satisfies some interfaces or protocols. Like an anonymous function <code>(fn [x] ...)</code>, <code>reify</code> creates an <em>anonymous type</em>. Because the <code>reify</code> type has no (predictable) name, we can’t extend protocols to it later. How do we make a type with a name–like <code>clojure.lang.PersistentVector</code>, or <code>clojure.lang.LazySeq</code>?</p>
<p>There are two tools at our disposal here: <code>deftype</code> and <code>defrecord</code>. Both define new named types—classes, to be exact. The <code>deftype</code> macro produces a very basic datatype, whereas <code>defrecord</code> defines a type which behaves, in many respects, like a Clojure map. First, <code>deftype</code>:</p>
<pre><code><span></span><span class="p">(</span><span class="kd">deftype </span><span class="nv">GroceryList</span> <span class="p">[</span><span class="nv">to-buy</span><span class="p">]</span>
<span class="nv">Append</span>
<span class="p">(</span><span class="nf">append</span> <span class="p">[</span><span class="nv">this</span> <span class="nv">x</span><span class="p">]</span>
<span class="p">(</span><span class="nf">GroceryList.</span> <span class="p">(</span><span class="nb">conj </span><span class="nv">to-buy</span> <span class="nv">x</span><span class="p">)))</span>
<span class="nv">Object</span>
<span class="p">(</span><span class="nf">toString</span> <span class="p">[</span><span class="nv">this</span><span class="p">]</span>
<span class="p">(</span><span class="nb">str </span><span class="s">"To buy: "</span> <span class="nv">to-buy</span><span class="p">)))</span>
</code></pre>
<p>We’re defining a new type, named <code>GroceryList</code>. Objects of this type keep track of a single variable, called <code>to-buy</code>. Just as with <code>reify</code>, we provide a sequence of types we’d like GroceryLists to be a subtype of, and provide implementations for their functions or methods. The only difference is that in <code>append</code>, we construct a new grocery list using <code>(GroceryList. to-buy)</code>. We use the name of the class followed by a period <code>.</code> to make a new instance of Grocerylist.</p>
<p>Let’s try creating one of these GroceryLists.</p>
<pre><code><span></span><span class="nv">scratch.polymorphism=></span> <span class="p">(</span><span class="nf">GroceryList.</span> <span class="p">[</span><span class="ss">:eggs</span><span class="p">])</span>
<span class="o">#</span><span class="nv">object</span><span class="p">[</span><span class="nv">scratch.polymorphism.GroceryList</span> <span class="mi">0</span><span class="nv">x370dbd33</span> <span class="s">"To buy: [:eggs]"</span><span class="p">]</span>
</code></pre>
<p>Voilà! An instance of GroceryList. We’ve got the full name of the type: <code>GroceryList</code>, preceded by the namespace <code>scratch.polymorphism</code>. There’s a memory address, and then our string representation. Can we append to it?</p>
<pre><code><span></span><span class="nv">scratch.polymorphism=></span> <span class="p">(</span><span class="nf">append</span> <span class="p">(</span><span class="nf">GroceryList.</span> <span class="p">[</span><span class="ss">:eggs</span><span class="p">])</span> <span class="ss">:spinach</span><span class="p">)</span>
<span class="o">#</span><span class="nv">object</span><span class="p">[</span><span class="nv">scratch.polymorphism.GroceryList</span> <span class="mi">0</span><span class="nv">x3c612037</span> <span class="s">"To buy: [:eggs :spinach]"</span><span class="p">]</span>
</code></pre>
<p>Indeed we can. What else can we do with a GroceryList?</p>
<pre><code><span></span><span class="nv">scratch.polymorphism=></span> <span class="p">(</span><span class="nf">supers</span> <span class="nv">GroceryList</span><span class="p">)</span>
<span class="o">#</span><span class="p">{</span><span class="nv">clojure.lang.IType</span> <span class="nv">scratch.polymorphism.Append</span> <span class="nv">java.lang.Object</span><span class="p">}</span>
</code></pre>
<p>Not much. There’s <code>clojure.lang.IType</code>, which just means “this thing is a Clojure datatype”. There’s our <code>Append</code> protocol, and <code>java.lang.Object</code>, of course—almost <em>everything</em> is a subtype of Object. As it turns out, <code>deftype</code> is pretty bare-bones.</p>
<p>We <em>do</em> get a few things for free with <code>deftype</code>. We can access the fields by using <code>.some-field-name</code>, like so:</p>
<pre><code><span></span><span class="nv">scratch.polymorphism=></span> <span class="p">(</span><span class="nf">.to-buy</span> <span class="p">(</span><span class="nf">GroceryList.</span> <span class="p">[</span><span class="ss">:eggs</span><span class="p">]))</span>
<span class="p">[</span><span class="ss">:eggs</span><span class="p">]</span>
</code></pre>
<p>And we also get a function that takes a <code>to-buy</code> list and builds a new <code>GroceryList</code>. These “constructor functions” take one argument for each field in the <code>deftype</code>.</p>
<pre><code>scratch.polymorphism=> (->GroceryList [:strawberries])
#object[scratch.polymorphism.GroceryList 0x44cc69b3 "To buy: [:strawberries]"]
</code></pre>
<p>This is a small wrapper around <code>(GroceryList. to-buy)</code>. It’s there because <code>GroceryList.</code>, like a method, isn’t a full-fledged Clojure function. Like methods, we can’t use <code>GroceryList.</code> with <code>map</code> or <code>apply</code>, or other things that expect functions. But we <em>can</em> use <code>->GroceryList</code> in these contexts!</p>
<pre><code><span></span><span class="nv">scratch.polymorphism=></span> <span class="p">(</span><span class="nb">map </span><span class="nv">GroceryList.</span> <span class="p">[[</span><span class="ss">:twix</span><span class="p">]</span> <span class="p">[</span><span class="ss">:kale</span> <span class="ss">:bananas</span><span class="p">]])</span>
<span class="nv">CompilerException</span> <span class="nv">java.lang.ClassNotFoundException</span><span class="err">:</span> <span class="nv">GroceryList.</span>, <span class="nv">compiling</span><span class="err">:</span><span class="p">(</span><span class="nf">/tmp/form-init2122621676255621718.clj</span><span class="ss">:1:1</span><span class="p">)</span>
<span class="nv">scratch.polymorphism=></span> <span class="p">(</span><span class="nb">map </span><span class="nv">->GroceryList</span> <span class="p">[[</span><span class="ss">:twix</span><span class="p">]</span> <span class="p">[</span><span class="ss">:kale</span> <span class="ss">:bananas</span><span class="p">]])</span>
<span class="p">(</span><span class="o">#</span><span class="nv">object</span><span class="p">[</span><span class="nv">scratch.polymorphism.GroceryList</span> <span class="mi">0</span><span class="nv">x552db723</span> <span class="s">"To buy: [:twix]"</span><span class="p">]</span> <span class="o">#</span><span class="nv">object</span><span class="p">[</span><span class="nv">scratch.polymorphism.GroceryList</span> <span class="mi">0</span><span class="nv">x4d81eefd</span> <span class="s">"To buy: [:kale :bananas]"</span><span class="p">])</span>
</code></pre>
<p>The types constructed by <code>deftype</code> are <em>so</em> basic that they lack properties we’ve taken for granted so far—like equality:</p>
<pre><code><span></span><span class="nv">scratch.polymorphism=></span> <span class="p">(</span><span class="nb">= </span><span class="p">(</span><span class="nf">GroceryList.</span> <span class="p">[</span><span class="ss">:cheese</span><span class="p">])</span> <span class="p">(</span><span class="nf">GroceryList.</span> <span class="p">[</span><span class="ss">:cheese</span><span class="p">]))</span>
<span class="nv">false</span>
</code></pre>
<p>The <em>only</em> thing a GroceryList is equal to is <em>itself</em>.</p>
<pre><code><span></span><span class="nv">scratch.polymorphism=></span> <span class="p">(</span><span class="k">let </span><span class="p">[</span><span class="nv">gl</span> <span class="p">(</span><span class="nf">GroceryList.</span> <span class="p">[</span><span class="ss">:fish</span><span class="p">])]</span> <span class="p">(</span><span class="nb">= </span><span class="nv">gl</span> <span class="nv">gl</span><span class="p">))</span>
<span class="nv">true</span>
</code></pre>
<p>This is Clojure being conservative—it doesn’t know if, say, two GroceryLists with the same <code>to-buy</code> list can <em>really</em> be considered equivalent. It’s up to us to define that by providing an implementation for the <code>equals</code> method—another part of <code>Object</code>.</p>
<pre><code><span></span><span class="p">(</span><span class="kd">deftype </span><span class="nv">GroceryList</span> <span class="p">[</span><span class="nv">to-buy</span><span class="p">]</span>
<span class="nv">Append</span>
<span class="p">(</span><span class="nf">append</span> <span class="p">[</span><span class="nv">this</span> <span class="nv">x</span><span class="p">]</span>
<span class="p">(</span><span class="nf">GroceryList.</span> <span class="p">(</span><span class="nb">conj </span><span class="nv">to-buy</span> <span class="nv">x</span><span class="p">)))</span>
<span class="nv">Object</span>
<span class="p">(</span><span class="nf">toString</span> <span class="p">[</span><span class="nv">this</span><span class="p">]</span>
<span class="p">(</span><span class="nb">str </span><span class="s">"To buy: "</span> <span class="nv">to-buy</span><span class="p">))</span>
<span class="p">(</span><span class="nf">equals</span> <span class="p">[</span><span class="nv">this</span> <span class="nv">other</span><span class="p">]</span>
<span class="p">(</span><span class="nb">and </span><span class="p">(</span><span class="nb">= </span><span class="p">(</span><span class="nf">type</span> <span class="nv">this</span><span class="p">)</span> <span class="p">(</span><span class="nf">type</span> <span class="nv">other</span><span class="p">))</span>
<span class="p">(</span><span class="nb">= </span><span class="nv">to-buy</span> <span class="p">(</span><span class="nf">.to-buy</span> <span class="nv">other</span><span class="p">)))))</span>
<span class="nv">scratch.polymorphism=></span> <span class="p">(</span><span class="nb">= </span><span class="p">(</span><span class="nf">GroceryList.</span> <span class="p">[</span><span class="ss">:cheese</span><span class="p">])</span> <span class="p">(</span><span class="nf">GroceryList.</span> <span class="p">[</span><span class="ss">:cheese</span><span class="p">]))</span>
<span class="nv">true</span>
</code></pre>
<p>Want to make all grocery lists equal? Go wild!</p>
<pre><code><span></span><span class="p">(</span><span class="kd">deftype </span><span class="nv">GroceryList</span> <span class="p">[</span><span class="nv">to-buy</span><span class="p">]</span>
<span class="nv">Append</span>
<span class="p">(</span><span class="nf">append</span> <span class="p">[</span><span class="nv">this</span> <span class="nv">x</span><span class="p">]</span>
<span class="p">(</span><span class="nf">GroceryList.</span> <span class="p">(</span><span class="nb">conj </span><span class="nv">to-buy</span> <span class="nv">x</span><span class="p">)))</span>
<span class="nv">Object</span>
<span class="p">(</span><span class="nf">toString</span> <span class="p">[</span><span class="nv">this</span><span class="p">]</span>
<span class="p">(</span><span class="nb">str </span><span class="s">"To buy: "</span> <span class="nv">to-buy</span><span class="p">))</span>
<span class="p">(</span><span class="nf">equals</span> <span class="p">[</span><span class="nv">this</span> <span class="nv">other</span><span class="p">]</span>
<span class="p">(</span><span class="nb">= </span><span class="p">(</span><span class="nf">type</span> <span class="nv">this</span><span class="p">)</span> <span class="p">(</span><span class="nf">type</span> <span class="nv">other</span><span class="p">))))</span>
<span class="nv">scratch.polymorphism=></span> <span class="p">(</span><span class="nb">= </span><span class="p">(</span><span class="nf">GroceryList.</span> <span class="p">[</span><span class="ss">:ketchup</span><span class="p">])</span> <span class="p">(</span><span class="nf">GroceryList.</span> <span class="p">[</span><span class="ss">:mayo</span><span class="p">]))</span>
<span class="nv">true</span>
</code></pre>
<p>So, <code>deftype</code> gives us the power to construct our own, primitive types. But most of the time, we don’t <em>want</em> this degree of control: defining exactly how to print our values, how to compare two values together, and so on. After all, plain old maps are a great way to model data. They’re easy to print and easy to manipulate. It’d be nice if we could create a type—to take advantage of protocols—but have it still work like a map. Clojure calls this kind of type a <em>record</em>.</p>
<pre><code><span></span><span class="p">(</span><span class="kd">defrecord </span><span class="nv">GroceryList</span> <span class="p">[</span><span class="nv">to-buy</span><span class="p">]</span>
<span class="nv">Append</span>
<span class="p">(</span><span class="nf">append</span> <span class="p">[</span><span class="nv">this</span> <span class="nv">x</span><span class="p">]</span>
<span class="p">(</span><span class="nf">GroceryList.</span> <span class="p">(</span><span class="nb">conj </span><span class="nv">to-buy</span> <span class="nv">x</span><span class="p">))))</span>
</code></pre>
<p>The <code>defrecord</code> macro looks almost exactly like <code>deftype</code>: it takes the name of the type we’re defining, the names of the fields each instance will keep track of, and then a series of types with method implementations. As with <code>deftype</code>, we can construct instances of our <code>GroceryList</code> type using <code>GroceryList.</code> or the <code>->GroceryList</code> function.</p>
<pre><code><span></span><span class="nv">scratch.polymorphism=></span> <span class="p">(</span><span class="nf">GroceryList.</span> <span class="p">[</span><span class="ss">:beans</span><span class="p">])</span>
<span class="o">#</span><span class="nv">scratch.polymorphism.GroceryList</span><span class="p">{</span><span class="ss">:to-buy</span> <span class="p">[</span><span class="ss">:beans</span><span class="p">]}</span>
</code></pre>
<p><em>Unlike</em> <code>deftype</code>, we get a nice, concise string representation for free. The first part shows the type name, and after that it looks just like a map, showing the fields of this <code>GroceryList</code> and their corresponding values.</p>
<p>We don’t have to define our own equality either: two records are equal if they’re of the same type, and their fields are equal.</p>
<pre><code><span></span><span class="nv">scratch.polymorphism=></span> <span class="p">(</span><span class="nb">= </span><span class="p">(</span><span class="nf">GroceryList.</span> <span class="p">[</span><span class="ss">:beans</span><span class="p">])</span> <span class="p">(</span><span class="nf">GroceryList.</span> <span class="p">[</span><span class="ss">:beans</span><span class="p">]))</span>
<span class="nv">true</span>
<span class="nv">scratch.polymorphism=></span> <span class="p">(</span><span class="nb">= </span><span class="p">(</span><span class="nf">GroceryList.</span> <span class="p">[</span><span class="ss">:beans</span><span class="p">])</span> <span class="p">{</span><span class="ss">:to-buy</span> <span class="p">[</span><span class="ss">:beans</span><span class="p">]})</span>
<span class="nv">false</span>
</code></pre>
<p>A GroceryList works <em>like</em> a map, but it’s not the same type: records aren’t equal to maps, even if they have the same keys and values.</p>
<p>Like <code>deftype</code>, we can access the fields of a record using <code>.to-buy</code>:</p>
<pre><code><span></span><span class="nv">scratch.polymorphism=></span> <span class="p">(</span><span class="nf">.to-buy</span> <span class="p">(</span><span class="nf">GroceryList.</span> <span class="p">[</span><span class="ss">:bread</span><span class="p">]))</span>
<span class="p">[</span><span class="ss">:bread</span><span class="p">]</span>
</code></pre>
<p>But since records work like maps, we can also access them using <code>get</code>, or by using keywords as functions:</p>
<pre><code><span></span><span class="nv">scratch.polymorphism=></span> <span class="p">(</span><span class="nb">get </span><span class="p">(</span><span class="nf">GroceryList.</span> <span class="p">[</span><span class="ss">:bread</span><span class="p">])</span> <span class="ss">:to-buy</span><span class="p">)</span>
<span class="p">[</span><span class="ss">:bread</span><span class="p">]</span>
<span class="nv">scratch.polymorphism=></span> <span class="p">(</span><span class="ss">:to-buy</span> <span class="p">(</span><span class="nf">GroceryList.</span> <span class="p">[</span><span class="ss">:bread</span><span class="p">]))</span>
<span class="p">[</span><span class="ss">:bread</span><span class="p">]</span>
</code></pre>
<p>And we can alter those fields using <code>assoc</code> and <code>update</code>, just like maps. Let’s replace our shopping list with onions:</p>
<pre><code><span></span><span class="nv">scratch.polymorphism=></span> <span class="p">(</span><span class="nb">-> </span><span class="p">(</span><span class="nf">GroceryList.</span> <span class="p">[</span><span class="ss">:chicken</span><span class="p">])</span>
<span class="p">(</span><span class="nb">assoc </span><span class="ss">:to-buy</span> <span class="p">[</span><span class="ss">:onion</span><span class="p">]))</span>
<span class="o">#</span><span class="nv">scratch.polymorphism.GroceryList</span><span class="p">{</span><span class="ss">:to-buy</span> <span class="p">[</span><span class="ss">:onion</span><span class="p">]}</span>
</code></pre>
<p>… and add some beets:</p>
<pre><code><span></span><span class="nv">scratch.polymorphism=></span> <span class="p">(</span><span class="nb">-> </span><span class="p">(</span><span class="nf">GroceryList.</span> <span class="p">[</span><span class="ss">:chicken</span><span class="p">])</span>
<span class="p">(</span><span class="nb">assoc </span><span class="ss">:to-buy</span> <span class="p">[</span><span class="ss">:onion</span><span class="p">])</span>
<span class="p">(</span><span class="nf">update</span> <span class="ss">:to-buy</span> <span class="nb">conj </span><span class="ss">:beets</span><span class="p">))</span>
<span class="o">#</span><span class="nv">scratch.polymorphism.GroceryList</span><span class="p">{</span><span class="ss">:to-buy</span> <span class="p">[</span><span class="ss">:onion</span> <span class="ss">:beets</span><span class="p">]}</span>
</code></pre>
<p>Just as with maps, these updates are immutable: they don’t alter the original GroceryList. Instead, they create <em>copies</em> with our requested changes. We aren’t limited to the fields we explicitly defined in the <code>defrecord</code>, either. Let’s tack on a <code>:note</code> to our grocery list:</p>
<pre><code><span></span><span class="nv">scratch.polymorphism=></span> <span class="p">(</span><span class="nb">assoc </span><span class="p">(</span><span class="nf">GroceryList.</span> <span class="p">[</span><span class="ss">:cherries</span><span class="p">])</span> <span class="ss">:note</span> <span class="s">"Tart cherries if possible!"</span><span class="p">)</span>
<span class="o">#</span><span class="nv">scratch.polymorphism.GroceryList</span><span class="p">{</span><span class="ss">:to-buy</span> <span class="p">[</span><span class="ss">:cherries</span><span class="p">]</span>, <span class="ss">:note</span> <span class="s">"Tart cherries if possible!"</span><span class="p">}</span>
</code></pre>
<p>This is possible because records (unlike deftypes) always carry around an extra map—just in case they need to store additional fields we didn’t define up front. The <code>assoc</code> function tries to update a field if it can, and if there’s no field by that name, it stores it in the record’s extra map.</p>
<p>Both <code>deftype</code> and <code>defrecord</code> produce named types, which means we can extend protocols over them after the fact. Let’s add a new protocol for printing out things nicely to the console—something we could use to print our grocery list and the items on it.</p>
<pre><code><span></span><span class="p">(</span><span class="kd">defprotocol </span><span class="nv">Printable</span>
<span class="p">(</span><span class="nf">print-out</span> <span class="p">[</span><span class="nv">x</span><span class="p">]</span> <span class="s">"Print out the given object, nicely formatted."</span><span class="p">))</span>
</code></pre>
<p>Now we can define how to print a <code>GroceryList</code>. Let’s add a basic <code>print-out</code> function that works on any object, while we’re at it:</p>
<pre><code><span></span><span class="p">(</span><span class="nf">extend-protocol</span> <span class="nv">Printable</span>
<span class="nv">GroceryList</span>
<span class="p">(</span><span class="nf">print-out</span> <span class="p">[</span><span class="nv">gl</span><span class="p">]</span>
<span class="p">(</span><span class="nb">println </span><span class="s">"GROCERIES"</span><span class="p">)</span>
<span class="p">(</span><span class="nb">println </span><span class="s">"---------"</span><span class="p">)</span>
<span class="p">(</span><span class="nb">doseq </span><span class="p">[</span><span class="nv">item</span> <span class="p">(</span><span class="ss">:to-buy</span> <span class="nv">gl</span><span class="p">)]</span>
<span class="p">(</span><span class="nb">print </span><span class="s">"[ ] "</span><span class="p">)</span>
<span class="p">(</span><span class="nf">print-out</span> <span class="nv">item</span><span class="p">)</span>
<span class="p">(</span><span class="nf">println</span><span class="p">)))</span>
<span class="nv">Object</span>
<span class="p">(</span><span class="nf">print-out</span> <span class="p">[</span><span class="nv">x</span><span class="p">]</span>
<span class="p">(</span><span class="nb">print </span><span class="nv">x</span><span class="p">)))</span>
</code></pre>
<p>Like we saw earlier, we can use <code>(:to-buy gl)</code> to get the items on the grocery list. We go through each one in turn using <code>doseq</code>, and call that particular item <code>item</code>. With that item, we print out a pair of brackets <code>"[ ] "</code>. Then we do something a bit strange: we call <code>print-out</code> <em>again</em>, but this time, with the <code>item</code> in question. The <code>Object</code> implementation takes over from there.</p>
<pre><code><span></span><span class="nv">scratch.polymorphism=></span> <span class="p">(</span><span class="nf">print-out</span> <span class="p">(</span><span class="nf">GroceryList.</span> <span class="p">[</span><span class="ss">:cilantro</span> <span class="ss">:carrots</span> <span class="ss">:pork</span> <span class="ss">:baguette</span><span class="p">]))</span>
<span class="nv">GROCERIES</span>
<span class="nv">---------</span>
<span class="p">[</span> <span class="p">]</span> <span class="ss">:cilantro</span>
<span class="p">[</span> <span class="p">]</span> <span class="ss">:carrots</span>
<span class="p">[</span> <span class="p">]</span> <span class="ss">:pork</span>
<span class="p">[</span> <span class="p">]</span> <span class="ss">:baguette</span>
</code></pre>
<p>Nice! This actually looks like a real grocery list. What if we wanted to keep track of how <em>many</em> carrots to buy? We could introduce a <em>new</em> type to keep track of things that come in a certain quantity:</p>
<pre><code><span></span><span class="p">(</span><span class="kd">defrecord </span><span class="nv">CountedItem</span> <span class="p">[</span><span class="nv">thing</span> <span class="nv">quantity</span><span class="p">]</span>
<span class="nv">Printable</span>
<span class="p">(</span><span class="nf">print-out</span> <span class="p">[</span><span class="nv">this</span><span class="p">]</span>
<span class="p">(</span><span class="nf">print-out</span> <span class="nv">thing</span><span class="p">)</span>
<span class="p">(</span><span class="nb">print </span><span class="p">(</span><span class="nb">str </span><span class="s">" ("</span> <span class="nv">quantity</span> <span class="s">"x)"</span><span class="p">))))</span>
</code></pre>
<p>We’ve defined how to print out a counted item: first we print out the thing, then the quantity in parentheses. Let’s give that a shot:</p>
<pre><code><span></span><span class="nv">scratch.polymorphism=></span> <span class="p">(</span><span class="nf">print-out</span> <span class="p">(</span><span class="nf">GroceryList.</span> <span class="p">[</span><span class="ss">:cilantro</span> <span class="p">(</span><span class="nf">CountedItem.</span> <span class="ss">:carrots</span> <span class="mi">2</span><span class="p">)</span> <span class="ss">:pork</span> <span class="ss">:baguette</span><span class="p">]))</span>
<span class="nv">GROCERIES</span>
<span class="nv">---------</span>
<span class="p">[</span> <span class="p">]</span> <span class="ss">:cilantro</span>
<span class="p">[</span> <span class="p">]</span> <span class="ss">:carrots</span> <span class="p">(</span><span class="mi">2</span><span class="nv">x</span><span class="p">)</span>
<span class="p">[</span> <span class="p">]</span> <span class="ss">:pork</span>
<span class="p">[</span> <span class="p">]</span> <span class="ss">:baguette</span>
</code></pre>
<p>Neat! We didn’t have to change <code>GroceryList</code> at all to get this behavior. Because it used the polymorphic protocol function <code>print-out</code>, it <em>automatically</em> knew how to work with our new <code>CountedItem</code> type.</p>
<h2><a href="#when-to-use-deftype-and-defrecord" id="when-to-use-deftype-and-defrecord">When To Use Deftype and Defrecord</a></h2>
<p>If you’re coming from an object-oriented language (e.g. Ruby, Java), or a typed language with algebraic datatypes (e.g. Haskell, ML), you might see <code>defprotocol</code>, <code>deftype</code>, and <code>defrecord</code>, and think: “Ah, finally. Here are the tools I’ve been waiting for.” You might start by wanting to model a person, and immediately jump to <code>(defrecord Person [name pronouns age])</code>. While this is valid, you should take a step back, and ask: do I <em>need</em> polymorphism here? Are there going to be functions that take people <em>and</em> animals? Or do I simply want to keep track of some data?</p>
<p>If you don’t need polymorphism, there’s a good chance your data can be modeled in Clojure as plain old maps, sets, vectors, and so on. Need to represent a person? How about:</p>
<pre><code><span></span><span class="p">{</span><span class="ss">:name</span> <span class="s">"Morgan"</span>
<span class="ss">:pronouns</span> <span class="p">[</span><span class="ss">:they</span> <span class="ss">:them</span><span class="p">]</span>
<span class="ss">:age</span> <span class="mi">56</span><span class="p">}</span>
</code></pre>
<p>No <code>defrecord</code> required. Sticking to maps keeps your data in a shape that can be easily manipulated using standard Clojure functions. It’s easy to store this data on disk, or send it across the network. It’s easier to share this kind of data with other people. And it’s more concise to print at the console, which makes debugging your programs easier.</p>
<p>Conversely, you’ll want to use <code>defrecord</code> and <code>deftype</code> when maps aren’t sufficient: when you need polymorphism, when you need to participate in existing protocols or interfaces, or when multimethod performance is too slow. Records are often faster and more memory-efficient than maps, so even if you don’t need the polymorphism, it can be worthwhile to define a record or so when map performance bogs you down. This is something you’ll want to find out by <em>measuring</em> your code, though, rather than simply assuming.</p>
<p>If you’re reaching for records for type safety: it’s not going to be as helpful as you’d like. Functions like <code>assoc</code> work equally well across <em>all</em> kinds of records, and the compiler won’t warn you about using the wrong keyword. Sticking to methods eliminates <em>some</em> of those risks, but it’s nothing like the type guardrails in Java or Haskell. Clojure programs generally rely more on tests and contracts to prevent these type errors. There are also static type systems like <a href="https://github.com/clojure/core.typed">core.typed</a>, which we’ll discuss later.</p>
<h2><a href="#review" id="review">Review</a></h2>
<p>When a function’s behavior depends on the type of values it is provided, we call that function polymorphic. Many of Clojure’s core functions, like <code>conj</code> or <code>reduce</code>, are polymorphic: we can <code>conj</code> into maps, vectors, sets, and lists, and each does something different. Often, our own code is <em>implicitly</em> polymorphic by virtue of using other polymorphic functions: <code>(defn add-bird [coll] (conj coll :bird))</code> can add birds to lots of different things.</p>
<p>When we need a function whose behavior explicitly depends on its arguments, we can use ad-hoc approaches, like <code>if</code>, <code>cond</code>, or <code>case</code>. The <code>instance?</code>, <code>type</code>, and <code>supers</code> functions let us choose what to do based on the <em>type</em> of the value.</p>
<p>When we need an <em>open</em> function—one whose behavior can be extended to new things <em>later</em>—we use a multimethod, an interface, or a protocol. Multimethods are the most general approach: they use a <em>dispatch function</em>, which receives the function’s arguments and decides which implementation to call. They’re not limited to dispatching by argument type: they can use arbitrary values and relationships between keywords, defined with <code>derive</code>. They also offer fine-grained control when that dispatch would be ambiguous. This flexibility comes with a performance penalty: Clojure has to evaluate the dispatch function every time the multimethod is called.</p>
<p>When a function’s behavior depends on the type of the first argument, use protocols or interfaces. Interfaces can’t be extended to existing types; protocols can. Protocols have some ergonomic advantages: they define regular functions, rather than methods, and come with documentation—though there’s nothing stopping you from writing your own documented wrapper functions, or using <a href="https://github.com/aleph-io/potemkin#definterface">definterface+</a>, which does so automatically. Interfaces are slightly faster; prefer them when performance matters.</p>
<p>To create instances of a new type, we have <code>reify</code>. Like <code>(fn [x] ...)</code>, <code>reify</code> generates an <em>anonymous</em> type—it can have interfaces and protocols as supertypes, and provides implementations for those types, but has no (predictable, meaningful) name. When we want to name our types—perhaps so that other people can extend them later—we use <code>deftype</code> and <code>defrecord</code>. Most of the time, <code>defrecord</code> is most useful: they work like maps out of the box. However, <code>deftype</code> is available should we need to construct bare-bones types with unusual behaviors.</p>
<p>We <em>haven’t</em> talked about the details of classes or inheritance in this discussion. These are important for Java interop, but we don’t use these concepts often in Clojure. A topic for later discussion!</p>
<h2><a href="#problems" id="problems">Problems</a></h2>
<ul>
<li>
<p>Write a <code>sorted</code> function that uses <code>cond</code> and <code>instance?</code> to convert lists to sorted lists (using <code>(sort ...)</code>), and sets to sorted sets (using <code>(into (sorted-set) ...)</code>).</p>
</li>
<li>
<p>Rewrite <code>sorted</code> as a multimethod. Using <code>defmethod</code>, extend <code>sorted</code> to handle maps.</p>
</li>
<li>
<p>Add a <code>checked-off</code> field to the <code>GroceryList</code> type, and use it to store a set of items that are already in the cart. Write a <code>check-off</code> function that takes a grocery list and checks off an item on it, by adding that item to the <code>checked-off</code> set: <code>(check-off my-list eggs)</code></p>
</li>
<li>
<p>Write a <code>remaining</code> function which takes a <code>GroceryList</code> and returns the items that <em>haven’t</em> been checked off yet.</p>
</li>
<li>
<p>Change the definition of <code>print-out</code> for <code>GroceryList</code> to take the <code>checked-off</code> set into account, printing an <code>[x]</code> in front of checked-off items.</p>
</li>
<li>
<p>Imagine Clojure had <em>no</em> built-in sets. Make up a <code>Set</code> protocol with some basic operations, like <code>add-element</code>, <code>has-element?</code>, and <code>remove-element</code>.</p>
</li>
<li>
<p>Using a vector or list to store your elements, write a basic implementation of your <code>Set</code> protocol. Experiment to make sure adding the same item twice doesn’t add two copies.</p>
</li>
<li>
<p>Try making larger and larger sets–say, with ten, a thousand, and a hundred thousand elements. Use <code>(time (has-element? some-set 123))</code> to see how your set performance changes with size. Why is this?</p>
</li>
<li>
<p>Write a different implementation of a <code>Set</code>, which uses a <em>map</em> to store its elements. Compare its performance to your list-based set.</p>
</li>
<li>
<p>The <code>deref</code> function uses an interface called <code>clojure.lang.IDeref</code> to return the <em>current value</em> of a container type. Using <code>deftype</code>, define your own container type. Try <code>@(MyContainer. :hi)</code> to verify that you can get the value of your container (<code>:hi</code>) back.</p>
</li>
<li>
<p>[advanced] So far, we’ve worked only with immutable types. <code>deftype</code> lets us define <em>mutable</em> types by tagging a field with <code>^:unsynchronized-mutable</code>, like so: <code>(deftype DangerBox [^:unsynchronized-mutable value] ...)</code>. Design a <code>Mutable</code> protocol with a <code>(write! box value)</code> function to overwrite the value of a mutable container. Using <code>(set! field value)</code>, build your own mutable container type which supports both <code>Mutable</code> and <code>IDeref</code>. Confirm that you can change its value using <code>write!</code>, and read it back using <code>@</code>.</p>
</li>
<li>
<p>[advanced] Use your mutable container as a counter by reading its current state and writing back a value one greater–e.g. via <code>(write! box (inc @box))</code>. Using <code>dotimes</code>, perform <em>many</em> updates in a row, and verify that the final value of the counter is the same as the number you passed to <code>dotimes</code>.</p>
</li>
<li>
<p>[advanced] Run this <code>dotimes</code> increment loop in two threads at once, using <code>future</code>. Is the final counter value what you expected? Why? How does this compare to using an <code>(atom)</code> with <code>swap!</code>?</p>
</li>
</ul>
https://aphyr.com/posts/319-clojure-from-the-ground-up-debuggingClojure from the ground up: debugging2014-08-26T21:27:05-05:002014-08-26T21:27:05-05:00Aphyrhttps://aphyr.com/<p>Previously: <a href="https://aphyr.com/posts/312-clojure-from-the-ground-up-modeling">Modeling</a>.</p>
<p>Writing software can be an exercise in frustration. Useless error messages, difficult-to-reproduce bugs, missing stacktrace information, obscure functions without documentation, and unmaintained libraries all stand in our way. As software engineers, our most useful skill isn’t so much <em>knowing how to solve a problem</em> as <em>knowing how to explore a problem that we haven’t seen before</em>. Experience is important, but even experienced engineers face unfamiliar bugs every day. When a problem doesn’t bear a resemblance to anything we’ve seen before, we fall back on <em>general cognitive strategies</em> to explore–and ultimately solve–the problem.</p>
<p>There’s an excellent book by the mathematician George Polya: <a href="http://www.amazon.com/How-Solve-It-Mathematical-Princeton/dp/069111966X">How to Solve It</a>, which tries to catalogue how successful mathematicians approach unfamiliar problems. When I catch myself banging my head against a problem for more than a few minutes, I try to back up and consider his <a href="http://math.berkeley.edu/~gmelvin/polya.pdf">principles</a>. Sometimes, just taking the time to slow down and reflect can get me out of a rut.</p>
<ol>
<li>Understand the problem.</li>
<li>Devise a plan.</li>
<li>Carry out the plan</li>
<li>Look back</li>
</ol>
<p>Seems easy enough, right? Let’s go a little deeper.</p>
<h2><a href="#understanding-the-problem" id="understanding-the-problem">Understanding the problem</a></h2>
<p>Well <em>obviously</em> there’s a problem, right? The program failed to compile, or a test spat out bizarre numbers, or you hit an unexpected exception. But try to dig a little deeper than that. Just having a careful description of the problem can make the solution obvious.</p>
<blockquote>
<p>Our audit program detected that users can double-withdraw cash from their accounts.</p>
</blockquote>
<p>What does your program do? Chances are your program is large and complex, so try to <em>isolate</em> the problem as much as possible. Find <em>preconditions</em> where the error holds.</p>
<blockquote>
<p>The problem occurs after multiple transfers between accounts.</p>
</blockquote>
<p>Identify specific lines of code from the stacktrace that are involved, specific data that’s being passed around. Can you find a particular function that’s misbehaving?</p>
<blockquote>
<p>The balance transfer function sometimes doesn’t increase or decrease the account values correctly.</p>
</blockquote>
<p>What are that function’s inputs and outputs? Are the inputs what you expected? What did you expect the result to be, given those arguments? It’s not enough to know “it doesn’t work”–you need to know exactly what <em>should</em> have happened. Try to find conditions where the function works correctly, so you can map out the boundaries of the problem.</p>
<blockquote>
<p>Trying to transfer $100 from A to B works as expected, as does a transfer of $50 from B to A. Running a million random transfers between accounts, sequentially, results in correct balances. The problem only seems to happen in production.</p>
</blockquote>
<p>If your function–or functions it calls–uses mutable state, like an agent, atom, or ref, the value of those references matters too. This is why you should avoid mutable state wherever possible: each mutable variable introduces another dimension of possible behaviors for your program. Print out those values when they’re read, and after they’re written, to get a description of what the function is actually doing. I am a huge believer in sprinkling <code>(prn x)</code> throughout one’s code to print how state evolves when the program runs.</p>
<blockquote>
<p>Each balance is stored in a separate atom. When two transfers happen at the same time involving the same accounts, the new value of one or both atoms may not reflect the transfer correctly.</p>
</blockquote>
<p>Look for <em>invariants</em>: properties that should always be true of a program. Devise a test to look for where those invariants are broken. Consider each individual step of the program: does it preserve all the invariants you need? If it doesn’t, what ensures those invariants are restored correctly?</p>
<blockquote>
<p>The total amount of money in the system should be constant–but sometimes changes!</p>
</blockquote>
<p>Draw diagrams, and invent a notation to talk about the problem. If you’re accessing fields in a vector, try drawing the vector as a set of boxes, and drawing the fields it accesses, step by step on paper. If you’re manipulating a tree, draw one! Figure out a way to write down the state of the system: in letters, numbers, arrows, graphs, whatever you can dream up.</p>
<pre><code>Transferring $5 from A to B in transaction 1, and $5 from B to A in transaction 2:
Transaction | A | B
-------------+-----+-----
txn1 read | 10 | 10 ; Transaction 1 sees 10, 10
txn1 write A | 5 | 10 ; A and B now out-of-sync
txn2 read | 5 | 10 ; Transaction 2 sees 5, 10
txn1 write B | 5 | 15 ; Transaction 1 completes
txn2 write A | 10 | 15 ; Transaction 2 writes based on out-of-sync read
txn2 write B | 5 | 5 ; Should have been 10, 10!
</code></pre>
<p>This doesn’t <em>solve</em> the problem, but helps us <em>explore</em> the problem in depth. Sometimes this makes the solution obvious–other times, we’re just left with a pile of disjoint facts. Even if things <em>look</em> jumbled-up and confusing, don’t despair! Exploring gives the brain the pieces; it’ll link them together over time.</p>
<p>Armed with a detailed <em>description</em> of the problem, we’re much better equipped to solve it.</p>
<h2><a href="#devise-a-plan" id="devise-a-plan">Devise a plan</a></h2>
<p>Our brains are excellent pattern-matchers, but not that great at tracking abstract logical operations. Try changing your viewpoint: rotating the problem into a representation that’s a little more tractable for your mind. Is there a similar problem you’ve seen in the past? Is this a well-known problem?</p>
<p>Make sure you know how to <em>check</em> the solution. With the problem isolated to a single function, we can write a test case that verifies the account balances are correct. Then we can experiment freely, and have some confidence that we’ve actually found a solution.</p>
<p>Can you solve a <em>related</em> problem? If only concurrent transfers trigger the problem, could we solve the issue by ensuring transactions never take place concurrently–e.g. by wrapping the operation in a lock? Could we solve it by <em>logging</em> all transactions, and replaying the log? Is there a simpler variant of the problem that might be tractable–maybe one that always <em>overcounts</em>, but never <em>undercounts</em>?</p>
<p>Consider your assumptions. We rely on layers of abstraction in writing software–that changing a variable is atomic, that lexical variables don’t change, that adding 1 and 1 always gives 2. Sometimes, parts of the computer <em>fail</em> to guarantee those abstractions hold. The CPU might–very rarely–fail to divide numbers correctly. A library might, for supposedly valid input, spit out a bad result. A numeric algorithm might fail to converge, and spit out wrong numbers. To avoid questioning <em>everything</em>, start in your own code, and work your way down to the assumptions themselves. See if you can devise tests that check the language or library is behaving as you expect.</p>
<p>Can you avoid solving the problem altogether? Is there a library, database, or language feature that does transaction management for us? Is integrating that library worth the reduced complexity in our application?</p>
<p>We’re not mathematicians; we’re engineers. Part theorist, yes, but also part mechanic. Some problems take a more abstract approach, and others are better approached by tapping it with a wrench and checking the service manual. If other people have solved your problem already, using their solution can be much simpler than devising your own.</p>
<p>Can you think of a way to get more diagnostic information? Perhaps we could log more data from the functions that are misbehaving, or find a way to dump and replay transactions from the live program. Some problems <em>disappear</em> when instrumented; these are the hardest to solve, but also the most rewarding.</p>
<p>Combine key phrases in a Google search: the name of the library you’re using, the type of exception thrown, any error codes or log messages. Often you’ll find a StackOverflow result, a mailing list post, or a Github issue that describes your problem. This works well when you know the technical terms for your problem–in our case, that we’re performing a <em>atomic</em>, <em>transactional</em> transfer between two variables. Sometimes, though, you don’t <em>know</em> the established names for your problem, and have to resort to blind queries like “variables out of sync” or “overwritten data”–which are much more difficult.</p>
<p>When you get stuck exploring on your own, try asking for help. Collect your description of the problem, the steps you took, and what you expected the program to do. Include any stacktraces or error messages, log files, and the smallest section of source code required to reproduce the problem. Also include the versions of software used–in Clojure, typically the JVM version (<code>java -version</code>), Clojure version (<code>project.clj</code>), and any other relevant library versions.</p>
<p>If the project has a Github page or public issue tracker, like Jira, you can try filing an issue there. Here’s a <a href="https://github.com/aphyr/riemann-dash/issues/66">particularly well-written issue</a> filed by a user on one of my projects. Note that this user included installation instructions, the command they ran, and the stacktrace it printed. The more specific a description you provide, the easier it is for someone else to understand your problem and help!</p>
<p>Sometimes you need to talk through a problem interactively. For that, I prefer IRC–many projects have a channel on <a href="https://libera.chat">the Libera IRC network</a> where you can ask basic questions. Remember to be respectful of the channel’s time; there may be hundreds of users present, and they have to sort through everything you write. Paste your problem description into a <em>pastebin</em> like <a href="https://gist.github.com/">Gist</a>, then mention the link in IRC with a short–say a few sentences–description of the problem. I try asking in a channel devoted to a specific library or program first, then back off to a more general channel, like #clojure. There’s no need to ask “Can I ask a question” first–just jump in.</p>
<p>Since the transactional problem we’ve been exploring seems like a general issue with atoms, I might ask in #clojure</p>
<pre><code>aphyr > Hi! Does anyone know the right way to change multiple atoms at the same time?
aphyr > This function and test case (http://gist.github.com/...) seems to double-
or under-count when invoked concurrently.
</code></pre>
<p>Finally, you can join the project’s email list, and ask your question there. Turnaround times are longer, but you’ll often find a more in-depth response to your question via email. This applies especially if you and the maintainer are in different time zones, or if they’re busy with life. You can also ask specific problems on StackOverflow or other message boards; users there can be incredibly helpful.</p>
<p>Remember, other engineers are taking time away from their work, family, friends, and hobbies to help you. It’s always polite to give them time to answer first–they may have other priorities. A sincere thank-you is always appreciated–as is paying it forward by answering other users’ questions on the list or channel!</p>
<h3><a href="#dealing-with-abuse" id="dealing-with-abuse">Dealing with abuse</a></h3>
<p>Sadly, some women, LGBT people, and so on experience harassment on IRC or in other discussion circles. They may be asked inappropriate personal questions, insulted, threatened, assumed to be straight, to be a man, and so on. Sometimes other users will attack questioners for inexperience. Exclusion can be overt (“Read the fucking docs, faggot!”) or more subtle (“Hey dudes, what’s up?”). It only takes one hurtful experience this to sour someone on an entire community.</p>
<p>If this happens to you, <b>place your own well-being first</b>. You are <em>not</em> obligated to fix anyone else’s problems, or to remain in a social context that makes you uncomfortable.</p>
<p>That said, be aware the other people in a channel may not share your culture. English may not be their main language, or they may have said something hurtful without realizing its impact. Explaining how the comment made you feel can jar a well-meaning but unaware person into reconsidering their actions.</p>
<p>Other times, people are just <em>mean</em>–and it only takes one to ruin everybody’s day. When this happens, you can appeal to a moderator. On IRC, moderators are sometimes identified by an <code>@</code> sign in front of their name; on forums, they may have a special mark on their username or profile. Large projects may have an official policy for reporting abuse on their website or in the channel topic. If there’s no policy, try asking whoever seems in charge for help. Most projects have a primary maintainer or community manager with the power to mute or ban malicious users.</p>
<p>Again, these ways of dealing with abuse are <em>optional</em>. You have no responsibility to provide others with endless patience, and it is not your responsibility to fix a toxic culture. You can always log off and try something else. There are many communities which will welcome and support you–it may just take a few tries to find the right fit.</p>
<p>If you don’t find community, you can <em>build</em> it. Starting your own IRC channel, mailing list, or discussion group with a few friends can be a great way to help each other learn in a supportive environment. And if trolls ever come calling, you’ll be able to ban them personally.</p>
<p>Now, back to problem-solving.</p>
<h2><a href="#execute-the-plan" id="execute-the-plan">Execute the plan</a></h2>
<p>Sometimes we can make a quick fix in the codebase, test it by hand, and move on. But for more serious problems, we’ll need a more involved process. I always try to get a reproducible test suite–one that runs in a matter of seconds–so that I can continually check my work.</p>
<p>Persist. Many problems require grinding away for some time. Mix blind experimentation with sitting back and planning. Periodically re-evaluate your work–have you made progress? Identified a sub-problem that can be solved independently? Developed a new notation?</p>
<p>If you get stuck, try a new tack. Save your approach as a comment or using <code>git stash</code>, and start fresh. Maybe using a different concurrency primitive is in order, or rephrasing the data structure entirely. Take a reading break and review the documentation for the library you’re trying to use. Read the <em>source code</em> for the functions you’re calling–even if you don’t understand exactly what it does, it might give you clues to how things work under the hood.</p>
<p>Bounce your problem off a friend. Grab a sheet of paper or whiteboard, describe the problem, and work through your thinking with that person. Their understanding of the problem might be totally off-base, but can still give you valuable insight. Maybe they know exactly what the problem is, and can point you to a solution in thirty seconds!</p>
<p>Finally, take a break. Go home. Go for a walk. Lift heavy, run hard, space out, drink with your friends, practice music, read a book. Just before sleep, go over the problem once more in your head; I often wake up with a new algorithm or new questions burning to get out. Your unconscious mind can come up with unexpected insights if given time <em>away</em> from the problem!</p>
<p>Some folks swear by time in the shower, others by hiking, or with pen and paper in a hammock. Find what works for you! The important thing seems to be giving yourself <em>away</em> from struggling with the problem.</p>
<h2><a href="#look-back" id="look-back">Look back</a></h2>
<p>Chances are you’ll know as soon as your solution works. The program compiles, transactions generate the correct amounts, etc. Now’s an important time to <em>solidify</em> your work.</p>
<p>Bolster your tests. You may have made the problem <em>less likely</em>, but not actually solved it. Try a more aggressive, randomized test; one that runs for longer, that generates a broader class of input. Try it on a copy of the production workload before deploying your change.</p>
<p>Identify <em>why</em> the new system works. Pasting something in from StackOverflow may get you through the day, but won’t help you solve similar problems in the future. Try to really understand <em>why</em> the program went wrong, and how the new pieces work together to prevent the problem. Is there a more general underlying problem? Could you generalize your technique to solve a related problem? If you’ll encounter this type of issue frequently, could you build a function or library to help build other solutions?</p>
<p>Document the solution. Write down your description of the problem, and why your changes fix it, as comments in the source code. Use that same description of the solution in your commit message, or attach it as a comment to the resources you used online, so that other people can come to the same understanding.</p>
<h2><a href="#debugging-clojure" id="debugging-clojure">Debugging Clojure</a></h2>
<p>With these general strategies in mind, I’d like to talk specifically about the debugging <em>Clojure</em> code–especially understanding its <em>stacktraces</em>. Consider this simple program for baking cakes:</p>
<pre><code><span></span><span class="p">(</span><span class="kd">ns </span><span class="nv">scratch.debugging</span><span class="p">)</span>
<span class="p">(</span><span class="kd">defn </span><span class="nv">bake</span>
<span class="s">"Bakes a cake for a certain amount of time, returning a cake with a new</span>
<span class="s"> :tastiness level."</span>
<span class="p">[</span><span class="nv">pie</span> <span class="nv">temp</span> <span class="nv">time</span><span class="p">]</span>
<span class="p">(</span><span class="nb">assoc </span><span class="nv">pie</span> <span class="ss">:tastiness</span>
<span class="p">(</span><span class="nf">condp</span> <span class="p">(</span><span class="nb">* </span><span class="nv">temp</span> <span class="nv">time</span><span class="p">)</span> <span class="nv"><</span>
<span class="mi">400</span> <span class="ss">:burned</span>
<span class="mi">350</span> <span class="ss">:perfect</span>
<span class="mi">300</span> <span class="ss">:soggy</span><span class="p">)))</span>
</code></pre>
<p>And in the REPL</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nf">bake</span> <span class="p">{</span><span class="ss">:flavor</span> <span class="ss">:blackberry</span><span class="p">}</span> <span class="mi">375</span> <span class="mf">10.25</span><span class="p">)</span>
<span class="nv">ClassCastException</span> <span class="nv">java.lang.Double</span> <span class="nv">cannot</span> <span class="nv">be</span> <span class="nb">cast </span><span class="nv">to</span> <span class="nv">clojure.lang.IFn</span> <span class="nv">scratch.debugging/bake</span> <span class="p">(</span><span class="nf">debugging.clj</span><span class="ss">:8</span><span class="p">)</span>
</code></pre>
<p>This is not particularly helpful. Let’s print a full stacktrace using <code>pst</code>:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nf">pst</span><span class="p">)</span>
<span class="nv">ClassCastException</span> <span class="nv">java.lang.Double</span> <span class="nv">cannot</span> <span class="nv">be</span> <span class="nb">cast </span><span class="nv">to</span> <span class="nv">clojure.lang.IFn</span>
<span class="nv">scratch.debugging/bake</span> <span class="p">(</span><span class="nf">debugging.clj</span><span class="ss">:8</span><span class="p">)</span>
<span class="nv">user/eval1223</span> <span class="p">(</span><span class="nf">form-init4495957503656407289.clj</span><span class="ss">:1</span><span class="p">)</span>
<span class="nv">clojure.lang.Compiler.eval</span> <span class="p">(</span><span class="nf">Compiler.java</span><span class="ss">:6619</span><span class="p">)</span>
<span class="nv">clojure.lang.Compiler.eval</span> <span class="p">(</span><span class="nf">Compiler.java</span><span class="ss">:6582</span><span class="p">)</span>
<span class="nv">clojure.core/eval</span> <span class="p">(</span><span class="nf">core.clj</span><span class="ss">:2852</span><span class="p">)</span>
<span class="nv">clojure.main/repl/read-eval-print--6588/fn--6591</span> <span class="p">(</span><span class="nf">main.clj</span><span class="ss">:259</span><span class="p">)</span>
<span class="nv">clojure.main/repl/read-eval-print--6588</span> <span class="p">(</span><span class="nf">main.clj</span><span class="ss">:259</span><span class="p">)</span>
<span class="nv">clojure.main/repl/fn--6597</span> <span class="p">(</span><span class="nf">main.clj</span><span class="ss">:277</span><span class="p">)</span>
<span class="nv">clojure.main/repl</span> <span class="p">(</span><span class="nf">main.clj</span><span class="ss">:277</span><span class="p">)</span>
<span class="nv">clojure.tools.nrepl.middleware.interruptible-eval/evaluate/fn--591</span> <span class="p">(</span><span class="nf">interruptible_eval.clj</span><span class="ss">:56</span><span class="p">)</span>
<span class="nv">clojure.core/apply</span> <span class="p">(</span><span class="nf">core.clj</span><span class="ss">:617</span><span class="p">)</span>
<span class="nv">clojure.core/with-bindings*</span> <span class="p">(</span><span class="nf">core.clj</span><span class="ss">:1788</span><span class="p">)</span>
</code></pre>
<p>The first line tells us the <em>type</em> of the error: a <code>ClassCastException</code>. Then there’s some explanatory text: we can’t cast a <code>java.lang.Double</code> to a <code>clojure.lang.IFn</code>. The indented lines show the functions that led to the error. The first line is the deepest function, where the error actually occurred: the <code>bake</code> function in the <code>scratch.debugging</code> namespace. In parentheses is the file name (<code>debugging.clj</code>) and line number (<code>8</code>) from the code that caused the error. Each following line shows the function that <em>called</em> the previous line. In the REPL, our code is invoked from a special function compiled by the REPL itself–with an automatically generated name like <code>user/eval1223</code>, and that function is invoked by the Clojure compiler, and the REPL tooling. Once we see something like <code>Compiler.eval</code> at the repl, we can generally skip the rest.</p>
<p>As a general rule, we want to look at the <em>deepest</em> (earliest) point in the stacktrace <em>that we wrote</em>. Sometimes an error will arise from deep within a library or Clojure itself–but it was probably <em>invoked</em> by our code somewhere. We’ll skim down the lines until we find our namespace, and start our investigation at that point.</p>
<p>Our case is simple: <code>bake.clj</code>, on line 8, seems to be the culprit.</p>
<pre><code><span></span> <span class="p">(</span><span class="nf">condp</span> <span class="p">(</span><span class="nb">* </span><span class="nv">temp</span> <span class="nv">time</span><span class="p">)</span> <span class="nv"><</span>
</code></pre>
<p>Now let’s consider the error itself: <code>ClassCastException: java.lang.Double cannot be cast to clojure.lang.IFn</code>. This implies we had a <code>Double</code> and tried to cast it to an <code>IFn</code>–but what does “cast” mean? For that matter, what’s a <code>Double</code>, or an <code>IFn</code>?</p>
<p>A quick google search for <a href="https://www.google.com/search?q=java.lang.double">java.lang.Double</a> reveals that it’s a <em>class</em> (a Java type) with some <a href="http://docs.oracle.com/javase/7/docs/api/java/lang/Double.html">basic documentation</a>. “The Double class wraps a value of the primitive type <code>double</code> in an object” is not particularly informative–but the “class hierarchy” at the top of the page shows that a <code>Double</code> is a kind of <code>java.lang.Number</code>. Let’s experiment at the REPL:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nf">type</span> <span class="mi">4</span><span class="p">)</span>
<span class="nv">java.lang.Long</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nf">type</span> <span class="mf">4.5</span><span class="p">)</span>
<span class="nv">java.lang.Double</span>
</code></pre>
<p>Indeed: decimal numbers in Clojure appear to be doubles. One of the expressions in that <code>condp</code> call was probably a decimal. At first we might suspect the literal values <code>300</code>, <code>350</code>, or <code>400</code>–but those are <code>Long</code>s, not <code>Doubles</code>. The only <code>Double</code> we passed in was the time duration <code>10.25</code>–which appears in <code>condp</code> as <code>(* temp time)</code>. That first argument was a <code>Double</code>, but <em>should</em> have been an IFn.</p>
<p><a href="https://www.google.com/search?q=clojure.lang.IFn">What the heck is an IFn?</a> Its <a href="https://github.com/clojure/clojure/blob/master/src/jvm/clojure/lang/IFn.java">source code</a> has a comment:</p>
<blockquote>
<p>IFn provides complete access to invoking any of Clojure’s API’s. You can also access any other library written in Clojure, after adding
either its source or compiled form to the classpath.</p>
</blockquote>
<p>So IFn has to do with <em>invoking</em> Clojure’s API. Ah–<code>Fn</code> probably stands for <em>function</em>–and this class is chock full of things like <code>invoke(Object arg1, Object arg2)</code>. That suggests that IFn is about <em>calling functions</em>. And the <code>I</code>? Google <a href="https://www.google.com/search?q=java+interface+starts+with+i">suggests</a> it’s a Java convention for an <em>interface</em>–whatever that is. Remember, we don’t have to understand <em>everything</em>–just enough to get by. There’s plenty to explore later.</p>
<p>Let’s check our hypothesis in the repl:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">instance? </span><span class="nv">clojure.lang.IFn</span> <span class="mf">2.5</span><span class="p">)</span>
<span class="nv">false</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nb">instance? </span><span class="nv">clojure.lang.IFn</span> <span class="nv">conj</span><span class="p">)</span>
<span class="nv">true</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nb">instance? </span><span class="nv">clojure.lang.IFn</span> <span class="p">(</span><span class="k">fn </span><span class="p">[</span><span class="nv">x</span><span class="p">]</span> <span class="p">(</span><span class="nb">inc </span><span class="nv">x</span><span class="p">)))</span>
<span class="nv">true</span>
</code></pre>
<p>So <code>Doubles</code> aren’t IFns–but Clojure built-in functions, and anonymous functions, both are. Let’s double-check the docs for <code>condp</code> again:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">doc </span><span class="nv">condp</span><span class="p">)</span>
<span class="nv">-------------------------</span>
<span class="nv">clojure.core/condp</span>
<span class="p">([</span><span class="nv">pred</span> <span class="nv">expr</span> <span class="o">&</span> <span class="nv">clauses</span><span class="p">])</span>
<span class="nv">Macro</span>
<span class="nv">Takes</span> <span class="nv">a</span> <span class="nv">binary</span> <span class="nv">predicate</span>, <span class="nv">an</span> <span class="nv">expression</span>, <span class="nb">and </span><span class="nv">a</span> <span class="nb">set </span><span class="nv">of</span> <span class="nv">clauses.</span>
<span class="nv">Each</span> <span class="nv">clause</span> <span class="nv">can</span> <span class="nb">take </span><span class="nv">the</span> <span class="nv">form</span> <span class="nv">of</span> <span class="nv">either</span><span class="err">:</span>
<span class="nv">test-expr</span> <span class="nv">result-expr</span>
<span class="nv">test-expr</span> <span class="ss">:>></span> <span class="nv">result-fn</span>
<span class="nv">Note</span> <span class="ss">:>></span> <span class="nv">is</span> <span class="nv">an</span> <span class="nv">ordinary</span> <span class="nv">keyword.</span>
<span class="nv">For</span> <span class="nv">each</span> <span class="nv">clause</span>, <span class="p">(</span><span class="nf">pred</span> <span class="nv">test-expr</span> <span class="nv">expr</span><span class="p">)</span> <span class="nv">is</span> <span class="nv">evaluated.</span> <span class="nv">If</span> <span class="nv">it</span> <span class="nv">returns</span>
<span class="nv">logical</span> <span class="nv">true</span>, <span class="nv">the</span> <span class="nv">clause</span> <span class="nv">is</span> <span class="nv">a</span> <span class="nv">match.</span> <span class="nv">If</span> <span class="nv">a</span> <span class="nv">binary</span> <span class="nv">clause</span> <span class="nv">matches</span>, <span class="nv">the</span>
<span class="nv">result-expr</span> <span class="nv">is</span> <span class="nv">returned</span>, <span class="k">if </span><span class="nv">a</span> <span class="nv">ternary</span> <span class="nv">clause</span> <span class="nv">matches</span>, <span class="nv">its</span> <span class="nv">result-fn</span>,
<span class="nv">which</span> <span class="nv">must</span> <span class="nv">be</span> <span class="nv">a</span> <span class="nv">unary</span> <span class="nv">function</span>, <span class="nv">is</span> <span class="nv">called</span> <span class="nv">with</span> <span class="nv">the</span> <span class="nv">result</span> <span class="nv">of</span> <span class="nv">the</span>
<span class="nv">predicate</span> <span class="nv">as</span> <span class="nv">its</span> <span class="nv">argument</span>, <span class="nv">the</span> <span class="nv">result</span> <span class="nv">of</span> <span class="nv">that</span> <span class="nv">call</span> <span class="nv">being</span> <span class="nv">the</span> <span class="nv">return</span>
<span class="nv">value</span> <span class="nv">of</span> <span class="nv">condp.</span> <span class="nv">A</span> <span class="nv">single</span> <span class="nv">default</span> <span class="nv">expression</span> <span class="nv">can</span> <span class="nv">follow</span> <span class="nv">the</span> <span class="nv">clauses</span>,
<span class="nb">and </span><span class="nv">its</span> <span class="nv">value</span> <span class="nv">will</span> <span class="nv">be</span> <span class="nv">returned</span> <span class="k">if </span><span class="nv">no</span> <span class="nv">clause</span> <span class="nv">matches.</span> <span class="nv">If</span> <span class="nv">no</span> <span class="nv">default</span>
<span class="nv">expression</span> <span class="nv">is</span> <span class="nv">provided</span> <span class="nb">and </span><span class="nv">no</span> <span class="nv">clause</span> <span class="nv">matches</span>, <span class="nv">an</span>
<span class="nv">IllegalArgumentException</span> <span class="nv">is</span> <span class="nv">thrown.clj</span>
</code></pre>
<p>That’s a lot to take in! No wonder we got it wrong! We’ll take it slow, and look at the arguments.</p>
<pre><code><span></span><span class="p">(</span><span class="nf">condp</span> <span class="p">(</span><span class="nb">* </span><span class="nv">temp</span> <span class="nv">time</span><span class="p">)</span> <span class="nv"><</span>
</code></pre>
<p>Our <code>pred</code> was <code>(* temp time)</code> (a <code>Double</code>), and our <code>expr</code> was the comparison function <code><</code>. For each clause, <code>(pred test-expr expr)</code> is evaluated, so that would expand to something like</p>
<pre><code><span></span><span class="p">((</span><span class="nb">* </span><span class="nv">temp</span> <span class="nv">time</span><span class="p">)</span> <span class="mi">400</span> <span class="nv"><</span><span class="p">)</span>
</code></pre>
<p>Which evaluates to something like</p>
<pre><code><span></span><span class="p">(</span><span class="mf">123.45</span> <span class="mi">400</span> <span class="nv"><</span><span class="p">)</span>
</code></pre>
<p>But this isn’t a valid Lisp program! It starts with a number, not a function. We should have written <code>(< 123.45 400)</code>. Our arguments are backwards!</p>
<pre><code><span></span><span class="p">(</span><span class="kd">defn </span><span class="nv">bake</span>
<span class="s">"Bakes a cake for a certain amount of time, returning a cake with a new</span>
<span class="s"> :tastiness level."</span>
<span class="p">[</span><span class="nv">pie</span> <span class="nv">temp</span> <span class="nv">time</span><span class="p">]</span>
<span class="p">(</span><span class="nb">assoc </span><span class="nv">pie</span> <span class="ss">:tastiness</span>
<span class="p">(</span><span class="nf">condp</span> <span class="nb">< </span><span class="p">(</span><span class="nb">* </span><span class="nv">temp</span> <span class="nv">time</span><span class="p">)</span>
<span class="mi">400</span> <span class="ss">:burned</span>
<span class="mi">350</span> <span class="ss">:perfect</span>
<span class="mi">300</span> <span class="ss">:soggy</span><span class="p">)))</span>
</code></pre>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nf">use</span> <span class="ss">'scratch.debugging</span> <span class="ss">:reload</span><span class="p">)</span>
<span class="nv">nil</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nf">bake</span> <span class="p">{</span><span class="ss">:flavor</span> <span class="ss">:chocolate</span><span class="p">}</span> <span class="mi">375</span> <span class="mf">10.25</span><span class="p">)</span>
<span class="p">{</span><span class="ss">:tastiness</span> <span class="ss">:burned</span>, <span class="ss">:flavor</span> <span class="ss">:chocolate</span><span class="p">}</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nf">bake</span> <span class="p">{</span><span class="ss">:flavor</span> <span class="ss">:chocolate</span><span class="p">}</span> <span class="mi">450</span> <span class="mf">0.8</span><span class="p">)</span>
<span class="p">{</span><span class="ss">:tastiness</span> <span class="ss">:perfect</span>, <span class="ss">:flavor</span> <span class="ss">:chocolate</span><span class="p">}</span>
</code></pre>
<p>Mission accomplished! We read the stacktrace as a <em>path</em> to a part of the program where things went wrong. We identified the deepest part of that path in <em>our</em> code, and looked for a problem there. We discovered that we had reversed the arguments to a function, and after some research and experimentation in the REPL, figured out the right order.</p>
<p>An aside on types: some languages have a <em>stricter</em> type system than Clojure’s, in which the types of variables are explicitly declared in the program’s source code. Those languages can detect type errors–when a variable of one type is used in place of another, incompatible, type–and offer more precise feedback. In Clojure, the compiler does not generally enforce types at compile time, which allows for significant flexibility–but requires more rigorous testing to expose these errors.</p>
<h2><a href="#higher-order-stacktraces" id="higher-order-stacktraces">Higher order stacktraces</a></h2>
<p>The stacktrace shows us a <em>path</em> through the program, moving downwards through functions. However, that path may not be straightforward. When data is handed off from one part of the program to another, the stacktrace may not show the <em>origin</em> of an error. When <em>functions</em> are handed off from one part of the program to another, the resulting traces can be tricky to interpret indeed.</p>
<p>For instance, say we wanted to make some picture frames out of wood, but didn’t know how much wood to buy. We might sketch out a program like this:</p>
<pre><code><span></span><span class="p">(</span><span class="kd">defn </span><span class="nv">perimeter</span>
<span class="s">"Given a rectangle, returns a vector of its edge lengths."</span>
<span class="p">[</span><span class="nv">rect</span><span class="p">]</span>
<span class="p">[(</span><span class="ss">:x</span> <span class="nv">rect</span><span class="p">)</span>
<span class="p">(</span><span class="ss">:y</span> <span class="nv">rect</span><span class="p">)</span>
<span class="p">(</span><span class="ss">:z</span> <span class="nv">rect</span><span class="p">)</span>
<span class="p">(</span><span class="ss">:y</span> <span class="nv">rect</span><span class="p">)])</span>
<span class="p">(</span><span class="kd">defn </span><span class="nv">frame</span>
<span class="s">"Given a mat width, and a photo rectangle, figure out the size of the frame</span>
<span class="s"> required by adding the mat width around all edges of the photo."</span>
<span class="p">[</span><span class="nv">mat-width</span> <span class="nv">rect</span><span class="p">]</span>
<span class="p">(</span><span class="k">let </span><span class="p">[</span><span class="nv">margin</span> <span class="p">(</span><span class="nb">* </span><span class="mi">2</span> <span class="nv">rect</span><span class="p">)]</span>
<span class="p">{</span><span class="ss">:x</span> <span class="p">(</span><span class="nb">+ </span><span class="nv">margin</span> <span class="p">(</span><span class="ss">:x</span> <span class="nv">rect</span><span class="p">))</span>
<span class="ss">:y</span> <span class="p">(</span><span class="nb">+ </span><span class="nv">margin</span> <span class="p">(</span><span class="ss">:y</span> <span class="nv">rect</span><span class="p">))}))</span>
<span class="p">(</span><span class="k">def </span><span class="nv">failure-rate</span>
<span class="s">"Sometimes the wood is knotty or we screw up a cut. We'll assume we need a</span>
<span class="s"> spare segment once every 8."</span>
<span class="mi">1</span><span class="nv">/8</span><span class="p">)</span>
<span class="p">(</span><span class="kd">defn </span><span class="nv">spares</span>
<span class="s">"Given a list of segments, figure out roughly how many of each distinct size</span>
<span class="s"> will go bad, and emit a sequence of spare segments, assuming we screw up</span>
<span class="s"> `failure-rate` of them."</span>
<span class="p">[</span><span class="nv">segments</span><span class="p">]</span>
<span class="p">(</span><span class="nf">->></span> <span class="nv">segments</span>
<span class="c1">; Compute a map of each segment length to the number of</span>
<span class="c1">; segments we'll need of that size.</span>
<span class="nv">frequencies</span>
<span class="c1">; Make a list of spares for each segment length,</span>
<span class="c1">; based on how often we think we'll screw up.</span>
<span class="p">(</span><span class="nb">mapcat </span><span class="p">(</span><span class="k">fn </span><span class="p">[</span> <span class="p">[</span><span class="nv">segment</span> <span class="nv">n</span><span class="p">]]</span>
<span class="p">(</span><span class="nb">repeat </span><span class="p">(</span><span class="nb">* </span><span class="nv">failure-rate</span> <span class="nv">n</span><span class="p">)</span>
<span class="nv">segment</span><span class="p">)))))</span>
<span class="p">(</span><span class="k">def </span><span class="nv">cut-size</span>
<span class="s">"How much extra wood do we need for each cut? Let's say a mitred cut for a</span>
<span class="s"> 1-inch frame needs a full inch."</span>
<span class="mi">1</span><span class="p">)</span>
<span class="p">(</span><span class="kd">defn </span><span class="nv">total-wood</span>
<span class="p">[</span><span class="nv">mat-width</span> <span class="nv">photos</span><span class="p">]</span>
<span class="s">"Given a mat width and a collection of photos, compute the total linear</span>
<span class="s"> amount of wood we need to buy in order to make frames for each, given a</span>
<span class="s"> 2-inch mat."</span>
<span class="p">(</span><span class="k">let </span><span class="p">[</span><span class="nv">segments</span> <span class="p">(</span><span class="nf">->></span> <span class="nv">photos</span>
<span class="c1">; Convert photos to frame dimensions</span>
<span class="p">(</span><span class="nb">map </span><span class="p">(</span><span class="nb">partial </span><span class="nv">frame</span> <span class="nv">mat-width</span><span class="p">))</span>
<span class="c1">; Convert frames to segments</span>
<span class="p">(</span><span class="nb">mapcat </span><span class="nv">perimeter</span><span class="p">))]</span>
<span class="c1">; Now, take segments</span>
<span class="p">(</span><span class="nf">->></span> <span class="nv">segments</span>
<span class="c1">; Add the spares</span>
<span class="p">(</span><span class="nb">concat </span><span class="p">(</span><span class="nf">spares</span> <span class="nv">segments</span><span class="p">))</span>
<span class="c1">; Include a cut between each segment</span>
<span class="p">(</span><span class="nf">interpose</span> <span class="nv">cut-size</span><span class="p">)</span>
<span class="c1">; And sum the whole shebang.</span>
<span class="p">(</span><span class="nb">reduce </span><span class="nv">+</span><span class="p">))))</span>
<span class="p">(</span><span class="nf">->></span> <span class="p">[{</span><span class="ss">:x</span> <span class="mi">8</span>
<span class="ss">:y</span> <span class="mi">10</span><span class="p">}</span>
<span class="p">{</span><span class="ss">:x</span> <span class="mi">10</span>
<span class="ss">:y</span> <span class="mi">8</span><span class="p">}</span>
<span class="p">{</span><span class="ss">:x</span> <span class="mi">20</span>
<span class="ss">:y</span> <span class="mi">30</span><span class="p">}]</span>
<span class="p">(</span><span class="nf">total-wood</span> <span class="mi">2</span><span class="p">)</span>
<span class="p">(</span><span class="nb">println </span><span class="s">"total inches:"</span><span class="p">))</span>
</code></pre>
<p>Running this program yields a curious stacktrace. We’ll print the <em>full</em> trace (not the shortened one that comes with <code>pst</code>) for the last exception <code>*e</code> with the <code>.printStackTrace</code> function.</p>
<pre><code>user=> (.printStackTrace *e)
java.lang.ClassCastException: clojure.lang.PersistentArrayMap cannot be cast to java.lang.Number, compiling:(scratch/debugging.clj:73:23)
at clojure.lang.Compiler.load(Compiler.java:7142)
at clojure.lang.RT.loadResourceScript(RT.java:370)
at clojure.lang.RT.loadResourceScript(RT.java:361)
at clojure.lang.RT.load(RT.java:440)
at clojure.lang.RT.load(RT.java:411)
...
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.ClassCastException: clojure.lang.PersistentArrayMap cannot be cast to java.lang.Number
at clojure.lang.Numbers.multiply(Numbers.java:146)
at clojure.lang.Numbers.multiply(Numbers.java:3659)
at scratch.debugging$frame.invoke(debugging.clj:26)
at clojure.lang.AFn.applyToHelper(AFn.java:156)
at clojure.lang.AFn.applyTo(AFn.java:144)
at clojure.core$apply.invoke(core.clj:626)
at clojure.core$partial$fn__4228.doInvoke(core.clj:2468)
at clojure.lang.RestFn.invoke(RestFn.java:408)
at clojure.core$map$fn__4245.invoke(core.clj:2557)
at clojure.lang.LazySeq.sval(LazySeq.java:40)
at clojure.lang.LazySeq.seq(LazySeq.java:49)
at clojure.lang.RT.seq(RT.java:484)
at clojure.core$seq.invoke(core.clj:133)
at clojure.core$map$fn__4245.invoke(core.clj:2551)
at clojure.lang.LazySeq.sval(LazySeq.java:40)
at clojure.lang.LazySeq.seq(LazySeq.java:49)
at clojure.lang.RT.seq(RT.java:484)
at clojure.core$seq.invoke(core.clj:133)
at clojure.core$apply.invoke(core.clj:624)
at clojure.core$mapcat.doInvoke(core.clj:2586)
at clojure.lang.RestFn.invoke(RestFn.java:423)
at scratch.debugging$total_wood.invoke(debugging.clj:62)
...
</code></pre>
<p>First: this trace has <em>two parts</em>. The top-level error (a <code>CompilerException</code>) appears first, and is followed by the exception that <em>caused</em> the <code>CompilerException</code>: a <code>ClassCastException</code>. This makes the stacktrace read somewhat out of order, since the deepest part of the trace occurs in the <em>first</em> line of the <em>last</em> exception. We read <code>C B A</code> then <code>F E D</code>. This is an old convention in the Java language, and the cause of no end of frustration.</p>
<p>Notice that this representation of the stacktrace is less friendly than <code>(pst)</code>. We’re seeing the Java Virtual Machine (JVM)’s internal representation of Clojure functions, which look like <code>clojure.core$partial$fn__4228.doInvoke</code>. This corresponds to the namespace <code>clojure.core</code>, in which there is a function called <code>partial</code>, inside of which is an <em>anonymous</em> function, here named <code>fn__4228</code>. Calling a Clojure function is written, in the JVM, as <code>.invoke</code> or <code>.doInvoke</code>.</p>
<p>So: the root cause was a <code>ClassCastException</code>, and it tells us that Clojure expected a <code>java.lang.Number</code>, but found a <code>PersistentArrayMap</code>. We might guess that <code>PersistentArrayMap</code> is something to do with the map data structure, which we used in this program:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nf">type</span> <span class="p">{</span><span class="ss">:x</span> <span class="mi">1</span><span class="p">})</span>
<span class="nv">clojure.lang.PersistentArrayMap</span>
</code></pre>
<p>And we’d be right. We can also tell, by reading down the stacktrace looking for our <code>scratch.debugging</code> namespace, where the error took place: <code>scratch.debugging$frame</code>, on line <code>26</code>.</p>
<pre><code><span></span> <span class="p">(</span><span class="k">let </span><span class="p">[</span><span class="nv">margin</span> <span class="p">(</span><span class="nb">* </span><span class="mi">2</span> <span class="nv">rect</span><span class="p">)]</span>
</code></pre>
<p>There’s our multiplication operation <code>*</code>, which we might assume expands to <code>clojure.lang.Numbers.multiply</code>. But the <em>path</em> to the error is odd.</p>
<pre><code><span></span> <span class="p">(</span><span class="nf">->></span> <span class="nv">photos</span>
<span class="c1">; Convert photos to frame dimensions</span>
<span class="p">(</span><span class="nb">map </span><span class="p">(</span><span class="nb">partial </span><span class="nv">frame</span> <span class="nv">mat-width</span><span class="p">))</span>
</code></pre>
<p>In <code>total-wood</code>, we call <code>(map (partial frame mat-width) photos)</code> right away, so we’d expect the stacktrace to go from <code>total-wood</code> to <code>map</code> to <code>frame</code>. But this is <em>not</em> what happens. Instead, <code>total-wood</code> invokes something called <code>RestFn</code>–a piece of Clojure plumbing–which in turn calls <code>mapcat</code>.</p>
<pre><code> at clojure.core$mapcat.doInvoke(core.clj:2586)
at clojure.lang.RestFn.invoke(RestFn.java:423)
at scratch.debugging$total_wood.invoke(debugging.clj:62)
</code></pre>
<p>Why doesn’t <code>total-wood</code> call <code>map</code> first? Well it <em>did</em>–but <code>map</code> doesn’t actually apply its function to anything in the <code>photos</code> vector when invoked. Instead, it returns a <em>lazy</em> sequence–one which applies <code>frame</code> only when elements are asked for.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nf">type</span> <span class="p">(</span><span class="nb">map inc </span><span class="p">(</span><span class="nb">range </span><span class="mi">10</span><span class="p">)))</span>
<span class="nv">clojure.lang.LazySeq</span>
</code></pre>
<p>Inside each <code>LazySeq</code> is a box containing a function. When you ask a <code>LazySeq</code> for its first value, it calls that function to return a new sequence–and <em>that’s</em> when <code>frame</code> gets invoked. What we’re seeing in this stacktrace is the <code>LazySeq</code> internal machinery at work–<code>mapcat</code> asks it for a value, and the LazySeq asks <code>map</code> to generate that value.</p>
<pre><code> at clojure.core$partial$fn__4228.doInvoke(core.clj:2468)
at clojure.lang.RestFn.invoke(RestFn.java:408)
at clojure.core$map$fn__4245.invoke(core.clj:2557)
at clojure.lang.LazySeq.sval(LazySeq.java:40)
at clojure.lang.LazySeq.seq(LazySeq.java:49)
at clojure.lang.RT.seq(RT.java:484)
at clojure.core$seq.invoke(core.clj:133)
at clojure.core$map$fn__4245.invoke(core.clj:2551)
at clojure.lang.LazySeq.sval(LazySeq.java:40)
at clojure.lang.LazySeq.seq(LazySeq.java:49)
at clojure.lang.RT.seq(RT.java:484)
at clojure.core$seq.invoke(core.clj:133)
at clojure.core$apply.invoke(core.clj:624)
at clojure.core$mapcat.doInvoke(core.clj:2586)
at clojure.lang.RestFn.invoke(RestFn.java:423)
at scratch.debugging$total_wood.invoke(debugging.clj:62)
</code></pre>
<p>In fact we pass through <code>map</code>’s laziness <em>twice</em> here: a quick peek at <code>(source mapcat)</code> shows that it expands into a <code>map</code> call itself, and then there’s a <em>second</em> map: the one we created in in <code>total-wood</code>. Then an odd thing happens–we hit something called <code>clojure.core$partial$fn__4228</code>.</p>
<pre><code><span></span> <span class="p">(</span><span class="nb">map </span><span class="p">(</span><span class="nb">partial </span><span class="nv">frame</span> <span class="nv">mat-width</span><span class="p">)</span> <span class="nv">photos</span><span class="p">)</span>
</code></pre>
<p>The <code>frame</code> function takes two arguments: a mat width and a photo. We wanted a function that takes just <em>one</em> argument: a photo. <code>(partial frame mat-width)</code> took <code>mat-width</code> and generated a <em>new function</em> which takes one arg–call it <code>photo</code>–and calls <code>(frame mat-width photo)</code>. That automatically generated function, returned by <code>partial</code>, is what <code>map</code> uses to generate new elements of its sequence on demand.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">partial + </span><span class="mi">1</span><span class="p">)</span>
<span class="o">#</span><span class="nv"><core$partial$fn__4228</span> <span class="nv">clojure.core$partial$fn__4228</span><span class="o">@</span><span class="mi">243634</span><span class="nv">f2></span>
<span class="nv">user=></span> <span class="p">((</span><span class="nb">partial + </span><span class="mi">1</span><span class="p">)</span> <span class="mi">4</span><span class="p">)</span>
<span class="mi">5</span>
</code></pre>
<p>That’s why we see control flow through <code>clojure.core$partial$fn__4228</code> (an anonymous function defined inside <code>clojure.core/partial</code>) on the way to <code>frame</code>.</p>
<pre><code>Caused by: java.lang.ClassCastException: clojure.lang.PersistentArrayMap cannot be cast to java.lang.Number
at clojure.lang.Numbers.multiply(Numbers.java:146)
at clojure.lang.Numbers.multiply(Numbers.java:3659)
at scratch.debugging$frame.invoke(debugging.clj:26)
at clojure.lang.AFn.applyToHelper(AFn.java:156)
at clojure.lang.AFn.applyTo(AFn.java:144)
at clojure.core$apply.invoke(core.clj:626)
at clojure.core$partial$fn__4228.doInvoke(core.clj:2468)
</code></pre>
<p>And there’s our suspect! <code>scratch.debugging/frame</code>, at line <code>26</code>. To return to that line again:</p>
<pre><code><span></span> <span class="p">(</span><span class="k">let </span><span class="p">[</span><span class="nv">margin</span> <span class="p">(</span><span class="nb">* </span><span class="mi">2</span> <span class="nv">rect</span><span class="p">)]</span>
</code></pre>
<p><code>*</code> is a multiplication, and <code>2</code> is obviously a number, but <code>rect</code>… <code>rect</code> is a map here. Aha! We meant to multiply the <code>mat-width</code> by two, not the rectangle.</p>
<pre><code><span></span><span class="p">(</span><span class="kd">defn </span><span class="nv">frame</span>
<span class="s">"Given a mat width, and a photo rectangle, figure out the size of the frame</span>
<span class="s"> required by adding the mat width around all edges of the photo."</span>
<span class="p">[</span><span class="nv">mat-width</span> <span class="nv">rect</span><span class="p">]</span>
<span class="p">(</span><span class="k">let </span><span class="p">[</span><span class="nv">margin</span> <span class="p">(</span><span class="nb">* </span><span class="mi">2</span> <span class="nv">mat-width</span><span class="p">)]</span>
<span class="p">{</span><span class="ss">:x</span> <span class="p">(</span><span class="nb">+ </span><span class="nv">margin</span> <span class="p">(</span><span class="ss">:x</span> <span class="nv">rect</span><span class="p">))</span>
<span class="ss">:y</span> <span class="p">(</span><span class="nb">+ </span><span class="nv">margin</span> <span class="p">(</span><span class="ss">:y</span> <span class="nv">rect</span><span class="p">))}))</span>
</code></pre>
<p>I believe we’ve fixed the bug, then. Let’s give it a shot!</p>
<h2><a href="#the-unbearable-lightness-of-nil" id="the-unbearable-lightness-of-nil">The unbearable lightness of nil</a></h2>
<p>There’s one more bug lurking in this program. This one’s stacktrace is short.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nf">use</span> <span class="ss">'scratch.debugging</span> <span class="ss">:reload</span><span class="p">)</span>
<span class="nv">CompilerException</span> <span class="nv">java.lang.NullPointerException</span>, <span class="nv">compiling</span><span class="err">:</span><span class="p">(</span><span class="nf">scratch/debugging.clj</span><span class="ss">:73:23</span><span class="p">)</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nf">pst</span><span class="p">)</span>
<span class="nv">CompilerException</span> <span class="nv">java.lang.NullPointerException</span>, <span class="nv">compiling</span><span class="err">:</span><span class="p">(</span><span class="nf">scratch/debugging.clj</span><span class="ss">:73:23</span><span class="p">)</span>
<span class="nv">clojure.lang.Compiler.load</span> <span class="p">(</span><span class="nf">Compiler.java</span><span class="ss">:7142</span><span class="p">)</span>
<span class="nv">clojure.lang.RT.loadResourceScript</span> <span class="p">(</span><span class="nf">RT.java</span><span class="ss">:370</span><span class="p">)</span>
<span class="nv">clojure.lang.RT.loadResourceScript</span> <span class="p">(</span><span class="nf">RT.java</span><span class="ss">:361</span><span class="p">)</span>
<span class="nv">clojure.lang.RT.load</span> <span class="p">(</span><span class="nf">RT.java</span><span class="ss">:440</span><span class="p">)</span>
<span class="nv">clojure.lang.RT.load</span> <span class="p">(</span><span class="nf">RT.java</span><span class="ss">:411</span><span class="p">)</span>
<span class="nv">clojure.core/load/fn--5066</span> <span class="p">(</span><span class="nf">core.clj</span><span class="ss">:5641</span><span class="p">)</span>
<span class="nv">clojure.core/load</span> <span class="p">(</span><span class="nf">core.clj</span><span class="ss">:5640</span><span class="p">)</span>
<span class="nv">clojure.core/load-one</span> <span class="p">(</span><span class="nf">core.clj</span><span class="ss">:5446</span><span class="p">)</span>
<span class="nv">clojure.core/load-lib/fn--5015</span> <span class="p">(</span><span class="nf">core.clj</span><span class="ss">:5486</span><span class="p">)</span>
<span class="nv">clojure.core/load-lib</span> <span class="p">(</span><span class="nf">core.clj</span><span class="ss">:5485</span><span class="p">)</span>
<span class="nv">clojure.core/apply</span> <span class="p">(</span><span class="nf">core.clj</span><span class="ss">:626</span><span class="p">)</span>
<span class="nv">clojure.core/load-libs</span> <span class="p">(</span><span class="nf">core.clj</span><span class="ss">:5524</span><span class="p">)</span>
<span class="nv">Caused</span> <span class="nv">by</span><span class="err">:</span>
<span class="nv">NullPointerException</span>
<span class="nv">clojure.lang.Numbers.ops</span> <span class="p">(</span><span class="nf">Numbers.java</span><span class="ss">:961</span><span class="p">)</span>
<span class="nv">clojure.lang.Numbers.add</span> <span class="p">(</span><span class="nf">Numbers.java</span><span class="ss">:126</span><span class="p">)</span>
<span class="nv">clojure.core/+</span> <span class="p">(</span><span class="nf">core.clj</span><span class="ss">:951</span><span class="p">)</span>
<span class="nv">clojure.core.protocols/fn--6086</span> <span class="p">(</span><span class="nf">protocols.clj</span><span class="ss">:143</span><span class="p">)</span>
<span class="nv">clojure.core.protocols/fn--6057/G--6052--6066</span> <span class="p">(</span><span class="nf">protocols.clj</span><span class="ss">:19</span><span class="p">)</span>
<span class="nv">clojure.core.protocols/seq-reduce</span> <span class="p">(</span><span class="nf">protocols.clj</span><span class="ss">:27</span><span class="p">)</span>
<span class="nv">clojure.core.protocols/fn--6078</span> <span class="p">(</span><span class="nf">protocols.clj</span><span class="ss">:53</span><span class="p">)</span>
<span class="nv">clojure.core.protocols/fn--6031/G--6026--6044</span> <span class="p">(</span><span class="nf">protocols.clj</span><span class="ss">:13</span><span class="p">)</span>
<span class="nv">clojure.core/reduce</span> <span class="p">(</span><span class="nf">core.clj</span><span class="ss">:6287</span><span class="p">)</span>
<span class="nv">scratch.debugging/total-wood</span> <span class="p">(</span><span class="nf">debugging.clj</span><span class="ss">:69</span><span class="p">)</span>
<span class="nv">scratch.debugging/eval1560</span> <span class="p">(</span><span class="nf">debugging.clj</span><span class="ss">:81</span><span class="p">)</span>
<span class="nv">clojure.lang.Compiler.eval</span> <span class="p">(</span><span class="nf">Compiler.java</span><span class="ss">:6703</span><span class="p">)</span>
</code></pre>
<p>On line 69, <code>total-wood</code> calls <code>reduce</code>, which dives through a series of functions from <code>clojure.core.protocols</code> before emerging in <code>+</code>: the function we passed to <code>reduce</code>. Reduce is trying to combine two elements from its collection of wood segments using <code>+</code>, but one of them was <code>nil</code>. Clojure calls this a <code>NullPointerException</code>. In <code>total-wood</code>, we constructed the sequence of segments this way:</p>
<pre><code><span></span> <span class="p">(</span><span class="k">let </span><span class="p">[</span><span class="nv">segments</span> <span class="p">(</span><span class="nf">->></span> <span class="nv">photos</span>
<span class="c1">; Convert photos to frame dimensions</span>
<span class="p">(</span><span class="nb">map </span><span class="p">(</span><span class="nb">partial </span><span class="nv">frame</span> <span class="nv">mat-width</span><span class="p">))</span>
<span class="c1">; Convert frames to segments</span>
<span class="p">(</span><span class="nb">mapcat </span><span class="nv">perimeter</span><span class="p">))]</span>
<span class="c1">; Now, take segments</span>
<span class="p">(</span><span class="nf">->></span> <span class="nv">segments</span>
<span class="c1">; Add the spares</span>
<span class="p">(</span><span class="nb">concat </span><span class="p">(</span><span class="nf">spares</span> <span class="nv">segments</span><span class="p">))</span>
<span class="c1">; Include a cut between each segment</span>
<span class="p">(</span><span class="nf">interpose</span> <span class="nv">cut-size</span><span class="p">)</span>
<span class="c1">; And sum the whole shebang.</span>
<span class="p">(</span><span class="nb">reduce </span><span class="nv">+</span><span class="p">))))</span>
</code></pre>
<p>Where did the <code>nil</code> value come from? The stacktrace <em>doesn’t say</em>, because the sequence <code>reduce</code> is traversing didn’t have any problem <em>producing</em> the <code>nil</code>. <code>reduce</code> asked for a value and the sequence happily produced a <code>nil</code>. We only had a problem when it came time to <em>combine</em> the <code>nil</code> with the next value, using <code>+</code>.</p>
<p>A stacktrace like this is something like a murder mystery: we know the program died in the reducer, that it was shot with a <code>+</code>, and the bullet was a <code>nil</code>–but we don’t know where the bullet came from. The trail runs cold. We need <em>more forensic information</em>–more hints about the <code>nil</code>’s origin–to find the culprit.</p>
<p>Again, this is a class of error largely preventable with static type systems. If you have worked with a statically typed language in the past, it may be interesting to consider that almost every Clojure function takes <code>Option[A]</code> and does something more-or-less sensible, returning <code>Option[B]</code>. Whether the error propagates as a <code>nil</code> or an <code>Option</code>, there can be similar difficulties in localizing the cause of the problem.</p>
<p>Let’s try printing out the state as <code>reduce</code> goes along:</p>
<pre><code><span></span> <span class="p">(</span><span class="nf">->></span> <span class="nv">segments</span>
<span class="c1">; Add the spares</span>
<span class="p">(</span><span class="nb">concat </span><span class="p">(</span><span class="nf">spares</span> <span class="nv">segments</span><span class="p">))</span>
<span class="c1">; Include a cut between each segment</span>
<span class="p">(</span><span class="nf">interpose</span> <span class="nv">cut-size</span><span class="p">)</span>
<span class="c1">; And sum the whole shebang.</span>
<span class="p">(</span><span class="nb">reduce </span><span class="p">(</span><span class="k">fn </span><span class="p">[</span><span class="nv">acc</span> <span class="nv">x</span><span class="p">]</span> <span class="p">(</span><span class="nb">prn </span><span class="nv">acc</span> <span class="nv">x</span><span class="p">)</span> <span class="p">(</span><span class="nb">+ </span><span class="nv">acc</span> <span class="nv">x</span><span class="p">))))))</span>
</code></pre>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nf">use</span> <span class="ss">'scratch.debugging</span> <span class="ss">:reload</span><span class="p">)</span>
<span class="mi">12</span> <span class="mi">1</span>
<span class="mi">13</span> <span class="mi">14</span>
<span class="mi">27</span> <span class="mi">1</span>
<span class="mi">28</span> <span class="nv">nil</span>
<span class="nv">CompilerException</span> <span class="nv">java.lang.NullPointerException</span>, <span class="nv">compiling</span><span class="err">:</span><span class="p">(</span><span class="nf">scratch/debugging.clj</span><span class="ss">:73:56</span><span class="p">)</span>
</code></pre>
<p>Not every value is nil! There’s a <code>14</code> there which looks like a plausible segment for a frame, and two one-inch buffers from <code>cut-size</code>. We can rule out <code>interpose</code> because it inserts a <code>1</code> every time, and that <code>1</code> reduces correctly. But where’s that <code>nil</code> coming from? Is from <code>segments</code> or <code>(spares segments)</code>?</p>
<pre><code><span></span> <span class="p">(</span><span class="k">let </span><span class="p">[</span><span class="nv">segments</span> <span class="p">(</span><span class="nf">->></span> <span class="nv">photos</span>
<span class="c1">; Convert photos to frame dimensions</span>
<span class="p">(</span><span class="nb">map </span><span class="p">(</span><span class="nb">partial </span><span class="nv">frame</span> <span class="nv">mat-width</span><span class="p">))</span>
<span class="c1">; Convert frames to segments</span>
<span class="p">(</span><span class="nb">mapcat </span><span class="nv">perimeter</span><span class="p">))]</span>
<span class="p">(</span><span class="nb">prn </span><span class="ss">:segments</span> <span class="nv">segments</span><span class="p">)</span>
</code></pre>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nf">use</span> <span class="ss">'scratch.debugging</span> <span class="ss">:reload</span><span class="p">)</span>
<span class="ss">:segments</span> <span class="p">(</span><span class="mi">12</span> <span class="mi">14</span> <span class="nv">nil</span> <span class="mi">14</span> <span class="mi">14</span> <span class="mi">12</span> <span class="nv">nil</span> <span class="mi">12</span> <span class="mi">24</span> <span class="mi">34</span> <span class="nv">nil</span> <span class="mi">34</span><span class="p">)</span>
</code></pre>
<p>It is present in <code>segments</code>. Let’s trace it backwards through the sequence’s creation. It’d be handy to have a function like <code>prn</code> that <em>returned</em> its input, so we could spy on values as they flowed through the <code>->></code> macro.</p>
<pre><code><span></span><span class="p">(</span><span class="kd">defn </span><span class="nv">spy</span>
<span class="p">[</span><span class="o">&</span> <span class="nv">args</span><span class="p">]</span>
<span class="p">(</span><span class="nb">apply prn </span><span class="nv">args</span><span class="p">)</span>
<span class="p">(</span><span class="nb">last </span><span class="nv">args</span><span class="p">))</span>
</code></pre>
<pre><code><span></span> <span class="p">(</span><span class="k">let </span><span class="p">[</span><span class="nv">segments</span> <span class="p">(</span><span class="nf">->></span> <span class="nv">photos</span>
<span class="c1">; Convert photos to frame dimensions</span>
<span class="p">(</span><span class="nb">map </span><span class="p">(</span><span class="nb">partial </span><span class="nv">frame</span> <span class="nv">mat-width</span><span class="p">))</span>
<span class="p">(</span><span class="nf">spy</span> <span class="ss">:frames</span><span class="p">)</span>
<span class="c1">; Convert frames to segments</span>
<span class="p">(</span><span class="nb">mapcat </span><span class="nv">perimeter</span><span class="p">))]</span>
</code></pre>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nf">use</span> <span class="ss">'scratch.debugging</span> <span class="ss">:reload</span><span class="p">)</span>
<span class="ss">:frames</span> <span class="p">({</span><span class="ss">:x</span> <span class="mi">12</span>, <span class="ss">:y</span> <span class="mi">14</span><span class="p">}</span> <span class="p">{</span><span class="ss">:x</span> <span class="mi">14</span>, <span class="ss">:y</span> <span class="mi">12</span><span class="p">}</span> <span class="p">{</span><span class="ss">:x</span> <span class="mi">24</span>, <span class="ss">:y</span> <span class="mi">34</span><span class="p">})</span>
<span class="ss">:segments</span> <span class="p">(</span><span class="mi">12</span> <span class="mi">14</span> <span class="nv">nil</span> <span class="mi">14</span> <span class="mi">14</span> <span class="mi">12</span> <span class="nv">nil</span> <span class="mi">12</span> <span class="mi">24</span> <span class="mi">34</span> <span class="nv">nil</span> <span class="mi">34</span><span class="p">)</span>
</code></pre>
<p>Ah! So the frames are intact, but the <em>perimeters</em> are bad. Let’s check the <code>perimeter</code> function:</p>
<pre><code><span></span><span class="p">(</span><span class="kd">defn </span><span class="nv">perimeter</span>
<span class="s">"Given a rectangle, returns a vector of its edge lengths."</span>
<span class="p">[</span><span class="nv">rect</span><span class="p">]</span>
<span class="p">[(</span><span class="ss">:x</span> <span class="nv">rect</span><span class="p">)</span>
<span class="p">(</span><span class="ss">:y</span> <span class="nv">rect</span><span class="p">)</span>
<span class="p">(</span><span class="ss">:z</span> <span class="nv">rect</span><span class="p">)</span>
<span class="p">(</span><span class="ss">:y</span> <span class="nv">rect</span><span class="p">)])</span>
</code></pre>
<p>Spot the typo? We wrote <code>:z</code> instead of <code>:x</code>. Since the frame didn’t have a <code>:z</code> field, it returned <code>nil</code>! That’s the origin of our <code>NullPointerException</code>. With the bug fixed, we can re-run and find:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nf">use</span> <span class="ss">'scratch.debugging</span> <span class="ss">:reload</span><span class="p">)</span>
<span class="nv">total</span> <span class="nv">inches</span><span class="err">:</span> <span class="mi">319</span>
</code></pre>
<p>Whallah!</p>
<h2><a href="#recap" id="recap">Recap</a></h2>
<p>As we solve more and more problems, we get faster at debugging–at skipping over irrelevant log data, figuring out exactly what input was at fault, knowing what terms to search for, and developing a network of peers and mentors to ask for help. But when we encounter unexpected bugs, it can help to fall back on a family of problem-solving tactics.</p>
<p>We explore the problem thoroughly, localizing it to a particular function, variable, or set of inputs. We identify the boundaries of the problem, carving away parts of the system that work as expected. We develop new notation, maps, and diagrams of the problem space, precisely characterizing it in a variety of modes.</p>
<p>With the problem identified, we search for extant solutions–or related problems others have solved in the past. We trawl through issue trackers, mailing list posts, blogs, and forums like Stackoverflow, or, for more theoretical problems, academic papers, Mathworld, and Wikipedia, etc. If searching reveals nothing, we try rephrasing the problem, relaxing the constraints, adding debugging statements, and solving smaller subproblems. When all else fails, we ask for help from our peers, or from the community in IRC, mailing lists, and so on, or just take a break.</p>
<p>We learned to explore Clojure stacktraces as a trail into our programs, leading to the place where an error occurred. But not all paths are linear, and we saw how lazy operations and higher-order functions create inversions and intermediate layers in the stacktrace. Then we learned how to debug values that were <em>distant</em> from the trace, by adding logging statements and working our way closer to the origin.</p>
<p>Programming languages and us, their users, are engaged in a continual dialogue. We may speak more formally, verbosely, with many types and defensive assertions–or we may speak quickly, generally, in fuzzy terms. The more precise we are with the specifications of our program’s types, the more the program can assist us when things go wrong. Conversely, those specifications <em>harden</em> our programs into strong but <em>rigid</em> forms, and rigid structures are harder to bend into new shapes.</p>
<p>In Clojure we strike a more dynamic balance: we speak in generalities, but we pay for that flexibility. Our errors are harder to trace to their origins. While the Clojure compiler can warn us of some errors, like mis-spelled variable names, it cannot (without a library like <a href="https://github.com/clojure/core.typed">core.typed</a>) tell us when we have incorrectly assumed an object will be of a certain type. Even very rigid languages, like Haskell, cannot identify some errors, like reversing the arguments to a subtraction function. <em>Some</em> tests are always necessary, though types are a huge boon.</p>
<p>No matter what language we write in, we use a balance of types and tests to <em>validate</em> our assumptions, both when the program is compiled and when it is run.</p>
<p>The errors that arise in compilation or runtime aren’t <em>rebukes</em> so much as <em>hints</em>. Don’t despair! They point the way towards understanding one’s program in more detail–though the errors may be cryptic. Over time we get better at reading our language’s errors and making our programs more robust.</p>
<p>In the next chapter, we discuss <a href="https://aphyr.com/posts/352-clojure-from-the-ground-up-polymorphism">polymorphism</a>.</p>
https://aphyr.com/posts/318-clojure-from-the-ground-up-roadmapClojure from the ground up: roadmap2014-08-26T21:26:03-05:002014-08-26T21:26:03-05:00Aphyrhttps://aphyr.com/<p>With the language fundamentals in hand, here’s my thinking for the remainder of the Clojure from the ground up book chapters. I’m putting Jepsen on hold to work on this project for the rest of the year; hoping to get the source material complete by… January?</p>
<ul>
<li>Debugging and getting help</li>
<li>Polymorphism</li>
<li>Error Handling</li>
<li>Modularization and refactoring</li>
<li>It’s not at all obvious what an object is</li>
<li>JVM interop</li>
<li>The Clojure type system</li>
<li>Compiler at runtime</li>
<li>Build your own language</li>
<li>Performance analysis</li>
<li>Parsers and protocols</li>
<li>Storage and persistence</li>
<li>Networks and messaging</li>
<li>Concurrency and queues</li>
</ul>
https://aphyr.com/posts/312-clojure-from-the-ground-up-modelingClojure from the ground up: modeling2014-02-19T01:17:39-05:002014-02-19T01:17:39-05:00Aphyrhttps://aphyr.com/<p>Previously: <a href="http://aphyr.com/posts/311-clojure-from-the-ground-up-logistics">Logistics</a></p>
<p>Until this point in the book, we’ve dealt primarily in specific details: what an expression is, how math works, which functions apply to different data structures, and where code lives. But programming, like speaking a language, painting landscapes, or designing turbines, is about more than the <em>nuts and bolts</em> of the trade. It’s knowing how to <em>combine</em> those parts into a cohesive whole–and this is a skill which is difficult to describe formally. In this part of the book, I’d like to work with you on an integrative tour of one particular problem: modeling a rocket in flight.</p>
<p>We’re going to reinforce our concrete knowledge of the standard library by using maps, sequences, and math functions together. At the same time, we’re going to practice how to represent a complex system; decomposing a problem into smaller parts, naming functions and variables, and writing tests.</p>
<h2><a href="#so-you-want-to-go-to-space" id="so-you-want-to-go-to-space">So you want to go to space</a></h2>
<p>First, we need a representation of a craft. The obvious properties for a rocket are its dry mass (how much it weighs without fuel), fuel mass, position, velocity, and time. We’ll create a new file in our scratch project–<code>src/scratch/rocket.clj</code>–to talk about spacecraft.</p>
<p>For starters, let’s pattern our craft after an <a href="http://en.wikipedia.org/wiki/Atlas_V">Atlas V</a> launch vehicle. We’ll represent everything in SI units–kilograms, meters, newtons, etc. The Atlas V carries 627,105 lbs of LOX/RP-1 fuel, and a total mass of 334,500 kg gives only 50,050 kg of mass which <em>isn’t</em> fuel. It develops 4152 kilonewtons of thrust and runs for 253 seconds, with a <a href="http://en.wikipedia.org/wiki/Specific_impulse">specific impulse</a> (effectively, exhaust velocity) of 3.05 kilometers/sec. Real rockets develop varying amounts of thrust depending on the atmosphere, but we’ll pretend it’s constant in our simulation.</p>
<pre><code><span></span><span class="p">(</span><span class="kd">defn </span><span class="nv">atlas-v</span>
<span class="p">[]</span>
<span class="p">{</span><span class="ss">:dry-mass</span> <span class="mi">50050</span>
<span class="ss">:fuel-mass</span> <span class="mi">284450</span>
<span class="ss">:time</span> <span class="mi">0</span>
<span class="ss">:isp</span> <span class="mi">3050</span>
<span class="ss">:max-fuel-rate</span> <span class="p">(</span><span class="nb">/ </span><span class="mi">284450</span> <span class="mi">253</span><span class="p">)</span>
<span class="ss">:max-thrust</span> <span class="mf">4.152</span><span class="nv">e6</span><span class="p">})</span>
</code></pre>
<p>How heavy is the craft?</p>
<pre><code><span></span><span class="p">(</span><span class="kd">defn </span><span class="nv">mass</span>
<span class="s">"The total mass of a craft."</span>
<span class="p">[</span><span class="nv">craft</span><span class="p">]</span>
<span class="p">(</span><span class="nb">+ </span><span class="p">(</span><span class="ss">:dry-mass</span> <span class="nv">craft</span><span class="p">)</span> <span class="p">(</span><span class="ss">:fuel-mass</span> <span class="nv">craft</span><span class="p">)))</span>
</code></pre>
<p>What about the position and velocity? We could represent them in Cartesian coordinates–x, y, and z–or we could choose spherical coordinates: a radius from the planet and angle from the pole and 0 degrees longitude. I’ve got a hunch that spherical coordinates will be easier for position, but accelerating the craft will be simplest in in x, y, and z terms. The center of the planet is a natural choice for the coordinate system’s origin (0, 0, 0). We’ll choose z along the north pole, and x and y in the plane of the equator.</p>
<p>Let’s define a space center where we launch from–let’s say it’s initially on the equator at y=0. To figure out the x coordinate, we’ll need to know how far the space center is from the center of the earth. The earth’s <a href="http://en.wikipedia.org/wiki/Earth_radius#Equatorial_radius">equatorial radius</a> is ~6378 kilometers.</p>
<pre><code><span></span><span class="p">(</span><span class="k">def </span><span class="nv">earth-equatorial-radius</span>
<span class="s">"Radius of the earth, in meters"</span>
<span class="mi">6378137</span><span class="p">)</span>
</code></pre>
<p>How fast is the surface moving? Well the earth’s day is 86,400 seconds long,</p>
<pre><code><span></span><span class="p">(</span><span class="k">def </span><span class="nv">earth-day</span>
<span class="s">"Length of an earth day, in seconds."</span>
<span class="mi">86400</span><span class="p">)</span>
</code></pre>
<p>which means a given point on the equator has to go 2 * pi * equatorial radius meters in earth-day seconds:</p>
<pre><code><span></span><span class="p">(</span><span class="k">def </span><span class="nv">earth-equatorial-speed</span>
<span class="s">"How fast points on the equator move, relative to the center of the earth,</span>
<span class="s"> in meters/sec."</span>
<span class="p">(</span><span class="nb">/ </span><span class="p">(</span><span class="nb">* </span><span class="mi">2</span> <span class="nv">Math/PI</span> <span class="nv">earth-equatorial-radius</span><span class="p">)</span>
<span class="nv">earth-day</span><span class="p">))</span>
</code></pre>
<p>So our space center is on the equator (z=0), at y=0 by choice, which means x is the equatorial radius. Since the earth is spinning, the space center is moving at earth-equatorial-speed in the y direction–and not changing at all in x or z.</p>
<pre><code><span></span><span class="p">(</span><span class="k">def </span><span class="nv">initial-space-center</span>
<span class="s">"The initial position and velocity of the launch facility"</span>
<span class="p">{</span><span class="ss">:time</span> <span class="mi">0</span>
<span class="ss">:position</span> <span class="p">{</span><span class="ss">:x</span> <span class="nv">earth-equatorial-radius</span>
<span class="ss">:y</span> <span class="mi">0</span>
<span class="ss">:z</span> <span class="mi">0</span><span class="p">}</span>
<span class="ss">:velocity</span> <span class="p">{</span><span class="ss">:x</span> <span class="mi">0</span>
<span class="ss">:y</span> <span class="nv">earth-equatorial-speed</span>
<span class="ss">:z</span> <span class="mi">0</span><span class="p">}})</span>
</code></pre>
<p><code>:position</code> and <code>:velocity</code> are both <a href="http://en.wikipedia.org/wiki/Euclidean_vector#Representations">vectors</a>, in the sense that they describe a position, or a direction, in terms of x, y, and z components. This is a <em>different</em> kind of vector than a Clojure vector, like <code>[1 2 3]</code>. We’re actually representing these logical vectors as Clojure <em>maps</em>, with <code>:x</code>, <code>:y</code>, and <code>:z</code> keys, corresponding to the distance along the x, y, and z directions, from the center of the earth. Throughout this chapter, I’ll mainly use the term <em>coordinates</em> to talk about these structures, to avoid confusion with Clojure vectors.</p>
<p>Now let’s create a function which positions our craft on the launchpad at time 0. We’ll just <em>merge</em> the spacecraft’s with the initial space center, overwriting the craft’s time and space coordinates.</p>
<pre><code><span></span><span class="p">(</span><span class="kd">defn </span><span class="nv">prepare</span>
<span class="s">"Prepares a craft for launch from an equatorial space center."</span>
<span class="p">[</span><span class="nv">craft</span><span class="p">]</span>
<span class="p">(</span><span class="nb">merge </span><span class="nv">craft</span> <span class="nv">initial-space-center</span><span class="p">))</span>
</code></pre>
<h2><a href="#forces" id="forces">Forces</a></h2>
<p>Gravity continually pulls the spacecraft towards the center of the Earth, accelerating it by 9.8 meters/second every second. To figure out what direction is towards the Earth, we’ll need the angles of a <a href="http://en.wikipedia.org/wiki/Spherical_coordinate_system">spherical coordinate system</a>. We’ll use the trigonometric functions from <a href="http://docs.oracle.com/javase/7/docs/api/java/lang/Math.html">java.lang.Math</a>.</p>
<pre><code><span></span><span class="p">(</span><span class="kd">defn </span><span class="nv">magnitude</span>
<span class="s">"What's the radius of a given set of cartesian coordinates?"</span>
<span class="p">[</span><span class="nv">c</span><span class="p">]</span>
<span class="c1">; By the Pythagorean theorem...</span>
<span class="p">(</span><span class="nf">Math/sqrt</span> <span class="p">(</span><span class="nb">+ </span><span class="p">(</span><span class="nf">Math/pow</span> <span class="p">(</span><span class="ss">:x</span> <span class="nv">c</span><span class="p">)</span> <span class="mi">2</span><span class="p">)</span>
<span class="p">(</span><span class="nf">Math/pow</span> <span class="p">(</span><span class="ss">:y</span> <span class="nv">c</span><span class="p">)</span> <span class="mi">2</span><span class="p">)</span>
<span class="p">(</span><span class="nf">Math/pow</span> <span class="p">(</span><span class="ss">:z</span> <span class="nv">c</span><span class="p">)</span> <span class="mi">2</span><span class="p">))))</span>
<span class="p">(</span><span class="kd">defn </span><span class="nv">cartesian->spherical</span>
<span class="s">"Converts a map of Cartesian coordinates :x, :y, and :z to spherical coordinates :r, :theta, and :phi."</span>
<span class="p">[</span><span class="nv">c</span><span class="p">]</span>
<span class="p">(</span><span class="k">let </span><span class="p">[</span><span class="nv">r</span> <span class="p">(</span><span class="nf">magnitude</span> <span class="nv">c</span><span class="p">)]</span>
<span class="p">{</span><span class="ss">:r</span> <span class="nv">r</span>
<span class="ss">:theta</span> <span class="p">(</span><span class="nf">Math/acos</span> <span class="p">(</span><span class="nb">/ </span><span class="p">(</span><span class="ss">:z</span> <span class="nv">c</span><span class="p">)</span> <span class="nv">r</span><span class="p">))</span>
<span class="ss">:phi</span> <span class="p">(</span><span class="nf">Math/atan2</span> <span class="p">(</span><span class="ss">:y</span> <span class="nv">c</span><span class="p">)</span> <span class="p">(</span><span class="ss">:x</span> <span class="nv">c</span><span class="p">))}))</span>
<span class="p">(</span><span class="kd">defn </span><span class="nv">spherical->cartesian</span>
<span class="s">"Converts spherical to Cartesian coordinates."</span>
<span class="p">[</span><span class="nv">c</span><span class="p">]</span>
<span class="p">{</span><span class="ss">:x</span> <span class="p">(</span><span class="nb">* </span><span class="p">(</span><span class="ss">:r</span> <span class="nv">c</span><span class="p">)</span> <span class="p">(</span><span class="nf">Math/sin</span> <span class="p">(</span><span class="ss">:theta</span> <span class="nv">c</span><span class="p">))</span> <span class="p">(</span><span class="nf">Math/cos</span> <span class="p">(</span><span class="ss">:phi</span> <span class="nv">c</span><span class="p">)))</span>
<span class="ss">:y</span> <span class="p">(</span><span class="nb">* </span><span class="p">(</span><span class="ss">:r</span> <span class="nv">c</span><span class="p">)</span> <span class="p">(</span><span class="nf">Math/sin</span> <span class="p">(</span><span class="ss">:theta</span> <span class="nv">c</span><span class="p">))</span> <span class="p">(</span><span class="nf">Math/sin</span> <span class="p">(</span><span class="ss">:phi</span> <span class="nv">c</span><span class="p">)))</span>
<span class="ss">:z</span> <span class="p">(</span><span class="nb">* </span><span class="p">(</span><span class="ss">:r</span> <span class="nv">c</span><span class="p">)</span> <span class="p">(</span><span class="nf">Math/cos</span> <span class="p">(</span><span class="ss">:phi</span> <span class="nv">c</span><span class="p">)))})</span>
</code></pre>
<p>With those angles in mind, computing the gravitational acceleration is easy. We take the spherical coordinates of the spacecraft and replace the radius with the total force due to gravity. Then we can transform that spherical force back into Cartesian coordinates.</p>
<pre><code><span></span><span class="p">(</span><span class="k">def </span><span class="nv">g</span> <span class="s">"Acceleration of gravity in meters/s^2"</span> <span class="mf">-9.8</span><span class="p">)</span>
<span class="p">(</span><span class="kd">defn </span><span class="nv">gravity-force</span>
<span class="s">"The force vector, each component in Newtons, due to gravity."</span>
<span class="p">[</span><span class="nv">craft</span><span class="p">]</span>
<span class="c1">; Since force is mass times acceleration...</span>
<span class="p">(</span><span class="k">let </span><span class="p">[</span><span class="nv">total-force</span> <span class="p">(</span><span class="nb">* </span><span class="nv">g</span> <span class="p">(</span><span class="nf">mass</span> <span class="nv">craft</span><span class="p">))]</span>
<span class="p">(</span><span class="nb">-> </span><span class="nv">craft</span>
<span class="c1">; Now we'll take the craft's position</span>
<span class="ss">:position</span>
<span class="c1">; in spherical coordinates,</span>
<span class="nv">cartesian->spherical</span>
<span class="c1">; replace the radius with the gravitational force...</span>
<span class="p">(</span><span class="nb">assoc </span><span class="ss">:r</span> <span class="nv">total-force</span><span class="p">)</span>
<span class="c1">; and transform back to Cartesian-land</span>
<span class="nv">spherical->cartesian</span><span class="p">)))</span>
</code></pre>
<p>Rockets produce thrust by consuming fuel. Let’s say the fuel consumption is always the maximum, until we run out:</p>
<pre><code><span></span><span class="p">(</span><span class="kd">defn </span><span class="nv">fuel-rate</span>
<span class="s">"How fast is fuel, in kilograms/second, consumed by the craft?"</span>
<span class="p">[</span><span class="nv">craft</span><span class="p">]</span>
<span class="p">(</span><span class="k">if </span><span class="p">(</span><span class="nb">pos? </span><span class="p">(</span><span class="ss">:fuel-mass</span> <span class="nv">craft</span><span class="p">))</span>
<span class="p">(</span><span class="ss">:max-fuel-rate</span> <span class="nv">craft</span><span class="p">)</span>
<span class="mi">0</span><span class="p">))</span>
</code></pre>
<p>Now that we know how much fuel is being consumed, we can compute the force the rocket engine develops. That force is simply the mass consumed per second times the exhaust velocity–which is the specific impulse <code>:isp</code>. We’ll ignore atmospheric effects.</p>
<pre><code><span></span><span class="p">(</span><span class="kd">defn </span><span class="nv">thrust</span>
<span class="s">"How much force, in newtons, does the craft's rocket engines exert?"</span>
<span class="p">[</span><span class="nv">craft</span><span class="p">]</span>
<span class="p">(</span><span class="nb">* </span><span class="p">(</span><span class="nf">fuel-rate</span> <span class="nv">craft</span><span class="p">)</span> <span class="p">(</span><span class="ss">:isp</span> <span class="nv">craft</span><span class="p">)))</span>
</code></pre>
<p>Cool. What about the direction of thrust? Just for grins, let’s keep the rocket pointing entirely along the x axis.</p>
<pre><code><span></span><span class="p">(</span><span class="kd">defn </span><span class="nv">engine-force</span>
<span class="s">"The force vector, each component in Newtons, due to the rocket engine."</span>
<span class="p">[</span><span class="nv">craft</span><span class="p">]</span>
<span class="p">(</span><span class="k">let </span><span class="p">[</span><span class="nv">t</span> <span class="p">(</span><span class="nf">thrust</span> <span class="nv">craft</span><span class="p">)]</span>
<span class="p">{</span><span class="ss">:x</span> <span class="nv">t</span>
<span class="ss">:y</span> <span class="mi">0</span>
<span class="ss">:z</span> <span class="mi">0</span><span class="p">}))</span>
</code></pre>
<p>The total force on a craft is just the sum of gravity and thrust. To sum these maps together, we’ll need a way to sum the x, y, and z components independently. Clojure’s <code>merge-with</code> function combines common fields in maps using a function, so this is surprisingly straightforward.</p>
<pre><code><span></span><span class="p">(</span><span class="kd">defn </span><span class="nv">total-force</span>
<span class="s">"Total force on a craft."</span>
<span class="p">[</span><span class="nv">craft</span><span class="p">]</span>
<span class="p">(</span><span class="nb">merge-with + </span><span class="p">(</span><span class="nf">engine-force</span> <span class="nv">craft</span><span class="p">)</span>
<span class="p">(</span><span class="nf">gravity-force</span> <span class="nv">craft</span><span class="p">)))</span>
</code></pre>
<p>The acceleration of a craft, by <a href="http://www.physicsclassroom.com/class/newtlaws/u2l3a.cfm">Newton’s second law</a>, is force divided by mass. This one’s a little trickier; given <code>{:x 1 :y 2 :z 4}</code> we want to apply a function–say, multiplication by a factor, to each number. Since maps are sequences of key/value pairs…</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">seq </span><span class="p">{</span><span class="ss">:x</span> <span class="mi">1</span> <span class="ss">:y</span> <span class="mi">2</span> <span class="ss">:z</span> <span class="mi">3</span><span class="p">})</span>
<span class="p">([</span><span class="ss">:z</span> <span class="mi">3</span><span class="p">]</span> <span class="p">[</span><span class="ss">:y</span> <span class="mi">2</span><span class="p">]</span> <span class="p">[</span><span class="ss">:x</span> <span class="mi">1</span><span class="p">])</span>
</code></pre>
<p>… and we can build up new maps out of key/value pairs using <code>into</code>…</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">into </span><span class="p">{}</span> <span class="p">[[</span><span class="ss">:x</span> <span class="mi">4</span><span class="p">]</span> <span class="p">[</span><span class="ss">:y</span> <span class="mi">5</span><span class="p">]])</span>
<span class="p">{</span><span class="ss">:x</span> <span class="mi">4</span>, <span class="ss">:y</span> <span class="mi">5</span><span class="p">}</span>
</code></pre>
<p>… we can write a function <code>map-values</code> which works like <code>map</code>, but affects the values of a map data structure.</p>
<pre><code><span></span><span class="p">(</span><span class="kd">defn </span><span class="nv">map-values</span>
<span class="s">"Applies f to every value in the map m."</span>
<span class="p">[</span><span class="nv">f</span> <span class="nv">m</span><span class="p">]</span>
<span class="p">(</span><span class="nb">into </span><span class="p">{}</span>
<span class="p">(</span><span class="nb">map </span><span class="p">(</span><span class="k">fn </span><span class="p">[</span><span class="nv">pair</span><span class="p">]</span>
<span class="p">[(</span><span class="nb">key </span><span class="nv">pair</span><span class="p">)</span> <span class="p">(</span><span class="nf">f</span> <span class="p">(</span><span class="nb">val </span><span class="nv">pair</span><span class="p">))])</span>
<span class="nv">m</span><span class="p">)))</span>
</code></pre>
<p>And that allows us to define a <code>scale</code> function which <em>scales</em> a set of coordinates by some factor:</p>
<pre><code><span></span><span class="p">(</span><span class="kd">defn </span><span class="nv">scale</span>
<span class="s">"Multiplies a map of x, y, and z coordinates by the given factor."</span>
<span class="p">[</span><span class="nv">factor</span> <span class="nv">coordinates</span><span class="p">]</span>
<span class="p">(</span><span class="nf">map-values</span> <span class="p">(</span><span class="nb">partial * </span><span class="nv">factor</span><span class="p">)</span> <span class="nv">coordinates</span><span class="p">))</span>
</code></pre>
<p>What’s that <code>partial</code> thing? It’s a function which <em>takes a function</em>, and some arguments, and <em>returns a new function</em>. What does the new function do? It calls the original function, with the arguments passed to <code>partial</code>, followed by any arguments passed to the new function. In short, <code>(partial * factor)</code> returns a function that takes any number, and multiplies it by <code>factor</code>.</p>
<p>So to divide each component of the force vector by the mass of the craft:</p>
<pre><code><span></span><span class="p">(</span><span class="kd">defn </span><span class="nv">acceleration</span>
<span class="s">"Total acceleration of a craft."</span>
<span class="p">[</span><span class="nv">craft</span><span class="p">]</span>
<span class="p">(</span><span class="k">let </span><span class="p">[</span><span class="nv">m</span> <span class="p">(</span><span class="nf">mass</span> <span class="nv">craft</span><span class="p">)]</span>
<span class="p">(</span><span class="nf">scale</span> <span class="p">(</span><span class="nb">/ </span><span class="nv">m</span><span class="p">)</span> <span class="p">(</span><span class="nf">total-force</span> <span class="nv">craft</span><span class="p">))))</span>
</code></pre>
<p>Note that <code>(/ m)</code> returns 1/m. Our scale function can do double-duty as both multiplication and division.</p>
<p>With the acceleration and fuel consumption all figured out, we’re ready to <em>apply those changes over time</em>. We’ll write a function which takes the rocket at a particular time, and returns a version of it <code>dt</code> seconds later.</p>
<pre><code><span></span><span class="p">(</span><span class="kd">defn </span><span class="nv">step</span>
<span class="p">[</span><span class="nv">craft</span> <span class="nv">dt</span><span class="p">]</span>
<span class="p">(</span><span class="nb">assoc </span><span class="nv">craft</span>
<span class="c1">; Time advances by dt seconds</span>
<span class="ss">:t</span> <span class="p">(</span><span class="nb">+ </span><span class="nv">dt</span> <span class="p">(</span><span class="ss">:t</span> <span class="nv">craft</span><span class="p">))</span>
<span class="c1">; We burn some fuel</span>
<span class="ss">:fuel-mass</span> <span class="p">(</span><span class="nb">- </span><span class="p">(</span><span class="ss">:fuel-mass</span> <span class="nv">craft</span><span class="p">)</span> <span class="p">(</span><span class="nb">* </span><span class="nv">dt</span> <span class="p">(</span><span class="nf">fuel-rate</span> <span class="nv">craft</span><span class="p">)))</span>
<span class="c1">; Our position changes based on our velocity</span>
<span class="ss">:position</span> <span class="p">(</span><span class="nb">merge-with + </span><span class="p">(</span><span class="ss">:position</span> <span class="nv">craft</span><span class="p">)</span>
<span class="p">(</span><span class="nf">scale</span> <span class="nv">dt</span> <span class="p">(</span><span class="ss">:velocity</span> <span class="nv">craft</span><span class="p">)))</span>
<span class="c1">; And our velocity changes based on our acceleration</span>
<span class="ss">:velocity</span> <span class="p">(</span><span class="nb">merge-with + </span><span class="p">(</span><span class="ss">:velocity</span> <span class="nv">craft</span><span class="p">)</span>
<span class="p">(</span><span class="nf">scale</span> <span class="nv">dt</span> <span class="p">(</span><span class="nf">acceleration</span> <span class="nv">craft</span><span class="p">)))))</span>
</code></pre>
<p>OK. Let’s save the <code>rocket.clj</code> file, load that code into the REPL, and fire it up.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nf">use</span> <span class="ss">'scratch.rocket</span> <span class="ss">:reload</span><span class="p">)</span>
<span class="nv">nil</span>
</code></pre>
<p><code>use</code> is like a shorthand for <code>(:require ... :refer :all)</code>. We’re passing <code>:reload</code> because we want the REPL to re-read the file. Notice that in <code>ns</code> declarations, the namespace name <code>scratch.rocket</code> is <em>unquoted</em>–but when we call <code>use</code> or <code>require</code> at the repl, we quote the namespace name.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nf">atlas-v</span><span class="p">)</span>
<span class="p">{</span><span class="ss">:dry-mass</span> <span class="mi">50050</span>, <span class="ss">:fuel-mass</span> <span class="mi">284450</span>, <span class="ss">:time</span> <span class="mi">0</span>, <span class="ss">:isp</span> <span class="mi">3050</span>, <span class="ss">:max-fuel-rate</span> <span class="mi">284450</span><span class="nv">/253</span>, <span class="ss">:max-thrust</span> <span class="mf">4152000.0</span><span class="p">}</span>
</code></pre>
<h2><a href="#launch" id="launch">Launch</a></h2>
<p>Let’s prepare the rocket. We’ll use <code>pprint</code> to print it in a more readable form.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">-> </span><span class="p">(</span><span class="nf">atlas-v</span><span class="p">)</span> <span class="nv">prepare</span> <span class="nv">pprint</span><span class="p">)</span>
<span class="p">{</span><span class="ss">:velocity</span> <span class="p">{</span><span class="ss">:x</span> <span class="mi">0</span>, <span class="ss">:y</span> <span class="mf">463.8312116386399</span>, <span class="ss">:z</span> <span class="mi">0</span><span class="p">}</span>,
<span class="ss">:position</span> <span class="p">{</span><span class="ss">:x</span> <span class="mi">6378137</span>, <span class="ss">:y</span> <span class="mi">0</span>, <span class="ss">:z</span> <span class="mi">0</span><span class="p">}</span>,
<span class="ss">:dry-mass</span> <span class="mi">50050</span>,
<span class="ss">:fuel-mass</span> <span class="mi">284450</span>,
<span class="ss">:time</span> <span class="mi">0</span>,
<span class="ss">:isp</span> <span class="mi">3050</span>,
<span class="ss">:max-fuel-rate</span> <span class="mi">284450</span><span class="nv">/253</span>,
<span class="ss">:max-thrust</span> <span class="mf">4152000.0</span><span class="p">}</span>
</code></pre>
<p>Great; there it is on the launchpad. Wow, even “standing still”, it’s moving at 463 meters/sec because of the earth’s rotation! That means <em>you and I</em> are flying through space at almost half a kilometer every second! Let’s step forward one second in time.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">-> </span><span class="p">(</span><span class="nf">atlas-v</span><span class="p">)</span> <span class="nv">prepare</span> <span class="p">(</span><span class="nf">step</span> <span class="mi">1</span><span class="p">)</span> <span class="nv">pprint</span><span class="p">)</span>
<span class="nv">NullPointerException</span> <span class="nv">clojure.lang.Numbers.ops</span> <span class="p">(</span><span class="nf">Numbers.java</span><span class="ss">:942</span><span class="p">)</span>
</code></pre>
<p>In evaluating this expression, Clojure reached a point where it could not continue, and aborted execution. We call this error an <em>exception</em>, and the process of aborting <em>throwing</em> the exception. Clojure backs up to the function which <em>called</em> the function that threw, then the function which called <em>that</em> function, and so on, all the way to the top-level expression. The REPL finally intercepts the exception, prints an error to the console, and stashes the exception object in a special variable <code>*e</code>.</p>
<p>In this case, we know that the exception in question was a <code>NullPointerException</code>, which occurs when a function received <code>nil</code> unexpectedly. This one came from <code>clojure.lang.Numbers.ops</code>, which suggests some sort of math was involved. Let’s use <code>pst</code> to find out where it came from.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nf">pst</span> <span class="nv">*e</span><span class="p">)</span>
<span class="nv">NullPointerException</span>
<span class="nv">clojure.lang.Numbers.ops</span> <span class="p">(</span><span class="nf">Numbers.java</span><span class="ss">:942</span><span class="p">)</span>
<span class="nv">clojure.lang.Numbers.add</span> <span class="p">(</span><span class="nf">Numbers.java</span><span class="ss">:126</span><span class="p">)</span>
<span class="nv">scratch.rocket/step</span> <span class="p">(</span><span class="nf">rocket.clj</span><span class="ss">:125</span><span class="p">)</span>
<span class="nv">user/eval1478</span> <span class="p">(</span><span class="nf">NO_SOURCE_FILE</span><span class="ss">:1</span><span class="p">)</span>
<span class="nv">clojure.lang.Compiler.eval</span> <span class="p">(</span><span class="nf">Compiler.java</span><span class="ss">:6619</span><span class="p">)</span>
<span class="nv">clojure.lang.Compiler.eval</span> <span class="p">(</span><span class="nf">Compiler.java</span><span class="ss">:6582</span><span class="p">)</span>
<span class="nv">clojure.core/eval</span> <span class="p">(</span><span class="nf">core.clj</span><span class="ss">:2852</span><span class="p">)</span>
<span class="nv">clojure.main/repl/read-eval-print--6588/fn--6591</span> <span class="p">(</span><span class="nf">main.clj</span><span class="ss">:259</span><span class="p">)</span>
<span class="nv">clojure.main/repl/read-eval-print--6588</span> <span class="p">(</span><span class="nf">main.clj</span><span class="ss">:259</span><span class="p">)</span>
<span class="nv">clojure.main/repl/fn--6597</span> <span class="p">(</span><span class="nf">main.clj</span><span class="ss">:277</span><span class="p">)</span>
<span class="nv">clojure.main/repl</span> <span class="p">(</span><span class="nf">main.clj</span><span class="ss">:277</span><span class="p">)</span>
<span class="nv">clojure.tools.nrepl.middleware.interruptible-eval/evaluate/fn--589</span> <span class="p">(</span><span class="nf">interruptible_eval.clj</span><span class="ss">:56</span><span class="p">)</span>
</code></pre>
<p>This is called a <em>stack trace</em>: the <em>stack</em> is the context of the program at each function call. It traces the path the computer took in evaluating the expression, from the bottom to the top. At the bottom is the REPL, and Clojure compiler. Our code begins at <code>user/eval1478</code>–that’s the compiler’s name for the expression we just typed. That function called <code>scratch.rocket/step</code>, which in turn called <code>Numbers.add</code>, and that called <code>Numbers.ops</code>. Let’s start by looking at the last function <em>we</em> wrote before calling into Clojure’s standard library: the <code>step</code> function, in <code>rocket.clj</code>, on line <code>125</code>.</p>
<pre><code><span></span><span class="mi">123</span> <span class="p">(</span><span class="nb">assoc </span><span class="nv">craft</span>
<span class="mi">124</span> <span class="c1">; Time advances by dt seconds</span>
<span class="mi">125</span> <span class="ss">:t</span> <span class="p">(</span><span class="nb">+ </span><span class="nv">dt</span> <span class="p">(</span><span class="ss">:t</span> <span class="nv">craft</span><span class="p">))</span>
</code></pre>
<p>Ah; we named the time field <code>:time</code> earlier, not <code>:t</code>. Let’s replace <code>:t</code> with <code>:time</code>, save the file, and reload.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nf">use</span> <span class="ss">'scratch.rocket</span> <span class="ss">:reload</span><span class="p">)</span>
<span class="nv">nil</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nb">-> </span><span class="p">(</span><span class="nf">atlas-v</span><span class="p">)</span> <span class="nv">prepare</span> <span class="p">(</span><span class="nf">step</span> <span class="mi">1</span><span class="p">)</span> <span class="nv">pprint</span><span class="p">)</span>
<span class="p">{</span><span class="ss">:velocity</span> <span class="p">{</span><span class="ss">:x</span> <span class="mf">0.45154055666826215</span>, <span class="ss">:y</span> <span class="mf">463.8312116386399</span>, <span class="ss">:z</span> <span class="mf">-9.8</span><span class="p">}</span>,
<span class="ss">:position</span> <span class="p">{</span><span class="ss">:x</span> <span class="mi">6378137</span>, <span class="ss">:y</span> <span class="mf">463.8312116386399</span>, <span class="ss">:z</span> <span class="mi">0</span><span class="p">}</span>,
<span class="ss">:dry-mass</span> <span class="mi">50050</span>,
<span class="ss">:fuel-mass</span> <span class="mi">71681400</span><span class="nv">/253</span>,
<span class="ss">:time</span> <span class="mi">1</span>,
<span class="ss">:isp</span> <span class="mi">3050</span>,
<span class="ss">:max-fuel-rate</span> <span class="mi">284450</span><span class="nv">/253</span>,
<span class="ss">:max-thrust</span> <span class="mf">4152000.0</span><span class="p">}</span>
</code></pre>
<p>Look at that! Our position is unchanged (because our velocity was zero), but our <em>velocity</em> has shifted. We’re now moving… wait, -9.8 meters per second <em>south</em>? That can’t be right. Gravity points <em>down</em>, not sideways. Something must be wrong with our spherical coordinate system. Let’s write a test in <code>test/scratch/rocket_test.clj</code> to explore.</p>
<pre><code><span></span><span class="p">(</span><span class="kd">ns </span><span class="nv">scratch.rocket-test</span>
<span class="p">(</span><span class="ss">:require</span> <span class="p">[</span><span class="nv">clojure.test</span> <span class="ss">:refer</span> <span class="ss">:all</span><span class="p">]</span>
<span class="p">[</span><span class="nv">scratch.rocket</span> <span class="ss">:refer</span> <span class="ss">:all</span><span class="p">]))</span>
<span class="p">(</span><span class="nf">deftest</span> <span class="nv">spherical-coordinate-test</span>
<span class="p">(</span><span class="k">let </span><span class="p">[</span><span class="nv">pos</span> <span class="p">{</span><span class="ss">:x</span> <span class="mi">1</span> <span class="ss">:y</span> <span class="mi">2</span> <span class="ss">:z</span> <span class="mi">3</span><span class="p">}]</span>
<span class="p">(</span><span class="nf">testing</span> <span class="s">"roundtrip"</span>
<span class="p">(</span><span class="nf">is</span> <span class="p">(</span><span class="nb">= </span><span class="nv">pos</span> <span class="p">(</span><span class="nb">-> </span><span class="nv">pos</span> <span class="nv">cartesian->spherical</span> <span class="nv">spherical->cartesian</span><span class="p">))))))</span>
</code></pre>
<pre><code>aphyr@waterhouse:~/scratch$ lein test
lein test scratch.core-test
lein test scratch.rocket-test
lein test :only scratch.rocket-test/spherical-coordinate-test
FAIL in (spherical-coordinate-test) (rocket_test.clj:8)
roundtrip
expected: (= pos (-> pos cartesian->spherical spherical->cartesian))
actual: (not (= {:z 3, :y 2, :x 1} {:x 1.0, :y 1.9999999999999996, :z 1.6733200530681513}))
Ran 2 tests containing 4 assertions.
1 failures, 0 errors.
Tests failed.
</code></pre>
<p>Definitely wrong. Looks like something to do with the z coordinate, since x and y look OK. Let’s try testing a point on the north pole:</p>
<pre><code><span></span><span class="p">(</span><span class="nf">deftest</span> <span class="nv">spherical-coordinate-test</span>
<span class="p">(</span><span class="nf">testing</span> <span class="s">"spherical->cartesian"</span>
<span class="p">(</span><span class="nf">is</span> <span class="p">(</span><span class="nb">= </span><span class="p">(</span><span class="nf">spherical->cartesian</span> <span class="p">{</span><span class="ss">:r</span> <span class="mi">2</span>
<span class="ss">:phi</span> <span class="mi">0</span>
<span class="ss">:theta</span> <span class="mi">0</span><span class="p">})</span>
<span class="p">{</span><span class="ss">:x</span> <span class="mf">0.0</span> <span class="ss">:y</span> <span class="mf">0.0</span> <span class="ss">:z</span> <span class="mf">2.0</span><span class="p">})))</span>
<span class="p">(</span><span class="nf">testing</span> <span class="s">"roundtrip"</span>
<span class="p">(</span><span class="k">let </span><span class="p">[</span><span class="nv">pos</span> <span class="p">{</span><span class="ss">:x</span> <span class="mf">1.0</span> <span class="ss">:y</span> <span class="mf">2.0</span> <span class="ss">:z</span> <span class="mf">3.0</span><span class="p">}]</span>
<span class="p">(</span><span class="nf">is</span> <span class="p">(</span><span class="nb">= </span><span class="nv">pos</span> <span class="p">(</span><span class="nb">-> </span><span class="nv">pos</span> <span class="nv">cartesian->spherical</span> <span class="nv">spherical->cartesian</span><span class="p">))))))</span>
</code></pre>
<p>That checks out OK. Let’s try some values in the repl.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nf">cartesian->spherical</span> <span class="p">{</span><span class="ss">:x</span> <span class="mf">0.00001</span> <span class="ss">:y</span> <span class="mf">0.00001</span> <span class="ss">:z</span> <span class="mf">2.0</span><span class="p">})</span>
<span class="p">{</span><span class="ss">:r</span> <span class="mf">2.00000000005</span>, <span class="ss">:theta</span> <span class="mf">7.071068104411588</span><span class="nv">E-6</span>, <span class="ss">:phi</span> <span class="mf">0.7853981633974483</span><span class="p">}</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nf">cartesian->spherical</span> <span class="p">{</span><span class="ss">:x</span> <span class="mi">1</span> <span class="ss">:y</span> <span class="mi">2</span> <span class="ss">:z</span> <span class="mi">3</span><span class="p">})</span>
<span class="p">{</span><span class="ss">:r</span> <span class="mf">3.7416573867739413</span>, <span class="ss">:theta</span> <span class="mf">0.6405223126794245</span>, <span class="ss">:phi</span> <span class="mf">1.1071487177940904</span><span class="p">}</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nf">spherical->cartesian</span> <span class="p">(</span><span class="nf">cartesian->spherical</span> <span class="p">{</span><span class="ss">:x</span> <span class="mi">1</span> <span class="ss">:y</span> <span class="mi">2</span> <span class="ss">:z</span> <span class="mi">3</span><span class="p">}))</span>
<span class="p">{</span><span class="ss">:x</span> <span class="mf">1.0</span>, <span class="ss">:y</span> <span class="mf">1.9999999999999996</span>, <span class="ss">:z</span> <span class="mf">1.6733200530681513</span><span class="p">}</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nf">cartesian->spherical</span> <span class="p">{</span><span class="ss">:x</span> <span class="mi">1</span> <span class="ss">:y</span> <span class="mi">2</span> <span class="ss">:z</span> <span class="mi">0</span><span class="p">})</span>
<span class="p">{</span><span class="ss">:r</span> <span class="mf">2.23606797749979</span>, <span class="ss">:theta</span> <span class="mf">1.5707963267948966</span>, <span class="ss">:phi</span> <span class="mf">1.1071487177940904</span><span class="p">}</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nf">cartesian->spherical</span> <span class="p">{</span><span class="ss">:x</span> <span class="mi">1</span> <span class="ss">:y</span> <span class="mi">1</span> <span class="ss">:z</span> <span class="mi">0</span><span class="p">})</span>
<span class="p">{</span><span class="ss">:r</span> <span class="mf">1.4142135623730951</span>, <span class="ss">:theta</span> <span class="mf">1.5707963267948966</span>, <span class="ss">:phi</span> <span class="mf">0.7853981633974483</span><span class="p">}</span>
</code></pre>
<p>Oh, wait, that looks odd. <code>{:x 1 :y 1 :z 0}</code> is on the equator: phi–the angle from the pole–should be pi/2 or ~1.57, and theta–the angle around the equator–should be pi/4 or 0.78. Those coordinates are reversed! Double-checking our formulas with <a href="http://mathworld.wolfram.com/SphericalCoordinates.html">Wolfram MathWorld</a> shows that we mixed up phi and theta! Let’s redefine <code>cartesian->spherical</code> correctly.</p>
<pre><code><span></span><span class="p">(</span><span class="kd">defn </span><span class="nv">cartesian->spherical</span>
<span class="s">"Converts a map of Cartesian coordinates :x, :y, and :z to spherical</span>
<span class="s"> coordinates :r, :theta, and :phi."</span>
<span class="p">[</span><span class="nv">c</span><span class="p">]</span>
<span class="p">(</span><span class="k">let </span><span class="p">[</span><span class="nv">r</span> <span class="p">(</span><span class="nf">Math/sqrt</span> <span class="p">(</span><span class="nb">+ </span><span class="p">(</span><span class="nf">Math/pow</span> <span class="p">(</span><span class="ss">:x</span> <span class="nv">c</span><span class="p">)</span> <span class="mi">2</span><span class="p">)</span>
<span class="p">(</span><span class="nf">Math/pow</span> <span class="p">(</span><span class="ss">:y</span> <span class="nv">c</span><span class="p">)</span> <span class="mi">2</span><span class="p">)</span>
<span class="p">(</span><span class="nf">Math/pow</span> <span class="p">(</span><span class="ss">:z</span> <span class="nv">c</span><span class="p">)</span> <span class="mi">2</span><span class="p">)))]</span>
<span class="p">{</span><span class="ss">:r</span> <span class="nv">r</span>
<span class="ss">:phi</span> <span class="p">(</span><span class="nf">Math/acos</span> <span class="p">(</span><span class="nb">/ </span><span class="p">(</span><span class="ss">:z</span> <span class="nv">c</span><span class="p">)</span> <span class="nv">r</span><span class="p">))</span>
<span class="ss">:theta</span> <span class="p">(</span><span class="nf">Math/atan2</span> <span class="p">(</span><span class="ss">:y</span> <span class="nv">c</span><span class="p">)</span> <span class="p">(</span><span class="ss">:x</span> <span class="nv">c</span><span class="p">))}))</span>
</code></pre>
<pre><code>aphyr@waterhouse:~/scratch$ lein test
lein test scratch.core-test
lein test scratch.rocket-test
Ran 2 tests containing 5 assertions.
0 failures, 0 errors.
</code></pre>
<p>Great. Now let’s check the rocket trajectory again.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">-> </span><span class="p">(</span><span class="nf">atlas-v</span><span class="p">)</span> <span class="nv">prepare</span> <span class="p">(</span><span class="nf">step</span> <span class="mi">1</span><span class="p">)</span> <span class="nv">pprint</span><span class="p">)</span>
<span class="p">{</span><span class="ss">:velocity</span>
<span class="p">{</span><span class="ss">:x</span> <span class="mf">0.45154055666826204</span>,
<span class="ss">:y</span> <span class="mf">463.8312116386399</span>,
<span class="ss">:z</span> <span class="mf">-6.000769315822031</span><span class="nv">E-16</span><span class="p">}</span>,
<span class="ss">:position</span> <span class="p">{</span><span class="ss">:x</span> <span class="mi">6378137</span>, <span class="ss">:y</span> <span class="mf">463.8312116386399</span>, <span class="ss">:z</span> <span class="mi">0</span><span class="p">}</span>,
<span class="ss">:dry-mass</span> <span class="mi">50050</span>,
<span class="ss">:fuel-mass</span> <span class="mi">71681400</span><span class="nv">/253</span>,
<span class="ss">:time</span> <span class="mi">1</span>,
<span class="ss">:isp</span> <span class="mi">3050</span>,
<span class="ss">:max-fuel-rate</span> <span class="mi">284450</span><span class="nv">/253</span>,
<span class="ss">:max-thrust</span> <span class="mf">4152000.0</span><span class="p">}</span>
</code></pre>
<p>This time, our velocity is increasing in the +x direction, at half a meter per second. We have liftoff!</p>
<h2><a href="#flight" id="flight">Flight</a></h2>
<p>We have a function that can move the rocket forward by one small step of time, but we’d like to understand the rocket’s trajectory as a <em>whole</em>; to see <em>all</em> positions it will take. We’ll use <em>iterate</em> to construct a lazy, infinite sequence of rocket states, each one constructed by stepping forward from the last.</p>
<pre><code><span></span><span class="p">(</span><span class="kd">defn </span><span class="nv">trajectory</span>
<span class="p">[</span><span class="nv">dt</span> <span class="nv">craft</span><span class="p">]</span>
<span class="s">"Returns all future states of the craft, at dt-second intervals."</span>
<span class="p">(</span><span class="nb">iterate </span><span class="o">#</span><span class="p">(</span><span class="nf">step</span> <span class="nv">%</span> <span class="mi">1</span><span class="p">)</span> <span class="nv">craft</span><span class="p">))</span>
</code></pre>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nf">->></span> <span class="p">(</span><span class="nf">atlas-v</span><span class="p">)</span> <span class="nv">prepare</span> <span class="p">(</span><span class="nf">trajectory</span> <span class="mi">1</span><span class="p">)</span> <span class="p">(</span><span class="nb">take </span><span class="mi">3</span><span class="p">)</span> <span class="nv">pprint</span><span class="p">)</span>
<span class="p">({</span><span class="ss">:velocity</span> <span class="p">{</span><span class="ss">:x</span> <span class="mi">0</span>, <span class="ss">:y</span> <span class="mf">463.8312116386399</span>, <span class="ss">:z</span> <span class="mi">0</span><span class="p">}</span>,
<span class="ss">:position</span> <span class="p">{</span><span class="ss">:x</span> <span class="mi">6378137</span>, <span class="ss">:y</span> <span class="mi">0</span>, <span class="ss">:z</span> <span class="mi">0</span><span class="p">}</span>,
<span class="ss">:dry-mass</span> <span class="mi">50050</span>,
<span class="ss">:fuel-mass</span> <span class="mi">284450</span>,
<span class="ss">:time</span> <span class="mi">0</span>,
<span class="ss">:isp</span> <span class="mi">3050</span>,
<span class="ss">:max-fuel-rate</span> <span class="mi">284450</span><span class="nv">/253</span>,
<span class="ss">:max-thrust</span> <span class="mf">4152000.0</span><span class="p">}</span>
<span class="p">{</span><span class="ss">:velocity</span>
<span class="p">{</span><span class="ss">:x</span> <span class="mf">0.45154055666826204</span>,
<span class="ss">:y</span> <span class="mf">463.8312116386399</span>,
<span class="ss">:z</span> <span class="mf">-6.000769315822031</span><span class="nv">E-16</span><span class="p">}</span>,
<span class="ss">:position</span> <span class="p">{</span><span class="ss">:x</span> <span class="mi">6378137</span>, <span class="ss">:y</span> <span class="mf">463.8312116386399</span>, <span class="ss">:z</span> <span class="mi">0</span><span class="p">}</span>,
<span class="ss">:dry-mass</span> <span class="mi">50050</span>,
<span class="ss">:fuel-mass</span> <span class="mi">71681400</span><span class="nv">/253</span>,
<span class="ss">:time</span> <span class="mi">1</span>,
<span class="ss">:isp</span> <span class="mi">3050</span>,
<span class="ss">:max-fuel-rate</span> <span class="mi">284450</span><span class="nv">/253</span>,
<span class="ss">:max-thrust</span> <span class="mf">4152000.0</span><span class="p">}</span>
<span class="p">{</span><span class="ss">:velocity</span>
<span class="p">{</span><span class="ss">:x</span> <span class="mf">0.9376544222659078</span>,
<span class="ss">:y</span> <span class="mf">463.83049896253056</span>,
<span class="ss">:z</span> <span class="mf">-1.200153863164406</span><span class="nv">E-15</span><span class="p">}</span>,
<span class="ss">:position</span>
<span class="p">{</span><span class="ss">:x</span> <span class="mf">6378137.451540557</span>,
<span class="ss">:y</span> <span class="mf">927.6624232772798</span>,
<span class="ss">:z</span> <span class="mf">-6.000769315822031</span><span class="nv">E-16</span><span class="p">}</span>,
<span class="ss">:dry-mass</span> <span class="mi">50050</span>,
<span class="ss">:fuel-mass</span> <span class="mi">71396950</span><span class="nv">/253</span>,
<span class="ss">:time</span> <span class="mi">2</span>,
<span class="ss">:isp</span> <span class="mi">3050</span>,
<span class="ss">:max-fuel-rate</span> <span class="mi">284450</span><span class="nv">/253</span>,
<span class="ss">:max-thrust</span> <span class="mf">4152000.0</span><span class="p">})</span>
</code></pre>
<p>Notice that each map is like a frame of a movie, playing at one frame per second. We can make the simulation more or less accurate by raising or lowering the framerate–adjusting the parameter fed to <code>trajectory</code>. For now, though, we’ll stick with one-second intervals.</p>
<p>How high above the surface is the rocket?</p>
<pre><code><span></span><span class="p">(</span><span class="kd">defn </span><span class="nv">altitude</span>
<span class="s">"The height above the surface of the equator, in meters."</span>
<span class="p">[</span><span class="nv">craft</span><span class="p">]</span>
<span class="p">(</span><span class="nb">-> </span><span class="nv">craft</span>
<span class="ss">:position</span>
<span class="nv">cartesian->spherical</span>
<span class="ss">:r</span>
<span class="p">(</span><span class="nb">- </span><span class="nv">earth-equatorial-radius</span><span class="p">)))</span>
</code></pre>
<p>Now we can explore the rocket’s path as a series of altitudes over time:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nf">->></span> <span class="p">(</span><span class="nf">atlas-v</span><span class="p">)</span> <span class="nv">prepare</span> <span class="p">(</span><span class="nf">trajectory</span> <span class="mi">1</span><span class="p">)</span> <span class="p">(</span><span class="nb">map </span><span class="nv">altitude</span><span class="p">)</span> <span class="p">(</span><span class="nb">take </span><span class="mi">10</span><span class="p">)</span> <span class="nv">pprint</span><span class="p">)</span>
<span class="p">(</span><span class="mf">0.0</span>
<span class="mf">0.016865378245711327</span>
<span class="mf">0.519002066925168</span>
<span class="mf">1.540983198210597</span>
<span class="mf">3.117615718394518</span>
<span class="mf">5.283942770212889</span>
<span class="mf">8.075246102176607</span>
<span class="mf">11.52704851794988</span>
<span class="mf">15.675116359256208</span>
<span class="mf">20.555462017655373</span><span class="p">)</span>
</code></pre>
<p>The million dollar question, though, is whether the rocket breaks orbit.</p>
<pre><code><span></span><span class="p">(</span><span class="kd">defn </span><span class="nv">above-ground?</span>
<span class="s">"Is the craft at or above the surface?"</span>
<span class="p">[</span><span class="nv">craft</span><span class="p">]</span>
<span class="p">(</span><span class="nb"><= </span><span class="mi">0</span> <span class="p">(</span><span class="nf">altitude</span> <span class="nv">craft</span><span class="p">)))</span>
<span class="p">(</span><span class="kd">defn </span><span class="nv">flight</span>
<span class="s">"The above-ground portion of a trajectory."</span>
<span class="p">[</span><span class="nv">trajectory</span><span class="p">]</span>
<span class="p">(</span><span class="nb">take-while </span><span class="nv">above-ground?</span> <span class="nv">trajectory</span><span class="p">))</span>
<span class="p">(</span><span class="kd">defn </span><span class="nv">crashed?</span>
<span class="s">"Does this trajectory crash into the surface before 100 hours are up?"</span>
<span class="p">[</span><span class="nv">trajectory</span><span class="p">]</span>
<span class="p">(</span><span class="k">let </span><span class="p">[</span><span class="nv">time-limit</span> <span class="p">(</span><span class="nb">* </span><span class="mi">100</span> <span class="mi">3600</span><span class="p">)]</span> <span class="c1">; 1 hour</span>
<span class="p">(</span><span class="nb">not </span><span class="p">(</span><span class="nb">every? </span><span class="nv">above-ground?</span>
<span class="p">(</span><span class="nb">take-while </span><span class="o">#</span><span class="p">(</span><span class="nb"><= </span><span class="p">(</span><span class="ss">:time</span> <span class="nv">%</span><span class="p">)</span> <span class="nv">time-limit</span><span class="p">)</span> <span class="nv">trajectory</span><span class="p">)))))</span>
<span class="p">(</span><span class="kd">defn </span><span class="nv">crash-time</span>
<span class="s">"Given a trajectory, returns the time the rocket impacted the ground."</span>
<span class="p">[</span><span class="nv">trajectory</span><span class="p">]</span>
<span class="p">(</span><span class="ss">:time</span> <span class="p">(</span><span class="nb">last </span><span class="p">(</span><span class="nf">flight</span> <span class="nv">trajectory</span><span class="p">))))</span>
<span class="p">(</span><span class="kd">defn </span><span class="nv">apoapsis</span>
<span class="s">"The highest altitude achieved during a trajectory."</span>
<span class="p">[</span><span class="nv">trajectory</span><span class="p">]</span>
<span class="p">(</span><span class="nb">apply max </span><span class="p">(</span><span class="nb">map </span><span class="nv">altitude</span> <span class="nv">trajectory</span><span class="p">)))</span>
<span class="p">(</span><span class="kd">defn </span><span class="nv">apoapsis-time</span>
<span class="s">"The time of apoapsis"</span>
<span class="p">[</span><span class="nv">trajectory</span><span class="p">]</span>
<span class="p">(</span><span class="ss">:time</span> <span class="p">(</span><span class="nb">apply max-key </span><span class="nv">altitude</span> <span class="p">(</span><span class="nf">flight</span> <span class="nv">trajectory</span><span class="p">))))</span>
</code></pre>
<p>If the rocket goes below ground, we know it crashed. If the rocket stays in orbit, the trajectory will never end. That makes it a bit tricky to tell whether the rocket is in a stable orbit or not, because we can’t ask about every element, or the last element, of an infinite sequence: it’ll take infinite time to evaluate. Instead, we’ll assume that the rocket <em>should</em> crash within the first, say, 100 hours; if it makes it past that point, we’ll assume it made orbit successfully. With these functions in hand, we’ll write a test in <code>test/scratch/rocket_test.clj</code> to see whether or not the launch is successful:</p>
<pre><code><span></span><span class="p">(</span><span class="nf">deftest</span> <span class="nv">makes-orbit</span>
<span class="p">(</span><span class="k">let </span><span class="p">[</span><span class="nv">trajectory</span> <span class="p">(</span><span class="nf">->></span> <span class="p">(</span><span class="nf">atlas-v</span><span class="p">)</span>
<span class="nv">prepare</span>
<span class="p">(</span><span class="nf">trajectory</span> <span class="mi">1</span><span class="p">))]</span>
<span class="p">(</span><span class="nb">when </span><span class="p">(</span><span class="nf">crashed?</span> <span class="nv">trajectory</span><span class="p">)</span>
<span class="p">(</span><span class="nb">println </span><span class="s">"Crashed at"</span> <span class="p">(</span><span class="nf">crash-time</span> <span class="nv">trajectory</span><span class="p">)</span> <span class="s">"seconds"</span><span class="p">)</span>
<span class="p">(</span><span class="nb">println </span><span class="s">"Maximum altitude"</span> <span class="p">(</span><span class="nf">apoapsis</span> <span class="nv">trajectory</span><span class="p">)</span>
<span class="s">"meters at"</span> <span class="p">(</span><span class="nf">apoapsis-time</span> <span class="nv">trajectory</span><span class="p">)</span> <span class="s">"seconds"</span><span class="p">))</span>
<span class="c1">; Assert that the rocket eventually made it to orbit.</span>
<span class="p">(</span><span class="nf">is</span> <span class="p">(</span><span class="nb">not </span><span class="p">(</span><span class="nf">crashed?</span> <span class="nv">trajectory</span><span class="p">)))))</span>
</code></pre>
<pre><code><span></span><span class="nv">aphyr</span><span class="o">@</span><span class="nv">waterhouse</span><span class="err">:</span><span class="o">~</span><span class="nv">/scratch$</span> <span class="nv">lein</span> <span class="nb">test </span><span class="nv">scratch.rocket-test</span>
<span class="nv">lein</span> <span class="nb">test </span><span class="nv">scratch.rocket-test</span>
<span class="nv">Crashed</span> <span class="nv">at</span> <span class="mi">982</span> <span class="nv">seconds</span>
<span class="nv">Maximum</span> <span class="nv">altitude</span> <span class="mf">753838.039645385</span> <span class="nv">meters</span> <span class="nv">at</span> <span class="mi">532</span> <span class="nv">seconds</span>
<span class="nv">lein</span> <span class="nb">test </span><span class="ss">:only</span> <span class="nv">scratch.rocket-test/makes-orbit</span>
<span class="nv">FAIL</span> <span class="nv">in</span> <span class="p">(</span><span class="nf">makes-orbit</span><span class="p">)</span> <span class="p">(</span><span class="nf">rocket_test.clj</span><span class="ss">:26</span><span class="p">)</span>
<span class="nv">expected</span><span class="err">:</span> <span class="p">(</span><span class="nb">not </span><span class="p">(</span><span class="nf">crashed?</span> <span class="nv">trajectory</span><span class="p">))</span>
<span class="nv">actual</span><span class="err">:</span> <span class="p">(</span><span class="nb">not </span><span class="p">(</span><span class="nb">not </span><span class="nv">true</span><span class="p">))</span>
<span class="nv">Ran</span> <span class="mi">2</span> <span class="nv">tests</span> <span class="nv">containing</span> <span class="mi">3</span> <span class="nv">assertions.</span>
<span class="mi">1</span> <span class="nv">failures</span>, <span class="mi">0</span> <span class="nv">errors.</span>
<span class="nv">Tests</span> <span class="nv">failed.</span>
</code></pre>
<p>We made it to an altitude of 750 kilometers, and crashed 982 seconds after launch. We’re gonna need a bigger boat.</p>
<h2><a href="#stage-ii" id="stage-ii">Stage II</a></h2>
<p>The Atlas V isn’t big enough to make it into orbit on its own. It carries a second stage, the <a href="http://en.wikipedia.org/wiki/Centaur_(rocket_stage)">Centaur</a>, which is much smaller and uses <a href="http://www.astronautix.com/stages/cenaurde.htm">more efficient engines</a>.</p>
<pre><code><span></span><span class="p">(</span><span class="kd">defn </span><span class="nv">centaur</span>
<span class="s">"The upper rocket stage.</span>
<span class="s"> http://en.wikipedia.org/wiki/Centaur_(rocket_stage)</span>
<span class="s"> http://www.astronautix.com/stages/cenaurde.htm"</span>
<span class="p">[]</span>
<span class="p">{</span><span class="ss">:dry-mass</span> <span class="mi">2361</span>
<span class="ss">:fuel-mass</span> <span class="mi">13897</span>
<span class="ss">:isp</span> <span class="mi">4354</span>
<span class="ss">:max-fuel-rate</span> <span class="p">(</span><span class="nb">/ </span><span class="mi">13897</span> <span class="mi">470</span><span class="p">)})</span>
</code></pre>
<p>The Centaur lives inside the Atlas V main stage. We’ll re-write <code>atlas-v</code> to take an <em>argument</em>: its next stage.</p>
<pre><code><span></span><span class="p">(</span><span class="kd">defn </span><span class="nv">atlas-v</span>
<span class="s">"The full launch vehicle. http://en.wikipedia.org/wiki/Atlas_V"</span>
<span class="p">[</span><span class="nv">next-stage</span><span class="p">]</span>
<span class="p">{</span><span class="ss">:dry-mass</span> <span class="mi">50050</span>
<span class="ss">:fuel-mass</span> <span class="mi">284450</span>
<span class="ss">:isp</span> <span class="mi">3050</span>
<span class="ss">:max-fuel-rate</span> <span class="p">(</span><span class="nb">/ </span><span class="mi">284450</span> <span class="mi">253</span><span class="p">)</span>
<span class="ss">:next-stage</span> <span class="nv">next-stage</span><span class="p">})</span>
</code></pre>
<p>Now, in our tests, we’ll construct the rocket like so:</p>
<pre><code><span></span> <span class="p">(</span><span class="k">let </span><span class="p">[</span><span class="nv">trajectory</span> <span class="p">(</span><span class="nf">->></span> <span class="p">(</span><span class="nf">atlas-v</span> <span class="p">(</span><span class="nf">centaur</span><span class="p">))</span>
<span class="nv">prepare</span>
<span class="p">(</span><span class="nf">trajectory</span> <span class="mi">1</span><span class="p">))]</span>
</code></pre>
<p>When we exhaust the fuel reserves of the primary stage, we’ll de-couple the main booster from the Centaur. In terms of our simulation, the Atlas V will be <em>replaced</em> by its next stage, the Centaur. We’ll write a function <code>stage</code> which separates the vehicles when ready:</p>
<pre><code><span></span><span class="p">(</span><span class="kd">defn </span><span class="nv">stage</span>
<span class="s">"When fuel reserves are exhausted, separate stages. Otherwise, return craft</span>
<span class="s"> unchanged."</span>
<span class="p">[</span><span class="nv">craft</span><span class="p">]</span>
<span class="p">(</span><span class="nf">cond</span>
<span class="c1">; Still fuel left</span>
<span class="p">(</span><span class="nb">pos? </span><span class="p">(</span><span class="ss">:fuel-mass</span> <span class="nv">craft</span><span class="p">))</span>
<span class="nv">craft</span>
<span class="c1">; No remaining stages</span>
<span class="p">(</span><span class="nb">nil? </span><span class="p">(</span><span class="ss">:next-stage</span> <span class="nv">craft</span><span class="p">))</span>
<span class="nv">craft</span>
<span class="c1">; Stage!</span>
<span class="ss">:else</span>
<span class="p">(</span><span class="nb">merge </span><span class="p">(</span><span class="ss">:next-stage</span> <span class="nv">craft</span><span class="p">)</span>
<span class="p">(</span><span class="nb">select-keys </span><span class="nv">craft</span> <span class="p">[</span><span class="ss">:time</span> <span class="ss">:position</span> <span class="ss">:velocity</span><span class="p">]))))</span>
</code></pre>
<p>We’re using <code>cond</code> to handle three distinct cases: where there’s fuel remaining in the craft, where there is no stage to separate, and when we’re ready for stage separation. Separation is easy: we simply return the next stage of the current craft, with the current craft’s time, position, and velocity merged in.</p>
<p>Finally, we’ll have to update our <code>step</code> function to take into account the possibility of stage separation.</p>
<pre><code><span></span><span class="p">(</span><span class="kd">defn </span><span class="nv">step</span>
<span class="p">[</span><span class="nv">craft</span> <span class="nv">dt</span><span class="p">]</span>
<span class="p">(</span><span class="k">let </span><span class="p">[</span><span class="nv">craft</span> <span class="p">(</span><span class="nf">stage</span> <span class="nv">craft</span><span class="p">)]</span>
<span class="p">(</span><span class="nb">assoc </span><span class="nv">craft</span>
<span class="c1">; Time advances by dt seconds</span>
<span class="ss">:time</span> <span class="p">(</span><span class="nb">+ </span><span class="nv">dt</span> <span class="p">(</span><span class="ss">:time</span> <span class="nv">craft</span><span class="p">))</span>
<span class="c1">; We burn some fuel</span>
<span class="ss">:fuel-mass</span> <span class="p">(</span><span class="nb">- </span><span class="p">(</span><span class="ss">:fuel-mass</span> <span class="nv">craft</span><span class="p">)</span> <span class="p">(</span><span class="nb">* </span><span class="nv">dt</span> <span class="p">(</span><span class="nf">fuel-rate</span> <span class="nv">craft</span><span class="p">)))</span>
<span class="c1">; Our position changes based on our velocity</span>
<span class="ss">:position</span> <span class="p">(</span><span class="nb">merge-with + </span><span class="p">(</span><span class="ss">:position</span> <span class="nv">craft</span><span class="p">)</span>
<span class="p">(</span><span class="nf">scale</span> <span class="nv">dt</span> <span class="p">(</span><span class="ss">:velocity</span> <span class="nv">craft</span><span class="p">)))</span>
<span class="c1">; And our velocity changes based on our acceleration</span>
<span class="ss">:velocity</span> <span class="p">(</span><span class="nb">merge-with + </span><span class="p">(</span><span class="ss">:velocity</span> <span class="nv">craft</span><span class="p">)</span>
<span class="p">(</span><span class="nf">scale</span> <span class="nv">dt</span> <span class="p">(</span><span class="nf">acceleration</span> <span class="nv">craft</span><span class="p">))))))</span>
</code></pre>
<p>Same as before, only now we call <code>stage</code> prior to the physics simulation. Let’s try a launch.</p>
<pre><code>aphyr@waterhouse:~/scratch$ lein test scratch.rocket-test
lein test scratch.rocket-test
Crashed at 2415 seconds
Maximum altitude 4598444.289945109 meters at 1446 seconds
lein test :only scratch.rocket-test/makes-orbit
FAIL in (makes-orbit) (rocket_test.clj:27)
expected: (not (crashed? trajectory))
actual: (not (not true))
Ran 2 tests containing 3 assertions.
1 failures, 0 errors.
Tests failed.
</code></pre>
<p>Still crashed–but we increased our apoapsis from 750 kilometers to 4,598 kilometers. That’s plenty high, but we’re still not making orbit. Why? Because we’re going straight up, then straight back down. To orbit, we need to move <em>sideways</em>, around the earth.</p>
<h2><a href="#orbital-insertion" id="orbital-insertion">Orbital insertion</a></h2>
<p>Our spacecraft is shooting upwards, but in order to remain in orbit around the earth, it has to execute a <em>second</em> burn: an orbital injection maneuver. That injection maneuver is also called a <em>circularization burn</em> because it turns the orbit from an ascending parabola into a circle (or something roughly like it). We don’t need to be precise about circularization–any trajectory that doesn’t hit the planet will suffice. All we have to do is burn towards the horizon, once we get high enough.</p>
<p>To do that, we’ll need to enhance the rocket’s control software. We assumed that the rocket would always thrust in the +x direction; but now we’ll need to thrust in multiple directions. We’ll break up the engine force into two parts: <code>thrust</code> (how hard the rocket motor pushes) and <code>orientation</code> (which determines the direction the rocket is pointing.)</p>
<pre><code><span></span><span class="p">(</span><span class="kd">defn </span><span class="nv">unit-vector</span>
<span class="s">"Scales coordinates to magnitude 1."</span>
<span class="p">[</span><span class="nv">coordinates</span><span class="p">]</span>
<span class="p">(</span><span class="nf">scale</span> <span class="p">(</span><span class="nb">/ </span><span class="p">(</span><span class="nf">magnitude</span> <span class="nv">coordinates</span><span class="p">))</span> <span class="nv">coordinates</span><span class="p">))</span>
<span class="p">(</span><span class="kd">defn </span><span class="nv">engine-force</span>
<span class="s">"The force vector, each component in Newtons, due to the rocket engine."</span>
<span class="p">[</span><span class="nv">craft</span><span class="p">]</span>
<span class="p">(</span><span class="nf">scale</span> <span class="p">(</span><span class="nf">thrust</span> <span class="nv">craft</span><span class="p">)</span> <span class="p">(</span><span class="nf">unit-vector</span> <span class="p">(</span><span class="nf">orientation</span> <span class="nv">craft</span><span class="p">))))</span>
</code></pre>
<p>We’re taking the orientation of the craft–some coordinates–and scaling it to be of length one with <code>unit-vector</code>. Then we’re scaling the orientation vector by the thrust, returning a <em>thrust vector</em>.</p>
<p>As we go back and redefine parts of the program, you might see an error like</p>
<pre><code>Exception in thread "main" java.lang.RuntimeException: Unable to resolve symbol: unit-vector in this context, compiling:(scratch/rocket.clj:69:11)
at clojure.lang.Compiler.analyze(Compiler.java:6380)
at clojure.lang.Compiler.analyze(Compiler.java:6322)
</code></pre>
<p>This is a stack trace from the Clojure compiler. It indicates that in <code>scratch/rocket.clj</code>, on line <code>69</code>, column <code>11</code>, we used the symbol <code>unit-vector</code>–but it didn’t have a meaning at that point in the program. Perhaps <code>unit-vector</code> is defined <em>below</em> that line. There are two ways to solve this.</p>
<ol>
<li>
<p>Organize your functions so that the simple ones come first, and those that depend on them come later. Read this way, namespaces tell a story, progressing from smaller to bigger, more complex problems.</p>
</li>
<li>
<p>Sometimes, ordering functions this way is impossible, or would put related ideas too far apart. In this case you can <code>(declare unit-vector)</code> near the top of the namespace. This tells Clojure that <code>unit-vector</code> isn’t defined <em>yet</em>, but it’ll come later.</p>
</li>
</ol>
<p>Now that we’ve broken up <code>engine-force</code> into <code>thrust</code> and <code>orientation</code>, we have to control the thrust properly for our two burns. We’ll start by defining the times for the initial ascent and circularization burn, expressed as vectors of start and end times, in seconds.</p>
<pre><code><span></span><span class="p">(</span><span class="k">def </span><span class="nv">ascent</span>
<span class="s">"The start and end times for the ascent burn."</span>
<span class="p">[</span><span class="mi">0</span> <span class="mi">3000</span><span class="p">])</span>
<span class="p">(</span><span class="k">def </span><span class="nv">circularization</span>
<span class="s">"The start and end times for the circularization burn."</span>
<span class="p">[</span><span class="mi">4000</span> <span class="mi">1000</span><span class="p">])</span>
</code></pre>
<p>Now we’ll change the thrust by adjusting the rate of fuel consumption. Instead of burning at maximum until running out of fuel, we’ll execute two distinct burns.</p>
<pre><code><span></span><span class="p">(</span><span class="kd">defn </span><span class="nv">fuel-rate</span>
<span class="s">"How fast is fuel, in kilograms/second, consumed by the craft?"</span>
<span class="p">[</span><span class="nv">craft</span><span class="p">]</span>
<span class="p">(</span><span class="nf">cond</span>
<span class="c1">; Out of fuel</span>
<span class="p">(</span><span class="nb"><= </span><span class="p">(</span><span class="ss">:fuel-mass</span> <span class="nv">craft</span><span class="p">)</span> <span class="mi">0</span><span class="p">)</span>
<span class="mi">0</span>
<span class="c1">; Ascent burn</span>
<span class="p">(</span><span class="nb"><= </span><span class="p">(</span><span class="nb">first </span><span class="nv">ascent</span><span class="p">)</span> <span class="p">(</span><span class="ss">:time</span> <span class="nv">craft</span><span class="p">)</span> <span class="p">(</span><span class="nb">last </span><span class="nv">ascent</span><span class="p">))</span>
<span class="p">(</span><span class="ss">:max-fuel-rate</span> <span class="nv">craft</span><span class="p">)</span>
<span class="c1">; Circularization burn</span>
<span class="p">(</span><span class="nb"><= </span><span class="p">(</span><span class="nb">first </span><span class="nv">circularization</span><span class="p">)</span> <span class="p">(</span><span class="ss">:time</span> <span class="nv">craft</span><span class="p">)</span> <span class="p">(</span><span class="nb">last </span><span class="nv">circularization</span><span class="p">))</span>
<span class="p">(</span><span class="ss">:max-fuel-rate</span> <span class="nv">craft</span><span class="p">)</span>
<span class="c1">; Shut down engines otherwise</span>
<span class="ss">:else</span> <span class="mi">0</span><span class="p">))</span>
</code></pre>
<p>We’re using <code>cond</code> to express four distinct possibilities: that we’ve run out of fuel, executing either of the two burns, or resting with the engines shut down. Because the comparison function <code><=</code> takes any number of arguments and asserts that they occur in order, expressing intervals like “the time is between the first and last times in the ascent” is easy.</p>
<p>Finally, we need to determine the <em>direction</em> to burn in. This one’s gonna require some tricky linear algebra. You don’t need to worry about the specifics here–the goal is to find out what direction the rocket should burn to fly towards the horizon, in a circle around the planet. We’re doing that by taking the rocket’s velocity vector, and <em>flattening out</em> the velocity towards or away from the planet. All that’s left is the direction the rocket is flying <em>around</em> the earth.</p>
<pre><code><span></span><span class="p">(</span><span class="kd">defn </span><span class="nv">dot-product</span>
<span class="s">"Finds the inner product of two x, y, z coordinate maps.</span>
<span class="s"> See http://en.wikipedia.org/wiki/Dot_product."</span>
<span class="p">[</span><span class="nv">c1</span> <span class="nv">c2</span><span class="p">]</span>
<span class="p">(</span><span class="nb">+ </span><span class="p">(</span><span class="nb">* </span><span class="p">(</span><span class="ss">:x</span> <span class="nv">c1</span><span class="p">)</span> <span class="p">(</span><span class="ss">:x</span> <span class="nv">c2</span><span class="p">))</span>
<span class="p">(</span><span class="nb">* </span><span class="p">(</span><span class="ss">:y</span> <span class="nv">c1</span><span class="p">)</span> <span class="p">(</span><span class="ss">:y</span> <span class="nv">c2</span><span class="p">))</span>
<span class="p">(</span><span class="nb">* </span><span class="p">(</span><span class="ss">:z</span> <span class="nv">c1</span><span class="p">)</span> <span class="p">(</span><span class="ss">:z</span> <span class="nv">c2</span><span class="p">))))</span>
<span class="p">(</span><span class="kd">defn </span><span class="nv">projection</span>
<span class="s">"The component of coordinate map a in the direction of coordinate map b.</span>
<span class="s"> See http://en.wikipedia.org/wiki/Vector_projection."</span>
<span class="p">[</span><span class="nv">a</span> <span class="nv">b</span><span class="p">]</span>
<span class="p">(</span><span class="k">let </span><span class="p">[</span><span class="nv">b</span> <span class="p">(</span><span class="nf">unit-vector</span> <span class="nv">b</span><span class="p">)]</span>
<span class="p">(</span><span class="nf">scale</span> <span class="p">(</span><span class="nf">dot-product</span> <span class="nv">a</span> <span class="nv">b</span><span class="p">)</span> <span class="nv">b</span><span class="p">)))</span>
<span class="p">(</span><span class="kd">defn </span><span class="nv">rejection</span>
<span class="s">"The component of coordinate map a *not* in the direction of coordinate map</span>
<span class="s"> b."</span>
<span class="p">[</span><span class="nv">a</span> <span class="nv">b</span><span class="p">]</span>
<span class="p">(</span><span class="k">let </span><span class="p">[</span><span class="nv">a</span><span class="o">'</span> <span class="p">(</span><span class="nf">projection</span> <span class="nv">a</span> <span class="nv">b</span><span class="p">)]</span>
<span class="p">{</span><span class="ss">:x</span> <span class="p">(</span><span class="nb">- </span><span class="p">(</span><span class="ss">:x</span> <span class="nv">a</span><span class="p">)</span> <span class="p">(</span><span class="ss">:x</span> <span class="nv">a</span><span class="o">'</span><span class="p">))</span>
<span class="ss">:y</span> <span class="p">(</span><span class="nb">- </span><span class="p">(</span><span class="ss">:y</span> <span class="nv">a</span><span class="p">)</span> <span class="p">(</span><span class="ss">:y</span> <span class="nv">a</span><span class="o">'</span><span class="p">))</span>
<span class="ss">:z</span> <span class="p">(</span><span class="nb">- </span><span class="p">(</span><span class="ss">:z</span> <span class="nv">a</span><span class="p">)</span> <span class="p">(</span><span class="ss">:z</span> <span class="nv">a</span><span class="o">'</span><span class="p">))}))</span>
</code></pre>
<p>With the mathematical underpinnings ready, we’ll define the orientation control software:</p>
<pre><code><span></span><span class="p">(</span><span class="kd">defn </span><span class="nv">orientation</span>
<span class="s">"What direction is the craft pointing?"</span>
<span class="p">[</span><span class="nv">craft</span><span class="p">]</span>
<span class="p">(</span><span class="nf">cond</span>
<span class="c1">; Initially, point along the *position* vector of the craft--that is</span>
<span class="c1">; to say, straight up, away from the earth.</span>
<span class="p">(</span><span class="nb"><= </span><span class="p">(</span><span class="nb">first </span><span class="nv">ascent</span><span class="p">)</span> <span class="p">(</span><span class="ss">:time</span> <span class="nv">craft</span><span class="p">)</span> <span class="p">(</span><span class="nb">last </span><span class="nv">ascent</span><span class="p">))</span>
<span class="p">(</span><span class="ss">:position</span> <span class="nv">craft</span><span class="p">)</span>
<span class="c1">; During the circularization burn, we want to burn *sideways*, in the</span>
<span class="c1">; direction of the orbit. We'll find the component of our velocity</span>
<span class="c1">; which is aligned with our position vector (that is to say, the vertical</span>
<span class="c1">; velocity), and subtract the vertical component. All that's left is the</span>
<span class="c1">; *horizontal* part of our velocity.</span>
<span class="p">(</span><span class="nb"><= </span><span class="p">(</span><span class="nb">first </span><span class="nv">circularization</span><span class="p">)</span> <span class="p">(</span><span class="ss">:time</span> <span class="nv">craft</span><span class="p">)</span> <span class="p">(</span><span class="nb">last </span><span class="nv">circularization</span><span class="p">))</span>
<span class="p">(</span><span class="nf">rejection</span> <span class="p">(</span><span class="ss">:velocity</span> <span class="nv">craft</span><span class="p">)</span> <span class="p">(</span><span class="ss">:position</span> <span class="nv">craft</span><span class="p">))</span>
<span class="c1">; Otherwise, just point straight ahead.</span>
<span class="ss">:else</span> <span class="p">(</span><span class="ss">:velocity</span> <span class="nv">craft</span><span class="p">)))</span>
</code></pre>
<p>For the ascent burn, we’ll push straight away from the planet. For circularization, we use the <code>rejection</code> function to find the part of the velocity around the planet, and thrust in that direction. By default, we’ll leave the rocket pointing in the direction of travel.</p>
<p>With these changes made, the rocket should execute two burns. Let’s re-run the tests and see.</p>
<pre><code>aphyr@waterhouse:~/scratch$ lein test scratch.rocket-test
lein test scratch.rocket-test
Ran 2 tests containing 3 assertions.
0 failures, 0 errors.
</code></pre>
<p>We finally did it! We’re <em>rocket scientists</em>!</p>
<h2><a href="#review" id="review">Review</a></h2>
<pre><code><span></span><span class="p">(</span><span class="kd">ns </span><span class="nv">scratch.rocket</span><span class="p">)</span>
<span class="c1">;; Linear algebra for {:x 1 :y 2 :z 3} coordinate vectors.</span>
<span class="p">(</span><span class="kd">defn </span><span class="nv">map-values</span>
<span class="s">"Applies f to every value in the map m."</span>
<span class="p">[</span><span class="nv">f</span> <span class="nv">m</span><span class="p">]</span>
<span class="p">(</span><span class="nb">into </span><span class="p">{}</span>
<span class="p">(</span><span class="nb">map </span><span class="p">(</span><span class="k">fn </span><span class="p">[</span><span class="nv">pair</span><span class="p">]</span>
<span class="p">[(</span><span class="nb">key </span><span class="nv">pair</span><span class="p">)</span> <span class="p">(</span><span class="nf">f</span> <span class="p">(</span><span class="nb">val </span><span class="nv">pair</span><span class="p">))])</span>
<span class="nv">m</span><span class="p">)))</span>
<span class="p">(</span><span class="kd">defn </span><span class="nv">magnitude</span>
<span class="s">"What's the radius of a given set of cartesian coordinates?"</span>
<span class="p">[</span><span class="nv">c</span><span class="p">]</span>
<span class="c1">; By the Pythagorean theorem...</span>
<span class="p">(</span><span class="nf">Math/sqrt</span> <span class="p">(</span><span class="nb">+ </span><span class="p">(</span><span class="nf">Math/pow</span> <span class="p">(</span><span class="ss">:x</span> <span class="nv">c</span><span class="p">)</span> <span class="mi">2</span><span class="p">)</span>
<span class="p">(</span><span class="nf">Math/pow</span> <span class="p">(</span><span class="ss">:y</span> <span class="nv">c</span><span class="p">)</span> <span class="mi">2</span><span class="p">)</span>
<span class="p">(</span><span class="nf">Math/pow</span> <span class="p">(</span><span class="ss">:z</span> <span class="nv">c</span><span class="p">)</span> <span class="mi">2</span><span class="p">))))</span>
<span class="p">(</span><span class="kd">defn </span><span class="nv">scale</span>
<span class="s">"Multiplies a map of x, y, and z coordinates by the given factor."</span>
<span class="p">[</span><span class="nv">factor</span> <span class="nv">coordinates</span><span class="p">]</span>
<span class="p">(</span><span class="nf">map-values</span> <span class="p">(</span><span class="nb">partial * </span><span class="nv">factor</span><span class="p">)</span> <span class="nv">coordinates</span><span class="p">))</span>
<span class="p">(</span><span class="kd">defn </span><span class="nv">unit-vector</span>
<span class="s">"Scales coordinates to magnitude 1."</span>
<span class="p">[</span><span class="nv">coordinates</span><span class="p">]</span>
<span class="p">(</span><span class="nf">scale</span> <span class="p">(</span><span class="nb">/ </span><span class="p">(</span><span class="nf">magnitude</span> <span class="nv">coordinates</span><span class="p">))</span> <span class="nv">coordinates</span><span class="p">))</span>
<span class="p">(</span><span class="kd">defn </span><span class="nv">dot-product</span>
<span class="s">"Finds the inner product of two x, y, z coordinate maps. See</span>
<span class="s"> http://en.wikipedia.org/wiki/Dot_product"</span>
<span class="p">[</span><span class="nv">c1</span> <span class="nv">c2</span><span class="p">]</span>
<span class="p">(</span><span class="nb">+ </span><span class="p">(</span><span class="nb">* </span><span class="p">(</span><span class="ss">:x</span> <span class="nv">c1</span><span class="p">)</span> <span class="p">(</span><span class="ss">:x</span> <span class="nv">c2</span><span class="p">))</span>
<span class="p">(</span><span class="nb">* </span><span class="p">(</span><span class="ss">:y</span> <span class="nv">c1</span><span class="p">)</span> <span class="p">(</span><span class="ss">:y</span> <span class="nv">c2</span><span class="p">))</span>
<span class="p">(</span><span class="nb">* </span><span class="p">(</span><span class="ss">:z</span> <span class="nv">c1</span><span class="p">)</span> <span class="p">(</span><span class="ss">:z</span> <span class="nv">c2</span><span class="p">))))</span>
<span class="p">(</span><span class="kd">defn </span><span class="nv">projection</span>
<span class="s">"The component of coordinate map a in the direction of coordinate map b.</span>
<span class="s"> See http://en.wikipedia.org/wiki/Vector_projection."</span>
<span class="p">[</span><span class="nv">a</span> <span class="nv">b</span><span class="p">]</span>
<span class="p">(</span><span class="k">let </span><span class="p">[</span><span class="nv">b</span> <span class="p">(</span><span class="nf">unit-vector</span> <span class="nv">b</span><span class="p">)]</span>
<span class="p">(</span><span class="nf">scale</span> <span class="p">(</span><span class="nf">dot-product</span> <span class="nv">a</span> <span class="nv">b</span><span class="p">)</span> <span class="nv">b</span><span class="p">)))</span>
<span class="p">(</span><span class="kd">defn </span><span class="nv">rejection</span>
<span class="s">"The component of coordinate map a *not* in the direction of coordinate map</span>
<span class="s"> b."</span>
<span class="p">[</span><span class="nv">a</span> <span class="nv">b</span><span class="p">]</span>
<span class="p">(</span><span class="k">let </span><span class="p">[</span><span class="nv">a</span><span class="o">'</span> <span class="p">(</span><span class="nf">projection</span> <span class="nv">a</span> <span class="nv">b</span><span class="p">)]</span>
<span class="p">{</span><span class="ss">:x</span> <span class="p">(</span><span class="nb">- </span><span class="p">(</span><span class="ss">:x</span> <span class="nv">a</span><span class="p">)</span> <span class="p">(</span><span class="ss">:x</span> <span class="nv">a</span><span class="o">'</span><span class="p">))</span>
<span class="ss">:y</span> <span class="p">(</span><span class="nb">- </span><span class="p">(</span><span class="ss">:y</span> <span class="nv">a</span><span class="p">)</span> <span class="p">(</span><span class="ss">:y</span> <span class="nv">a</span><span class="o">'</span><span class="p">))</span>
<span class="ss">:z</span> <span class="p">(</span><span class="nb">- </span><span class="p">(</span><span class="ss">:z</span> <span class="nv">a</span><span class="p">)</span> <span class="p">(</span><span class="ss">:z</span> <span class="nv">a</span><span class="o">'</span><span class="p">))}))</span>
<span class="c1">;; Coordinate conversion</span>
<span class="p">(</span><span class="kd">defn </span><span class="nv">cartesian->spherical</span>
<span class="s">"Converts a map of Cartesian coordinates :x, :y, and :z to spherical</span>
<span class="s"> coordinates :r, :theta, and :phi."</span>
<span class="p">[</span><span class="nv">c</span><span class="p">]</span>
<span class="p">(</span><span class="k">let </span><span class="p">[</span><span class="nv">r</span> <span class="p">(</span><span class="nf">magnitude</span> <span class="nv">c</span><span class="p">)]</span>
<span class="p">{</span><span class="ss">:r</span> <span class="nv">r</span>
<span class="ss">:phi</span> <span class="p">(</span><span class="nf">Math/acos</span> <span class="p">(</span><span class="nb">/ </span><span class="p">(</span><span class="ss">:z</span> <span class="nv">c</span><span class="p">)</span> <span class="nv">r</span><span class="p">))</span>
<span class="ss">:theta</span> <span class="p">(</span><span class="nf">Math/atan2</span> <span class="p">(</span><span class="ss">:y</span> <span class="nv">c</span><span class="p">)</span> <span class="p">(</span><span class="ss">:x</span> <span class="nv">c</span><span class="p">))}))</span>
<span class="p">(</span><span class="kd">defn </span><span class="nv">spherical->cartesian</span>
<span class="s">"Converts spherical to Cartesian coordinates."</span>
<span class="p">[</span><span class="nv">c</span><span class="p">]</span>
<span class="p">{</span><span class="ss">:x</span> <span class="p">(</span><span class="nb">* </span><span class="p">(</span><span class="ss">:r</span> <span class="nv">c</span><span class="p">)</span> <span class="p">(</span><span class="nf">Math/cos</span> <span class="p">(</span><span class="ss">:theta</span> <span class="nv">c</span><span class="p">))</span> <span class="p">(</span><span class="nf">Math/sin</span> <span class="p">(</span><span class="ss">:phi</span> <span class="nv">c</span><span class="p">)))</span>
<span class="ss">:y</span> <span class="p">(</span><span class="nb">* </span><span class="p">(</span><span class="ss">:r</span> <span class="nv">c</span><span class="p">)</span> <span class="p">(</span><span class="nf">Math/sin</span> <span class="p">(</span><span class="ss">:theta</span> <span class="nv">c</span><span class="p">))</span> <span class="p">(</span><span class="nf">Math/sin</span> <span class="p">(</span><span class="ss">:phi</span> <span class="nv">c</span><span class="p">)))</span>
<span class="ss">:z</span> <span class="p">(</span><span class="nb">* </span><span class="p">(</span><span class="ss">:r</span> <span class="nv">c</span><span class="p">)</span> <span class="p">(</span><span class="nf">Math/cos</span> <span class="p">(</span><span class="ss">:phi</span> <span class="nv">c</span><span class="p">)))})</span>
<span class="c1">;; The earth</span>
<span class="p">(</span><span class="k">def </span><span class="nv">earth-equatorial-radius</span>
<span class="s">"Radius of the earth, in meters"</span>
<span class="mi">6378137</span><span class="p">)</span>
<span class="p">(</span><span class="k">def </span><span class="nv">earth-day</span>
<span class="s">"Length of an earth day, in seconds."</span>
<span class="mi">86400</span><span class="p">)</span>
<span class="p">(</span><span class="k">def </span><span class="nv">earth-equatorial-speed</span>
<span class="s">"How fast points on the equator move, relative to the center of the earth, in</span>
<span class="s"> meters/sec."</span>
<span class="p">(</span><span class="nb">/ </span><span class="p">(</span><span class="nb">* </span><span class="mi">2</span> <span class="nv">Math/PI</span> <span class="nv">earth-equatorial-radius</span><span class="p">)</span>
<span class="nv">earth-day</span><span class="p">))</span>
<span class="p">(</span><span class="k">def </span><span class="nv">g</span> <span class="s">"Acceleration of gravity in meters/s^2"</span> <span class="mf">-9.8</span><span class="p">)</span>
<span class="p">(</span><span class="k">def </span><span class="nv">initial-space-center</span>
<span class="s">"The initial position and velocity of the launch facility"</span>
<span class="p">{</span><span class="ss">:time</span> <span class="mi">0</span>
<span class="ss">:position</span> <span class="p">{</span><span class="ss">:x</span> <span class="nv">earth-equatorial-radius</span>
<span class="ss">:y</span> <span class="mi">0</span>
<span class="ss">:z</span> <span class="mi">0</span><span class="p">}</span>
<span class="ss">:velocity</span> <span class="p">{</span><span class="ss">:x</span> <span class="mi">0</span>
<span class="ss">:y</span> <span class="nv">earth-equatorial-speed</span>
<span class="ss">:z</span> <span class="mi">0</span><span class="p">}})</span>
<span class="c1">;; Craft</span>
<span class="p">(</span><span class="kd">defn </span><span class="nv">centaur</span>
<span class="s">"The upper rocket stage.</span>
<span class="s"> http://en.wikipedia.org/wiki/Centaur_(rocket_stage)</span>
<span class="s"> http://www.astronautix.com/stages/cenaurde.htm"</span>
<span class="p">[]</span>
<span class="p">{</span><span class="ss">:dry-mass</span> <span class="mi">2361</span>
<span class="ss">:fuel-mass</span> <span class="mi">13897</span>
<span class="ss">:isp</span> <span class="mi">4354</span>
<span class="ss">:max-fuel-rate</span> <span class="p">(</span><span class="nb">/ </span><span class="mi">13897</span> <span class="mi">470</span><span class="p">)})</span>
<span class="p">(</span><span class="kd">defn </span><span class="nv">atlas-v</span>
<span class="s">"The full launch vehicle. http://en.wikipedia.org/wiki/Atlas_V"</span>
<span class="p">[</span><span class="nv">next-stage</span><span class="p">]</span>
<span class="p">{</span><span class="ss">:dry-mass</span> <span class="mi">50050</span>
<span class="ss">:fuel-mass</span> <span class="mi">284450</span>
<span class="ss">:isp</span> <span class="mi">3050</span>
<span class="ss">:max-fuel-rate</span> <span class="p">(</span><span class="nb">/ </span><span class="mi">284450</span> <span class="mi">253</span><span class="p">)</span>
<span class="ss">:next-stage</span> <span class="nv">next-stage</span><span class="p">})</span>
<span class="c1">;; Flight control</span>
<span class="p">(</span><span class="k">def </span><span class="nv">ascent</span>
<span class="s">"The start and end times for the ascent burn."</span>
<span class="p">[</span><span class="mi">0</span> <span class="mi">300</span><span class="p">])</span>
<span class="p">(</span><span class="k">def </span><span class="nv">circularization</span>
<span class="s">"The start and end times for the circularization burn."</span>
<span class="p">[</span><span class="mi">400</span> <span class="mi">1000</span><span class="p">])</span>
<span class="p">(</span><span class="kd">defn </span><span class="nv">orientation</span>
<span class="s">"What direction is the craft pointing?"</span>
<span class="p">[</span><span class="nv">craft</span><span class="p">]</span>
<span class="p">(</span><span class="nf">cond</span>
<span class="c1">; Initially, point along the *position* vector of the craft--that is</span>
<span class="c1">; to say, straight up, away from the earth.</span>
<span class="p">(</span><span class="nb"><= </span><span class="p">(</span><span class="nb">first </span><span class="nv">ascent</span><span class="p">)</span> <span class="p">(</span><span class="ss">:time</span> <span class="nv">craft</span><span class="p">)</span> <span class="p">(</span><span class="nb">last </span><span class="nv">ascent</span><span class="p">))</span>
<span class="p">(</span><span class="ss">:position</span> <span class="nv">craft</span><span class="p">)</span>
<span class="c1">; During the circularization burn, we want to burn *sideways*, in the</span>
<span class="c1">; direction of the orbit. We'll find the component of our velocity</span>
<span class="c1">; which is aligned with our position vector (that is to say, the vertical</span>
<span class="c1">; velocity), and subtract the vertical component. All that's left is the</span>
<span class="c1">; *horizontal* part of our velocity.</span>
<span class="p">(</span><span class="nb"><= </span><span class="p">(</span><span class="nb">first </span><span class="nv">circularization</span><span class="p">)</span> <span class="p">(</span><span class="ss">:time</span> <span class="nv">craft</span><span class="p">)</span> <span class="p">(</span><span class="nb">last </span><span class="nv">circularization</span><span class="p">))</span>
<span class="p">(</span><span class="nf">rejection</span> <span class="p">(</span><span class="ss">:velocity</span> <span class="nv">craft</span><span class="p">)</span> <span class="p">(</span><span class="ss">:position</span> <span class="nv">craft</span><span class="p">))</span>
<span class="c1">; Otherwise, just point straight ahead.</span>
<span class="ss">:else</span> <span class="p">(</span><span class="ss">:velocity</span> <span class="nv">craft</span><span class="p">)))</span>
<span class="p">(</span><span class="kd">defn </span><span class="nv">fuel-rate</span>
<span class="s">"How fast is fuel, in kilograms/second, consumed by the craft?"</span>
<span class="p">[</span><span class="nv">craft</span><span class="p">]</span>
<span class="p">(</span><span class="nf">cond</span>
<span class="c1">; Out of fuel</span>
<span class="p">(</span><span class="nb"><= </span><span class="p">(</span><span class="ss">:fuel-mass</span> <span class="nv">craft</span><span class="p">)</span> <span class="mi">0</span><span class="p">)</span>
<span class="mi">0</span>
<span class="c1">; Ascent burn</span>
<span class="p">(</span><span class="nb"><= </span><span class="p">(</span><span class="nb">first </span><span class="nv">ascent</span><span class="p">)</span> <span class="p">(</span><span class="ss">:time</span> <span class="nv">craft</span><span class="p">)</span> <span class="p">(</span><span class="nb">last </span><span class="nv">ascent</span><span class="p">))</span>
<span class="p">(</span><span class="ss">:max-fuel-rate</span> <span class="nv">craft</span><span class="p">)</span>
<span class="c1">; Circularization burn</span>
<span class="p">(</span><span class="nb"><= </span><span class="p">(</span><span class="nb">first </span><span class="nv">circularization</span><span class="p">)</span> <span class="p">(</span><span class="ss">:time</span> <span class="nv">craft</span><span class="p">)</span> <span class="p">(</span><span class="nb">last </span><span class="nv">circularization</span><span class="p">))</span>
<span class="p">(</span><span class="ss">:max-fuel-rate</span> <span class="nv">craft</span><span class="p">)</span>
<span class="c1">; Shut down engines otherwise</span>
<span class="ss">:else</span> <span class="mi">0</span><span class="p">))</span>
<span class="p">(</span><span class="kd">defn </span><span class="nv">stage</span>
<span class="s">"When fuel reserves are exhausted, separate stages. Otherwise, return craft</span>
<span class="s"> unchanged."</span>
<span class="p">[</span><span class="nv">craft</span><span class="p">]</span>
<span class="p">(</span><span class="nf">cond</span>
<span class="c1">; Still fuel left</span>
<span class="p">(</span><span class="nb">pos? </span><span class="p">(</span><span class="ss">:fuel-mass</span> <span class="nv">craft</span><span class="p">))</span>
<span class="nv">craft</span>
<span class="c1">; No remaining stages</span>
<span class="p">(</span><span class="nb">nil? </span><span class="p">(</span><span class="ss">:next-stage</span> <span class="nv">craft</span><span class="p">))</span>
<span class="nv">craft</span>
<span class="c1">; Stage!</span>
<span class="ss">:else</span>
<span class="p">(</span><span class="nb">merge </span><span class="p">(</span><span class="ss">:next-stage</span> <span class="nv">craft</span><span class="p">)</span>
<span class="p">(</span><span class="nb">select-keys </span><span class="nv">craft</span> <span class="p">[</span><span class="ss">:time</span> <span class="ss">:position</span> <span class="ss">:velocity</span><span class="p">]))))</span>
<span class="c1">;; Dynamics</span>
<span class="p">(</span><span class="kd">defn </span><span class="nv">thrust</span>
<span class="s">"How much force, in newtons, does the craft's rocket engines exert?"</span>
<span class="p">[</span><span class="nv">craft</span><span class="p">]</span>
<span class="p">(</span><span class="nb">* </span><span class="p">(</span><span class="nf">fuel-rate</span> <span class="nv">craft</span><span class="p">)</span> <span class="p">(</span><span class="ss">:isp</span> <span class="nv">craft</span><span class="p">)))</span>
<span class="p">(</span><span class="kd">defn </span><span class="nv">mass</span>
<span class="s">"The total mass of a craft."</span>
<span class="p">[</span><span class="nv">craft</span><span class="p">]</span>
<span class="p">(</span><span class="nb">+ </span><span class="p">(</span><span class="ss">:dry-mass</span> <span class="nv">craft</span><span class="p">)</span> <span class="p">(</span><span class="ss">:fuel-mass</span> <span class="nv">craft</span><span class="p">)))</span>
<span class="p">(</span><span class="kd">defn </span><span class="nv">gravity-force</span>
<span class="s">"The force vector, each component in Newtons, due to gravity."</span>
<span class="p">[</span><span class="nv">craft</span><span class="p">]</span>
<span class="c1">; Since force is mass times acceleration...</span>
<span class="p">(</span><span class="k">let </span><span class="p">[</span><span class="nv">total-force</span> <span class="p">(</span><span class="nb">* </span><span class="nv">g</span> <span class="p">(</span><span class="nf">mass</span> <span class="nv">craft</span><span class="p">))]</span>
<span class="p">(</span><span class="nb">-> </span><span class="nv">craft</span>
<span class="c1">; Now we'll take the craft's position</span>
<span class="ss">:position</span>
<span class="c1">; in spherical coordinates,</span>
<span class="nv">cartesian->spherical</span>
<span class="c1">; replace the radius with the gravitational force...</span>
<span class="p">(</span><span class="nb">assoc </span><span class="ss">:r</span> <span class="nv">total-force</span><span class="p">)</span>
<span class="c1">; and transform back to Cartesian-land</span>
<span class="nv">spherical->cartesian</span><span class="p">)))</span>
<span class="p">(</span><span class="kd">declare </span><span class="nv">altitude</span><span class="p">)</span>
<span class="p">(</span><span class="kd">defn </span><span class="nv">engine-force</span>
<span class="s">"The force vector, each component in Newtons, due to the rocket engine."</span>
<span class="p">[</span><span class="nv">craft</span><span class="p">]</span>
<span class="c1">; Debugging; useful for working through trajectories in detail.</span>
<span class="c1">; (println craft)</span>
<span class="c1">; (println "t " (:time craft) "alt" (altitude craft) "thrust" (thrust craft))</span>
<span class="c1">; (println "fuel" (:fuel-mass craft))</span>
<span class="c1">; (println "vel " (:velocity craft))</span>
<span class="c1">; (println "ori " (unit-vector (orientation craft)))</span>
<span class="p">(</span><span class="nf">scale</span> <span class="p">(</span><span class="nf">thrust</span> <span class="nv">craft</span><span class="p">)</span> <span class="p">(</span><span class="nf">unit-vector</span> <span class="p">(</span><span class="nf">orientation</span> <span class="nv">craft</span><span class="p">))))</span>
<span class="p">(</span><span class="kd">defn </span><span class="nv">total-force</span>
<span class="s">"Total force on a craft."</span>
<span class="p">[</span><span class="nv">craft</span><span class="p">]</span>
<span class="p">(</span><span class="nb">merge-with + </span><span class="p">(</span><span class="nf">engine-force</span> <span class="nv">craft</span><span class="p">)</span>
<span class="p">(</span><span class="nf">gravity-force</span> <span class="nv">craft</span><span class="p">)))</span>
<span class="p">(</span><span class="kd">defn </span><span class="nv">acceleration</span>
<span class="s">"Total acceleration of a craft."</span>
<span class="p">[</span><span class="nv">craft</span><span class="p">]</span>
<span class="p">(</span><span class="k">let </span><span class="p">[</span><span class="nv">m</span> <span class="p">(</span><span class="nf">mass</span> <span class="nv">craft</span><span class="p">)]</span>
<span class="p">(</span><span class="nf">scale</span> <span class="p">(</span><span class="nb">/ </span><span class="nv">m</span><span class="p">)</span> <span class="p">(</span><span class="nf">total-force</span> <span class="nv">craft</span><span class="p">))))</span>
<span class="p">(</span><span class="kd">defn </span><span class="nv">step</span>
<span class="p">[</span><span class="nv">craft</span> <span class="nv">dt</span><span class="p">]</span>
<span class="p">(</span><span class="k">let </span><span class="p">[</span><span class="nv">craft</span> <span class="p">(</span><span class="nf">stage</span> <span class="nv">craft</span><span class="p">)]</span>
<span class="p">(</span><span class="nb">assoc </span><span class="nv">craft</span>
<span class="c1">; Time advances by dt seconds</span>
<span class="ss">:time</span> <span class="p">(</span><span class="nb">+ </span><span class="nv">dt</span> <span class="p">(</span><span class="ss">:time</span> <span class="nv">craft</span><span class="p">))</span>
<span class="c1">; We burn some fuel</span>
<span class="ss">:fuel-mass</span> <span class="p">(</span><span class="nb">- </span><span class="p">(</span><span class="ss">:fuel-mass</span> <span class="nv">craft</span><span class="p">)</span> <span class="p">(</span><span class="nb">* </span><span class="nv">dt</span> <span class="p">(</span><span class="nf">fuel-rate</span> <span class="nv">craft</span><span class="p">)))</span>
<span class="c1">; Our position changes based on our velocity</span>
<span class="ss">:position</span> <span class="p">(</span><span class="nb">merge-with + </span><span class="p">(</span><span class="ss">:position</span> <span class="nv">craft</span><span class="p">)</span>
<span class="p">(</span><span class="nf">scale</span> <span class="nv">dt</span> <span class="p">(</span><span class="ss">:velocity</span> <span class="nv">craft</span><span class="p">)))</span>
<span class="c1">; And our velocity changes based on our acceleration</span>
<span class="ss">:velocity</span> <span class="p">(</span><span class="nb">merge-with + </span><span class="p">(</span><span class="ss">:velocity</span> <span class="nv">craft</span><span class="p">)</span>
<span class="p">(</span><span class="nf">scale</span> <span class="nv">dt</span> <span class="p">(</span><span class="nf">acceleration</span> <span class="nv">craft</span><span class="p">))))))</span>
<span class="c1">;; Launch and flight</span>
<span class="p">(</span><span class="kd">defn </span><span class="nv">prepare</span>
<span class="s">"Prepares a craft for launch from an equatorial space center."</span>
<span class="p">[</span><span class="nv">craft</span><span class="p">]</span>
<span class="p">(</span><span class="nb">merge </span><span class="nv">craft</span> <span class="nv">initial-space-center</span><span class="p">))</span>
<span class="p">(</span><span class="kd">defn </span><span class="nv">trajectory</span>
<span class="p">[</span><span class="nv">dt</span> <span class="nv">craft</span><span class="p">]</span>
<span class="s">"Returns all future states of the craft, at dt-second intervals."</span>
<span class="p">(</span><span class="nb">iterate </span><span class="o">#</span><span class="p">(</span><span class="nf">step</span> <span class="nv">%</span> <span class="mi">1</span><span class="p">)</span> <span class="nv">craft</span><span class="p">))</span>
<span class="c1">;; Analyzing trajectories</span>
<span class="p">(</span><span class="kd">defn </span><span class="nv">altitude</span>
<span class="s">"The height above the surface of the equator, in meters."</span>
<span class="p">[</span><span class="nv">craft</span><span class="p">]</span>
<span class="p">(</span><span class="nb">-> </span><span class="nv">craft</span>
<span class="ss">:position</span>
<span class="nv">cartesian->spherical</span>
<span class="ss">:r</span>
<span class="p">(</span><span class="nb">- </span><span class="nv">earth-equatorial-radius</span><span class="p">)))</span>
<span class="p">(</span><span class="kd">defn </span><span class="nv">above-ground?</span>
<span class="s">"Is the craft at or above the surface?"</span>
<span class="p">[</span><span class="nv">craft</span><span class="p">]</span>
<span class="p">(</span><span class="nb"><= </span><span class="mi">0</span> <span class="p">(</span><span class="nf">altitude</span> <span class="nv">craft</span><span class="p">)))</span>
<span class="p">(</span><span class="kd">defn </span><span class="nv">flight</span>
<span class="s">"The above-ground portion of a trajectory."</span>
<span class="p">[</span><span class="nv">trajectory</span><span class="p">]</span>
<span class="p">(</span><span class="nb">take-while </span><span class="nv">above-ground?</span> <span class="nv">trajectory</span><span class="p">))</span>
<span class="p">(</span><span class="kd">defn </span><span class="nv">crashed?</span>
<span class="s">"Does this trajectory crash into the surface before 10 hours are up?"</span>
<span class="p">[</span><span class="nv">trajectory</span><span class="p">]</span>
<span class="p">(</span><span class="k">let </span><span class="p">[</span><span class="nv">time-limit</span> <span class="p">(</span><span class="nb">* </span><span class="mi">10</span> <span class="mi">3600</span><span class="p">)]</span> <span class="c1">; 10 hours</span>
<span class="p">(</span><span class="nb">not </span><span class="p">(</span><span class="nb">every? </span><span class="nv">above-ground?</span>
<span class="p">(</span><span class="nb">take-while </span><span class="o">#</span><span class="p">(</span><span class="nb"><= </span><span class="p">(</span><span class="ss">:time</span> <span class="nv">%</span><span class="p">)</span> <span class="nv">time-limit</span><span class="p">)</span> <span class="nv">trajectory</span><span class="p">)))))</span>
<span class="p">(</span><span class="kd">defn </span><span class="nv">crash-time</span>
<span class="s">"Given a trajectory, returns the time the rocket impacted the ground."</span>
<span class="p">[</span><span class="nv">trajectory</span><span class="p">]</span>
<span class="p">(</span><span class="ss">:time</span> <span class="p">(</span><span class="nb">last </span><span class="p">(</span><span class="nf">flight</span> <span class="nv">trajectory</span><span class="p">))))</span>
<span class="p">(</span><span class="kd">defn </span><span class="nv">apoapsis</span>
<span class="s">"The highest altitude achieved during a trajectory."</span>
<span class="p">[</span><span class="nv">trajectory</span><span class="p">]</span>
<span class="p">(</span><span class="nb">apply max </span><span class="p">(</span><span class="nb">map </span><span class="nv">altitude</span> <span class="p">(</span><span class="nf">flight</span> <span class="nv">trajectory</span><span class="p">))))</span>
<span class="p">(</span><span class="kd">defn </span><span class="nv">apoapsis-time</span>
<span class="s">"The time of apoapsis"</span>
<span class="p">[</span><span class="nv">trajectory</span><span class="p">]</span>
<span class="p">(</span><span class="ss">:time</span> <span class="p">(</span><span class="nb">apply max-key </span><span class="nv">altitude</span> <span class="p">(</span><span class="nf">flight</span> <span class="nv">trajectory</span><span class="p">))))</span>
</code></pre>
<p>As written here, our first non-trivial program tells a story–though a <em>different</em> one than the process of exploration and refinement that brought the rocket to orbit. It builds from small, abstract ideas: linear algebra and coordinates; physical constants describing the universe for the simulation; and the basic outline of the spacecraft. Then we define the software controlling the rocket; the times for the burns, how much to thrust, in what direction, and when to separate stages. Using those control functions, we build a <em>physics engine</em> including gravity and thrust forces, and use Newton’s second law to build a basic <a href="http://en.wikipedia.org/wiki/Euler_method">Euler Method</a> solver. Finally, we analyze the trajectories the solver produces to answer key questions: how high, how long, and did it explode?</p>
<p>We used Clojure’s immutable data structures–mostly maps–to represent the state of the universe, and defined <em>pure functions</em> to interpret those states and construct new ones. Using <code>iterate</code>, we projected a single state forward into an infinite timeline of the future–evaluated as demanded by the analysis functions. Though we pay a performance penalty, immutable data structures, pure functions, and lazy evaluation make simulating complex systems easier to reason about.</p>
<p>Had we written this simulation in a different language, different techniques might have come into play. In Java, C++, or Ruby, we would have defined a hierarchy of datatypes called <em>classes</em>, each one representing a small piece of state. We might define a <code>Craft</code> type, and created subtypes <code>Atlas</code> and <code>Centaur</code>. We’d create a <code>Coordinate</code> type, subdivided into <code>Cartesian</code> and <code>Spherical</code>, and so on. The types add complexity and rigidity, but also prevent mis-spellings, and can prevent us from interpreting, say, coordinates as craft or vice-versa.</p>
<p>To move the system forward in a language emphasizing <em>mutable</em> data structures, we would have updated the time and coordinates of a single craft in-place. This introduces additional complexity, because many of the changes we made depended on the current values of the craft. To ensure the correct ordering of calculations, we’d scatter temporary variables and explicit copies throughout the code, ensuring that functions didn’t see inconsistent pictures of the craft state. The mutable approach would likely be faster, but would still demand some copying of data, and sacrifice clarity.</p>
<p>More <em>imperative</em> languages place less emphasis on laziness, and make it harder to express ideas like <code>map</code> and <code>take</code>. We might have simulated the trajectory for some fixed time, constructing a history of all the intermediate results we needed, then analyzed it by moving explicitly from slot to slot in that history, checking if the craft had crashed, and so on.</p>
<p>Across all these languages, though, some ideas remain the same. We solve big problems by breaking them up into smaller ones. We use data structures to represent the state of the system, and functions to alter that state. Comments and docstrings clarify the <em>story</em> of the code, making it readable to others. Tests ensure the software is correct, and allow us to work piecewise towards a solution.</p>
<h2><a href="#exercises" id="exercises">Exercises</a></h2>
<ol>
<li>
<p>We know the spacecraft reached orbit, but we have no idea what that orbit <em>looks</em> like. Since the trajectory is infinite in length, we can’t ask about the <em>entire</em> history using <code>max</code>–but we know that all orbits have a high and low point. At the highest point, the difference between successive altitudes changes from increasing to decreasing, and at the lowest point, the difference between successive altitudes changes from decreasing to increasing. Using this technique, refine the <code>apoapsis</code> function to find the highest point using that <em>inflection</em> in altitudes–and write a corresponding <code>periapsis</code> function that finds the lowest point in the orbit. Display both periapsis and apoapsis in the test suite.</p>
</li>
<li>
<p>We assumed the force of gravity resulted in a constant 9.8 meter/second/second acceleration towards the earth, but in the real world, gravity falls off with the <a href="http://en.wikipedia.org/wiki/Newton's_law_of_universal_gravitation">inverse square law</a>. Using the mass of the earth, mass of the spacecraft, and Newton’s constant, refine the gravitational force used in this simulation to take Newton’s law into account. How does this affect the apoapsis?</p>
</li>
<li>
<p>We ignored the atmosphere, which exerts <a href="http://en.wikipedia.org/wiki/Drag_(physics)">drag</a> on the craft as it moves through the air. Write a basic air-density function which falls off with altitude. Make some educated guesses as to how much drag a real rocket experiences, and assume that the drag force is proportional to the square of the rocket’s velocity. Can your rocket still reach orbit?</p>
</li>
<li>
<p>Notice that the periapsis and apoapsis of the rocket are <em>different</em>. By executing the circularization burn carefully, can you make them the same–achieving a perfectly circular orbit? One way to do this is to pick an orbital altitude and velocity of a known satellite–say, the International Space Station–and write the control software to match that velocity at that altitude.</p>
</li>
</ol>
<p>In the next chapter, we talk about <a href="https://aphyr.com/posts/319-clojure-from-the-ground-up-debugging">debugging</a>.</p>
https://aphyr.com/posts/311-clojure-from-the-ground-up-logisticsClojure from the ground up: logistics2014-02-15T13:20:40-05:002014-02-15T13:20:40-05:00Aphyrhttps://aphyr.com/<p><em>Previously, we covered <a href="http://aphyr.com/posts/306-clojure-from-the-ground-up-state">state and mutability</a>.</em></p>
<p>Up until now, we’ve been programming primarily at the REPL. However, the
REPL is a limited tool. While it lets us explore a problem interactively, that
interactivity comes at a cost: changing an expression requires retyping the
entire thing, editing multi-line expressions is awkward, and our work vanishes
when we restart the REPL–so we can’t share our programs with others, or run
them again later. Moreover, programs in the REPL are hard to organize. To solve
large problems, we need a way of writing programs <em>durably</em>–so they can be read
and evaluated later.</p>
<p>In addition to the code itself, we often want to store <em>ancillary</em>
information. <em>Tests</em> verify the correctness of the program. <em>Resources</em> like
precomputed databases, lookup tables, images, and text files provide other data
the program needs to run. There may be <em>documentation</em>: instructions for how to use and understand the software. A program may also depend on code from <em>other</em> programs, which we call <em>libraries</em>, <em>packages</em>, or <em>dependencies</em>. In Clojure, we have a standardized way to bind together all these parts into a single directory, called a <em>project</em>.</p>
<h2><a href="#project-structure" id="project-structure">Project structure</a></h2>
<p>We created a project at the start of this book by using Leiningen, the Clojure
project tool.</p>
<pre><code><span></span>$ lein new scratch
</code></pre>
<p><code>scratch</code> is the name of the project, and also the name of the directory where the project’s files live. Inside the project are a few files.</p>
<pre><code><span></span>$ <span class="nb">cd</span> scratch<span class="p">;</span> ls
doc project.clj README.md resources src target <span class="nb">test</span>
</code></pre>
<p><code>project.clj</code> defines the project: its name, its version, dependencies, and so
on. Notice the name of the project (<code>scratch</code>) comes first, followed by the
version (<code>0.1.0-SNAPSHOT</code>). <code>-SNAPSHOT</code> versions are for development; you can
change them at any time, and any projects which depend on the snapshot will
pick up the most recent changes. A version which does <em>not</em> end in <code>-SNAPSHOT</code>
is fixed: once published, it always points to the same version of the project.
This allows projects to specify precisely which projects they depend on. For
example, scratch’s <code>project.clj</code> says scratch depends on <code>org.clojure/clojure</code>
version <code>1.5.1</code>.</p>
<pre><code><span></span><span class="p">(</span><span class="kd">defproject </span><span class="nv">scratch</span> <span class="s">"0.1.0-SNAPSHOT"</span>
<span class="ss">:description</span> <span class="s">"FIXME: write description"</span>
<span class="ss">:url</span> <span class="s">"http://example.com/FIXME"</span>
<span class="ss">:license</span> <span class="p">{</span><span class="ss">:name</span> <span class="s">"Eclipse Public License"</span>
<span class="ss">:url</span> <span class="s">"http://www.eclipse.org/legal/epl-v10.html"</span><span class="p">}</span>
<span class="ss">:dependencies</span> <span class="p">[[</span><span class="nv">org.clojure/clojure</span> <span class="s">"1.5.1"</span><span class="p">]</span> <span class="p">])</span>
</code></pre>
<p><code>README.md</code> is the first file most people open when they look at a new project,
and Lein generates a generic readme for you to fill in later. We call this kind
of scaffolding or example a “stub”; it’s just there to remind you what sort of
things to write yourself. You’ll notice the readme includes the name of the
project, some notes on what it does and how to use it, a copyright notice where
your name should go, and a license, which sets the legal terms for the use of
the project. By default, Leiningen suggests the Eclipse Public License, which
allows everyone to use and modify the software, so long as they preserve the
license information.</p>
<p>The <code>doc</code> directory is for documentation; sometimes hand-written, sometimes
automatically generated from the source code. <code>resources</code> is for additional
files, like images. <code>src</code> is where Clojure code lives, and <code>test</code> contains the
corresponding tests. Finally, <code>target</code> is where Leiningen stores compiled code,
built packages, and so on.</p>
<h2><a href="#namespaces" id="namespaces">Namespaces</a></h2>
<p>Every lein project starts out with a stub namespace containing a simple function. Let’s take a look at that namespace now–it lives in <code>src/scratch/core.clj</code>:</p>
<pre><code><span></span><span class="p">(</span><span class="kd">ns </span><span class="nv">scratch.core</span><span class="p">)</span>
<span class="p">(</span><span class="kd">defn </span><span class="nv">foo</span>
<span class="s">"I don't do a whole lot."</span>
<span class="p">[</span><span class="nv">x</span><span class="p">]</span>
<span class="p">(</span><span class="nb">println </span><span class="nv">x</span> <span class="s">"Hello, World!"</span><span class="p">))</span>
</code></pre>
<p>The first part of this file defines the <em>namespace</em> we’ll be working in. The <code>ns</code> macro lets the Clojure compiler know that all following code belongs in the <code>scratch.core</code> namespace. Remember, <code>scratch</code> is the name of our project. <code>scratch.core</code> is for the core functions and definitions of the scratch project. As projects expand, we might add new namespaces to <em>separate</em> our work into smaller, more understandable pieces. For instance, Clojure’s primary functions live in <code>clojure.core</code>, but there are auxiliary functions for string processing in <code>clojure.string</code>, functions for interoperating with Java’s input-output system in <code>clojure.java.io</code>, for printing values in <code>clojure.pprint</code>, and so on.</p>
<p><code>def</code>, <code>defn</code>, and peers always work in the scope of a particular <em>namespace</em>. The function <code>foo</code> in <code>scratch.core</code> is <em>different</em> from the function <code>foo</code> in <code>scratch.pad</code>.</p>
<pre><code><span></span><span class="nv">scratch.foo=></span> <span class="p">(</span><span class="kd">ns </span><span class="nv">scratch.core</span><span class="p">)</span>
<span class="nv">nil</span>
<span class="nv">scratch.core=></span> <span class="p">(</span><span class="k">def </span><span class="nv">foo</span> <span class="s">"I'm in core"</span><span class="p">)</span>
<span class="o">#</span><span class="ss">'scratch.core/foo</span>
<span class="nv">scratch.core=></span> <span class="p">(</span><span class="kd">ns </span><span class="nv">scratch.pad</span><span class="p">)</span>
<span class="nv">nil</span>
<span class="nv">scratch.pad=></span> <span class="p">(</span><span class="k">def </span><span class="nv">foo</span> <span class="s">"I'm in pad!"</span><span class="p">)</span>
<span class="o">#</span><span class="ss">'scratch.pad/foo</span>
</code></pre>
<p>Notice the full names of these vars are different: <code>scratch.core/foo</code> vs <code>scratch.pad/foo</code>. You can always refer to a var by its fully qualified name: the namespace, followed by a slash <code>/</code>, followed by the short name.</p>
<p>Inside a namespace, symbols resolve to variables which are defined in that namespace. So in <code>scratch.pad</code>, <code>foo</code> refers to <code>scratch.pad/foo</code>.</p>
<pre><code><span></span><span class="nv">scratch.pad=></span> <span class="nv">foo</span>
<span class="s">"I'm in pad!"</span>
</code></pre>
<p>Namespaces automatically include <code>clojure.core</code> by default; which is where all the standard functions, macros, and special forms come from. <code>let</code>, <code>defn</code>, <code>filter</code>, <code>vector</code>, etc: all live in <code>clojure.core</code>, but are automatically <em>included</em> in new namespaces so we can refer to them by their short names.</p>
<p>Notice that the values for <code>foo</code> we defined in <code>scratch.pad</code> and <code>scratch.core</code> aren’t available in other namespaces, like <code>user</code>.</p>
<pre><code><span></span><span class="nv">scratch.pad=></span> <span class="p">(</span><span class="kd">ns </span><span class="nv">user</span><span class="p">)</span>
<span class="nv">nil</span>
<span class="nv">user=></span> <span class="nv">foo</span>
<span class="nv">CompilerException</span> <span class="nv">java.lang.RuntimeException</span><span class="err">:</span> <span class="nv">Unable</span> <span class="nv">to</span> <span class="nb">resolve </span><span class="nv">symbol</span><span class="err">:</span> <span class="nv">foo</span> <span class="nv">in</span> <span class="nv">this</span> <span class="nv">context</span>, <span class="nv">compiling</span><span class="err">:</span><span class="p">(</span><span class="nf">NO_SOURCE_PATH</span><span class="ss">:1:602</span><span class="p">)</span>
</code></pre>
<p>To access things from other namespaces, we have to <em>require</em> them in the namespace definition.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="kd">ns </span><span class="nv">user</span> <span class="p">(</span><span class="ss">:require</span> <span class="p">[</span><span class="nv">scratch.core</span><span class="p">]))</span>
<span class="nv">nil</span>
<span class="nv">user=></span> <span class="nv">scratch.core/foo</span>
<span class="s">"I'm in core"</span>
</code></pre>
<p>The <code>:require</code> part of the <code>ns</code> declaration told the compiler that the <code>user</code> namespace needed access to <code>scratch.core</code>. We could then refer to the fully qualified name <code>scratch.core/foo</code>.</p>
<p>Often, writing out the full namespace is cumbersome–so you can give a short alias for a namespace like so:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="kd">ns </span><span class="nv">user</span> <span class="p">(</span><span class="ss">:require</span> <span class="p">[</span><span class="nv">scratch.core</span> <span class="ss">:as</span> <span class="nv">c</span><span class="p">]))</span>
<span class="nv">nil</span>
<span class="nv">user=></span> <span class="nv">c/foo</span>
<span class="s">"I'm in core"</span>
</code></pre>
<p>The <code>:as</code> directive indicates that anywhere we write <code>c/something</code>, the compiler should expand that to <code>scratch.core/something</code>. If you plan on using a var from another namespace often, you can <em>refer</em> it to the local namespace–which means you may omit the namespace qualifier entirely.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="kd">ns </span><span class="nv">user</span> <span class="p">(</span><span class="ss">:require</span> <span class="p">[</span><span class="nv">scratch.core</span> <span class="ss">:refer</span> <span class="p">[</span><span class="nv">foo</span><span class="p">]]))</span>
<span class="nv">nil</span>
<span class="nv">user=></span> <span class="nv">foo</span>
<span class="s">"I'm in core"</span>
</code></pre>
<p>You can refer functions into the current namespace by listing them: <code>[foo bar ...]</code>. Alternatively, you can suck in <em>every</em> function from another namespace by saying <code>:refer :all</code>:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="kd">ns </span><span class="nv">user</span> <span class="p">(</span><span class="ss">:require</span> <span class="p">[</span><span class="nv">scratch.core</span> <span class="ss">:refer</span> <span class="ss">:all</span><span class="p">]))</span>
<span class="nv">nil</span>
<span class="nv">user=></span> <span class="nv">foo</span>
<span class="s">"I'm in core"</span>
</code></pre>
<p>Namespaces <em>control complexity</em> by isolating code into more understandable, related pieces. They make it easier to read code by keeping similar things together, and unrelated things apart. By making dependencies between namespaces explicit, they make it clear how groups of functions relate to one another.</p>
<p>If you’ve worked with Erlang, Modula-2, Haskell, Perl, or ML, you’ll find namespaces analogous to <em>modules</em> or <em>packages</em>. Namespaces are often large, encompassing hundreds of functions; and most projects use only a handful of namespaces.</p>
<p>By contrast, object-oriented programming languages like Java, Scala, Ruby, and Objective C organize code in <em>classes</em>, which combine <em>names</em> and <em>state</em> in a single construct. Because all functions in a class operate on the same state, object-oriented languages tend to have <em>many</em> classes with <em>fewer</em> functions in each. It’s not uncommon for a typical Java project to define hundreds or thousands of classes containing only one or two functions each. If you come from an object-oriented language, it can feel a bit unusual to combine so many functions in a single scope–but because functional programs isolate state differently, this is <em>normal</em>. If, on the other hand, you move <em>to</em> an object-oriented language after Clojure, remember that OO languages compose differently. Objects with hundreds of functions are usually considered unwieldy and should be split into smaller pieces.</p>
<h2><a href="#code-and-tests" id="code-and-tests">Code and tests</a></h2>
<p>It’s perfectly fine to test small programs in the REPL. We’ve written
and refined hundreds of functions that way: by calling the function and seeing
what happens. However, as programs grow in scope and complexity, testing them
by hand becomes harder and harder. If you change the behavior of a function
which ten other functions rely on, you may have to re-test <em>all ten</em> by hand. In real programs, a small change can alter thousands of distinct behaviors, all of which should be verified.</p>
<p>Wherever possible, we want to <em>automate</em> software tests–making the test itself
<em>another program</em>. If the test suite runs in a matter of seconds, we can make
changes freely–re-running the tests continuously to verify that everything
still works.</p>
<p>As a simple example, let’s write and test a single function in <code>src/scratch/core.clj</code>. How about exponentiation–raising a number to the given power?</p>
<pre><code><span></span><span class="p">(</span><span class="kd">ns </span><span class="nv">scratch.core</span><span class="p">)</span>
<span class="p">(</span><span class="kd">defn </span><span class="nv">pow</span>
<span class="s">"Raises base to the given power. For instance, (pow 3 2) returns three squared, or nine."</span>
<span class="p">[</span><span class="nv">base</span> <span class="nv">power</span><span class="p">]</span>
<span class="p">(</span><span class="nb">apply * </span><span class="p">(</span><span class="nb">repeat </span><span class="nv">base</span> <span class="nv">power</span><span class="p">)))</span>
</code></pre>
<p>So we <em>repeat</em> the base <em>power</em> times, then call <code>*</code> with that sequence of bases to multiply them all together. Seems straightforward enough. Now we need to test it.</p>
<p>By default, all lein projects come with a simple test stub. Let’s see it in action by running <code>lein test</code>.</p>
<pre><code><span></span>aphyr@waterhouse:~/scratch$ lein <span class="nb">test</span>
lein <span class="nb">test</span> scratch.core-test
lein <span class="nb">test</span> :only scratch.core-test/a-test
FAIL in <span class="o">(</span>a-test<span class="o">)</span> <span class="o">(</span>core_test.clj:7<span class="o">)</span>
FIXME, I fail.
expected: <span class="o">(=</span> <span class="m">0</span> <span class="m">1</span><span class="o">)</span>
actual: <span class="o">(</span>not <span class="o">(=</span> <span class="m">0</span> <span class="m">1</span><span class="o">))</span>
Ran <span class="m">1</span> tests containing <span class="m">1</span> assertions.
<span class="m">1</span> failures, <span class="m">0</span> errors.
Tests failed.
</code></pre>
<p>A <em>failure</em> is when a test returns the wrong value. An <em>error</em> is when a test throws an exception. In this case, the test named <code>a-test</code>, in the file <code>core_test.clj</code>, on line 7, failed. That test expected zero to be equal to one–but found that 0 and 1 were (in point of fact) not equal. Let’s take a look at that test, in <code>test/scratch/core_test.clj</code>.</p>
<pre><code><span></span><span class="p">(</span><span class="kd">ns </span><span class="nv">scratch.core-test</span>
<span class="p">(</span><span class="ss">:require</span> <span class="p">[</span><span class="nv">clojure.test</span> <span class="ss">:refer</span> <span class="ss">:all</span><span class="p">]</span>
<span class="p">[</span><span class="nv">scratch.core</span> <span class="ss">:refer</span> <span class="ss">:all</span><span class="p">]))</span>
<span class="p">(</span><span class="nf">deftest</span> <span class="nv">a-test</span>
<span class="p">(</span><span class="nf">testing</span> <span class="s">"FIXME, I fail."</span>
<span class="p">(</span><span class="nf">is</span> <span class="p">(</span><span class="nb">= </span><span class="mi">0</span> <span class="mi">1</span><span class="p">))))</span>
</code></pre>
<p>These tests live in a namespace too! Notice that namespaces and file names match up: <code>scratch.core</code> lives in <code>src/scratch/core.clj</code>, and <code>scratch.core-test</code> lives in <code>test/scratch/core_test.clj</code>. Dashes (<code>-</code>) in namespaces correspond to underscores (<code>_</code>) in filenames, and dots (<code>.</code>) correspond to directory separators (<code>/</code>).</p>
<p>The <code>scratch.core-test</code> namespace is responsible for testing things in <code>scratch.core</code>. Notice that it requires two namespaces: <code>clojure.test</code>, which provides testing functions and macros, and <code>scratch.core</code>, which is the namespace we want to test.</p>
<p>Then we define a test using <code>deftest</code>. <code>deftest</code> takes a symbol as a test name, and then any number of expressions to evaluate. We can use <code>testing</code> to split up tests into smaller pieces. If a test fails, <code>lein test</code> will print out the enclosing <code>deftest</code> and <code>testing</code> names, to make it easier to figure out what went wrong.</p>
<p>Let’s change this test so that it passes. 0 should equal 0.</p>
<pre><code><span></span><span class="p">(</span><span class="nf">deftest</span> <span class="nv">a-test</span>
<span class="p">(</span><span class="nf">testing</span> <span class="s">"Numbers are equal to themselves, right?"</span>
<span class="p">(</span><span class="nf">is</span> <span class="p">(</span><span class="nb">= </span><span class="mi">0</span> <span class="mi">0</span><span class="p">))))</span>
</code></pre>
<pre><code><span></span>aphyr@waterhouse:~/scratch$ lein <span class="nb">test</span>
lein <span class="nb">test</span> scratch.core-test
Ran <span class="m">1</span> tests containing <span class="m">1</span> assertions.
<span class="m">0</span> failures, <span class="m">0</span> errors.
</code></pre>
<p>Wonderful! Now let’s test the <code>pow</code> function. I like to start with a really basic case and work my way up to more complicated ones. 1^1 is 1, so:</p>
<pre><code><span></span><span class="p">(</span><span class="nf">deftest</span> <span class="nv">pow-test</span>
<span class="p">(</span><span class="nf">testing</span> <span class="s">"unity"</span>
<span class="p">(</span><span class="nf">is</span> <span class="p">(</span><span class="nb">= </span><span class="mi">1</span> <span class="p">(</span><span class="nf">pow</span> <span class="mi">1</span> <span class="mi">1</span><span class="p">)))))</span>
</code></pre>
<pre><code><span></span>aphyr@waterhouse:~/scratch$ lein <span class="nb">test</span>
lein <span class="nb">test</span> scratch.core-test
Ran <span class="m">1</span> tests containing <span class="m">1</span> assertions.
<span class="m">0</span> failures, <span class="m">0</span> errors.
</code></pre>
<p>Excellent. How about something harder?</p>
<pre><code><span></span><span class="p">(</span><span class="nf">deftest</span> <span class="nv">pow-test</span>
<span class="p">(</span><span class="nf">testing</span> <span class="s">"unity"</span>
<span class="p">(</span><span class="nf">is</span> <span class="p">(</span><span class="nb">= </span><span class="mi">1</span> <span class="p">(</span><span class="nf">pow</span> <span class="mi">1</span> <span class="mi">1</span><span class="p">))))</span>
<span class="p">(</span><span class="nf">testing</span> <span class="s">"square integers"</span>
<span class="p">(</span><span class="nf">is</span> <span class="p">(</span><span class="nb">= </span><span class="mi">9</span> <span class="p">(</span><span class="nf">pow</span> <span class="mi">3</span> <span class="mi">2</span><span class="p">)))))</span>
</code></pre>
<pre><code><span></span>aphyr@waterhouse:~/scratch$ lein <span class="nb">test</span>
lein <span class="nb">test</span> scratch.core-test
lein <span class="nb">test</span> :only scratch.core-test/pow-test
FAIL in <span class="o">(</span>pow-test<span class="o">)</span> <span class="o">(</span>core_test.clj:10<span class="o">)</span>
square integers
expected: <span class="o">(=</span> <span class="m">9</span> <span class="o">(</span>pow <span class="m">3</span> <span class="m">2</span><span class="o">))</span>
actual: <span class="o">(</span>not <span class="o">(=</span> <span class="m">9</span> <span class="m">8</span><span class="o">))</span>
Ran <span class="m">1</span> tests containing <span class="m">2</span> assertions.
<span class="m">1</span> failures, <span class="m">0</span> errors.
Tests failed.
</code></pre>
<p>That’s odd. 3^2 should be 9, not 8. Let’s double-check our code in the REPL. <code>base</code> was 3, and <code>power</code> was 2, so…</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">repeat </span><span class="mi">3</span> <span class="mi">2</span><span class="p">)</span>
<span class="p">(</span><span class="mi">2</span> <span class="mi">2</span> <span class="mi">2</span><span class="p">)</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nb">* </span><span class="mi">2</span> <span class="mi">2</span> <span class="mi">2</span><span class="p">)</span>
<span class="mi">8</span>
</code></pre>
<p>Ah, there’s the problem. We’re mis-using <code>repeat</code>. Instead of repeating 3 twice, we repeated 2 thrice.</p>
<pre><code>user=> (doc repeat)
-------------------------
clojure.core/repeat
([x] [n x])
Returns a lazy (infinite!, or length n if supplied) sequence of xs.
</code></pre>
<p>Let’s redefine <code>pow</code> with the correct arguments to <code>repeat</code>:</p>
<pre><code><span></span><span class="p">(</span><span class="kd">defn </span><span class="nv">pow</span>
<span class="s">"Raises base to the given power. For instance, (pow 3 2) returns three</span>
<span class="s"> squared, or nine."</span>
<span class="p">[</span><span class="nv">base</span> <span class="nv">power</span><span class="p">]</span>
<span class="p">(</span><span class="nb">apply * </span><span class="p">(</span><span class="nb">repeat </span><span class="nv">power</span> <span class="nv">base</span><span class="p">)))</span>
</code></pre>
<p>How about 0^0? By convention, mathematicians define 0^0 as 1.</p>
<pre><code><span></span><span class="p">(</span><span class="nf">deftest</span> <span class="nv">pow-test</span>
<span class="p">(</span><span class="nf">testing</span> <span class="s">"unity"</span>
<span class="p">(</span><span class="nf">is</span> <span class="p">(</span><span class="nb">= </span><span class="mi">1</span> <span class="p">(</span><span class="nf">pow</span> <span class="mi">1</span> <span class="mi">1</span><span class="p">))))</span>
<span class="p">(</span><span class="nf">testing</span> <span class="s">"square integers"</span>
<span class="p">(</span><span class="nf">is</span> <span class="p">(</span><span class="nb">= </span><span class="mi">9</span> <span class="p">(</span><span class="nf">pow</span> <span class="mi">3</span> <span class="mi">2</span><span class="p">))))</span>
<span class="p">(</span><span class="nf">testing</span> <span class="s">"0^0"</span>
<span class="p">(</span><span class="nf">is</span> <span class="p">(</span><span class="nb">= </span><span class="mi">1</span> <span class="p">(</span><span class="nf">pow</span> <span class="mi">0</span> <span class="mi">0</span><span class="p">)))))</span>
</code></pre>
<pre><code>aphyr@waterhouse:~/scratch$ lein test
lein test scratch.core-test
Ran 1 tests containing 3 assertions.
0 failures, 0 errors.
</code></pre>
<p>Hey, what do you know? It works! But <em>why</em>?</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">repeat </span><span class="mi">0</span> <span class="mi">0</span><span class="p">)</span>
<span class="p">()</span>
</code></pre>
<p>What happens when we call <code>*</code> with an <em>empty</em> list of arguments?</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nf">*</span><span class="p">)</span>
<span class="mi">1</span>
</code></pre>
<p>Remember when we talked about how the zero-argument forms of <code>+</code>, and <code>*</code> made some definitions simpler? This is one of those times. We didn’t have to define a special exception for zero powers because <code>(*)</code> returns the multiplicative identity 1, by convention.</p>
<h2><a href="#exploring-data" id="exploring-data">Exploring data</a></h2>
<p>The last bit of logistics we need to talk about is <em>working with other people’s code</em>. Clojure projects, like most modern programming environments, are built to work together. We can use libraries to parse data, solve mathematical problems, render graphics, perform simulations, talk to robots, or predict the weather. As a quick example, I’d like to imagine that you and I are public-health researchers trying to identify the best location for an ad campaign to reduce drunk driving.</p>
<p>The FBI’s <a href="http://www.fbi.gov/about-us/cjis/ucr/ucr">Uniform Crime Reporting</a> database tracks the annual tally of different types of arrests, broken down by county–but the data files provided by the FBI are a mess to work with. Helpfully, <a href="http://emdasheveryone.wordpress.com/">Matt Aliabadi</a> has organized the UCR’s somewhat complex format into nice, normalized files in a data format called JSON, and made them available <a href="https://github.com/maliabadi/ucr-json">on Github</a>. Let’s download the most recent year’s <a href="https://raw2.github.com/maliabadi/ucr-json/master/data/parsed/normalized/2008.json">normalized data</a>, and save it in the <code>scratch</code> directory.</p>
<p>What’s <em>in</em> this file, anyway? Let’s take a look at the first few lines using <code>head</code>:</p>
<pre><code><span></span><span class="err">aphyr@waterhouse:~/scratch$</span> <span class="err">head</span> <span class="mi">2008</span><span class="err">.json</span>
<span class="p">[</span>
<span class="p">{</span>
<span class="nt">"icpsr_study_number"</span><span class="p">:</span> <span class="kc">null</span><span class="p">,</span>
<span class="nt">"icpsr_edition_number"</span><span class="p">:</span> <span class="mi">1</span><span class="p">,</span>
<span class="nt">"icpsr_part_number"</span><span class="p">:</span> <span class="mi">1</span><span class="p">,</span>
<span class="nt">"icpsr_sequential_case_id_number"</span><span class="p">:</span> <span class="mi">1</span><span class="p">,</span>
<span class="nt">"fips_state_code"</span><span class="p">:</span> <span class="s2">"01"</span><span class="p">,</span>
<span class="nt">"fips_county_code"</span><span class="p">:</span> <span class="s2">"001"</span><span class="p">,</span>
<span class="nt">"county_population"</span><span class="p">:</span> <span class="mi">52417</span><span class="p">,</span>
<span class="nt">"number_of_agencies_in_county"</span><span class="p">:</span> <span class="mi">3</span><span class="p">,</span>
</code></pre>
<p>This is a data format called <a href="http://json.org/">JSON</a>, and it looks a lot like Clojure’s data structures. That’s the start of a vector on the first line, and the second line starts a map. Then we’ve got string keys like <code>"icpsr_study_number"</code>, and values which look like <code>null</code> (<code>nil</code>), numbers, or strings. But in order to <em>work</em> with this file, we’ll need to <em>parse</em> it into Clojure data structures. For that, we can use a JSON parsing library, like <a href="https://github.com/dakrone/cheshire">Cheshire</a>.</p>
<p>Cheshire, like most Clojure libraries, is published on an internet repository called <a href="http://clojars.org">Clojars</a>. To include it in our scratch project, all we have to do is add open <code>project.clj</code> in a text editor, and add a line specifying that we want to use a particular version of Cheshire.</p>
<pre><code><span></span><span class="p">(</span><span class="kd">defproject </span><span class="nv">scratch</span> <span class="s">"0.1.0-SNAPSHOT"</span>
<span class="ss">:description</span> <span class="s">"Just playing around"</span>
<span class="ss">:url</span> <span class="s">"http://example.com/FIXME"</span>
<span class="ss">:license</span> <span class="p">{</span><span class="ss">:name</span> <span class="s">"Eclipse Public License"</span>
<span class="ss">:url</span> <span class="s">"http://www.eclipse.org/legal/epl-v10.html"</span><span class="p">}</span>
<span class="ss">:dependencies</span> <span class="p">[[</span><span class="nv">org.clojure/clojure</span> <span class="s">"1.5.1"</span><span class="p">]</span>
<span class="p">[</span><span class="nv">cheshire</span> <span class="s">"5.3.1"</span><span class="p">]])</span>
</code></pre>
<p>Now we’ll exit the REPL with Control+D (^D), and restart it with <code>lein repl</code>. Leiningen, the Clojure package manager, will automatically download Cheshire from Clojars and make it available in the new REPL session.</p>
<p>Now let’s figure out how to parse the JSON file. Looking at <a href="https://github.com/dakrone/cheshire">Cheshire’s README</a> shows an example that looks helpful:</p>
<pre><code><span></span><span class="c1">;; parse some json and get keywords back</span>
<span class="p">(</span><span class="nf">parse-string</span> <span class="s">"{\"foo\":\"bar\"}"</span> <span class="nv">true</span><span class="p">)</span>
<span class="c1">;; => {:foo "bar"}</span>
</code></pre>
<p>So Cheshire includes a parse-string function which can take a string and return a data structure. How can we get a string out of a file? Using <code>slurp</code>:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nf">use</span> <span class="ss">'cheshire.core</span><span class="p">)</span>
<span class="nv">nil</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nf">parse-string</span> <span class="p">(</span><span class="nb">slurp </span><span class="s">"2008.json"</span><span class="p">))</span>
<span class="nv">...</span>
</code></pre>
<p>Woooow, that’s a lot of data! Let’s chop it down to something more manageable. How about the first entry?</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">first </span><span class="p">(</span><span class="nf">parse-string</span> <span class="p">(</span><span class="nb">slurp </span><span class="s">"2008.json"</span><span class="p">)))</span>
<span class="p">{</span><span class="s">"syntheticdrug_salemanufacture"</span> <span class="mi">1</span>, <span class="s">"all_other_offenses_except_traffic"</span> <span class="mi">900</span>, <span class="s">"arson"</span> <span class="mi">3</span>, <span class="nv">...</span><span class="p">}</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nb">-> </span><span class="s">"2008.json"</span> <span class="nb">slurp </span><span class="nv">parse-string</span> <span class="nv">first</span><span class="p">)</span>
</code></pre>
<p>It’d be nicer if this data used keywords instead of strings for its keys. Let’s use the second argument to Chesire’s <code>parse-string</code> to convert all the keys in maps to keywords.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">first </span><span class="p">(</span><span class="nf">parse-string</span> <span class="p">(</span><span class="nb">slurp </span><span class="s">"2008.json"</span><span class="p">)</span> <span class="nv">true</span><span class="p">))</span>
<span class="p">{</span><span class="ss">:other_assaults</span> <span class="mi">288</span>, <span class="ss">:gambling_all_other</span> <span class="mi">0</span>, <span class="ss">:arson</span> <span class="mi">3</span>, <span class="nv">...</span> <span class="ss">:drunkenness</span> <span class="mi">108</span><span class="p">}</span>
</code></pre>
<p>Since we’re going to be working with this dataset over and over again, let’s bind it to a variable for easy re-use.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="k">def </span><span class="nv">data</span> <span class="p">(</span><span class="nf">parse-string</span> <span class="p">(</span><span class="nb">slurp </span><span class="s">"2008.json"</span><span class="p">)</span> <span class="nv">true</span><span class="p">))</span>
<span class="o">#</span><span class="ss">'user/data</span>
</code></pre>
<p>Now we’ve got a big long vector of counties, each represented by a map–but we’re just interested in the <em>DUIs</em> of each one. What does that look like? Let’s <em>map</em> each county to its <code>:driving_under_influence</code>.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nf">->></span> <span class="nv">data</span> <span class="p">(</span><span class="nb">map </span><span class="ss">:driving_under_influence</span><span class="p">))</span>
<span class="p">(</span><span class="mi">198</span> <span class="mi">1095</span> <span class="mi">114</span> <span class="mi">98</span> <span class="mi">135</span> <span class="mi">4</span> <span class="mi">122</span> <span class="mi">587</span> <span class="mi">204</span> <span class="mi">53</span> <span class="mi">177</span> <span class="nv">...</span>
</code></pre>
<p>What’s the most any county has ever reported?</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nf">->></span> <span class="nv">data</span> <span class="p">(</span><span class="nb">map </span><span class="ss">:driving_under_influence</span><span class="p">)</span> <span class="p">(</span><span class="nb">apply </span><span class="nv">max</span><span class="p">))</span>
<span class="mi">45056</span>
</code></pre>
<p>45056 counts in one year? Wow! What about the second-worst county? The easiest way to find the <em>top n</em> counties is to <em>sort</em> the list, then look at the final elements.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nf">->></span> <span class="nv">data</span> <span class="p">(</span><span class="nb">map </span><span class="ss">:driving_under_influence</span><span class="p">)</span> <span class="nb">sort </span><span class="p">(</span><span class="nf">take-last</span> <span class="mi">10</span><span class="p">))</span>
<span class="p">(</span><span class="mi">8589</span> <span class="mi">10432</span> <span class="mi">10443</span> <span class="mi">10814</span> <span class="mi">11439</span> <span class="mi">13983</span> <span class="mi">17572</span> <span class="mi">18562</span> <span class="mi">26235</span> <span class="mi">45056</span><span class="p">)</span>
</code></pre>
<p>So the top 10 counties range from 8549 counts to 45056 counts. What’s the <em>most common</em> count? Clojure comes with a built-in function called <code>frequencies</code> which takes a sequence of elements, and returns a map from each element to how many times it appeared in the sequence.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nf">->></span> <span class="nv">data</span> <span class="p">(</span><span class="nb">map </span><span class="ss">:driving_under_influence</span><span class="p">)</span> <span class="nv">frequencies</span><span class="p">)</span>
<span class="p">{</span><span class="mi">0</span> <span class="mi">227</span>, <span class="mi">1024</span> <span class="mi">1</span>, <span class="mi">45056</span> <span class="mi">1</span>, <span class="mi">32</span> <span class="mi">15</span>, <span class="mi">2080</span> <span class="mi">1</span>, <span class="mi">64</span> <span class="mi">12</span> <span class="nv">...</span>
</code></pre>
<p>Now let’s take those [drunk-driving, frequency] pairs and sort them by key to produce a <em>histogram</em>. <code>sort-by</code> takes a function to apply to each element in the collection–in this case, a key-value pair–and returns something that can be sorted, like a number. We’ll choose the <code>key</code> function to extract the key from each key-value pair, effectively sorting the counties by number of reported incidents.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nf">->></span> <span class="nv">data</span> <span class="p">(</span><span class="nb">map </span><span class="ss">:driving_under_influence</span><span class="p">)</span> <span class="nv">frequencies</span> <span class="p">(</span><span class="nb">sort-by </span><span class="nv">key</span><span class="p">)</span> <span class="nv">pprint</span><span class="p">)</span>
<span class="p">([</span><span class="mi">0</span> <span class="mi">227</span><span class="p">]</span>
<span class="p">[</span><span class="mi">1</span> <span class="mi">24</span><span class="p">]</span>
<span class="p">[</span><span class="mi">2</span> <span class="mi">17</span><span class="p">]</span>
<span class="p">[</span><span class="mi">3</span> <span class="mi">20</span><span class="p">]</span>
<span class="p">[</span><span class="mi">4</span> <span class="mi">17</span><span class="p">]</span>
<span class="p">[</span><span class="mi">5</span> <span class="mi">24</span><span class="p">]</span>
<span class="p">[</span><span class="mi">6</span> <span class="mi">23</span><span class="p">]</span>
<span class="p">[</span><span class="mi">7</span> <span class="mi">23</span><span class="p">]</span>
<span class="p">[</span><span class="mi">8</span> <span class="mi">17</span><span class="p">]</span>
<span class="p">[</span><span class="mi">9</span> <span class="mi">19</span><span class="p">]</span>
<span class="p">[</span><span class="mi">10</span> <span class="mi">29</span><span class="p">]</span>
<span class="p">[</span><span class="mi">11</span> <span class="mi">20</span><span class="p">]</span>
<span class="p">[</span><span class="mi">12</span> <span class="mi">18</span><span class="p">]</span>
<span class="p">[</span><span class="mi">13</span> <span class="mi">21</span><span class="p">]</span>
<span class="p">[</span><span class="mi">14</span> <span class="mi">25</span><span class="p">]</span>
<span class="p">[</span><span class="mi">15</span> <span class="mi">13</span><span class="p">]</span>
<span class="p">[</span><span class="mi">16</span> <span class="mi">18</span><span class="p">]</span>
<span class="p">[</span><span class="mi">17</span> <span class="mi">16</span><span class="p">]</span>
<span class="p">[</span><span class="mi">18</span> <span class="mi">17</span><span class="p">]</span>
<span class="p">[</span><span class="mi">19</span> <span class="mi">11</span><span class="p">]</span>
<span class="p">[</span><span class="mi">20</span> <span class="mi">8</span><span class="p">]</span>
<span class="nv">...</span>
</code></pre>
<p>So a ton of counties (227 out of 3172 total) report no drunk driving; a few hundred have one incident, a moderate number have 10-20, and it falls off from there. This is a common sort of shape in statistics; often a hallmark of an exponential distribution.</p>
<p>How about the 10 worst counties, all the way out on the end of the curve?</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nf">->></span> <span class="nv">data</span> <span class="p">(</span><span class="nb">map </span><span class="ss">:driving_under_influence</span><span class="p">)</span> <span class="nv">frequencies</span> <span class="p">(</span><span class="nb">sort-by </span><span class="nv">key</span><span class="p">)</span> <span class="p">(</span><span class="nf">take-last</span> <span class="mi">10</span><span class="p">)</span> <span class="nv">pprint</span><span class="p">)</span>
<span class="p">([</span><span class="mi">8589</span> <span class="mi">1</span><span class="p">]</span>
<span class="p">[</span><span class="mi">10432</span> <span class="mi">1</span><span class="p">]</span>
<span class="p">[</span><span class="mi">10443</span> <span class="mi">1</span><span class="p">]</span>
<span class="p">[</span><span class="mi">10814</span> <span class="mi">1</span><span class="p">]</span>
<span class="p">[</span><span class="mi">11439</span> <span class="mi">1</span><span class="p">]</span>
<span class="p">[</span><span class="mi">13983</span> <span class="mi">1</span><span class="p">]</span>
<span class="p">[</span><span class="mi">17572</span> <span class="mi">1</span><span class="p">]</span>
<span class="p">[</span><span class="mi">18562</span> <span class="mi">1</span><span class="p">]</span>
<span class="p">[</span><span class="mi">26235</span> <span class="mi">1</span><span class="p">]</span>
<span class="p">[</span><span class="mi">45056</span> <span class="mi">1</span><span class="p">])</span>
</code></pre>
<p>So it looks like 45056 is high, but there are 8 other counties with tens of thousands of reports too. Let’s back up to the original dataset, and sort it by DUIs:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nf">->></span> <span class="nv">data</span> <span class="p">(</span><span class="nb">sort-by </span><span class="ss">:driving_under_influence</span><span class="p">)</span> <span class="p">(</span><span class="nf">take-last</span> <span class="mi">10</span><span class="p">)</span> <span class="nv">pprint</span><span class="p">)</span>
<span class="p">({</span><span class="ss">:other_assaults</span> <span class="mi">3096</span>,
<span class="ss">:gambling_all_other</span> <span class="mi">3</span>,
<span class="ss">:arson</span> <span class="mi">106</span>,
<span class="ss">:have_stolen_property</span> <span class="mi">698</span>,
<span class="ss">:syntheticdrug_salemanufacture</span> <span class="mi">0</span>,
<span class="ss">:icpsr_sequential_case_id_number</span> <span class="mi">220</span>,
<span class="ss">:drug_abuse_salemanufacture</span> <span class="mi">1761</span>,
<span class="nv">...</span>
</code></pre>
<p>What we’re looking for is the county names, but it’s a little hard to read these enormous maps. Let’s take a look at just the keys that define each county, and see which ones might be useful. We’ll take this list of counties, map each one to a list of its keys, and concatenate those lists together into one big long list. <code>mapcat</code> maps and concatenates in a single step. We expect the same keys to show up over and over again, so we’ll remove duplicates by merging all those keys <code>into</code> a <code>sorted-set</code>.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nf">->></span> <span class="nv">data</span> <span class="p">(</span><span class="nb">sort-by </span><span class="ss">:driving_under_influence</span><span class="p">)</span> <span class="p">(</span><span class="nf">take-last</span> <span class="mi">10</span><span class="p">)</span> <span class="p">(</span><span class="nb">mapcat </span><span class="nv">keys</span><span class="p">)</span> <span class="p">(</span><span class="nb">into </span><span class="p">(</span><span class="nf">sorted-set</span><span class="p">))</span> <span class="nv">pprint</span><span class="p">)</span>
<span class="o">#</span><span class="p">{</span><span class="ss">:aggravated_assaults</span> <span class="ss">:all_other_offenses_except_traffic</span> <span class="ss">:arson</span>
<span class="ss">:auto_thefts</span> <span class="ss">:bookmaking_horsesport</span> <span class="ss">:burglary</span> <span class="ss">:county_population</span>
<span class="ss">:coverage_indicator</span> <span class="ss">:curfew_loitering_laws</span> <span class="ss">:disorderly_conduct</span>
<span class="ss">:driving_under_influence</span> <span class="ss">:drug_abuse_salemanufacture</span>
<span class="ss">:drug_abuse_violationstotal</span> <span class="ss">:drug_possession_other</span>
<span class="ss">:drug_possession_subtotal</span> <span class="ss">:drunkenness</span> <span class="ss">:embezzlement</span>
<span class="ss">:fips_county_code</span> <span class="ss">:fips_state_code</span> <span class="ss">:forgerycounterfeiting</span> <span class="ss">:fraud</span>
<span class="ss">:gambling_all_other</span> <span class="ss">:gambling_total</span> <span class="ss">:grand_total</span>
<span class="ss">:have_stolen_property</span> <span class="ss">:icpsr_edition_number</span> <span class="ss">:icpsr_part_number</span>
<span class="ss">:icpsr_sequential_case_id_number</span> <span class="ss">:icpsr_study_number</span> <span class="ss">:larceny</span>
<span class="ss">:liquor_law_violations</span> <span class="ss">:marijuana_possession</span>
<span class="ss">:marijuanasalemanufacture</span> <span class="ss">:multicounty_jurisdiction_flag</span> <span class="ss">:murder</span>
<span class="ss">:number_of_agencies_in_county</span> <span class="ss">:numbers_lottery</span>
<span class="ss">:offenses_against_family_child</span> <span class="ss">:opiumcocaine_possession</span>
<span class="ss">:opiumcocainesalemanufacture</span> <span class="ss">:other_assaults</span> <span class="ss">:otherdang_nonnarcotics</span>
<span class="ss">:part_1_total</span> <span class="ss">:property_crimes</span> <span class="ss">:prostitutioncomm_vice</span> <span class="ss">:rape</span> <span class="ss">:robbery</span>
<span class="ss">:runaways</span> <span class="ss">:sex_offenses</span> <span class="ss">:suspicion</span> <span class="ss">:synthetic_narcoticspossession</span>
<span class="ss">:syntheticdrug_salemanufacture</span> <span class="ss">:vagrancy</span> <span class="ss">:vandalism</span> <span class="ss">:violent_crimes</span>
<span class="ss">:weapons_violations</span><span class="p">}</span>
</code></pre>
<p>Ah, <code>:fips_county_code</code> and <code>:fips_state_code</code> look promising. Let’s compact the dataset to just drunk driving and those codes, using <code>select-keys</code>.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nf">->></span> <span class="nv">data</span> <span class="p">(</span><span class="nb">sort-by </span><span class="ss">:driving_under_influence</span><span class="p">)</span> <span class="p">(</span><span class="nf">take-last</span> <span class="mi">10</span><span class="p">)</span> <span class="p">(</span><span class="nb">map </span><span class="o">#</span><span class="p">(</span><span class="nb">select-keys </span><span class="nv">%</span> <span class="p">[</span><span class="ss">:driving_under_influence</span> <span class="ss">:fips_county_code</span> <span class="ss">:fips_state_code</span><span class="p">]))</span> <span class="nv">pprint</span><span class="p">)</span>
<span class="p">({</span><span class="ss">:fips_state_code</span> <span class="s">"06"</span>,
<span class="ss">:fips_county_code</span> <span class="s">"067"</span>,
<span class="ss">:driving_under_influence</span> <span class="mi">8589</span><span class="p">}</span>
<span class="p">{</span><span class="ss">:fips_state_code</span> <span class="s">"48"</span>,
<span class="ss">:fips_county_code</span> <span class="s">"201"</span>,
<span class="ss">:driving_under_influence</span> <span class="mi">10432</span><span class="p">}</span>
<span class="p">{</span><span class="ss">:fips_state_code</span> <span class="s">"32"</span>,
<span class="ss">:fips_county_code</span> <span class="s">"003"</span>,
<span class="ss">:driving_under_influence</span> <span class="mi">10443</span><span class="p">}</span>
<span class="p">{</span><span class="ss">:fips_state_code</span> <span class="s">"06"</span>,
<span class="ss">:fips_county_code</span> <span class="s">"065"</span>,
<span class="ss">:driving_under_influence</span> <span class="mi">10814</span><span class="p">}</span>
<span class="p">{</span><span class="ss">:fips_state_code</span> <span class="s">"53"</span>,
<span class="ss">:fips_county_code</span> <span class="s">"033"</span>,
<span class="ss">:driving_under_influence</span> <span class="mi">11439</span><span class="p">}</span>
<span class="p">{</span><span class="ss">:fips_state_code</span> <span class="s">"06"</span>,
<span class="ss">:fips_county_code</span> <span class="s">"071"</span>,
<span class="ss">:driving_under_influence</span> <span class="mi">13983</span><span class="p">}</span>
<span class="p">{</span><span class="ss">:fips_state_code</span> <span class="s">"06"</span>,
<span class="ss">:fips_county_code</span> <span class="s">"059"</span>,
<span class="ss">:driving_under_influence</span> <span class="mi">17572</span><span class="p">}</span>
<span class="p">{</span><span class="ss">:fips_state_code</span> <span class="s">"06"</span>,
<span class="ss">:fips_county_code</span> <span class="s">"073"</span>,
<span class="ss">:driving_under_influence</span> <span class="mi">18562</span><span class="p">}</span>
<span class="p">{</span><span class="ss">:fips_state_code</span> <span class="s">"04"</span>,
<span class="ss">:fips_county_code</span> <span class="s">"013"</span>,
<span class="ss">:driving_under_influence</span> <span class="mi">26235</span><span class="p">}</span>
<span class="p">{</span><span class="ss">:fips_state_code</span> <span class="s">"06"</span>,
<span class="ss">:fips_county_code</span> <span class="s">"037"</span>,
<span class="ss">:driving_under_influence</span> <span class="mi">45056</span><span class="p">})</span>
</code></pre>
<p>Now we’re getting somewhere–but we need a way to interpret these state and county codes. Googling for “FIPS” led me to Wikipedia’s account of the <a href="http://en.wikipedia.org/wiki/FIPS_county_code">FIPS county code system</a>, and the NOAA’s ERDDAP service, which has a table <a href="http://coastwatch.pfeg.noaa.gov/erddap/convert/fipscounty.html">mapping FIPS codes to state and county names</a>. Let’s save that file as <a href="http://coastwatch.pfeg.noaa.gov/erddap/convert/fipscounty.json">fips.json</a>.</p>
<p>Now we’ll slurp that file into the REPL and parse it, just like we did with the crime dataset.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="k">def </span><span class="nv">fips</span> <span class="p">(</span><span class="nf">parse-string</span> <span class="p">(</span><span class="nb">slurp </span><span class="s">"fips.json"</span><span class="p">)</span> <span class="nv">true</span><span class="p">))</span>
</code></pre>
<p>Let’s take a quick look at the structure of this data:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">keys </span><span class="nv">fips</span><span class="p">)</span>
<span class="p">(</span><span class="ss">:table</span><span class="p">)</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nb">keys </span><span class="p">(</span><span class="ss">:table</span> <span class="nv">fips</span><span class="p">))</span>
<span class="p">(</span><span class="ss">:columnNames</span> <span class="ss">:columnTypes</span> <span class="ss">:rows</span><span class="p">)</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nf">->></span> <span class="nv">fips</span> <span class="ss">:table</span> <span class="ss">:columnNames</span><span class="p">)</span>
<span class="p">[</span><span class="s">"FIPS"</span> <span class="s">"Name"</span><span class="p">]</span>
</code></pre>
<p>Great, so we expect the rows to be a list of FIPS code and Name.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nf">->></span> <span class="nv">fips</span> <span class="ss">:table</span> <span class="ss">:rows</span> <span class="p">(</span><span class="nb">take </span><span class="mi">3</span><span class="p">)</span> <span class="nv">pprint</span><span class="p">)</span>
<span class="p">([</span><span class="s">"02000"</span> <span class="s">"AK"</span><span class="p">]</span>
<span class="p">[</span><span class="s">"02013"</span> <span class="s">"AK, Aleutians East"</span><span class="p">]</span>
<span class="p">[</span><span class="s">"02016"</span> <span class="s">"AK, Aleutians West"</span><span class="p">])</span>
</code></pre>
<p>Perfect. Now that’s we’ve done some exploratory work in the REPL, let’s shift back to an editor. Create a new file in <code>src/scratch/crime.clj</code>:</p>
<pre><code><span></span><span class="p">(</span><span class="kd">ns </span><span class="nv">scratch.crime</span>
<span class="p">(</span><span class="ss">:require</span> <span class="p">[</span><span class="nv">cheshire.core</span> <span class="ss">:as</span> <span class="nv">json</span><span class="p">]))</span>
<span class="p">(</span><span class="k">def </span><span class="nv">fips</span>
<span class="s">"A map of FIPS codes to their county names."</span>
<span class="p">(</span><span class="nf">->></span> <span class="p">(</span><span class="nf">json/parse-string</span> <span class="p">(</span><span class="nb">slurp </span><span class="s">"fips.json"</span><span class="p">)</span> <span class="nv">true</span><span class="p">)</span>
<span class="ss">:table</span>
<span class="ss">:rows</span>
<span class="p">(</span><span class="nb">into </span><span class="p">{})))</span>
</code></pre>
<p>We’re just taking a snippet we wrote in the REPL–parsing the FIPS dataset–and writing it down for use as a part of a bigger program. We use <code>(into {})</code> to convert the sequence of <code>[fips, name]</code> pairs into a map, just like we used <code>into (sorted-set)</code> to merge a list of keywords into a set earlier. <code>into</code> works just like <code>conj</code>, repeated over and over again, and is an incredibly useful tool for building up collections of things.</p>
<p>Back in the REPL, let’s check if that worked:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nf">use</span> <span class="ss">'scratch.crime</span> <span class="ss">:reload</span><span class="p">)</span>
<span class="nv">nil</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nf">fips</span> <span class="s">"10001"</span><span class="p">)</span>
<span class="s">"DE, Kent"</span>
</code></pre>
<p>Remember, maps act like functions in Clojure, so we can use the <code>fips</code> map to look up the names of counties efficiently.</p>
<p>We also have to load the UCR crime file–so let’s split that load-and-parse code into its own function:</p>
<pre><code><span></span><span class="p">(</span><span class="kd">defn </span><span class="nv">load-json</span>
<span class="s">"Given a filename, reads a JSON file and returns it, parsed, with keywords."</span>
<span class="p">[</span><span class="nv">file</span><span class="p">]</span>
<span class="p">(</span><span class="nf">json/parse-string</span> <span class="p">(</span><span class="nb">slurp </span><span class="nv">file</span><span class="p">)</span> <span class="nv">true</span><span class="p">))</span>
<span class="p">(</span><span class="k">def </span><span class="nv">fips</span>
<span class="s">"A map of FIPS codes to their county names."</span>
<span class="p">(</span><span class="nf">->></span> <span class="s">"fips.json"</span>
<span class="nv">load-json</span>
<span class="ss">:table</span>
<span class="ss">:rows</span>
<span class="p">(</span><span class="nb">into </span><span class="p">{})))</span>
</code></pre>
<p>Now we can re-use <code>load-json</code> to load the UCR crime file.</p>
<pre><code><span></span><span class="p">(</span><span class="kd">defn </span><span class="nv">most-duis</span>
<span class="s">"Given a JSON filename of UCR crime data for a particular year, finds the</span>
<span class="s"> counties with the most DUIs."</span>
<span class="p">[</span><span class="nv">file</span><span class="p">]</span>
<span class="p">(</span><span class="nf">->></span> <span class="nv">file</span>
<span class="nv">load-json</span>
<span class="p">(</span><span class="nb">sort-by </span><span class="ss">:driving_under_influence</span><span class="p">)</span>
<span class="p">(</span><span class="nf">take-last</span> <span class="mi">10</span><span class="p">)</span>
<span class="p">(</span><span class="nb">map </span><span class="o">#</span><span class="p">(</span><span class="nb">select-keys </span><span class="nv">%</span> <span class="p">[</span><span class="ss">:driving_under_influence</span>
<span class="ss">:fips_county_code</span>
<span class="ss">:fips_state_code</span><span class="p">]))))</span>
</code></pre>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nf">use</span> <span class="ss">'scratch.crime</span> <span class="ss">:reload</span><span class="p">)</span> <span class="p">(</span><span class="nf">pprint</span> <span class="p">(</span><span class="nf">most-duis</span> <span class="s">"2008.json"</span><span class="p">))</span>
<span class="nv">nil</span>
<span class="p">({</span><span class="ss">:fips_state_code</span> <span class="s">"06"</span>,
<span class="ss">:fips_county_code</span> <span class="s">"067"</span>,
<span class="ss">:driving_under_influence</span> <span class="mi">8589</span><span class="p">}</span>
<span class="p">{</span><span class="ss">:fips_state_code</span> <span class="s">"48"</span>,
<span class="ss">:fips_county_code</span> <span class="s">"201"</span>,
<span class="ss">:driving_under_influence</span> <span class="mi">10432</span><span class="p">}</span>
<span class="p">{</span><span class="ss">:fips_state_code</span> <span class="s">"32"</span>,
<span class="ss">:fips_county_code</span> <span class="s">"003"</span>,
<span class="ss">:driving_under_influence</span> <span class="mi">10443</span><span class="p">}</span>
<span class="p">{</span><span class="ss">:fips_state_code</span> <span class="s">"06"</span>,
<span class="ss">:fips_county_code</span> <span class="s">"065"</span>,
<span class="ss">:driving_under_influence</span> <span class="mi">10814</span><span class="p">}</span>
<span class="p">{</span><span class="ss">:fips_state_code</span> <span class="s">"53"</span>,
<span class="ss">:fips_county_code</span> <span class="s">"033"</span>,
<span class="ss">:driving_under_influence</span> <span class="mi">11439</span><span class="p">}</span>
<span class="p">{</span><span class="ss">:fips_state_code</span> <span class="s">"06"</span>,
<span class="ss">:fips_county_code</span> <span class="s">"071"</span>,
<span class="ss">:driving_under_influence</span> <span class="mi">13983</span><span class="p">}</span>
<span class="p">{</span><span class="ss">:fips_state_code</span> <span class="s">"06"</span>,
<span class="ss">:fips_county_code</span> <span class="s">"059"</span>,
<span class="ss">:driving_under_influence</span> <span class="mi">17572</span><span class="p">}</span>
<span class="p">{</span><span class="ss">:fips_state_code</span> <span class="s">"06"</span>,
<span class="ss">:fips_county_code</span> <span class="s">"073"</span>,
<span class="ss">:driving_under_influence</span> <span class="mi">18562</span><span class="p">}</span>
<span class="p">{</span><span class="ss">:fips_state_code</span> <span class="s">"04"</span>,
<span class="ss">:fips_county_code</span> <span class="s">"013"</span>,
<span class="ss">:driving_under_influence</span> <span class="mi">26235</span><span class="p">}</span>
<span class="p">{</span><span class="ss">:fips_state_code</span> <span class="s">"06"</span>,
<span class="ss">:fips_county_code</span> <span class="s">"037"</span>,
<span class="ss">:driving_under_influence</span> <span class="mi">45056</span><span class="p">})</span>
</code></pre>
<p>Almost there. We need to join together the state and county FIPS codes into a single string, because that’s how <code>fips</code> represents the county code:</p>
<pre><code><span></span><span class="p">(</span><span class="kd">defn </span><span class="nv">fips-code</span>
<span class="s">"Given a county (a map with :fips_state_code and :fips_county_code keys),</span>
<span class="s"> returns the five-digit FIPS code for the county, as a string."</span>
<span class="p">[</span><span class="nv">county</span><span class="p">]</span>
<span class="p">(</span><span class="nb">str </span><span class="p">(</span><span class="ss">:fips_state_code</span> <span class="nv">county</span><span class="p">)</span> <span class="p">(</span><span class="ss">:fips_county_code</span> <span class="nv">county</span><span class="p">)))</span>
</code></pre>
<p>Let’s write a quick test in <code>test/scratch/crime_test.clj</code> to verify that function works correctly:</p>
<pre><code><span></span><span class="p">(</span><span class="kd">ns </span><span class="nv">scratch.crime-test</span>
<span class="p">(</span><span class="ss">:require</span> <span class="p">[</span><span class="nv">clojure.test</span> <span class="ss">:refer</span> <span class="ss">:all</span><span class="p">]</span>
<span class="p">[</span><span class="nv">scratch.crime</span> <span class="ss">:refer</span> <span class="ss">:all</span><span class="p">]))</span>
<span class="p">(</span><span class="nf">deftest</span> <span class="nv">fips-code-test</span>
<span class="p">(</span><span class="nf">is</span> <span class="p">(</span><span class="nb">= </span><span class="s">"12345"</span> <span class="p">(</span><span class="nf">fips-code</span> <span class="p">{</span><span class="ss">:fips_state_code</span> <span class="s">"12"</span> <span class="ss">:fips_county_code</span> <span class="s">"345"</span><span class="p">}))))</span>
</code></pre>
<pre><code>aphyr@waterhouse:~/scratch$ lein test scratch.crime-test
lein test scratch.crime-test
Ran 1 tests containing 1 assertions.
0 failures, 0 errors.
</code></pre>
<p>Great. Note that <code>lein test some-namespace</code> runs only the tests in that particular namespace. For the last step, let’s take the <code>most-duis</code> function and, using <code>fips</code> and <code>fips-code</code>, construct a map of county names to DUI reports.</p>
<pre><code><span></span><span class="p">(</span><span class="kd">defn </span><span class="nv">most-duis</span>
<span class="s">"Given a JSON filename of UCR crime data for a particular year, finds the</span>
<span class="s"> counties with the most DUIs."</span>
<span class="p">[</span><span class="nv">file</span><span class="p">]</span>
<span class="p">(</span><span class="nf">->></span> <span class="nv">file</span>
<span class="nv">load-json</span>
<span class="p">(</span><span class="nb">sort-by </span><span class="ss">:driving_under_influence</span><span class="p">)</span>
<span class="p">(</span><span class="nf">take-last</span> <span class="mi">10</span><span class="p">)</span>
<span class="p">(</span><span class="nb">map </span><span class="p">(</span><span class="k">fn </span><span class="p">[</span><span class="nv">county</span><span class="p">]</span>
<span class="p">[(</span><span class="nf">fips</span> <span class="p">(</span><span class="nf">fips-code</span> <span class="nv">county</span><span class="p">))</span>
<span class="p">(</span><span class="ss">:driving_under_influence</span> <span class="nv">county</span><span class="p">)]))</span>
<span class="p">(</span><span class="nb">into </span><span class="p">{})))</span>
</code></pre>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nf">use</span> <span class="ss">'scratch.crime</span> <span class="ss">:reload</span><span class="p">)</span> <span class="p">(</span><span class="nf">pprint</span> <span class="p">(</span><span class="nf">most-duis</span> <span class="s">"2008.json"</span><span class="p">))</span>
<span class="nv">nil</span>
<span class="p">{</span><span class="s">"CA, Orange"</span> <span class="mi">17572</span>,
<span class="s">"CA, San Bernardino"</span> <span class="mi">13983</span>,
<span class="s">"CA, Los Angeles"</span> <span class="mi">45056</span>,
<span class="s">"CA, Riverside"</span> <span class="mi">10814</span>,
<span class="s">"NV, Clark"</span> <span class="mi">10443</span>,
<span class="s">"WA, King"</span> <span class="mi">11439</span>,
<span class="s">"AZ, Maricopa"</span> <span class="mi">26235</span>,
<span class="s">"CA, San Diego"</span> <span class="mi">18562</span>,
<span class="s">"TX, Harris"</span> <span class="mi">10432</span>,
<span class="s">"CA, Sacramento"</span> <span class="mi">8589</span><span class="p">}</span>
</code></pre>
<p>Our question is, at least in part, answered: Los Angeles and Maricopa counties, in California and Arizona, have the most reports of drunk driving out of any counties in the 2008 Uniform Crime Reporting database. These might be good counties for a PSA campaign. California has either lots of drunk drivers, or aggressive enforcement, or both! Remember, this only tells us about <em>reports</em> of crimes; not the crimes themselves. Numbers vary based on how the state enforces the laws!</p>
<pre><code><span></span><span class="p">(</span><span class="kd">ns </span><span class="nv">scratch.crime</span>
<span class="p">(</span><span class="ss">:require</span> <span class="p">[</span><span class="nv">cheshire.core</span> <span class="ss">:as</span> <span class="nv">json</span><span class="p">]))</span>
<span class="p">(</span><span class="kd">defn </span><span class="nv">load-json</span>
<span class="s">"Given a filename, reads a JSON file and returns it, parsed, with keywords."</span>
<span class="p">[</span><span class="nv">file</span><span class="p">]</span>
<span class="p">(</span><span class="nf">json/parse-string</span> <span class="p">(</span><span class="nb">slurp </span><span class="nv">file</span><span class="p">)</span> <span class="nv">true</span><span class="p">))</span>
<span class="p">(</span><span class="k">def </span><span class="nv">fips</span>
<span class="s">"A map of FIPS codes to their county names."</span>
<span class="p">(</span><span class="nf">->></span> <span class="s">"fips.json"</span>
<span class="nv">load-json</span>
<span class="ss">:table</span>
<span class="ss">:rows</span>
<span class="p">(</span><span class="nb">into </span><span class="p">{})))</span>
<span class="p">(</span><span class="kd">defn </span><span class="nv">fips-code</span>
<span class="s">"Given a county (a map with :fips_state_code and :fips_county_code keys),</span>
<span class="s"> returns the five-digit FIPS code for the county, as a string."</span>
<span class="p">[</span><span class="nv">county</span><span class="p">]</span>
<span class="p">(</span><span class="nb">str </span><span class="p">(</span><span class="ss">:fips_state_code</span> <span class="nv">county</span><span class="p">)</span> <span class="p">(</span><span class="ss">:fips_county_code</span> <span class="nv">county</span><span class="p">)))</span>
<span class="p">(</span><span class="kd">defn </span><span class="nv">most-duis</span>
<span class="s">"Given a JSON filename of UCR crime data for a particular year, finds the</span>
<span class="s"> counties with the most DUIs."</span>
<span class="p">[</span><span class="nv">file</span><span class="p">]</span>
<span class="p">(</span><span class="nf">->></span> <span class="nv">file</span>
<span class="nv">load-json</span>
<span class="p">(</span><span class="nb">sort-by </span><span class="ss">:driving_under_influence</span><span class="p">)</span>
<span class="p">(</span><span class="nf">take-last</span> <span class="mi">10</span><span class="p">)</span>
<span class="p">(</span><span class="nb">map </span><span class="p">(</span><span class="k">fn </span><span class="p">[</span><span class="nv">county</span><span class="p">]</span>
<span class="p">[(</span><span class="nf">fips</span> <span class="p">(</span><span class="nf">fips-code</span> <span class="nv">county</span><span class="p">))</span>
<span class="p">(</span><span class="ss">:driving_under_influence</span> <span class="nv">county</span><span class="p">)]))</span>
<span class="p">(</span><span class="nb">into </span><span class="p">{})))</span>
</code></pre>
<h2><a href="#recap" id="recap">Recap</a></h2>
<p>In this chapter, we expanded beyond transient programs written in the REPL. We learned how <em>projects</em> combine static resources, code, and tests into a single package, and how projects can relate to one another through <em>dependencies</em>. We learned the basics of Clojure’s namespace system, which isolates distinct chunks of code from one another, and how to include definitions from one namespace in another via <code>require</code> and <code>use</code>. We learned how to write and run <em>tests</em> to verify our code’s correctness, and how to move seamlessly between the repl and code in <code>.clj</code> files. We made use of Cheshire, a Clojure library published on Clojars, to parse JSON–a common data format. Finally, we brought together our knowledge of Clojure’s basic grammar, immutable data structures, core functions, sequences, threading macros, and vars to explore a real-world problem.</p>
<h2><a href="#exercises" id="exercises">Exercises</a></h2>
<ol>
<li>
<p><code>most-duis</code> tells us about the raw number of reports, but doesn’t account for differences in county population. One would naturally expect counties with more people to have more crime! Divide the <code>:driving_under_influence</code> of each county by its <code>:county_population</code> to find a <em>prevalence</em> of DUIs, and take the top ten counties based on prevalence. How should you handle counties with a population of zero?</p>
</li>
<li>
<p>How do the prevalence counties compare to the original counties? Expand most-duis to return vectors of <code>[county-name, prevalence, report-count, population]</code> What are the populations of the high-prevalence counties? Why do you suppose the data looks this way? If you were leading a public-health campaign to reduce drunk driving, would you target your intervention based on <em>report count</em> or <em>prevalence</em>? Why?</p>
</li>
<li>
<p>We can <em>generalize</em> the <code>most-duis</code> function to handle <em>any</em> type of crime. Write a function <code>most-prevalent</code> which takes a file and a field name, like <code>:arson</code>, and finds the counties where that field is most often reported, per capita.</p>
</li>
<li>
<p>Write a test to verify that <code>most-prevalent</code> is correct.</p>
</li>
</ol>
<p>Next up: <a href="https://aphyr.com/posts/312-clojure-from-the-ground-up-modeling">modeling</a>.</p>
https://aphyr.com/posts/306-clojure-from-the-ground-up-stateClojure from the ground up: state2013-12-01T02:14:12-05:002013-12-01T02:14:12-05:00Aphyrhttps://aphyr.com/<p><em>Previously: <a href="http://aphyr.com/posts/305-clojure-from-the-ground-up-macros">Macros</a>.</em></p>
<p>Most programs encompass <em>change</em>. People grow up, leave town, fall in love, and take new names. Engines burn through fuel while their parts wear out, and new ones are swapped in. Forests burn down and their logs become nurseries for new trees. Despite these changes, we say “She’s still Nguyen”, “That’s my motorcycle”, “The same woods I hiked through as a child.”</p>
<p>Identity is a skein we lay across the world of immutable facts; a single entity which encompasses change. In programming, identities unify different values over time. Identity types are <em>mutable references</em> to <em>immutable values</em>.</p>
<p>In this chapter, we’ll move from immutable references to complex concurrent transactions. In the process we’ll get a taste of <em>concurrency</em> and <em>parallelism</em>, which will motivate the use of more sophisticated identity types. These are not easy concepts, so don’t get discouraged. You don’t have to understand this chapter fully to be a productive programmer, but I do want to hint at <em>why</em> things work this way. As you work with state more, these concepts will solidify.</p>
<h2><a href="#immutability" id="immutability">Immutability</a></h2>
<p>The references we’ve used in <code>let</code> bindings and function arguments are <em>immutable</em>: they never change.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="k">let </span><span class="p">[</span><span class="nv">x</span> <span class="mi">1</span><span class="p">]</span>
<span class="p">(</span><span class="nb">prn </span><span class="p">(</span><span class="nb">inc </span><span class="nv">x</span><span class="p">))</span>
<span class="p">(</span><span class="nb">prn </span><span class="p">(</span><span class="nb">inc </span><span class="nv">x</span><span class="p">)))</span>
<span class="mi">2</span>
<span class="mi">2</span>
</code></pre>
<p>The expression <code>(inc x)</code> did not <em>alter</em> <code>x</code>: <code>x</code> remained <code>1</code>. The same applies to strings, lists, vectors, maps, sets, and most everything else in Clojure:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="k">let </span><span class="p">[</span><span class="nv">x</span> <span class="p">[</span><span class="mi">1</span> <span class="mi">2</span><span class="p">]]</span>
<span class="p">(</span><span class="nb">prn </span><span class="p">(</span><span class="nb">conj </span><span class="nv">x</span> <span class="ss">:a</span><span class="p">))</span>
<span class="p">(</span><span class="nb">prn </span><span class="p">(</span><span class="nb">conj </span><span class="nv">x</span> <span class="ss">:b</span><span class="p">)))</span>
<span class="p">[</span><span class="mi">1</span> <span class="mi">2</span> <span class="ss">:a</span><span class="p">]</span>
<span class="p">[</span><span class="mi">1</span> <span class="mi">2</span> <span class="ss">:b</span><span class="p">]</span>
</code></pre>
<p>Immutability also extends to <code>let</code> bindings, function arguments, and other symbols. Functions <em>remember</em> the values of those symbols at the time the function was constructed.</p>
<pre><code><span></span><span class="p">(</span><span class="kd">defn </span><span class="nv">present</span>
<span class="p">[</span><span class="nv">gift</span><span class="p">]</span>
<span class="p">(</span><span class="k">fn </span><span class="p">[]</span> <span class="nv">gift</span><span class="p">))</span>
<span class="nv">user=></span> <span class="p">(</span><span class="k">def </span><span class="nv">green-box</span> <span class="p">(</span><span class="nf">present</span> <span class="s">"clockwork beetle"</span><span class="p">))</span>
<span class="o">#</span><span class="ss">'user/green-box</span>
<span class="nv">user=></span> <span class="p">(</span><span class="k">def </span><span class="nv">red-box</span> <span class="p">(</span><span class="nf">present</span> <span class="s">"plush tiger"</span><span class="p">))</span>
<span class="o">#</span><span class="ss">'user/red-box</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nf">red-box</span><span class="p">)</span>
<span class="s">"plush tiger"</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nf">green-box</span><span class="p">)</span>
<span class="s">"clockwork beetle"</span>
</code></pre>
<p>The <code>present</code> function <em>creates a new function</em>. That function takes no arguments, and always returns the gift. Which gift? Because <code>gift</code> is not an argument to the inner function, it refers to the value from the <em>outer function body</em>. When we packaged up the red and green boxes, the functions we created carried with them a memory of the <code>gift</code> symbol’s value.</p>
<p>This is called <em>closing over</em> the <code>gift</code> variable; the inner function is sometimes called <em>a closure</em>. In Clojure, new functions close over <em>all</em> variables except their arguments–the arguments, of course, will be provided when the function is invoked.</p>
<h2><a href="#delays" id="delays">Delays</a></h2>
<p>Because functions <em>close over</em> their arguments, they can be used to <em>defer</em> evaluation of expressions. That’s how we introduced functions originally–like <code>let</code> expressions, but with a number (maybe zero!) of symbols <em>missing</em>, to be filled in at a later time.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="k">do </span><span class="p">(</span><span class="nb">prn </span><span class="s">"Adding"</span><span class="p">)</span> <span class="p">(</span><span class="nb">+ </span><span class="mi">1</span> <span class="mi">2</span><span class="p">))</span>
<span class="s">"Adding"</span>
<span class="mi">3</span>
<span class="nv">user=></span> <span class="p">(</span><span class="k">def </span><span class="nv">later</span> <span class="p">(</span><span class="k">fn </span><span class="p">[]</span> <span class="p">(</span><span class="nb">prn </span><span class="s">"Adding"</span><span class="p">)</span> <span class="p">(</span><span class="nb">+ </span><span class="mi">1</span> <span class="mi">2</span><span class="p">)))</span>
<span class="o">#</span><span class="ss">'user/later</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nf">later</span><span class="p">)</span>
<span class="s">"Adding"</span>
<span class="mi">3</span>
</code></pre>
<p>Evaluating <code>(def later ...)</code> did <em>not</em> evaluate the expressions in the function body. Only when we invoked the function <code>later</code> did Clojure print <code>"Adding"</code> to the screen, and return <code>3</code>. This is the basis of <em>concurrency</em>: evaluating expressions outside their normal, sequential order.</p>
<p>This pattern of deferring evaluation is so common that there’s a standard macro for it, called <code>delay</code>:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="k">def </span><span class="nv">later</span> <span class="p">(</span><span class="nf">delay</span> <span class="p">(</span><span class="nb">prn </span><span class="s">"Adding"</span><span class="p">)</span> <span class="p">(</span><span class="nb">+ </span><span class="mi">1</span> <span class="mi">2</span><span class="p">)))</span>
<span class="o">#</span><span class="ss">'user/later</span>
<span class="nv">user=></span> <span class="nv">later</span>
<span class="o">#</span><span class="nv"><Delay</span><span class="o">@</span><span class="mi">2</span><span class="nv">dd31aac</span><span class="err">:</span> <span class="ss">:pending></span>
<span class="nv">user=></span> <span class="p">(</span><span class="nb">deref </span><span class="nv">later</span><span class="p">)</span>
<span class="s">"Adding"</span>
<span class="mi">3</span>
</code></pre>
<p>Instead of a function, <code>delay</code> creates a special type of Delay object: an identity which <em>refers</em> to expressions which should be evaluated later. We extract, or <em>dereference</em>, the value of that identity with <code>deref</code>. Delays follow the same rules as functions, closing over lexical scope–because <code>delay</code> actually macroexpands into an anonymous function.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nf">source</span> <span class="nv">delay</span><span class="p">)</span>
<span class="p">(</span><span class="kd">defmacro </span><span class="nv">delay</span>
<span class="s">"Takes a body of expressions and yields a Delay object that will</span>
<span class="s"> invoke the body only the first time it is forced (with force or deref/@), and</span>
<span class="s"> will cache the result and return it on all subsequent force</span>
<span class="s"> calls. See also - realized?"</span>
<span class="p">{</span><span class="ss">:added</span> <span class="s">"1.0"</span><span class="p">}</span>
<span class="p">[</span><span class="o">&</span> <span class="nv">body</span><span class="p">]</span>
<span class="p">(</span><span class="nb">list </span><span class="ss">'new</span> <span class="ss">'clojure.lang.Delay</span> <span class="p">(</span><span class="nb">list* </span><span class="o">`^</span><span class="p">{</span><span class="ss">:once</span> <span class="nv">true</span><span class="p">}</span> <span class="nv">fn*</span> <span class="p">[]</span> <span class="nv">body</span><span class="p">)))</span>
</code></pre>
<p>Why the <code>Delay</code> object instead of a plain old function? Because unlike function invocation, delays only evaluate their expressions <em>once</em>. They remember their value, after the first evaluation, and return it for every successive <code>deref</code>.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">deref </span><span class="nv">later</span><span class="p">)</span>
<span class="mi">3</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nb">deref </span><span class="nv">later</span><span class="p">)</span>
<span class="mi">3</span>
</code></pre>
<p>By the way, there’s a shortcut for <code>(deref something)</code>: the wormhole operator <code>@</code>:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="o">@</span><span class="nv">later</span> <span class="c1">; Interpreted as (deref later)</span>
<span class="mi">3</span>
</code></pre>
<p>Remember how <code>map</code> returned a sequence immediately, but didn’t actually perform any computation until we asked for elements? That’s called <em>lazy</em> evaluation. Because delays are lazy, we can avoid doing expensive operations until they’re really needed. Like an IOU, we use delays when we aren’t ready to do something just yet, but when someone calls in the favor, we’ll make sure it happens.</p>
<h2><a href="#futures" id="futures">Futures</a></h2>
<p>What if we wanted to <em>opportunistically</em> defer computation? Modern computers have multiple cores, and operating systems let us share a core between two tasks. It would be great if we could use that multitasking ability to say, “I don’t need the result of evaluating these expressions <em>yet</em>, but I’d like it <em>later</em>. Could you start working on it in the meantime?”</p>
<p>Enter the <em>future</em>: a delay which is evaluated <em>in parallel</em>. Like delays, futures return immediately, and give us an <em>identity</em> which will point to the value of the last expression in the future–in this case, the value of <code>(+ 1 2)</code>.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="k">def </span><span class="nv">x</span> <span class="p">(</span><span class="nf">future</span> <span class="p">(</span><span class="nb">prn </span><span class="s">"hi"</span><span class="p">)</span> <span class="p">(</span><span class="nb">+ </span><span class="mi">1</span> <span class="mi">2</span><span class="p">)))</span>
<span class="s">"hi"</span>
<span class="o">#</span><span class="ss">'user/x</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nb">deref </span><span class="nv">x</span><span class="p">)</span>
<span class="mi">3</span>
</code></pre>
<p>Notice how the future printed “hi” right away. That’s because futures are evaluated in a new <em>thread</em>. On multicore computers, two threads can run in <em>parallel</em>, on different cores the same time. When there are more threads than cores, the cores <em>trade off</em> running different threads. Both parallel and non-parallel evaluation of threads are <em>concurrent</em> because expressions from different threads can be evaluated out of order.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">dotimes </span><span class="p">[</span><span class="nv">i</span> <span class="mi">5</span><span class="p">]</span> <span class="p">(</span><span class="nf">future</span> <span class="p">(</span><span class="nb">prn </span><span class="nv">i</span><span class="p">)))</span>
<span class="mi">14</span>
<span class="mi">3</span>
<span class="mi">0</span>
<span class="mi">2</span>
<span class="nv">nil</span>
</code></pre>
<p>Five threads running at once. Notice that the thread printing <code>1</code> didn’t even get to move to a new line before <code>4</code> showed up–then both threads wrote new lines at the same time. There are techniques to control this concurrent execution so that things happen in some well-defined sequence, like agents and locks, but we’ll discuss those later.</p>
<p>Just like delays, we can deref a future as many times as we want, and the expressions are only evaluated once.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="k">def </span><span class="nv">x</span> <span class="p">(</span><span class="nf">future</span> <span class="p">(</span><span class="nb">prn </span><span class="s">"hi"</span><span class="p">)</span> <span class="p">(</span><span class="nb">+ </span><span class="mi">1</span> <span class="mi">2</span><span class="p">)))</span>
<span class="o">#</span><span class="ss">'user/x</span><span class="s">"hi"</span>
<span class="nv">user=></span> <span class="o">@</span><span class="nv">x</span>
<span class="mi">3</span>
<span class="nv">user=></span> <span class="o">@</span><span class="nv">x</span>
<span class="mi">3</span>
</code></pre>
<p>Futures are the most generic parallel construct in Clojure. You can use futures to do CPU-intensive computation faster, to wait for multiple network requests to complete at once, or to run housekeeping code periodically.</p>
<h2><a href="#promises" id="promises">Promises</a></h2>
<p>Delays <em>defer</em> evaluation, and futures <em>parallelize</em> it. What if we wanted to defer something we <em>don’t even have yet</em>? To hand someone an empty box and, later, before they open it, sneak in and replacing its contents with an actual gift? Surely I’m not the only one who does birthday presents this way.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="k">def </span><span class="nv">box</span> <span class="p">(</span><span class="nf">promise</span><span class="p">))</span>
<span class="o">#</span><span class="ss">'user/box</span>
<span class="nv">user=></span> <span class="nv">box</span>
<span class="o">#</span><span class="nv"><core$promise$reify__6310</span><span class="o">@</span><span class="mi">1</span><span class="nv">d7762e</span><span class="err">:</span> <span class="ss">:pending></span>
</code></pre>
<p>This box is <em>pending</em> a value. Like futures and delays, if we try to open it, we’ll get <em>stuck</em> and have to wait for something to appear inside:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">deref </span><span class="nv">box</span><span class="p">)</span>
</code></pre>
<p>But unlike futures and delays, this box won’t be filled automatically. Hold the <code>Control</code> key and hit <code>c</code> to give up on trying to open that package. Nobody else is in this REPL, so we’ll have to buy our own presents.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nf">deliver</span> <span class="nv">box</span> <span class="ss">:live-scorpions!</span><span class="p">)</span>
<span class="o">#</span><span class="nv"><core$promise$reify__6310</span><span class="o">@</span><span class="mi">1</span><span class="nv">d7762e</span><span class="err">:</span> <span class="ss">:live-scorpions!></span>
<span class="nv">user=></span> <span class="p">(</span><span class="nb">deref </span><span class="nv">box</span><span class="p">)</span>
<span class="ss">:live-scorpions!</span>
</code></pre>
<p>Wow, that’s a <em>terrible</em> gift. But at least there’s something there: when we dereference the box, it opens immediately and live scorpions skitter out. Can we get a do-over? Let’s try a nicer gift.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nf">deliver</span> <span class="nv">box</span> <span class="ss">:puppy</span><span class="p">)</span>
<span class="nv">nil</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nb">deref </span><span class="nv">box</span><span class="p">)</span>
<span class="ss">:live-scorpions!</span>
</code></pre>
<p>Like delays and futures, there’s no going back on our promises. Once delivered, a promise <em>always</em> refers to the same value. This is a simple identity type: we can set it to a value once, and read it as many times as we want. <code>promise</code> is also a <em>concurrency primitive</em>: it guarantees that any attempt to read the value will <em>wait</em> until the value has been written. We can use promises to <em>synchronize</em> a program which is being evaluated concurrently–for instance, this simple card game:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="k">def </span><span class="nv">card</span> <span class="p">(</span><span class="nf">promise</span><span class="p">))</span>
<span class="o">#</span><span class="ss">'user/card</span>
<span class="nv">user=></span> <span class="p">(</span><span class="k">def </span><span class="nv">dealer</span> <span class="p">(</span><span class="nf">future</span>
<span class="p">(</span><span class="nf">Thread/sleep</span> <span class="mi">5000</span><span class="p">)</span>
<span class="p">(</span><span class="nf">deliver</span> <span class="nv">card</span> <span class="p">[(</span><span class="nb">inc </span><span class="p">(</span><span class="nb">rand-int </span><span class="mi">13</span><span class="p">))</span>
<span class="p">(</span><span class="nf">rand-nth</span> <span class="p">[</span><span class="ss">:clubs</span> <span class="ss">:spades</span> <span class="ss">:hearts</span> <span class="ss">:diamonds</span><span class="p">])])))</span>
<span class="o">#</span><span class="ss">'user/dealer</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nb">deref </span><span class="nv">card</span><span class="p">)</span>
<span class="p">[</span><span class="mi">5</span> <span class="ss">:diamonds</span><span class="p">]</span>
</code></pre>
<p>In this program, we set up a <code>dealer</code> thread which waits for five seconds (5000 milliseconds), then delivers a random card. While the dealer is sleeping, we try to deref our card–and have to wait until the five seconds are up. Synchronization and identity in one package.</p>
<p>Where delays are lazy, and futures are parallel, promises are concurrent <em>without specifying how the evaluation occurs</em>. We control exactly when and how the value is delivered. You can think of both delays and futures as being built atop promises, in a way.</p>
<h2><a href="#vars" id="vars">Vars</a></h2>
<p>So far the identities we’ve discussed have referred (eventually) to a <em>single</em> value, but the real world needs names that refer to <em>different</em> values at different points in time. For this, we use <em>vars</em>.</p>
<p>We’ve touched on vars before–they’re transparent mutable references. Each var has a value associated with it, and that value can change over time. When a var is evaluated, it is replaced by its <em>present</em> value transparently–everywhere in the program.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="k">def </span><span class="nv">x</span> <span class="ss">:mouse</span><span class="p">)</span>
<span class="o">#</span><span class="ss">'user/x</span>
<span class="nv">user=></span> <span class="p">(</span><span class="k">def </span><span class="nv">box</span> <span class="p">(</span><span class="k">fn </span><span class="p">[]</span> <span class="nv">x</span><span class="p">))</span>
<span class="o">#</span><span class="ss">'user/box</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nf">box</span><span class="p">)</span>
<span class="ss">:mouse</span>
<span class="nv">user=></span> <span class="p">(</span><span class="k">def </span><span class="nv">x</span> <span class="ss">:cat</span><span class="p">)</span>
<span class="o">#</span><span class="ss">'user/x</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nf">box</span><span class="p">)</span>
<span class="ss">:cat</span>
</code></pre>
<p>The <code>box</code> function closed over <code>x</code>–but calling <code>(box)</code> returned <em>different</em> results depending on the current value of <code>x</code>. Even though the <em>var</em> <code>x</code> remained unchanged throughout this example, the <em>value associated with that var</em> did change!</p>
<p>Using mutable vars allows us to write programs which we can redefine as we go along.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="kd">defn </span><span class="nv">decouple</span> <span class="p">[</span><span class="nv">glider</span><span class="p">]</span>
<span class="o">#</span><span class="nv">_=></span> <span class="p">(</span><span class="nb">prn </span><span class="s">"bolts released"</span><span class="p">))</span>
<span class="o">#</span><span class="ss">'user/decouple</span>
<span class="nv">user=></span> <span class="p">(</span><span class="kd">defn </span><span class="nv">launch</span> <span class="p">[</span><span class="nv">glider</span><span class="p">]</span>
<span class="o">#</span><span class="nv">_=></span> <span class="p">(</span><span class="nf">decouple</span> <span class="nv">glider</span><span class="p">)</span>
<span class="o">#</span><span class="nv">_=></span> <span class="p">(</span><span class="nb">prn </span><span class="nv">glider</span> <span class="s">"away!"</span><span class="p">))</span>
<span class="o">#</span><span class="ss">'user/launch</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nf">launch</span> <span class="s">"albatross"</span><span class="p">)</span>
<span class="s">"bolts released"</span>
<span class="s">"albatross"</span> <span class="s">"away!"</span>
<span class="nv">nil</span>
<span class="nv">user=></span> <span class="p">(</span><span class="kd">defn </span><span class="nv">decouple</span> <span class="p">[</span><span class="nv">glider</span><span class="p">]</span>
<span class="o">#</span><span class="nv">_=></span> <span class="p">(</span><span class="nb">prn </span><span class="s">"tether released"</span><span class="p">))</span>
<span class="o">#</span><span class="ss">'user/decouple</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nf">launch</span> <span class="s">"albatross"</span><span class="p">)</span>
<span class="s">"tether released"</span>
<span class="s">"albatross"</span> <span class="s">"away!"</span>
</code></pre>
<p>A reference which is the same everywhere is called a <em>global variable</em>, or simply a <em>global</em>. But vars have an additional trick up their sleeve: with a <em>dynamic</em> var, we can override their value only within the scope of a particular function call, and nowhere else.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="k">def </span><span class="o">^</span><span class="ss">:dynamic</span> <span class="nv">*board*</span> <span class="ss">:maple</span><span class="p">)</span>
<span class="o">#</span><span class="ss">'user/*board*</span>
</code></pre>
<p><code>^:dynamic</code> tells Clojure that this var can be overridden in one particular scope. By convention, dynamic variables are named with asterisks around them–this reminds us, as programmers, that they are likely to change. Next, we define a function that uses that dynamic var:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="kd">defn </span><span class="nv">cut</span> <span class="p">[]</span> <span class="p">(</span><span class="nb">prn </span><span class="s">"sawing through"</span> <span class="nv">*board*</span><span class="p">))</span>
<span class="o">#</span><span class="ss">'user/cut</span>
</code></pre>
<p>Note that <code>cut</code> closes over the var <code>*board*</code>, but not the <em>value</em> :maple. Every time the function is invoked, it looks up the <em>current</em> value of <code>*board*</code>.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nf">cut</span><span class="p">)</span>
<span class="s">"sawing through"</span> <span class="ss">:maple</span>
<span class="nv">nil</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nb">binding </span><span class="p">[</span><span class="nv">*board*</span> <span class="ss">:cedar</span><span class="p">]</span> <span class="p">(</span><span class="nf">cut</span><span class="p">))</span>
<span class="s">"sawing through"</span> <span class="ss">:cedar</span>
<span class="nv">nil</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nf">cut</span><span class="p">)</span>
<span class="s">"sawing through"</span> <span class="ss">:maple</span>
</code></pre>
<p>Like <code>let</code>, the <code>binding</code> macro assigns a value to a name–but where <code>fn</code> and <code>let</code> create immutable <em>lexical scope</em>, <code>binding</code> creates <em>dynamic scope</em>. The difference? Lexical scope is constrained to the literal text of the <code>fn</code> or <code>let</code> expression–but dynamic scope propagates <em>through function calls</em>.</p>
<p>Within the <code>binding</code> expression, and in every function called from that expression, and every function called from <em>those</em> functions, and so on, <code>*board*</code> has the value <code>:cedar</code>. Outside the <code>binding</code> expression, the value is still <code>:maple</code>. This safety property holds even when the program is executed in multiple threads: only the thread which evaluated the <code>binding</code> expression uses that value. Other threads are unaffected.</p>
<p>While we use <code>def</code> all the time in the REPL, in real programs you should only mutate vars sparingly. They’re intended for naming functions, important bits of global data, and for tracking the <em>environment</em> of a program–like where to print messages with <code>prn</code>, which database to talk to, and so on. Using vars for mutable program state is a recipe for disaster, as we’re about to see.</p>
<h2><a href="#atoms" id="atoms">Atoms</a></h2>
<p>Vars can be read, set, and dynamically bound–but they aren’t easy to <em>evolve</em>. Imagine building up a set of integers:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="k">def </span><span class="nv">xs</span> <span class="o">#</span><span class="p">{})</span>
<span class="o">#</span><span class="ss">'user/xs</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nb">dotimes </span><span class="p">[</span><span class="nv">i</span> <span class="mi">10</span><span class="p">]</span> <span class="p">(</span><span class="k">def </span><span class="nv">xs</span> <span class="p">(</span><span class="nb">conj </span><span class="nv">xs</span> <span class="nv">i</span><span class="p">)))</span>
<span class="nv">user=></span> <span class="nv">xs</span>
<span class="o">#</span><span class="p">{</span><span class="mi">0</span> <span class="mi">1</span> <span class="mi">2</span> <span class="mi">3</span> <span class="mi">4</span> <span class="mi">5</span> <span class="mi">6</span> <span class="mi">7</span> <span class="mi">8</span> <span class="mi">9</span><span class="p">}</span>
</code></pre>
<p>For each number from 0 to 9, we take the current set of numbers <code>xs</code>, add a particular number <code>i</code> to that set, and redefine <code>xs</code> as the result. This is a common idiom in imperative language like C, Ruby, Javascript, or Java–all variables are mutable by default.</p>
<pre><code><span></span><span class="n">ImmutableSet</span> <span class="n">xs</span> <span class="o">=</span> <span class="k">new</span> <span class="n">ImmutableSet</span><span class="p">();</span>
<span class="k">for</span> <span class="p">(</span><span class="kt">int</span> <span class="n">i</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="n">i</span><span class="o">++</span><span class="p">;</span> <span class="n">i</span> <span class="o"><</span> <span class="mi">10</span><span class="p">)</span> <span class="p">{</span>
<span class="n">xs</span> <span class="o">=</span> <span class="n">xs</span><span class="p">.</span><span class="na">add</span><span class="p">(</span><span class="n">i</span><span class="p">);</span>
<span class="p">}</span>
</code></pre>
<p>It seems straightforward enough, but there are serious problems lurking here. Specifically, this program is not <em>thread safe</em>.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="k">def </span><span class="nv">xs</span> <span class="o">#</span><span class="p">{})</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nb">dotimes </span><span class="p">[</span><span class="nv">i</span> <span class="mi">10</span><span class="p">]</span> <span class="p">(</span><span class="nf">future</span> <span class="p">(</span><span class="k">def </span><span class="nv">xs</span> <span class="p">(</span><span class="nb">conj </span><span class="nv">xs</span> <span class="nv">i</span><span class="p">))))</span>
<span class="o">#</span><span class="ss">'user/xs</span>
<span class="nv">nil</span>
<span class="nv">user=></span> <span class="nv">xs</span>
<span class="o">#</span><span class="p">{</span><span class="mi">1</span> <span class="mi">4</span> <span class="mi">5</span> <span class="mi">7</span><span class="p">}</span>
</code></pre>
<p>This program runs 10 threads in parallel, and each reads the current value of <code>xs</code>, adds its particular number, and defines <code>xs</code> to be that new set of numbers. This read-modify-update process assumed that all updates would be <em>consecutive</em>–not <em>concurrent</em>. When we allowed the program to do two read-modify-updates at the same time, updates were lost.</p>
<ol>
<li>Thread 2 read <code>#{0 1}</code></li>
<li>Thread 3 read <code>#{0 1}</code></li>
<li>Thread 2 wrote <code>#{0 1 2}</code></li>
<li>Thread 3 wrote <code>#{0 1 3}</code></li>
</ol>
<p>This interleaving of operations allowed the number <code>2</code> to slip through the cracks. We need something stronger–an identity which supports safe transformation from one state to another. Enter atoms.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="k">def </span><span class="nv">xs</span> <span class="p">(</span><span class="nf">atom</span> <span class="o">#</span><span class="p">{}))</span>
<span class="o">#</span><span class="ss">'user/xs</span>
<span class="nv">user=></span> <span class="nv">xs</span>
<span class="o">#</span><span class="nv"><Atom</span><span class="o">@</span><span class="mi">30</span><span class="nv">bb8cc9</span><span class="err">:</span> <span class="o">#</span><span class="p">{}</span><span class="nv">></span>
</code></pre>
<p>The initial value of this atom is <code>#{}</code>. Unlike vars, atoms are not transparent. When evaluated, they don’t return their underlying values–but notice that when printed, the current value is hiding inside. To get the current value out of an atom, we have to use <code>deref</code> or <code>@</code>.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">deref </span><span class="nv">xs</span><span class="p">)</span>
<span class="o">#</span><span class="p">{}</span>
<span class="nv">user=></span> <span class="o">@</span><span class="nv">xs</span>
<span class="o">#</span><span class="p">{}</span>
</code></pre>
<p>Like vars, atoms can be set to a particular value–but instead of <code>def</code>, we use <code>reset!</code>. The exclamation point (sometimes called a <em>bang</em>) is there to remind us that this function <em>modifies</em> the state of its arguments–in this case, changing the value of the atom.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nf">reset!</span> <span class="nv">xs</span> <span class="ss">:foo</span><span class="p">)</span>
<span class="ss">:foo</span>
<span class="nv">user=></span> <span class="nv">xs</span>
<span class="o">#</span><span class="nv"><Atom</span><span class="o">@</span><span class="mi">30</span><span class="nv">bb8cc9</span><span class="err">:</span> <span class="ss">:foo></span>
</code></pre>
<p>Unlike vars, atoms can be safely <em>updated</em> using <code>swap!</code>. <code>swap!</code> uses a pure function which takes the current value of the atom and returns a <em>new</em> value. Under the hood, Clojure does some tricks to ensure that these updates are <em>linearizable</em>, which means:</p>
<ol>
<li>All updates with `swap! complete in what <em>appears</em> to be a single consecutive order.</li>
<li>The effect of a swap! never takes place before calling <code>swap!</code>.</li>
<li>The effect of a swap! is visible to everyone once swap! returns.</li>
</ol>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="k">def </span><span class="nv">x</span> <span class="p">(</span><span class="nf">atom</span> <span class="mi">0</span><span class="p">))</span>
<span class="o">#</span><span class="ss">'user/x</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nf">swap!</span> <span class="nv">x</span> <span class="nv">inc</span><span class="p">)</span>
<span class="mi">1</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nf">swap!</span> <span class="nv">x</span> <span class="nv">inc</span><span class="p">)</span>
<span class="mi">2</span>
</code></pre>
<p>The first <code>swap!</code> reads the value <code>0</code>, calls <code>(inc 0)</code> to obtain <code>1</code>, and writes <code>1</code> back to the atom. Each call to <code>swap!</code> returns the value that was just written.</p>
<p>We can pass additional arguments to the function <code>swap!</code> calls. For instance, <code>(swap! x + 5 6)</code> will call <code>(+ x 5 6)</code> to find the new value. Now we have the tools to correct our parallel program from earlier:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="k">def </span><span class="nv">xs</span> <span class="p">(</span><span class="nf">atom</span> <span class="o">#</span><span class="p">{}))</span>
<span class="o">#</span><span class="ss">'user/xs</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nb">dotimes </span><span class="p">[</span><span class="nv">i</span> <span class="mi">10</span><span class="p">]</span> <span class="p">(</span><span class="nf">future</span> <span class="p">(</span><span class="nf">swap!</span> <span class="nv">xs</span> <span class="nb">conj </span><span class="nv">i</span><span class="p">)))</span>
<span class="nv">nil</span>
<span class="nv">user=></span> <span class="o">@</span><span class="nv">xs</span>
<span class="o">#</span><span class="p">{</span><span class="mi">0</span> <span class="mi">1</span> <span class="mi">2</span> <span class="mi">3</span> <span class="mi">4</span> <span class="mi">5</span> <span class="mi">6</span> <span class="mi">7</span> <span class="mi">8</span> <span class="mi">9</span><span class="p">}</span>
</code></pre>
<p>Note that the function we use to update an atom must be <em>pure</em>–must not mutate any state–because when resolving conflicts between multiple threads, Clojure might need to call the update function more than once. Clojure’s reliance on immutable datatypes, immutable variables, and pure functions <em>enables</em> this approach to linearizable mutability. Languages which emphasize mutable datatypes need to use other constructs.</p>
<p>Atoms are the workhorse of Clojure state. They’re lightweight, safe, fast, and flexible. You can use atoms with any immutable datatype–for instance, a map to track complex state. Reach for an atom whenever you want to update a single thing over time.</p>
<h2><a href="#refs" id="refs">Refs</a></h2>
<p>Atoms are a great way to represent state, but they are only linearizable <em>individually</em>. Updates to an atom aren’t well-ordered with respect to other atoms, so if we try to update more than one atom at once, we could see the same kinds of bugs that we did with vars.</p>
<p>For multi-identity updates, we need a stronger safety property than single-atom linearizability. We want <em>serializability</em>: a global order. For this, Clojure has an identity type called a <em>Ref</em>.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="k">def </span><span class="nv">x</span> <span class="p">(</span><span class="nb">ref </span><span class="mi">0</span><span class="p">))</span>
<span class="o">#</span><span class="ss">'user/x</span>
<span class="nv">user=></span> <span class="nv">x</span>
<span class="o">#</span><span class="nv"><Ref</span><span class="o">@</span><span class="mi">1835</span><span class="nv">d850</span><span class="err">:</span> <span class="mi">0</span><span class="nv">></span>
</code></pre>
<p>Like all identity types, refs are dereferencable:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="o">@</span><span class="nv">x</span>
<span class="mi">0</span>
</code></pre>
<p>But where atoms are updated individually with <code>swap!</code>, refs are updated in <em>groups</em> using <code>dosync</code> transactions. Just as we <code>reset!</code> an atom, we can set refs to new values using <code>ref-set</code>–but unlike atoms, we can change more than one ref at once.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="k">def </span><span class="nv">x</span> <span class="p">(</span><span class="nb">ref </span><span class="mi">0</span><span class="p">))</span>
<span class="nv">user=></span> <span class="p">(</span><span class="k">def </span><span class="nv">y</span> <span class="p">(</span><span class="nb">ref </span><span class="mi">0</span><span class="p">))</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nf">dosync</span>
<span class="p">(</span><span class="nb">ref-set </span><span class="nv">x</span> <span class="mi">1</span><span class="p">)</span>
<span class="p">(</span><span class="nb">ref-set </span><span class="nv">y</span> <span class="mi">2</span><span class="p">))</span>
<span class="mi">2</span>
<span class="nv">user=></span> <span class="p">[</span><span class="o">@</span><span class="nv">x</span> <span class="o">@</span><span class="nv">y</span><span class="p">]</span>
<span class="p">[</span><span class="mi">1</span> <span class="mi">2</span><span class="p">]</span>
</code></pre>
<p>The equivalent of <code>swap!</code>, for a ref, is <code>alter</code>:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="k">def </span><span class="nv">x</span> <span class="p">(</span><span class="nb">ref </span><span class="mi">1</span><span class="p">))</span>
<span class="nv">user=></span> <span class="p">(</span><span class="k">def </span><span class="nv">y</span> <span class="p">(</span><span class="nb">ref </span><span class="mi">2</span><span class="p">))</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nf">dosync</span>
<span class="p">(</span><span class="nb">alter </span><span class="nv">x</span> <span class="nb">+ </span><span class="mi">2</span><span class="p">)</span>
<span class="p">(</span><span class="nb">alter </span><span class="nv">y</span> <span class="nv">inc</span><span class="p">))</span>
<span class="mi">3</span>
<span class="nv">user=></span> <span class="p">[</span><span class="o">@</span><span class="nv">x</span> <span class="o">@</span><span class="nv">y</span><span class="p">]</span>
<span class="p">[</span><span class="mi">3</span> <span class="mi">3</span><span class="p">]</span>
</code></pre>
<p>All <code>alter</code> operations within a <code>dosync</code> take place atomically–their effects are never interleaved with other transactions. If it’s OK for an operation to take place out of order, you can use <code>commute</code> instead of <code>alter</code> for a performance boost:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nf">dosync</span>
<span class="p">(</span><span class="nb">commute </span><span class="nv">x</span> <span class="nb">+ </span><span class="mi">2</span><span class="p">)</span>
<span class="p">(</span><span class="nb">commute </span><span class="nv">y</span> <span class="nv">inc</span><span class="p">))</span>
</code></pre>
<p>These updates are <em>not</em> guaranteed to take place in the same order–but if all our transactions are equivalent, we can <em>relax</em> the ordering constraints. x + 2 + 3 is equal to x + 3 + 2, so we can do the additions in either order. That’s what <em>commutative</em> means: the same result from all orders. It’s a weaker, but faster kind of safety property.</p>
<p>Finally, if you want to read a value from one ref and use it to update another, use <code>ensure</code> instead of <code>deref</code> to perform a <em>strongly consistent read</em>–one which is guaranteed to take place in the same logical order as the <code>dosync</code> transaction itself. To add <code>y</code>’s current value to <code>x</code>, use:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nf">dosync</span>
<span class="p">(</span><span class="nb">alter </span><span class="nv">x</span> <span class="nb">+ </span><span class="p">(</span><span class="nb">ensure </span><span class="nv">y</span><span class="p">)))</span>
</code></pre>
<p>Refs are a powerful construct, and make it easier to write complex transactional logic safely. However, that safety comes at a cost: refs are typically an order of magnitude slower to update than atoms.</p>
<p>Use refs only where you need to update multiple pieces of state independently–specifically, where different transactions need to work with distinct but <em>partly overlapping</em> pieces of state. If there’s no overlap between updates, use distinct atoms. If all operations update the same identities, use a single atom to hold a map of the system’s state. If a system requires complex interlocking state spread throughput the program–that’s when to reach for refs.</p>
<h2><a href="#summary" id="summary">Summary</a></h2>
<p>We moved beyond immutable programs into the world of <em>changing state</em>–and discovered the challenges of concurrency and parallelism. Where symbols provide immutable and transparent names for values objects, Vars provide <em>mutable</em> transparent names. We also saw a host of anonymous identity types for different purposes: delays for lazy evaluation, futures for parallel evaluation, and promises for arbitrary handoff of a value. Updates to vars are unsafe, so atoms and refs provide linearizable and serializable identities where transformations are <em>safe</em>.</p>
<p>Where reading a symbol or var is <em>transparent</em>–they evaluate directly to their current values–reading these new identity types requires the use of <code>deref</code>. Delays, futures, and promises <em>block</em>: deref must wait until the value is ready. This allows synchronization of concurrent threads. Atoms and refs, by contrast, can be read immediately at any time–but <em>updating</em> their values should occur within a <code>swap!</code> or <code>dosync</code> transaction, respectively.</p>
<table>
<thead>
<tr><th>Type</th>
<th>Mutability</th>
<th>Reads</th>
<th>Updates</th>
<th>Evaluation</th>
<th>Scope</th>
</tr></thead>
<tbody>
<tr>
<td>Symbol</td>
<td>Immutable</td>
<td>Transparent</td>
<td></td>
<td></td>
<td>Lexical</td>
</tr>
<tr>
<td>Var</td>
<td>Mutable</td>
<td>Transparent</td>
<td>Unrestricted</td>
<td></td>
<td>Global/Dynamic</td>
</tr>
<tr>
<td>Delay</td>
<td>Mutable</td>
<td>Blocking</td>
<td>Once only</td>
<td>Lazy</td>
<td></td>
</tr>
<tr>
<td>Future</td>
<td>Mutable</td>
<td>Blocking</td>
<td>Once only</td>
<td>Parallel</td>
<td></td>
</tr>
<tr>
<td>Promise</td>
<td>Mutable</td>
<td>Blocking</td>
<td>Once only</td>
<td></td>
<td></td>
</tr>
<tr>
<td>Atom</td>
<td>Mutable</td>
<td>Nonblocking</td>
<td>Linearizable</td>
<td></td>
<td></td>
</tr>
<tr>
<td>Ref</td>
<td>Mutable</td>
<td>Nonblocking</td>
<td>Serializable</td>
<td></td>
<td></td>
</tr>
</tbody>
</table>
<p>State is undoubtedly the hardest part of programming, and this chapter probably felt overwhelming! On the other hand, we’re now equipped to solve serious problems. We’ll take a break to apply what we’ve learned through practical examples, in Chapter Seven: <a href="http://aphyr.com/posts/311-clojure-from-the-ground-up-logistics">Logistics</a>.</p>
<h2><a href="#exercises" id="exercises">Exercises</a></h2>
<p>Finding the sum of the first 10000000 numbers takes about 1 second on my machine:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="kd">defn </span><span class="nv">sum</span> <span class="p">[</span><span class="nv">start</span> <span class="nv">end</span><span class="p">]</span> <span class="p">(</span><span class="nb">reduce + </span><span class="p">(</span><span class="nb">range </span><span class="nv">start</span> <span class="nv">end</span><span class="p">)))</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nb">time </span><span class="p">(</span><span class="nf">sum</span> <span class="mi">0</span> <span class="mi">1</span><span class="nv">e7</span><span class="p">))</span>
<span class="s">"Elapsed time: 1001.295323 msecs"</span>
<span class="mi">49999995000000</span>
</code></pre>
<ol>
<li>
<p>Use <code>delay</code> to compute this sum lazily; show that it takes no time to return the delay, but roughly 1 second to <code>deref</code>.</p>
</li>
<li>
<p>We can do the computation in a new thread directly, using <code>(.start (Thread. (fn [] (sum 0 1e7)))</code>–but this simply runs the <code>(sum)</code> function and discards the results. Use a promise to hand the result back out of the thread. Use this technique to write your own version of the <code>future</code> macro.</p>
</li>
<li>
<p>If your computer has two cores, you can do this expensive computation twice as fast by splitting it into two parts: <code>(sum 0 (/ 1e7 2))</code>, and <code>(sum (/ 1e7 2) 1e7)</code>, then adding those parts together. Use <code>future</code> to do both parts at once, and show that this strategy gets the same answer as the single-threaded version, but takes roughly half the time.</p>
</li>
<li>
<p>Instead of using <code>reduce</code>, store the sum in an atom and use two futures to add each number from the lower and upper range to that atom. Wait for both futures to complete using <code>deref</code>, then check that the atom contains the right number. Is this technique faster or slower than <code>reduce</code>? Why do you think that might be?</p>
</li>
<li>
<p>Instead of using a lazy list, imagine two threads are removing tasks from a pile of work. Our work pile will be the list of all integers from 0 to 10000:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="k">def </span><span class="nv">work</span> <span class="p">(</span><span class="nb">ref </span><span class="p">(</span><span class="nb">apply list </span><span class="p">(</span><span class="nb">range </span><span class="mi">1</span><span class="nv">e5</span><span class="p">))))</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nb">take </span><span class="mi">10</span> <span class="o">@</span><span class="nv">work</span><span class="p">)</span>
<span class="p">(</span><span class="mi">0</span> <span class="mi">1</span> <span class="mi">2</span> <span class="mi">3</span> <span class="mi">4</span> <span class="mi">5</span> <span class="mi">6</span> <span class="mi">7</span> <span class="mi">8</span> <span class="mi">9</span><span class="p">)</span>
</code></pre>
<p>And the sum will be a ref as well:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="k">def </span><span class="nv">sum</span> <span class="p">(</span><span class="nb">ref </span><span class="mi">0</span><span class="p">))</span>
</code></pre>
<p>Write a function which, in a <code>dosync</code> transaction, removes the first number in <code>work</code> and adds it to <code>sum</code>.<br>
Then, in two futures, call that function over and over again until there’s no work left. Verify that <code>@sum</code>
is <code>4999950000</code>. Experiment with different combinations of <code>alter</code> and <code>commute</code>–if both are correct, is
one faster? Does using <code>deref</code> instead of <code>ensure</code> change the result?</p>
</li>
</ol>
https://aphyr.com/posts/305-clojure-from-the-ground-up-macrosClojure from the ground up: macros2013-11-26T02:03:55-05:002013-11-26T02:03:55-05:00Aphyrhttps://aphyr.com/<p>In <a href="/posts/301-clojure-from-the-ground-up-welcome">Chapter 1</a>, I asserted that the grammar of Lisp is uniform: every expression is a list, beginning with a verb, and followed by some arguments. Evaluation proceeds from left to right, and every element of the list must be evaluated <em>before</em> evaluating the list itself. Yet we just saw, at the end of <a href="/posts/304-clojure-from-the-ground-up-sequences">Sequences</a>, an expression which seemed to <em>violate</em> these rules.</p>
<p>Clearly, this is not the whole story.</p>
<h2><a href="#macroexpansion" id="macroexpansion">Macroexpansion</a></h2>
<p>There is another phase to evaluating an expression; one which takes place before the rules we’ve followed so far. That process is called <em>macro-expansion</em>. During macro-expansion, the <em>code itself</em> is restructured according to some set of rules–rules which you, the programmer, can define.</p>
<pre><code><span></span><span class="p">(</span><span class="kd">defmacro </span><span class="nv">ignore</span>
<span class="s">"Cancels the evaluation of an expression, returning nil instead."</span>
<span class="p">[</span><span class="nv">expr</span><span class="p">]</span>
<span class="nv">nil</span><span class="p">)</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nf">ignore</span> <span class="p">(</span><span class="nb">+ </span><span class="mi">1</span> <span class="mi">2</span><span class="p">))</span>
<span class="nv">nil</span>
</code></pre>
<p><code>defmacro</code> looks a lot like <code>defn</code>: it has a name, an optional documentation string, an argument vector, and a body–in this case, just <code>nil</code>. In this case, it looks like it simply ignored the expr <code>(+ 1 2)</code> and returned <code>nil</code>–but it’s actually deeper than that. <code>(+ 1 2)</code> was <em>never evaluated at all</em>.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="k">def </span><span class="nv">x</span> <span class="mi">1</span><span class="p">)</span>
<span class="o">#</span><span class="ss">'user/x</span>
<span class="nv">user=></span> <span class="nv">x</span>
<span class="mi">1</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nf">ignore</span> <span class="p">(</span><span class="k">def </span><span class="nv">x</span> <span class="mi">2</span><span class="p">))</span>
<span class="nv">nil</span>
<span class="nv">user=></span> <span class="nv">x</span>
<span class="mi">1</span>
</code></pre>
<p><code>def</code> should have defined <code>x</code> to be <code>2</code> <em>no matter what</em>–but that never happened. At macroexpansion time, the expression <code>(ignore (+ 1 2))</code> was <em>replaced</em> by the expression <code>nil</code>, which was then evaluated to <code>nil</code>. Where functions rewrite <em>values</em>, macros rewrite <em>code</em>.</p>
<p>To see these different layers in play, let’s try a macro which reverses the order of arguments to a function.</p>
<pre><code><span></span><span class="p">(</span><span class="kd">defmacro </span><span class="nv">rev</span> <span class="p">[</span><span class="nv">fun</span> <span class="o">&</span> <span class="nv">args</span><span class="p">]</span>
<span class="p">(</span><span class="nb">cons </span><span class="nv">fun</span> <span class="p">(</span><span class="nb">reverse </span><span class="nv">args</span><span class="p">)))</span>
</code></pre>
<p>This macro, named <code>rev</code>, takes one mandatory argument: a function. Then it takes any number of arguments, which are collected in the list <code>args</code>. It constructs a new list, starting with the function, and followed by the arguments, in reverse order.</p>
<p>First, we macro-expand:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">macroexpand </span><span class="o">'</span><span class="p">(</span><span class="nf">rev</span> <span class="nb">str </span><span class="s">"hi"</span> <span class="p">(</span><span class="nb">+ </span><span class="mi">1</span> <span class="mi">2</span><span class="p">)))</span>
<span class="p">(</span><span class="nb">str </span><span class="p">(</span><span class="nb">+ </span><span class="mi">1</span> <span class="mi">2</span><span class="p">)</span> <span class="s">"hi"</span><span class="p">)</span>
</code></pre>
<p>So the <code>rev</code> macro took <code>str</code> as the function, and <code>"hi"</code> and <code>(+ 1 2)</code> as the arguments; then constructed a new list with the same function, but the arguments reversed. When we <em>evaluate</em> that expression, we get:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">eval </span><span class="p">(</span><span class="nb">macroexpand </span><span class="o">'</span><span class="p">(</span><span class="nf">rev</span> <span class="nb">str </span><span class="s">"hi"</span> <span class="p">(</span><span class="nb">+ </span><span class="mi">1</span> <span class="mi">2</span><span class="p">))))</span>
<span class="s">"3hi"</span>
</code></pre>
<p><code>macroexpand</code> takes an expression and returns that expression with all macros expanded. <code>eval</code> takes an expression and evaluates it. When you type an unquoted expression into the REPL, Clojure macroexpands, then evaluates. Two stages–the first transforming <em>code</em>, the second transforming <em>values</em>.</p>
<h2><a href="#across-languages" id="across-languages">Across languages</a></h2>
<p>Some languages have a <em>metalanguage</em>: a language for extending the language itself. In C, for example, macros are implemented by the <a href="http://www.rt-embedded.com/blog/archives/macros-in-the-c-programming-language/">C preprocessor</a>, which has its own syntax for defining expressions, matching patterns in the source code’s text, and replacing that text with other text. But that preprocessor is <em>not</em> C–it is a separate language entirely, with special limitations. In Clojure, the metalanguage is <em>Clojure itself</em>–the full power of the language is available to restructure programs. This is called a <em>procedural</em> macro system. Some Lisps, like Scheme, use a macro system based on templating expressions, and still others use more powerful models like <em>f-expressions</em>–but that’s a discussion for a later time.</p>
<p>There is another key difference between Lisp macros and many other macro systems: in Lisp, the macros operate on <em>expressions</em>: the data structure of the code itself. Because Lisp code is <em>written</em> explicitly as a data structure, a tree made out of lists, this transformation is natural. You can <em>see</em> the structure of the code, which makes it easy to reason about its transformation. In the C preprocessor, macros operate only on <em>text</em>: there is no understanding of the underlying syntax. Even in languages like Scala which have syntactic macros, the fact that the code looks <em>nothing like</em> the syntax tree makes it <a href="http://docs.scala-lang.org/overviews/macros/overview.html">cumbersome</a> to truly restructure expressions.</p>
<p>When people say that Lisp’s syntax is “more elegant”, or “more beautiful”, or “simpler”, this is part of what they they mean. By choosing to represent the program directly as a a data structure, we make it much easier to define complex transformations of code itself.</p>
<h2><a href="#defining-new-syntax" id="defining-new-syntax">Defining new syntax</a></h2>
<p>What kind of transformations are best expressed with macros?</p>
<p>Most languages encode special syntactic forms–things like “define a function”, “call a function”, “define a local variable”, “if this, then that”, and so on. In Clojure, these are called <em>special forms</em>. <code>if</code> is a special form, for instance. Its definition is built into the language core itself; it cannot be reduced into smaller parts.</p>
<pre><code><span></span><span class="p">(</span><span class="k">if </span><span class="p">(</span><span class="nb">< </span><span class="mi">3</span> <span class="nv">x</span><span class="p">)</span>
<span class="s">"big"</span>
<span class="s">"small"</span><span class="p">)</span>
</code></pre>
<p>Or in Javascript:</p>
<pre><code><span></span><span class="k">if</span> <span class="p">(</span><span class="mi">3</span> <span class="o"><</span> <span class="nx">x</span><span class="p">)</span> <span class="p">{</span>
<span class="k">return</span> <span class="s2">"big"</span><span class="p">;</span>
<span class="p">}</span> <span class="k">else</span> <span class="p">{</span>
<span class="k">return</span> <span class="s2">"small"</span><span class="p">;</span>
<span class="p">}</span>
</code></pre>
<p>In Javascript, Ruby, and many other languages, these special forms are <em>fixed</em>. You cannot define your own syntax. For instance, one cannot define <code>or</code> in a language like JS or Ruby: it must be defined <em>for</em> you by the language author.</p>
<p>In Clojure, <code>or</code> is just a macro.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nf">source</span> <span class="nv">or</span><span class="p">)</span>
<span class="p">(</span><span class="kd">defmacro </span><span class="nv">or</span>
<span class="s">"Evaluates exprs one at a time, from left to right. If a form</span>
<span class="s"> returns a logical true value, or returns that value and doesn't</span>
<span class="s"> evaluate any of the other expressions, otherwise it returns the</span>
<span class="s"> value of the last expression. (or) returns nil."</span>
<span class="p">{</span><span class="ss">:added</span> <span class="s">"1.0"</span><span class="p">}</span>
<span class="p">([]</span> <span class="nv">nil</span><span class="p">)</span>
<span class="p">([</span><span class="nv">x</span><span class="p">]</span> <span class="nv">x</span><span class="p">)</span>
<span class="p">([</span><span class="nv">x</span> <span class="o">&</span> <span class="nv">next</span><span class="p">]</span>
<span class="o">`</span><span class="p">(</span><span class="k">let </span><span class="p">[</span><span class="nv">or#</span> <span class="o">~</span><span class="nv">x</span><span class="p">]</span>
<span class="p">(</span><span class="k">if </span><span class="nv">or#</span> <span class="nv">or#</span> <span class="p">(</span><span class="nb">or </span><span class="o">~@</span><span class="nv">next</span><span class="p">)))))</span>
<span class="nv">nil</span>
</code></pre>
<p>That <code>`</code> operator–that’s called <em>syntax-quote</em>. It works just like regular quote–preventing evaluation of the following list–but with a twist: we can escape the quoting rule and substitute in regularly evaluated expressions using <em>unquote</em> (<code>~</code>), and <em>unquote-splice</em> (<code>~@</code>). Think of a syntax-quoted expression like a <em>template</em> for code, with some parts filled in by evaluated forms.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="k">let </span><span class="p">[</span><span class="nv">x</span> <span class="mi">2</span><span class="p">]</span> <span class="o">`</span><span class="p">(</span><span class="nb">inc </span><span class="nv">x</span><span class="p">))</span>
<span class="p">(</span><span class="nf">clojure.core/inc</span> <span class="nv">user/x</span><span class="p">)</span>
<span class="nv">user=></span> <span class="p">(</span><span class="k">let </span><span class="p">[</span><span class="nv">x</span> <span class="mi">2</span><span class="p">]</span> <span class="o">`</span><span class="p">(</span><span class="nb">inc </span><span class="o">~</span><span class="nv">x</span><span class="p">))</span>
<span class="p">(</span><span class="nf">clojure.core/inc</span> <span class="mi">2</span><span class="p">)</span>
</code></pre>
<p>See the difference? <code>~x</code> <em>substitutes</em> the value of x, instead of using <code>x</code> as an unevaluated symbol. This code is essentially just shorthand for something like</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="k">let </span><span class="p">[</span><span class="nv">x</span> <span class="mi">2</span><span class="p">]</span> <span class="p">(</span><span class="nb">list </span><span class="ss">'clojure.core/inc</span> <span class="nv">x</span><span class="p">))</span>
<span class="p">(</span><span class="nb">inc </span><span class="mi">2</span><span class="p">)</span>
</code></pre>
<p>… where we explicitly constructed a new list with the quoted symbol <code>'inc</code> and the current value of <code>x</code>. Syntax quote just makes it easier to read the code, since the quoted and expanded expressions have similar shapes.</p>
<p>The <code>~@</code> unquote splice works just like <code>~</code>, except it explodes a list into <em>multiple</em> expressions in the resulting form:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="o">`</span><span class="p">(</span><span class="nf">foo</span> <span class="o">~</span><span class="p">[</span><span class="mi">1</span> <span class="mi">2</span> <span class="mi">3</span><span class="p">])</span>
<span class="p">(</span><span class="nf">user/foo</span> <span class="p">[</span><span class="mi">1</span> <span class="mi">2</span> <span class="mi">3</span><span class="p">])</span>
<span class="nv">user=></span> <span class="o">`</span><span class="p">(</span><span class="nf">foo</span> <span class="o">~@</span><span class="p">[</span><span class="mi">1</span> <span class="mi">2</span> <span class="mi">3</span><span class="p">])</span>
<span class="p">(</span><span class="nf">user/foo</span> <span class="mi">1</span> <span class="mi">2</span> <span class="mi">3</span><span class="p">)</span>
</code></pre>
<p><code>~@</code> is particularly useful when a function or macro takes an <em>arbitrary</em> number of arguments. In the definition of <code>or</code>, it’s used to expand <code>(or a b c)</code> <em>recursively</em>.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nf">pprint</span> <span class="p">(</span><span class="nb">macroexpand </span><span class="o">'</span><span class="p">(</span><span class="nb">or </span><span class="nv">a</span> <span class="nv">b</span> <span class="nv">c</span> <span class="nv">d</span><span class="p">)))</span>
<span class="p">(</span><span class="nf">let*</span>
<span class="p">[</span><span class="nv">or__3943__auto__</span> <span class="nv">a</span><span class="p">]</span>
<span class="p">(</span><span class="k">if </span><span class="nv">or__3943__auto__</span> <span class="nv">or__3943__auto__</span> <span class="p">(</span><span class="nf">clojure.core/or</span> <span class="nv">b</span> <span class="nv">c</span> <span class="nv">d</span><span class="p">)))</span>
</code></pre>
<p>We’re using <code>pprint</code> (for “pretty print”) to make this expression easier to read. <code>(or a b c d)</code> is defined in terms of <em>if</em>: if the first element is truthy we return it; otherwise we evaluate <code>(or b c d)</code> instead, and so on.</p>
<p>The final piece of the puzzle here is that weirdly named symbol: <code>or__3943__auto__</code>. That variable was <em>automatically generated</em> by Clojure, to prevent <em>conflicts</em> with an existing variable name. Because macros rewrite code, they have to be careful not to interfere with local variables, or it could get very confusing. Whenever we need a new variable in a macro, we use <code>gensym</code> to <em>generate a new symbol</em>.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">gensym </span><span class="s">"hi"</span><span class="p">)</span>
<span class="nv">hi326</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nb">gensym </span><span class="s">"hi"</span><span class="p">)</span>
<span class="nv">hi329</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nb">gensym </span><span class="s">"hi"</span><span class="p">)</span>
<span class="nv">hi332</span>
</code></pre>
<p>Each symbol is different! If we tack on a <code>#</code> to the end of a symbol in a syntax-quoted expression, it’ll be expanded to a particular gensym:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="o">`</span><span class="p">(</span><span class="k">let </span><span class="p">[</span><span class="nv">x#</span> <span class="mi">2</span><span class="p">]</span> <span class="nv">x#</span><span class="p">)</span>
<span class="p">(</span><span class="nf">clojure.core/let</span> <span class="p">[</span><span class="nv">x__339__auto__</span> <span class="mi">2</span><span class="p">]</span> <span class="nv">x__339__auto__</span><span class="p">)</span>
</code></pre>
<p>Note that you can always escape this safety feature if you <em>want</em> to override local variables. That’s called <em>symbol capture</em>, or an <em>anaphoric</em> or <em>unhygenic</em> macro. To override local symbols, just use <code>~'foo</code> instead of <code>foo#</code>.</p>
<p>With all the pieces on the board, let’s compare the <code>or</code> macro and its expansion:</p>
<pre><code><span></span><span class="p">(</span><span class="kd">defmacro </span><span class="nv">or</span>
<span class="s">"Evaluates exprs one at a time, from left to right. If a form</span>
<span class="s"> returns a logical true value, or returns that value and doesn't</span>
<span class="s"> evaluate any of the other expressions, otherwise it returns the</span>
<span class="s"> value of the last expression. (or) returns nil."</span>
<span class="p">{</span><span class="ss">:added</span> <span class="s">"1.0"</span><span class="p">}</span>
<span class="p">([]</span> <span class="nv">nil</span><span class="p">)</span>
<span class="p">([</span><span class="nv">x</span><span class="p">]</span> <span class="nv">x</span><span class="p">)</span>
<span class="p">([</span><span class="nv">x</span> <span class="o">&</span> <span class="nv">next</span><span class="p">]</span>
<span class="o">`</span><span class="p">(</span><span class="k">let </span><span class="p">[</span><span class="nv">or#</span> <span class="o">~</span><span class="nv">x</span><span class="p">]</span>
<span class="p">(</span><span class="k">if </span><span class="nv">or#</span> <span class="nv">or#</span> <span class="p">(</span><span class="nb">or </span><span class="o">~@</span><span class="nv">next</span><span class="p">)))))</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nf">pprint</span> <span class="p">(</span><span class="nf">clojure.walk/macroexpand-all</span>
<span class="o">'</span><span class="p">(</span><span class="nb">or </span><span class="p">(</span><span class="nf">mossy?</span> <span class="nv">stone</span><span class="p">)</span> <span class="p">(</span><span class="nf">cool?</span> <span class="nv">stone</span><span class="p">)</span> <span class="p">(</span><span class="nf">wet?</span> <span class="nv">stone</span><span class="p">))))</span>
<span class="p">(</span><span class="nf">let*</span>
<span class="p">[</span><span class="nv">or__3943__auto__</span> <span class="p">(</span><span class="nf">mossy?</span> <span class="nv">stone</span><span class="p">)]</span>
<span class="p">(</span><span class="nf">if</span>
<span class="nv">or__3943__auto__</span>
<span class="nv">or__3943__auto__</span>
<span class="p">(</span><span class="nf">let*</span>
<span class="p">[</span><span class="nv">or__3943__auto__</span> <span class="p">(</span><span class="nf">cool?</span> <span class="nv">stone</span><span class="p">)]</span>
<span class="p">(</span><span class="k">if </span><span class="nv">or__3943__auto__</span> <span class="nv">or__3943__auto__</span> <span class="p">(</span><span class="nf">wet?</span> <span class="nv">stone</span><span class="p">)))))</span>
</code></pre>
<p>See how the macro’s syntax-quoted <code>(let ...</code> has the same shape as the resulting code? <code>or#</code> is expanded to a variable named <code>or__3943__auto__</code>, which is bound to the expression <code>(mossy? stone)</code>. If that variable is truthy, we return it. Otherwise, we (and here’s the recursive part) rebind <code>or__3943__auto__</code> to <code>(cool? stone)</code> and try again. If <em>that</em> fails, we fall back to evaluating <code>(wet? stone)</code>–thanks to the base case, the single-argument form of the <code>or</code> macro.</p>
<h2><a href="#control-flow" id="control-flow">Control flow</a></h2>
<p>We’ve seen that <code>or</code> is a macro written in terms of the special form <code>if</code>–and because of the way the macro is structured, it does <em>not</em> obey the normal execution order. In <code>(or a b c)</code>, only <code>a</code> is evaluated first–then, only if it is <code>false</code> or <code>nil</code>, do we evaluate <code>b</code>. This is called <em>short-circuiting</em>, and it works for <code>and</code> as well.</p>
<p>Changing the order of evaluation in a language is called <em>control flow</em>, and lets programs make decisions based on varying circumstances. We’ve already seen <code>if</code>:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="k">if </span><span class="p">(</span><span class="nb">= </span><span class="mi">2</span> <span class="mi">2</span><span class="p">)</span> <span class="ss">:a</span> <span class="ss">:b</span><span class="p">)</span>
<span class="ss">:a</span>
</code></pre>
<p><code>if</code> takes a predicate and two expressions, and only evaluates one of them, depending on whether the predicate evaluates to a truthy or falsey value. Sometimes you want to evaluate <em>more than one</em> expression in order. For this, we have <code>do</code>.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="k">if </span><span class="p">(</span><span class="nb">pos? </span><span class="mi">-5</span><span class="p">)</span>
<span class="p">(</span><span class="nb">prn </span><span class="s">"-5 is positive"</span><span class="p">)</span>
<span class="p">(</span><span class="nf">do</span>
<span class="p">(</span><span class="nb">prn </span><span class="s">"-5 is negative"</span><span class="p">)</span>
<span class="p">(</span><span class="nb">prn </span><span class="s">"Who would have thought?"</span><span class="p">)))</span>
<span class="s">"-5 is negative"</span>
<span class="s">"Who would have thought?"</span>
<span class="nv">nil</span>
</code></pre>
<p><code>prn</code> is a function which has a <em>side effect</em>: it prints a message to the screen, and returns <code>nil</code>. We wanted to print <em>two</em> messages, but <code>if</code> only takes a single expression per branch–so in our false branch, we used <code>do</code> to wrap up two <code>prn</code>s into a single expression, and evaluate them in order. <code>do</code> returns the value of the final expression, which happens to be <code>nil</code> here.</p>
<p>When you only want to take one branch of an <code>if</code>, you can use <code>when</code>:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">when </span><span class="nv">false</span>
<span class="p">(</span><span class="nb">prn </span><span class="ss">:hi</span><span class="p">)</span>
<span class="p">(</span><span class="nb">prn </span><span class="ss">:there</span><span class="p">))</span>
<span class="nv">nil</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nb">when </span><span class="nv">true</span>
<span class="p">(</span><span class="nb">prn </span><span class="ss">:hi</span><span class="p">)</span>
<span class="p">(</span><span class="nb">prn </span><span class="ss">:there</span><span class="p">))</span>
<span class="ss">:hi</span>
<span class="ss">:there</span>
<span class="nv">nil</span>
</code></pre>
<p>Because there is only one path to take, <code>when</code> takes any number of expressions, and evaluates them only when the predicate is truthy. If the predicate evaluates to <code>nil</code> or <code>false</code>, <code>when</code> does not evaluate its body, and returns <code>nil</code>.</p>
<p>Both <code>when</code> and <code>if</code> have complementary forms, <code>when-not</code> and <code>if-not</code>, which simply invert the sense of their predicate.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">when-not </span><span class="p">(</span><span class="nf">number?</span> <span class="s">"a string"</span><span class="p">)</span>
<span class="ss">:here</span><span class="p">)</span>
<span class="ss">:here</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nb">if-not </span><span class="p">(</span><span class="nb">vector? </span><span class="p">(</span><span class="nb">list </span><span class="mi">1</span> <span class="mi">2</span> <span class="mi">3</span><span class="p">))</span>
<span class="ss">:a</span>
<span class="ss">:b</span><span class="p">)</span>
<span class="ss">:a</span>
</code></pre>
<p>Often, you want to perform some operation, and if it’s truthy, re-use that value without recomputing it. For this, we have <code>when-let</code> and <code>if-let</code>. These work just like <code>when</code> and <code>let</code> combined.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">when-let </span><span class="p">[</span><span class="nv">x</span> <span class="p">(</span><span class="nb">+ </span><span class="mi">1</span> <span class="mi">2</span> <span class="mi">3</span> <span class="mi">4</span><span class="p">)]</span>
<span class="p">(</span><span class="nb">str </span><span class="nv">x</span><span class="p">))</span>
<span class="s">"10"</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nb">when-let </span><span class="p">[</span><span class="nv">x</span> <span class="p">(</span><span class="nb">first </span><span class="p">[])]</span>
<span class="p">(</span><span class="nb">str </span><span class="nv">x</span><span class="p">))</span>
<span class="nv">nil</span>
</code></pre>
<p><code>while</code> evaluates an expression so long as its predicate is truthy. This is generally useful only for side effects, like <code>prn</code> or <code>def</code>; things that change the state of the world.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="k">def </span><span class="nv">x</span> <span class="mi">0</span><span class="p">)</span>
<span class="o">#</span><span class="ss">'user/x</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nf">while</span> <span class="p">(</span><span class="nb">< </span><span class="nv">x</span> <span class="mi">5</span><span class="p">)</span>
<span class="o">#</span><span class="nv">_=></span> <span class="p">(</span><span class="nb">prn </span><span class="nv">x</span><span class="p">)</span>
<span class="o">#</span><span class="nv">_=></span> <span class="p">(</span><span class="k">def </span><span class="nv">x</span> <span class="p">(</span><span class="nb">inc </span><span class="nv">x</span><span class="p">)))</span>
<span class="mi">0</span>
<span class="mi">1</span>
<span class="mi">2</span>
<span class="mi">3</span>
<span class="mi">4</span>
<span class="nv">nil</span>
</code></pre>
<p><code>cond</code> (for “conditional”) is like a multiheaded <code>if</code>: it takes <em>any number</em> of test/expression pairs, and tries each test in turn. The first test which evaluates truthy causes the following expression to be evaluated; then <code>cond</code> returns that expression’s value.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nf">cond</span>
<span class="o">#</span><span class="nv">_=></span> <span class="p">(</span><span class="nb">= </span><span class="mi">2</span> <span class="mi">5</span><span class="p">)</span> <span class="ss">:nope</span>
<span class="o">#</span><span class="nv">_=></span> <span class="p">(</span><span class="nb">= </span><span class="mi">3</span> <span class="mi">3</span><span class="p">)</span> <span class="ss">:yep</span>
<span class="o">#</span><span class="nv">_=></span> <span class="p">(</span><span class="nb">= </span><span class="mi">5</span> <span class="mi">5</span><span class="p">)</span> <span class="ss">:cant-get-here</span>
<span class="o">#</span><span class="nv">_=></span> <span class="ss">:else</span> <span class="ss">:a-default-value</span><span class="p">)</span>
<span class="ss">:yep</span>
</code></pre>
<p>If you find yourself making several similar decisions based on a value, try <code>condp</code>, for “cond with predicate”. For instance, we might categorize a number based on some ranges:</p>
<pre><code><span></span><span class="p">(</span><span class="kd">defn </span><span class="nv">category</span>
<span class="s">"Determines the Saffir-Simpson category of a hurricane, by wind speed in meters/sec"</span>
<span class="p">[</span><span class="nv">wind-speed</span><span class="p">]</span>
<span class="p">(</span><span class="nf">condp</span> <span class="nb"><= </span><span class="nv">wind-speed</span>
<span class="mi">70</span> <span class="ss">:F5</span>
<span class="mi">58</span> <span class="ss">:F4</span>
<span class="mi">49</span> <span class="ss">:F3</span>
<span class="mi">42</span> <span class="ss">:F2</span>
<span class="ss">:F1</span><span class="p">))</span> <span class="c1">; Default value</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nf">category</span> <span class="mi">10</span><span class="p">)</span>
<span class="ss">:F1</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nf">category</span> <span class="mi">50</span><span class="p">)</span>
<span class="ss">:F3</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nf">category</span> <span class="mi">100</span><span class="p">)</span>
<span class="ss">:F5</span>
</code></pre>
<p><code>condp</code> generates code which combines the predicate <code><=</code> with each number, and the value of <code>wind-speed</code>, like so:</p>
<pre><code><span></span><span class="p">(</span><span class="k">if </span><span class="p">(</span><span class="nb"><= </span><span class="mi">70</span> <span class="nv">wind-speed</span><span class="p">)</span> <span class="ss">:F5</span>
<span class="p">(</span><span class="k">if </span><span class="p">(</span><span class="nb"><= </span><span class="mi">58</span> <span class="nv">wind-speed</span><span class="p">)</span> <span class="ss">:F4</span>
<span class="p">(</span><span class="k">if </span><span class="p">(</span><span class="nb"><= </span><span class="mi">49</span> <span class="nv">wind-speed</span><span class="p">)</span> <span class="ss">:F3</span>
<span class="p">(</span><span class="k">if </span><span class="p">(</span><span class="nb"><= </span><span class="mi">42</span> <span class="nv">wind-speed</span><span class="p">)</span> <span class="ss">:F2</span>
<span class="ss">:F1</span><span class="p">))))</span>
</code></pre>
<p>Specialized macros like <code>condp</code> are less commonly used than <code>if</code> or <code>when</code>, but they still play an important role in simplifying repeated code. They clarify the meaning of complex expressions, making them easier to read and maintain.</p>
<p>Finally, there’s <code>case</code>, which works a little bit like a map of keys to values–only the values are <em>code</em>, to be evaluated. You can think of <code>case</code> like <code>(condp = ...)</code>, trying to match an expression to a particular branch for which it is equal.</p>
<pre><code><span></span><span class="p">(</span><span class="kd">defn </span><span class="nv">with-tax</span>
<span class="s">"Computes the total cost, with tax, of a purchase in the given state."</span>
<span class="p">[</span><span class="nv">state</span> <span class="nv">subtotal</span><span class="p">]</span>
<span class="p">(</span><span class="nf">case</span> <span class="nv">state</span>
<span class="ss">:WA</span> <span class="p">(</span><span class="nb">* </span><span class="mf">1.065</span> <span class="nv">subtotal</span><span class="p">)</span>
<span class="ss">:OR</span> <span class="nv">subtotal</span>
<span class="ss">:CA</span> <span class="p">(</span><span class="nb">* </span><span class="mf">1.075</span> <span class="nv">subtotal</span><span class="p">)</span>
<span class="c1">; ... 48 other states ...</span>
<span class="nv">subtotal</span><span class="p">))</span> <span class="c1">; a default case</span>
</code></pre>
<p>Unlike <code>cond</code> and <code>condp</code>, <code>case</code> does <em>not</em> evaluate its tests in order. It jumps <em>immediately</em> to the matching expression. This makes <code>case</code> much faster when there are many branches to take–at the cost of reduced generality.</p>
<h2><a href="#recursion" id="recursion">Recursion</a></h2>
<p>Previously, we defined recursive functions by having those functions call themselves explicitly.</p>
<pre><code><span></span><span class="p">(</span><span class="kd">defn </span><span class="nv">sum</span> <span class="p">[</span><span class="nv">numbers</span><span class="p">]</span>
<span class="p">(</span><span class="nb">if-let </span><span class="p">[</span><span class="nv">n</span> <span class="p">(</span><span class="nb">first </span><span class="nv">numbers</span><span class="p">)]</span>
<span class="p">(</span><span class="nb">+ </span><span class="nv">n</span> <span class="p">(</span><span class="nf">sum</span> <span class="p">(</span><span class="nb">rest </span><span class="nv">numbers</span><span class="p">)))</span>
<span class="mi">0</span><span class="p">))</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nf">sum</span> <span class="p">(</span><span class="nb">range </span><span class="mi">10</span><span class="p">))</span>
<span class="mi">45</span>
</code></pre>
<p>But this approach breaks down when we have the function call itself <em>deeply</em>, over and over again.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nf">sum</span> <span class="p">(</span><span class="nb">range </span><span class="mi">100000</span><span class="p">))</span>
<span class="nv">StackOverflowError</span> <span class="nv">clojure.core/range/fn--4269</span> <span class="p">(</span><span class="nf">core.clj</span><span class="ss">:2664</span><span class="p">)</span>
</code></pre>
<p>Every time you call a function, the arguments for that function are stored in memory, in a region called <em>the stack</em>. They remain there for as long as the function is being called–including any deeper function calls.</p>
<pre><code><span></span> <span class="p">(</span><span class="nb">+ </span><span class="nv">n</span> <span class="p">(</span><span class="nf">sum</span> <span class="p">(</span><span class="nb">rest </span><span class="nv">numbers</span><span class="p">)))</span>
</code></pre>
<p>In order to add <code>n</code> and <code>(sum (rest numbers))</code>, we have to call <code>sum</code> <em>first</em>–while holding onto the memory for <code>n</code> and <code>numbers</code>. We can’t re-use that memory until <em>every single recursive call</em> has completed. Clojure complains, after tens of thousands of stack frames are in use, that it has run out of space in the stack and can allocate no more.</p>
<p>But consider this variation on <code>sum</code>:</p>
<pre><code><span></span><span class="p">(</span><span class="kd">defn </span><span class="nv">sum</span>
<span class="p">([</span><span class="nv">numbers</span><span class="p">]</span>
<span class="p">(</span><span class="nf">sum</span> <span class="mi">0</span> <span class="nv">numbers</span><span class="p">))</span>
<span class="p">([</span><span class="nv">subtotal</span> <span class="nv">numbers</span><span class="p">]</span>
<span class="p">(</span><span class="nb">if-let </span><span class="p">[</span><span class="nv">n</span> <span class="p">(</span><span class="nb">first </span><span class="nv">numbers</span><span class="p">)]</span>
<span class="p">(</span><span class="nf">recur</span> <span class="p">(</span><span class="nb">+ </span><span class="nv">subtotal</span> <span class="nv">n</span><span class="p">)</span> <span class="p">(</span><span class="nb">rest </span><span class="nv">numbers</span><span class="p">))</span>
<span class="nv">subtotal</span><span class="p">)))</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nf">sum</span> <span class="p">(</span><span class="nb">range </span><span class="mi">100000</span><span class="p">))</span>
<span class="mi">4999950000</span>
</code></pre>
<p>We’ve added an additional parameter to the function. In its two-argument form, <code>sum</code> now takes an accumulator, <code>subtotal</code>, which represents the count so far. In addition, <code>recur</code> has taken the place of <code>sum</code>. Notice, however, that the final expression to be evaluated is not <code>+</code>, but <code>sum</code> (viz <code>recur</code>) itself. We don’t need to hang on to any of the variables in this function any more, because the final return value won’t depend on them. <code>recur</code> hints to the Clojure compiler that we <em>don’t need</em> to hold on to the stack, and can re-use that space for other things. This is called a <em>tail-recursive</em> function, and it requires only a single stack frame no matter how deep the recursive calls go.</p>
<p>Use <code>recur</code> wherever possible. It requires much less memory and is much faster than the explicit recursion.</p>
<p>You can also use <code>recur</code> within the context of the <code>loop</code> macro, where it acts just like an unnamed recursive function with initial values provided. Think of it, perhaps, like a recursive <code>let</code>.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="k">loop </span><span class="p">[</span><span class="nv">i</span> <span class="mi">0</span>
<span class="nv">nums</span> <span class="p">[]]</span>
<span class="p">(</span><span class="k">if </span><span class="p">(</span><span class="nb">< </span><span class="mi">10</span> <span class="nv">i</span><span class="p">)</span>
<span class="nv">nums</span>
<span class="p">(</span><span class="nf">recur</span> <span class="p">(</span><span class="nb">inc </span><span class="nv">i</span><span class="p">)</span> <span class="p">(</span><span class="nb">conj </span><span class="nv">nums</span> <span class="nv">i</span><span class="p">))))</span>
<span class="p">[</span><span class="mi">0</span> <span class="mi">1</span> <span class="mi">2</span> <span class="mi">3</span> <span class="mi">4</span> <span class="mi">5</span> <span class="mi">6</span> <span class="mi">7</span> <span class="mi">8</span> <span class="mi">9</span> <span class="mi">10</span><span class="p">]</span>
</code></pre>
<h2><a href="#laziness" id="laziness">Laziness</a></h2>
<p>In chapter 4 we mentioned that most of the sequences in Clojure, like <code>map</code>, <code>filter</code>, <code>iterate</code>, <code>repeatedly</code>, and so on, were <em>lazy</em>: they did not evaluate any of their elements until required. This too is provided by a macro, called <code>lazy-seq</code>.</p>
<pre><code><span></span><span class="p">(</span><span class="kd">defn </span><span class="nv">integers</span>
<span class="p">[</span><span class="nv">x</span><span class="p">]</span>
<span class="p">(</span><span class="nf">lazy-seq</span>
<span class="p">(</span><span class="nb">cons </span><span class="nv">x</span> <span class="p">(</span><span class="nf">integers</span> <span class="p">(</span><span class="nb">inc </span><span class="nv">x</span><span class="p">)))))</span>
<span class="nv">user=></span> <span class="p">(</span><span class="k">def </span><span class="nv">xs</span> <span class="p">(</span><span class="nf">integers</span> <span class="mi">0</span><span class="p">))</span>
<span class="o">#</span><span class="ss">'user/xs</span>
</code></pre>
<p>This sequence does not terminate; it is <em>infinitely</em> recursive. Yet it returned instantaneously. <code>lazy-seq</code> interrupted that recursion and restructured it into a sequence which constructs elements only when they are requested.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">take </span><span class="mi">10</span> <span class="nv">xs</span><span class="p">)</span>
<span class="p">(</span><span class="mi">0</span> <span class="mi">1</span> <span class="mi">2</span> <span class="mi">3</span> <span class="mi">4</span> <span class="mi">5</span> <span class="mi">6</span> <span class="mi">7</span> <span class="mi">8</span> <span class="mi">9</span><span class="p">)</span>
</code></pre>
<p>When using <code>lazy-seq</code> and its partner <code>lazy-cat</code>, you don’t have to use <code>recur</code>–or even be tail-recursive. The macros interrupt each level of recursion, preventing stack overflows.</p>
<p>You can also delay evaluation of some expressions until later, using <code>delay</code> and <code>deref</code>.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="k">def </span><span class="nv">x</span> <span class="p">(</span><span class="nf">delay</span>
<span class="p">(</span><span class="nb">prn </span><span class="s">"computing a really big number!"</span><span class="p">)</span>
<span class="p">(</span><span class="nb">last </span><span class="p">(</span><span class="nb">take </span><span class="mi">10000000</span> <span class="p">(</span><span class="nb">iterate inc </span><span class="mi">0</span><span class="p">)))))</span>
<span class="o">#</span><span class="ss">'user/x</span> <span class="c1">; Did nothing, returned immediately</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nb">deref </span><span class="nv">x</span><span class="p">)</span>
<span class="s">"computing a really big number!"</span> <span class="c1">; Now we have to wait!</span>
<span class="mi">9999999</span>
</code></pre>
<h2><a href="#list-comprehensions" id="list-comprehensions">List comprehensions</a></h2>
<p>Combining recursion and laziness is the <em>list comprehension</em> macro, <code>for</code>. In its simplest form, <code>for</code> works like <code>map</code>:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">for </span><span class="p">[</span><span class="nv">x</span> <span class="p">(</span><span class="nb">range </span><span class="mi">10</span><span class="p">)]</span> <span class="p">(</span><span class="nb">- </span><span class="nv">x</span><span class="p">))</span>
<span class="p">(</span><span class="mi">0</span> <span class="mi">-1</span> <span class="mi">-2</span> <span class="mi">-3</span> <span class="mi">-4</span> <span class="mi">-5</span> <span class="mi">-6</span> <span class="mi">-7</span> <span class="mi">-8</span> <span class="mi">-9</span><span class="p">)</span>
</code></pre>
<p>Like <code>let</code>, <code>for</code> takes a vector of <code>bindings</code>. Unlike <code>let</code>, however, <code>for</code> binds its variables to <em>each possible combination of elements in their corresponding sequences</em>.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">for </span><span class="p">[</span><span class="nv">x</span> <span class="p">[</span><span class="mi">1</span> <span class="mi">2</span> <span class="mi">3</span><span class="p">]</span>
<span class="nv">y</span> <span class="p">[</span><span class="ss">:a</span> <span class="ss">:b</span><span class="p">]]</span>
<span class="p">[</span><span class="nv">x</span> <span class="nv">y</span><span class="p">])</span>
<span class="p">([</span><span class="mi">1</span> <span class="ss">:a</span><span class="p">]</span> <span class="p">[</span><span class="mi">1</span> <span class="ss">:b</span><span class="p">]</span> <span class="p">[</span><span class="mi">2</span> <span class="ss">:a</span><span class="p">]</span> <span class="p">[</span><span class="mi">2</span> <span class="ss">:b</span><span class="p">]</span> <span class="p">[</span><span class="mi">3</span> <span class="ss">:a</span><span class="p">]</span> <span class="p">[</span><span class="mi">3</span> <span class="ss">:b</span><span class="p">])</span>
</code></pre>
<p>“For each x in the sequence <code>[1 2 3]</code>, and for each <code>y</code> in the sequence <code>[:a :b]</code>, find all <code>[x y]</code> pairs.” Note that the rightmost variable <code>y</code> iterates the fastest.</p>
<p>Like most sequence functions, the <code>for</code> macro yields lazy sequences. You can filter them with <code>take</code>, <code>filter</code>, et al like any other sequence. Or you can use <code>:while</code> to tell <code>for</code> when to stop, or <code>:when</code> to filter out combinations of elements.</p>
<pre><code><span></span><span class="p">(</span><span class="nb">for </span><span class="p">[</span><span class="nv">x</span> <span class="p">(</span><span class="nb">range </span><span class="mi">5</span><span class="p">)</span>
<span class="nv">y</span> <span class="p">(</span><span class="nb">range </span><span class="mi">5</span><span class="p">)</span>
<span class="ss">:when</span> <span class="p">(</span><span class="nb">and </span><span class="p">(</span><span class="nf">even?</span> <span class="nv">x</span><span class="p">)</span> <span class="p">(</span><span class="nf">odd?</span> <span class="nv">y</span><span class="p">))]</span>
<span class="p">[</span><span class="nv">x</span> <span class="nv">y</span><span class="p">])</span>
<span class="p">([</span><span class="mi">0</span> <span class="mi">1</span><span class="p">]</span> <span class="p">[</span><span class="mi">0</span> <span class="mi">3</span><span class="p">]</span> <span class="p">[</span><span class="mi">2</span> <span class="mi">1</span><span class="p">]</span> <span class="p">[</span><span class="mi">2</span> <span class="mi">3</span><span class="p">]</span> <span class="p">[</span><span class="mi">4</span> <span class="mi">1</span><span class="p">]</span> <span class="p">[</span><span class="mi">4</span> <span class="mi">3</span><span class="p">])</span>
</code></pre>
<p>Clojure includes a rich smörgåsbord of control-flow constructs; we’ll meet new ones throughout the book.</p>
<h2><a href="#the-threading-macros" id="the-threading-macros">The threading macros</a></h2>
<p>Sometimes you want to <em>thread</em> a computation through several expressions, like a chain. Object-oriented languages like Ruby or Java are well-suited to this style:</p>
<pre><code><span></span><span class="mi">1</span><span class="o">.</span><span class="mi">9</span><span class="o">.</span><span class="mi">3</span><span class="n">p385</span> <span class="p">:</span><span class="mo">004</span> <span class="o">></span> <span class="p">(</span><span class="mi">0</span><span class="o">..</span><span class="mi">10</span><span class="p">)</span><span class="o">.</span><span class="n">select</span><span class="p">(</span><span class="o">&</span><span class="ss">:odd?</span><span class="p">)</span><span class="o">.</span><span class="n">reduce</span><span class="p">(</span><span class="o">&</span><span class="ss">:+</span><span class="p">)</span>
<span class="mi">25</span>
</code></pre>
<p>Start with the range <code>0</code> to <code>10</code>, then call <code>select</code> on that range, with the function <code>odd?</code>. Finally, take <em>that</em> sequence of numbers, and reduce it with the <code>+</code> function.</p>
<p>The Clojure threading macros do the same by restructuring a sequence of expressions, inserting each expression as the first (or final) argument in the next expression.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nf">pprint</span> <span class="p">(</span><span class="nf">clojure.walk/macroexpand-all</span>
<span class="o">'</span><span class="p">(</span><span class="nf">->></span> <span class="p">(</span><span class="nb">range </span><span class="mi">10</span><span class="p">)</span> <span class="p">(</span><span class="nb">filter </span><span class="nv">odd?</span><span class="p">)</span> <span class="p">(</span><span class="nb">reduce </span><span class="nv">+</span><span class="p">))))</span>
<span class="p">(</span><span class="nb">reduce + </span><span class="p">(</span><span class="nb">filter </span><span class="nv">odd?</span> <span class="p">(</span><span class="nb">range </span><span class="mi">10</span><span class="p">)))</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nf">->></span> <span class="p">(</span><span class="nb">range </span><span class="mi">10</span><span class="p">)</span> <span class="p">(</span><span class="nb">filter </span><span class="nv">odd?</span><span class="p">)</span> <span class="p">(</span><span class="nb">reduce </span><span class="nv">+</span><span class="p">))</span>
<span class="mi">25</span>
</code></pre>
<p><code>->></code> took <code>(range 10)</code> and inserted it at the end of <code>(filter odd?)</code>, forming <code>(filter odd? (range 10))</code>. Then it took <em>that</em> expression and inserted it at the end of <code>(reduce +)</code>. In essence, <code>->></code> <em>flattens and reverses</em> a nested chain of operations.</p>
<p><code>-></code>, by contrast, inserts each form in as the <em>first</em> argument in the following expression.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nf">pprint</span> <span class="p">(</span><span class="nf">clojure.walk/macroexpand-all</span>
<span class="o">'</span><span class="p">(</span><span class="nb">-> </span><span class="p">{</span><span class="ss">:proton</span> <span class="ss">:fermion</span><span class="p">}</span> <span class="p">(</span><span class="nb">assoc </span><span class="ss">:photon</span> <span class="ss">:boson</span><span class="p">)</span> <span class="p">(</span><span class="nb">assoc </span><span class="ss">:neutrino</span> <span class="ss">:fermion</span><span class="p">))))</span>
<span class="p">(</span><span class="nb">assoc </span><span class="p">(</span><span class="nb">assoc </span><span class="p">{</span><span class="ss">:proton</span> <span class="ss">:fermion</span><span class="p">}</span> <span class="ss">:photon</span> <span class="ss">:boson</span><span class="p">)</span> <span class="ss">:neutrino</span> <span class="ss">:fermion</span><span class="p">)</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nb">-> </span><span class="p">{</span><span class="ss">:proton</span> <span class="ss">:fermion</span><span class="p">}</span>
<span class="p">(</span><span class="nb">assoc </span><span class="ss">:photon</span> <span class="ss">:boson</span><span class="p">)</span>
<span class="p">(</span><span class="nb">assoc </span><span class="ss">:neutrino</span> <span class="ss">:fermion</span><span class="p">))</span>
<span class="p">{</span><span class="ss">:neutrino</span> <span class="ss">:fermion</span>, <span class="ss">:photon</span> <span class="ss">:boson</span>, <span class="ss">:proton</span> <span class="ss">:fermion</span><span class="p">}</span>
</code></pre>
<p>Clojure isn’t just <code>function-oriented</code> in its syntax; it can be object-oriented, and stack-oriented, and array-oriented, and so on–and <em>mix all of these styles freely, in a controlled way</em>. If you don’t like the way the language fits a certain problem, you can write a macro which defines a <em>new</em> language, specifically for that subproblem.</p>
<p><code>cond</code>, <code>condp</code> and <code>case</code>, for example, express a language for branching based on predicates. <code>-></code>, <code>->></code>, and <code>doto</code> express object-oriented and other expression-chaining languages.</p>
<ul>
<li><a href="https://github.com/clojure/core.match">core.match</a> is a set of macros which express powerful <em>pattern-matching</em> and substitution languages.</li>
<li><a href="https://github.com/clojure/core.logic">core.logic</a> expresses syntax for <em>logic programming</em>, for finding values which satisfy complex constraints.</li>
<li><a href="http://clojure.com/blog/2013/06/28/clojure-core-async-channels.html">core.async</a> restructures Clojure code into <em>asynchronous</em> forms so they can do many things at once.</li>
<li>For those with a twisted sense of humor, <a href="https://github.com/rplevy/swiss-arrows">Swiss Arrows</a> extends the threading macros into evil–but delightfully concise!–forms.</li>
</ul>
<p>We’ll see a plethora of macros, from simple to complex, through the course of this book. Each one shares the common pattern of <em>simplifying code</em>; reducing tangled or verbose expressions into something more concise, more meaningful, better suited to the problem at hand.</p>
<h2><a href="#when-to-use-macros" id="when-to-use-macros">When to use macros</a></h2>
<p>While it’s important to be aware of the purpose and behavior of the macro system, you don’t need to write your own macros to be productive with Clojure. For now, you’ll be just fine writing code which uses the existing macros in the language. When you <em>do</em> need to delve deeper, come back to this guide and experiment. It’ll take some time to sink in.</p>
<p>First, know that writing macros is <em>tricky</em>, even for experts. It requires you to think at two levels simultaneously, and to be mindful of the distinction between <em>expression</em> and underlying <em>evaluation</em>. Writing a macro is essentially extending the language, the compiler, the syntax and evaluation model of Clojure, by restructuring <em>arbitrary</em> expressions into ones the evaluation system understands. This is hard, and it’ll take practice to get used to.</p>
<p>In addition, Clojure macros come with some important restrictions. Because they’re expanded prior to evaluation, macros are invisible to functions. They can’t be composed functionally–you can’t <code>(map or ...)</code>, for instance.</p>
<p>So in general, if you <em>can</em> solve a problem without writing a macro, <em>don’t write one</em>. It’ll be easier to debug, easier to understand, and easier to compose later. Only reach for macros when you need <em>new syntax</em>, or when performance demands the code be transformed at compile time.</p>
<p>When you do write a macro, consider its scope carefully. Keep the transformation simple; and do as much in normal functions as possible. Provide an escape hatch where possible, by doing most of the work in a function, and writing a small wrapper macro which calls that function. Finally, remember the distinction between <em>code</em> and what that code <em>evaluates to</em>. Use <code>let</code> whenever a value is to be re-used, to prevent it being evaluated twice by accident.</p>
<p>For a deeper exploration of Clojure macros in a real-world application, try <a href="http://aphyr.com/posts/268-language-power">Language Power</a>.</p>
<h2><a href="#review" id="review">Review</a></h2>
<p>In Chapter 4, deeply nested expressions led to the desire for a <em>simpler</em>, <em>more direct</em> expression of a chain of sequence operations. We learned that the Clojure compiler first <em>expands</em> expressions before evaluating them, using macros–special functions which take code and return other code. We used macros to define the short-circuiting <code>or</code> operator, and followed that with a tour of basic control flow, recursion, laziness, list comprehensions, and chained expressions. Finally, we learned a bit about when and how to write our own macros.</p>
<p>Throughout this chapter we’ve brushed against the idea of <em>side effects</em>: things which change the outside world. We might change a var with <code>def</code>, or print a message to the screen with <code>prn</code>. Real languages must model a continually shifting universe, which leads us to <a href="http://aphyr.com/posts/306-clojure-from-the-ground-up-state">Chapter Six: Side effects and state</a>.</p>
<h2><a href="#problems" id="problems">Problems</a></h2>
<ol>
<li>
<p>Using the control flow constructs we’ve learned, write a <code>schedule</code> function which, given an hour of the day, returns what you’ll be doing at that time. <code>(schedule 18)</code>, for me, returns <code>:dinner</code>.</p>
</li>
<li>
<p>Using the threading macros, find how many numbers from 0 to 9999 are palindromes: identical when written forwards and backwards. <code>121</code> is a palindrome, as is <code>7447</code> and <code>5</code>, but not <code>12</code> or <code>953</code>.</p>
</li>
<li>
<p>Write a macro <code>id</code> which takes a function and a list of args: <code>(id f a b c)</code>, and returns an expression which calls that function with the given args: <code>(f a b c)</code>.</p>
</li>
<li>
<p>Write a macro <code>log</code> which uses a var, <code>logging-enabled</code>, to determine whether or not to print an expression to the console at compile time. If <code>logging-enabled</code> is false, <code>(log :hi)</code> should macroexpand to <code>nil</code>. If <code>logging-enabled</code> is true, <code>(log :hi)</code> should macroexpand to <code>(prn :hi)</code>. Why would you want to do this check during <em>compilation</em>, instead of when running the program? What might you <em>lose</em>?</p>
</li>
<li>
<p>(Advanced) Using the <code>rationalize</code> function, write a macro <code>exact</code> which rewrites any use of <code>+</code>, <code>-</code>, <code>*</code>, or <code>/</code> to force the use of <em>ratios</em> instead of <a href="http://erlang.org/pipermail/erlang-questions/2013-November/076114.html">floating-point numbers</a>. <code>(* 2452.45 100)</code> returns <code>245244.99999999997</code>, but <code>(exact (* 2452.45 100))</code> should return <code>245245N</code></p>
</li>
</ol>
https://aphyr.com/posts/304-clojure-from-the-ground-up-sequencesClojure from the ground up: sequences2013-11-18T10:57:30-05:002013-11-18T10:57:30-05:00Aphyrhttps://aphyr.com/<p>In <a href="/posts/303-clojure-from-the-ground-up-functions">Chapter 3</a>, we discovered functions as a way to <em>abstract</em> expressions; to
rephrase a particular computation with some parts missing. We used functions to
transform a single value. But what if we want to apply a function to <em>more than
one</em> value at once? What about sequences?</p>
<p>For example, we know that <code>(inc 2)</code> increments the number 2. What if we wanted to increment <em>every number</em> in the vector <code>[1 2 3]</code>, producing <code>[2 3 4]</code>?</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">inc </span><span class="p">[</span><span class="mi">1</span> <span class="mi">2</span> <span class="mi">3</span><span class="p">])</span>
<span class="nv">ClassCastException</span> <span class="nv">clojure.lang.PersistentVector</span> <span class="nv">cannot</span> <span class="nv">be</span> <span class="nb">cast </span><span class="nv">to</span> <span class="nv">java.lang.Number</span> <span class="nv">clojure.lang.Numbers.inc</span> <span class="p">(</span><span class="nf">Numbers.java</span><span class="ss">:110</span><span class="p">)</span>
</code></pre>
<p>Clearly <code>inc</code> can only work on numbers, not on vectors. We need a different
kind of tool.</p>
<h2><a href="#a-direct-approach" id="a-direct-approach">A direct approach</a></h2>
<p>Let’s think about the problem in concrete terms. We want to increment each of
three elements: the first, second, and third. We know how to get an element
from a sequence by using nth, so let’s start with the first number, at index 0:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="k">def </span><span class="nv">numbers</span> <span class="p">[</span><span class="mi">1</span> <span class="mi">2</span> <span class="mi">3</span><span class="p">])</span>
<span class="o">#</span><span class="ss">'user/numbers</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nb">nth </span><span class="nv">numbers</span> <span class="mi">0</span><span class="p">)</span>
<span class="mi">1</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nb">inc </span><span class="p">(</span><span class="nb">nth </span><span class="nv">numbers</span> <span class="mi">0</span><span class="p">))</span>
<span class="mi">2</span>
</code></pre>
<p>So there’s the first element incremented. Now we can do the second:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">inc </span><span class="p">(</span><span class="nb">nth </span><span class="nv">numbers</span> <span class="mi">1</span><span class="p">))</span>
<span class="mi">3</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nb">inc </span><span class="p">(</span><span class="nb">nth </span><span class="nv">numbers</span> <span class="mi">2</span><span class="p">))</span>
<span class="mi">4</span>
</code></pre>
<p>And it should be straightforward to combine these into a vector…</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">[(</span><span class="nb">inc </span><span class="p">(</span><span class="nb">nth </span><span class="nv">numbers</span> <span class="mi">0</span><span class="p">))</span> <span class="p">(</span><span class="nb">inc </span><span class="p">(</span><span class="nb">nth </span><span class="nv">numbers</span> <span class="mi">1</span><span class="p">))</span> <span class="p">(</span><span class="nb">inc </span><span class="p">(</span><span class="nb">nth </span><span class="nv">numbers</span> <span class="mi">2</span><span class="p">))]</span>
<span class="p">[</span><span class="mi">2</span> <span class="mi">3</span> <span class="mi">4</span><span class="p">]</span>
</code></pre>
<p>Success! We’ve incremented each of the numbers in the list! How about a list with only two elements?</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="k">def </span><span class="nv">numbers</span> <span class="p">[</span><span class="mi">1</span> <span class="mi">2</span><span class="p">])</span>
<span class="o">#</span><span class="ss">'user/numbers</span>
<span class="nv">user=></span> <span class="p">[(</span><span class="nb">inc </span><span class="p">(</span><span class="nb">nth </span><span class="nv">numbers</span> <span class="mi">0</span><span class="p">))</span> <span class="p">(</span><span class="nb">inc </span><span class="p">(</span><span class="nb">nth </span><span class="nv">numbers</span> <span class="mi">1</span><span class="p">))</span> <span class="p">(</span><span class="nb">inc </span><span class="p">(</span><span class="nb">nth </span><span class="nv">numbers</span> <span class="mi">2</span><span class="p">))]</span>
<span class="nv">IndexOutOfBoundsException</span> <span class="nv">clojure.lang.PersistentVector.arrayFor</span> <span class="p">(</span><span class="nf">PersistentVector.java</span><span class="ss">:107</span><span class="p">)</span>
</code></pre>
<p>Shoot. We tried to get the element at index 2, but <em>couldn’t</em>, because
<code>numbers</code> only has indices 0 and 1. Clojure calls that “index out of bounds”.</p>
<p>We could just leave off the third expression in the vector; taking only
elements 0 and 1. But the problem actually gets much worse, because we’d need
to make this change <em>every</em> time we wanted to use a different sized vector. And
what of a vector with 1000 elements? We’d need 1000 <code>(inc (nth numbers ...))</code>
expressions! Down this path lies madness.</p>
<p>Let’s back up a bit, and try a slightly smaller problem.</p>
<h2><a href="#recursion" id="recursion">Recursion</a></h2>
<p>What if we just incremented the <em>first</em> number in the vector? How would that
work? We know that <code>first</code> finds the first element in a sequence, and <code>rest</code>
finds all the remaining ones.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">first </span><span class="p">[</span><span class="mi">1</span> <span class="mi">2</span> <span class="mi">3</span><span class="p">])</span>
<span class="mi">1</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nb">rest </span><span class="p">[</span><span class="mi">1</span> <span class="mi">2</span> <span class="mi">3</span><span class="p">])</span>
<span class="p">(</span><span class="mi">2</span> <span class="mi">3</span><span class="p">)</span>
</code></pre>
<p>So there’s the <em>pieces</em> we’d need. To glue them back together, we can use a
function called <code>cons</code>, which says “make a list beginning with the first
argument, followed by all the elements in the second argument”.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">cons </span><span class="mi">1</span> <span class="p">[</span><span class="mi">2</span><span class="p">])</span>
<span class="p">(</span><span class="mi">1</span> <span class="mi">2</span><span class="p">)</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nb">cons </span><span class="mi">1</span> <span class="p">[</span><span class="mi">2</span> <span class="mi">3</span><span class="p">])</span>
<span class="p">(</span><span class="mi">1</span> <span class="mi">2</span> <span class="mi">3</span><span class="p">)</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nb">cons </span><span class="mi">1</span> <span class="p">[</span><span class="mi">2</span> <span class="mi">3</span> <span class="mi">4</span><span class="p">])</span>
<span class="p">(</span><span class="mi">1</span> <span class="mi">2</span> <span class="mi">3</span> <span class="mi">4</span><span class="p">)</span>
</code></pre>
<p>OK so we can split up a sequence, increment the first part, and join them back together. Not so hard, right?</p>
<pre><code><span></span><span class="p">(</span><span class="kd">defn </span><span class="nv">inc-first</span> <span class="p">[</span><span class="nv">nums</span><span class="p">]</span>
<span class="p">(</span><span class="nb">cons </span><span class="p">(</span><span class="nb">inc </span><span class="p">(</span><span class="nb">first </span><span class="nv">nums</span><span class="p">))</span>
<span class="p">(</span><span class="nb">rest </span><span class="nv">nums</span><span class="p">)))</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nf">inc-first</span> <span class="p">[</span><span class="mi">1</span> <span class="mi">2</span> <span class="mi">3</span> <span class="mi">4</span><span class="p">])</span>
<span class="p">(</span><span class="mi">2</span> <span class="mi">2</span> <span class="mi">3</span> <span class="mi">4</span><span class="p">)</span>
</code></pre>
<p>Hey, there we go! First element changed. Will it work with any length list?</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nf">inc-first</span> <span class="p">[</span><span class="mi">5</span><span class="p">])</span>
<span class="p">(</span><span class="mi">6</span><span class="p">)</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nf">inc-first</span> <span class="p">[])</span>
<span class="nv">NullPointerException</span> <span class="nv">clojure.lang.Numbers.ops</span> <span class="p">(</span><span class="nf">Numbers.java</span><span class="ss">:942</span><span class="p">)</span>
</code></pre>
<p>Shoot. We can’t increment the first element of this empty vector, because it
doesn’t <em>have</em> a first element.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">first </span><span class="p">[])</span>
<span class="nv">nil</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nb">inc </span><span class="nv">nil</span><span class="p">)</span>
<span class="nv">NullPointerException</span> <span class="nv">clojure.lang.Numbers.ops</span> <span class="p">(</span><span class="nf">Numbers.java</span><span class="ss">:942</span><span class="p">)</span>
</code></pre>
<p>So there are really <em>two</em> cases for this function. If there is a first element
in <code>nums</code>, we’ll increment it as normal. If there’s <em>no</em> such element, we’ll
return an empty list. To express this kind of conditional behavior, we’ll use a
Clojure special form called <code>if</code>:</p>
<pre><code>user=> (doc if)
-------------------------
if
(if test then else?)
Special Form
Evaluates test. If not the singular values nil or false,
evaluates and yields then, otherwise, evaluates and yields else. If
else is not supplied it defaults to nil.
Please see http://clojure.org/special_forms#if
</code></pre>
<p>To confirm our intuition:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="k">if </span><span class="nv">true</span> <span class="ss">:a</span> <span class="ss">:b</span><span class="p">)</span>
<span class="ss">:a</span>
<span class="nv">user=></span> <span class="p">(</span><span class="k">if </span><span class="nv">false</span> <span class="ss">:a</span> <span class="ss">:b</span><span class="p">)</span>
<span class="ss">:b</span>
</code></pre>
<p>Seems straightforward enough.</p>
<pre><code><span></span><span class="p">(</span><span class="kd">defn </span><span class="nv">inc-first</span> <span class="p">[</span><span class="nv">nums</span><span class="p">]</span>
<span class="p">(</span><span class="k">if </span><span class="p">(</span><span class="nb">first </span><span class="nv">nums</span><span class="p">)</span>
<span class="c1">; If there's a first number, build a new list with cons</span>
<span class="p">(</span><span class="nb">cons </span><span class="p">(</span><span class="nb">inc </span><span class="p">(</span><span class="nb">first </span><span class="nv">nums</span><span class="p">))</span>
<span class="p">(</span><span class="nb">rest </span><span class="nv">nums</span><span class="p">))</span>
<span class="c1">; If there's no first number, just return an empty list</span>
<span class="p">(</span><span class="nf">list</span><span class="p">)))</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nf">inc-first</span> <span class="p">[])</span>
<span class="p">()</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nf">inc-first</span> <span class="p">[</span><span class="mi">1</span> <span class="mi">2</span> <span class="mi">3</span><span class="p">])</span>
<span class="p">(</span><span class="mi">2</span> <span class="mi">2</span> <span class="mi">3</span><span class="p">)</span>
</code></pre>
<p>Success! Now we can handle <em>both</em> cases: empty sequences, and sequences with
things in them. Now how about incrementing that <em>second</em> number? Let’s stare at
that code for a bit.</p>
<pre><code><span></span><span class="p">(</span><span class="nb">rest </span><span class="nv">nums</span><span class="p">)</span>
</code></pre>
<p>Hang on. That list–<code>(rest nums)</code>–that’s a list of numbers too. What if we…
used our inc-first function on <em>that</em> list, to increment <em>its</em> first number?
Then we’d have incremented both the first <em>and</em> the second element.</p>
<pre><code><span></span><span class="p">(</span><span class="kd">defn </span><span class="nv">inc-more</span> <span class="p">[</span><span class="nv">nums</span><span class="p">]</span>
<span class="p">(</span><span class="k">if </span><span class="p">(</span><span class="nb">first </span><span class="nv">nums</span><span class="p">)</span>
<span class="p">(</span><span class="nb">cons </span><span class="p">(</span><span class="nb">inc </span><span class="p">(</span><span class="nb">first </span><span class="nv">nums</span><span class="p">))</span>
<span class="p">(</span><span class="nf">inc-more</span> <span class="p">(</span><span class="nb">rest </span><span class="nv">nums</span><span class="p">)))</span>
<span class="p">(</span><span class="nf">list</span><span class="p">)))</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nf">inc-more</span> <span class="p">[</span><span class="mi">1</span> <span class="mi">2</span> <span class="mi">3</span> <span class="mi">4</span><span class="p">])</span>
<span class="p">(</span><span class="mi">2</span> <span class="mi">3</span> <span class="mi">4</span> <span class="mi">5</span><span class="p">)</span>
</code></pre>
<p>Odd. That didn’t just increment the first two numbers. It incremented <em>all</em> the
numbers. We fell into the <em>complete</em> solution entirely by accident. What
happened here?</p>
<p>Well first we… yes, we got the number one, and incremented it. Then we stuck
that onto <code>(inc-first [2 3 4])</code>, which got two, and incremented it. Then we stuck
that two onto <code>(inc-first [3 4])</code>, which got three, and then we did the same for
four. Only <em>that</em> time around, at the very end of the list, <code>(rest [4])</code> would
have been <em>empty</em>. So when we went to get the first number of the empty list,
we took the <em>second</em> branch of the <code>if</code>, and returned the empty list.</p>
<p>Having reached the <em>bottom</em> of the function calls, so to speak, we zip back up
the chain. We can imagine this function turning into a long string of <code>cons</code>
calls, like so:</p>
<pre><code><span></span><span class="p">(</span><span class="nb">cons </span><span class="mi">2</span> <span class="p">(</span><span class="nb">cons </span><span class="mi">3</span> <span class="p">(</span><span class="nb">cons </span><span class="mi">4</span> <span class="p">(</span><span class="nb">cons </span><span class="mi">5</span> <span class="o">'</span><span class="p">()))))</span>
<span class="p">(</span><span class="nb">cons </span><span class="mi">2</span> <span class="p">(</span><span class="nb">cons </span><span class="mi">3</span> <span class="p">(</span><span class="nb">cons </span><span class="mi">4</span> <span class="o">'</span><span class="p">(</span><span class="mi">5</span><span class="p">))))</span>
<span class="p">(</span><span class="nb">cons </span><span class="mi">2</span> <span class="p">(</span><span class="nb">cons </span><span class="mi">3</span> <span class="o">'</span><span class="p">(</span><span class="mi">4</span> <span class="mi">5</span><span class="p">)))</span>
<span class="p">(</span><span class="nb">cons </span><span class="mi">2</span> <span class="o">'</span><span class="p">(</span><span class="mi">3</span> <span class="mi">4</span> <span class="mi">5</span><span class="p">))</span>
<span class="o">'</span><span class="p">(</span><span class="mi">2</span> <span class="mi">3</span> <span class="mi">4</span> <span class="mi">5</span><span class="p">)</span>
</code></pre>
<p>This technique is called <em>recursion</em>, and it is a fundamental principle in
working with collections, sequences, trees, or graphs… any problem which has
small parts linked together. There are two key elements in a recursive program:</p>
<ol>
<li>Some part of the problem which has a known solution</li>
<li>A relationship which connects one part of the problem to the next</li>
</ol>
<p>Incrementing the elements of an empty list returns the empty list. This is our
<em>base case</em>: the ground to build on. Our <em>inductive</em> case, also called the
<em>recurrence relation</em>, is how we broke the problem up into incrementing the
<em>first</em> number in the sequence, and incrementing all the numbers in the
<em>rest</em> of the sequence. The <code>if</code> expression bound these two cases together into
a single function; a function <em>defined in terms of itself</em>.</p>
<p>Once the initial step has been taken, <em>every</em> step can be taken.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nf">inc-more</span> <span class="p">[</span><span class="mi">1</span> <span class="mi">2</span> <span class="mi">3</span> <span class="mi">4</span> <span class="mi">5</span> <span class="mi">6</span> <span class="mi">7</span> <span class="mi">8</span> <span class="mi">9</span> <span class="mi">10</span> <span class="mi">11</span> <span class="mi">12</span><span class="p">])</span>
<span class="p">(</span><span class="mi">2</span> <span class="mi">3</span> <span class="mi">4</span> <span class="mi">5</span> <span class="mi">6</span> <span class="mi">7</span> <span class="mi">8</span> <span class="mi">9</span> <span class="mi">10</span> <span class="mi">11</span> <span class="mi">12</span> <span class="mi">13</span><span class="p">)</span>
</code></pre>
<p>This is the beauty of a recursive function; folding an unbounded stream of
computation over and over, onto itself, until only a single step remains.</p>
<h2><a href="#generalizing-from-inc" id="generalizing-from-inc">Generalizing from inc</a></h2>
<p>We set out to increment every number in a vector, but nothing in our solution
actually depended on <code>inc</code>. It just as well could have been <code>dec</code>, or <code>str</code>, or
<code>keyword</code>. Let’s <em>parameterize</em> our <code>inc-more</code> function to use <em>any</em>
transformation of its elements:</p>
<pre><code><span></span><span class="p">(</span><span class="kd">defn </span><span class="nv">transform-all</span> <span class="p">[</span><span class="nv">f</span> <span class="nv">xs</span><span class="p">]</span>
<span class="p">(</span><span class="k">if </span><span class="p">(</span><span class="nb">first </span><span class="nv">xs</span><span class="p">)</span>
<span class="p">(</span><span class="nb">cons </span><span class="p">(</span><span class="nf">f</span> <span class="p">(</span><span class="nb">first </span><span class="nv">xs</span><span class="p">))</span>
<span class="p">(</span><span class="nf">transform-all</span> <span class="nv">f</span> <span class="p">(</span><span class="nb">rest </span><span class="nv">xs</span><span class="p">)))</span>
<span class="p">(</span><span class="nf">list</span><span class="p">)))</span>
</code></pre>
<p>Because we could be talking about <em>any</em> kind of sequence, not just numbers,
we’ve named the sequence <code>xs</code>, and its first element <code>x</code>. We also take a
function <code>f</code> as an argument, and that function will be applied to each <code>x</code> in
turn. So not only can we increment numbers…</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nf">transform-all</span> <span class="nb">inc </span><span class="p">[</span><span class="mi">1</span> <span class="mi">2</span> <span class="mi">3</span> <span class="mi">4</span><span class="p">])</span>
<span class="p">(</span><span class="mi">2</span> <span class="mi">3</span> <span class="mi">4</span> <span class="mi">5</span><span class="p">)</span>
</code></pre>
<p>…but we can turn strings to keywords…</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nf">transform-all</span> <span class="nb">keyword </span><span class="p">[</span><span class="s">"bell"</span> <span class="s">"hooks"</span><span class="p">])</span>
<span class="p">(</span><span class="ss">:bell</span> <span class="ss">:hooks</span><span class="p">)</span>
</code></pre>
<p>…or wrap every element in a list:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nf">transform-all</span> <span class="nb">list </span><span class="p">[</span><span class="ss">:codex</span> <span class="ss">:book</span> <span class="ss">:manuscript</span><span class="p">])</span>
<span class="p">((</span><span class="ss">:codex</span><span class="p">)</span> <span class="p">(</span><span class="ss">:book</span><span class="p">)</span> <span class="p">(</span><span class="ss">:manuscript</span><span class="p">))</span>
</code></pre>
<p>In short, this function expresses a sequence in which each element is some
function applied to the corresponding element in the underlying sequence.
This idea is so important that it has its own name, in mathematics, Clojure, and other languages. We call it <code>map</code>.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">map inc </span><span class="p">[</span><span class="mi">1</span> <span class="mi">2</span> <span class="mi">3</span> <span class="mi">4</span><span class="p">])</span>
<span class="p">(</span><span class="mi">2</span> <span class="mi">3</span> <span class="mi">4</span> <span class="mi">5</span><span class="p">)</span>
</code></pre>
<p>You might remember maps as a datatype in Clojure, too–they’re dictionaries
that relate keys to values.</p>
<pre><code><span></span><span class="p">{</span><span class="ss">:year</span> <span class="mi">1969</span>
<span class="ss">:event</span> <span class="s">"moon landing"</span><span class="p">}</span>
</code></pre>
<p>The <em>function</em> <code>map</code> relates one sequence to another. The <em>type</em> map relates
keys to values. There is a deep symmetry between the two: maps are usually
sparse, and the relationships between keys and values may be arbitrarily
complex. The map function, on the other hand, usually expresses the <em>same</em> type
of relationship, applied to a series of elements in <em>fixed order</em>.</p>
<h2><a href="#building-sequences" id="building-sequences">Building sequences</a></h2>
<p>Recursion can do more than just <code>map</code>. We can use it to expand a single value
into a sequence of values, each related by some function. For
instance:</p>
<pre><code><span></span><span class="p">(</span><span class="kd">defn </span><span class="nv">expand</span> <span class="p">[</span><span class="nv">f</span> <span class="nv">x</span> <span class="nv">count</span><span class="p">]</span>
<span class="p">(</span><span class="nb">when </span><span class="p">(</span><span class="nb">pos? </span><span class="nv">count</span><span class="p">)</span>
<span class="p">(</span><span class="nb">cons </span><span class="nv">x</span> <span class="p">(</span><span class="nf">expand</span> <span class="nv">f</span> <span class="p">(</span><span class="nf">f</span> <span class="nv">x</span><span class="p">)</span> <span class="p">(</span><span class="nb">dec </span><span class="nv">count</span><span class="p">)))))</span>
</code></pre>
<p>Our base case is <code>nil</code>, returned when <code>count</code> is zero, and <code>(pos? count)</code> fails. Our inductive case returns a list of x, followed by the expansion starting with (f x), and a count <em>one smaller</em>. This means the first element of our list will be x, the second (f x), the third (f (f x)), and so on.
Each time we call <code>expand</code>, we count down by one using <code>dec</code>. Once the count is zero,
the <code>if</code> returns <code>nil</code>, and evaluation stops. If we start with the number 0 and
use inc as our function:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="nv">user=></span> <span class="p">(</span><span class="nf">expand</span> <span class="nb">inc </span><span class="mi">0</span> <span class="mi">10</span><span class="p">)</span>
<span class="p">(</span><span class="mi">0</span> <span class="mi">1</span> <span class="mi">2</span> <span class="mi">3</span> <span class="mi">4</span> <span class="mi">5</span> <span class="mi">6</span> <span class="mi">7</span> <span class="mi">8</span> <span class="mi">9</span><span class="p">)</span>
</code></pre>
<p>Clojure has a more general form of this function, called <code>iterate</code>.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">take </span><span class="mi">10</span> <span class="p">(</span><span class="nb">iterate inc </span><span class="mi">0</span><span class="p">))</span>
<span class="p">(</span><span class="mi">0</span> <span class="mi">1</span> <span class="mi">2</span> <span class="mi">3</span> <span class="mi">4</span> <span class="mi">5</span> <span class="mi">6</span> <span class="mi">7</span> <span class="mi">8</span> <span class="mi">9</span><span class="p">)</span>
</code></pre>
<p>Since this sequence is <em>infinitely</em> long, we’re using <code>take</code> to select only the
first 10 elements. We can construct more complex sequences by using more
complex functions:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">take </span><span class="mi">10</span> <span class="p">(</span><span class="nb">iterate </span><span class="p">(</span><span class="k">fn </span><span class="p">[</span><span class="nv">x</span><span class="p">]</span> <span class="p">(</span><span class="k">if </span><span class="p">(</span><span class="nf">odd?</span> <span class="nv">x</span><span class="p">)</span> <span class="p">(</span><span class="nb">+ </span><span class="mi">1</span> <span class="nv">x</span><span class="p">)</span> <span class="p">(</span><span class="nb">/ </span><span class="nv">x</span> <span class="mi">2</span><span class="p">)))</span> <span class="mi">10</span><span class="p">))</span>
<span class="p">(</span><span class="mi">10</span> <span class="mi">5</span> <span class="mi">6</span> <span class="mi">3</span> <span class="mi">4</span> <span class="mi">2</span> <span class="mi">1</span> <span class="mi">2</span> <span class="mi">1</span> <span class="mi">2</span><span class="p">)</span>
</code></pre>
<p>Or build up strings:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">take </span><span class="mi">5</span> <span class="p">(</span><span class="nb">iterate </span><span class="p">(</span><span class="k">fn </span><span class="p">[</span><span class="nv">x</span><span class="p">]</span> <span class="p">(</span><span class="nb">str </span><span class="nv">x</span> <span class="s">"o"</span><span class="p">))</span> <span class="s">"y"</span><span class="p">))</span>
<span class="p">(</span><span class="s">"y"</span> <span class="s">"yo"</span> <span class="s">"yoo"</span> <span class="s">"yooo"</span> <span class="s">"yoooo"</span><span class="p">)</span>
</code></pre>
<p><code>iterate</code> is extremely handy for working with infinite sequences, and has some
partners in crime. <code>repeat</code>, for instance, constructs a sequence where every
element is the same.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">take </span><span class="mi">10</span> <span class="p">(</span><span class="nb">repeat </span><span class="ss">:hi</span><span class="p">))</span>
<span class="p">(</span><span class="ss">:hi</span> <span class="ss">:hi</span> <span class="ss">:hi</span> <span class="ss">:hi</span> <span class="ss">:hi</span> <span class="ss">:hi</span> <span class="ss">:hi</span> <span class="ss">:hi</span> <span class="ss">:hi</span> <span class="ss">:hi</span><span class="p">)</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nb">repeat </span><span class="mi">3</span> <span class="ss">:echo</span><span class="p">)</span>
<span class="p">(</span><span class="ss">:echo</span> <span class="ss">:echo</span> <span class="ss">:echo</span><span class="p">)</span>
</code></pre>
<p>And its close relative <code>repeatedly</code> simply calls a function <code>(f)</code> to generate an
infinite sequence of values, over and over again, without any relationship
between elements. For an infinite sequence of random numbers:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nf">rand</span><span class="p">)</span>
<span class="mf">0.9002678382322784</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nf">rand</span><span class="p">)</span>
<span class="mf">0.12375594203332863</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nb">take </span><span class="mi">3</span> <span class="p">(</span><span class="nf">repeatedly</span> <span class="nv">rand</span><span class="p">))</span>
<span class="p">(</span><span class="mf">0.44442397843046755</span> <span class="mf">0.33668691162169784</span> <span class="mf">0.18244875487846746</span><span class="p">)</span>
</code></pre>
<p>Notice that calling <code>(rand)</code> returns a different number each time. We say that
<code>rand</code> is an <em>impure</em> function, because it cannot simply be replaced by the
same value every time. It does something different each time it’s called.</p>
<p>There’s another very handy sequence function specifically for numbers: <code>range</code>,
which generates a sequence of numbers between two points. <code>(range n)</code> gives n
successive integers starting at 0. <code>(range n m)</code> returns integers from n to
m-1. <code>(range n m step)</code> returns integers from n to m, but separated by <code>step</code>.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">range </span><span class="mi">5</span><span class="p">)</span>
<span class="p">(</span><span class="mi">0</span> <span class="mi">1</span> <span class="mi">2</span> <span class="mi">3</span> <span class="mi">4</span><span class="p">)</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nb">range </span><span class="mi">2</span> <span class="mi">10</span><span class="p">)</span>
<span class="p">(</span><span class="mi">2</span> <span class="mi">3</span> <span class="mi">4</span> <span class="mi">5</span> <span class="mi">6</span> <span class="mi">7</span> <span class="mi">8</span> <span class="mi">9</span><span class="p">)</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nb">range </span><span class="mi">0</span> <span class="mi">100</span> <span class="mi">5</span><span class="p">)</span>
<span class="p">(</span><span class="mi">0</span> <span class="mi">5</span> <span class="mi">10</span> <span class="mi">15</span> <span class="mi">20</span> <span class="mi">25</span> <span class="mi">30</span> <span class="mi">35</span> <span class="mi">40</span> <span class="mi">45</span> <span class="mi">50</span> <span class="mi">55</span> <span class="mi">60</span> <span class="mi">65</span> <span class="mi">70</span> <span class="mi">75</span> <span class="mi">80</span> <span class="mi">85</span> <span class="mi">90</span> <span class="mi">95</span><span class="p">)</span>
</code></pre>
<p>To extend a sequence by repeating it forever, use <code>cycle</code>:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">take </span><span class="mi">10</span> <span class="p">(</span><span class="nb">cycle </span><span class="p">[</span><span class="mi">1</span> <span class="mi">2</span> <span class="mi">3</span><span class="p">]))</span>
<span class="p">(</span><span class="mi">1</span> <span class="mi">2</span> <span class="mi">3</span> <span class="mi">1</span> <span class="mi">2</span> <span class="mi">3</span> <span class="mi">1</span> <span class="mi">2</span> <span class="mi">3</span> <span class="mi">1</span><span class="p">)</span>
</code></pre>
<h2><a href="#transforming-sequences" id="transforming-sequences">Transforming sequences</a></h2>
<p>Given a sequence, we often want to find a <em>related</em> sequence. <code>map</code>, for
instance, applies a function to each element–but has a few more tricks up its
sleeve.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">map </span><span class="p">(</span><span class="k">fn </span><span class="p">[</span><span class="nv">n</span> <span class="nv">vehicle</span><span class="p">]</span> <span class="p">(</span><span class="nb">str </span><span class="s">"I've got "</span> <span class="nv">n</span> <span class="s">" "</span> <span class="nv">vehicle</span> <span class="s">"s"</span><span class="p">))</span>
<span class="p">[</span><span class="mi">0</span> <span class="mi">200</span> <span class="mi">9</span><span class="p">]</span>
<span class="p">[</span><span class="s">"car"</span> <span class="s">"train"</span> <span class="s">"kiteboard"</span><span class="p">])</span>
<span class="p">(</span><span class="s">"I've got 0 cars"</span> <span class="s">"I've got 200 trains"</span> <span class="s">"I've got 9 kiteboards"</span><span class="p">)</span>
</code></pre>
<p>If given multiple sequences, <code>map</code> calls its function with one element from each sequence in turn. So the first value will be <code>(f 0 "car")</code>, the second <code>(f 200 "train")</code>, and so on. Like a zipper, map folds together corresponding elements
from multiple collections. To sum three vectors, column-wise:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">map + </span><span class="p">[</span><span class="mi">1</span> <span class="mi">2</span> <span class="mi">3</span><span class="p">]</span>
<span class="p">[</span><span class="mi">4</span> <span class="mi">5</span> <span class="mi">6</span><span class="p">]</span>
<span class="p">[</span><span class="mi">7</span> <span class="mi">8</span> <span class="mi">9</span><span class="p">])</span>
<span class="p">(</span><span class="mi">12</span> <span class="mi">15</span> <span class="mi">18</span><span class="p">)</span>
</code></pre>
<p>If one sequence is bigger than another, map stops at the end of the smaller
one. We can exploit this to combine finite and infinite sequences. For example, to number the elements in a vector:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">map </span><span class="p">(</span><span class="k">fn </span><span class="p">[</span><span class="nb">index </span><span class="nv">element</span><span class="p">]</span> <span class="p">(</span><span class="nb">str index </span><span class="s">". "</span> <span class="nv">element</span><span class="p">))</span>
<span class="p">(</span><span class="nb">iterate inc </span><span class="mi">0</span><span class="p">)</span>
<span class="p">[</span><span class="s">"erlang"</span> <span class="s">"ruby"</span> <span class="s">"haskell"</span><span class="p">])</span>
<span class="p">(</span><span class="s">"0. erlang"</span> <span class="s">"1. ruby"</span> <span class="s">"2. haskell"</span><span class="p">)</span>
</code></pre>
<p>Transforming elements together with their indices is so common that Clojure has
a special function for it: <code>map-indexed</code>:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nf">map-indexed</span> <span class="p">(</span><span class="k">fn </span><span class="p">[</span><span class="nb">index </span><span class="nv">element</span><span class="p">]</span> <span class="p">(</span><span class="nb">str index </span><span class="s">". "</span> <span class="nv">element</span><span class="p">))</span>
<span class="p">[</span><span class="s">"erlang"</span> <span class="s">"ruby"</span> <span class="s">"haskell"</span><span class="p">])</span>
<span class="p">(</span><span class="s">"0. erlang"</span> <span class="s">"1. ruby"</span> <span class="s">"2. haskell"</span><span class="p">)</span>
</code></pre>
<p>You can also tack one sequence onto the end of another, like so:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">concat </span><span class="p">[</span><span class="mi">1</span> <span class="mi">2</span> <span class="mi">3</span><span class="p">]</span> <span class="p">[</span><span class="ss">:a</span> <span class="ss">:b</span> <span class="ss">:c</span><span class="p">]</span> <span class="p">[</span><span class="mi">4</span> <span class="mi">5</span> <span class="mi">6</span><span class="p">])</span>
<span class="p">(</span><span class="mi">1</span> <span class="mi">2</span> <span class="mi">3</span> <span class="ss">:a</span> <span class="ss">:b</span> <span class="ss">:c</span> <span class="mi">4</span> <span class="mi">5</span> <span class="mi">6</span><span class="p">)</span>
</code></pre>
<p>Another way to combine two sequences is to riffle them together, using
<code>interleave</code>.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">interleave </span><span class="p">[</span><span class="ss">:a</span> <span class="ss">:b</span> <span class="ss">:c</span><span class="p">]</span> <span class="p">[</span><span class="mi">1</span> <span class="mi">2</span> <span class="mi">3</span><span class="p">])</span>
<span class="p">(</span><span class="ss">:a</span> <span class="mi">1</span> <span class="ss">:b</span> <span class="mi">2</span> <span class="ss">:c</span> <span class="mi">3</span><span class="p">)</span>
</code></pre>
<p>And if you want to insert a specific element between each successive pair in a
sequence, try <code>interpose</code>:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nf">interpose</span> <span class="ss">:and</span> <span class="p">[</span><span class="mi">1</span> <span class="mi">2</span> <span class="mi">3</span> <span class="mi">4</span><span class="p">])</span>
<span class="p">(</span><span class="mi">1</span> <span class="ss">:and</span> <span class="mi">2</span> <span class="ss">:and</span> <span class="mi">3</span> <span class="ss">:and</span> <span class="mi">4</span><span class="p">)</span>
</code></pre>
<p>To reverse a sequence, use <code>reverse</code>.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">reverse </span><span class="p">[</span><span class="mi">1</span> <span class="mi">2</span> <span class="mi">3</span><span class="p">])</span>
<span class="p">(</span><span class="mi">3</span> <span class="mi">2</span> <span class="mi">1</span><span class="p">)</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nb">reverse </span><span class="s">"woolf"</span><span class="p">)</span>
<span class="p">(</span><span class="sc">\f</span> <span class="sc">\l</span> <span class="sc">\o</span> <span class="sc">\o</span> <span class="sc">\w</span><span class="p">)</span>
</code></pre>
<p>Strings are sequences too! Each element of a string is a <em>character</em>, written <code>\f</code>. You can rejoin those characters into a string with <code>apply str</code>:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">apply str </span><span class="p">(</span><span class="nb">reverse </span><span class="s">"woolf"</span><span class="p">))</span>
<span class="s">"floow"</span>
</code></pre>
<p>…and break strings up into sequences of chars with <code>seq</code>.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">seq </span><span class="s">"sato"</span><span class="p">)</span>
<span class="p">(</span><span class="sc">\s</span> <span class="sc">\a</span> <span class="sc">\t</span> <span class="sc">\o</span><span class="p">)</span>
</code></pre>
<p>To randomize the order of a sequence, use <code>shuffle</code>.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nf">shuffle</span> <span class="p">[</span><span class="mi">1</span> <span class="mi">2</span> <span class="mi">3</span> <span class="mi">4</span><span class="p">])</span>
<span class="p">[</span><span class="mi">3</span> <span class="mi">1</span> <span class="mi">2</span> <span class="mi">4</span><span class="p">]</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nb">apply str </span><span class="p">(</span><span class="nf">shuffle</span> <span class="p">(</span><span class="nb">seq </span><span class="s">"abracadabra"</span><span class="p">)))</span>
<span class="s">"acaadabrrab"</span>
</code></pre>
<h2><a href="#subsequences" id="subsequences">Subsequences</a></h2>
<p>We’ve already seen <code>take</code>, which selects the first n elements. There’s also
<code>drop</code>, which removes the first n elements.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">range </span><span class="mi">10</span><span class="p">)</span>
<span class="p">(</span><span class="mi">0</span> <span class="mi">1</span> <span class="mi">2</span> <span class="mi">3</span> <span class="mi">4</span> <span class="mi">5</span> <span class="mi">6</span> <span class="mi">7</span> <span class="mi">8</span> <span class="mi">9</span><span class="p">)</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nb">take </span><span class="mi">3</span> <span class="p">(</span><span class="nb">range </span><span class="mi">10</span><span class="p">))</span>
<span class="p">(</span><span class="mi">0</span> <span class="mi">1</span> <span class="mi">2</span><span class="p">)</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nb">drop </span><span class="mi">3</span> <span class="p">(</span><span class="nb">range </span><span class="mi">10</span><span class="p">))</span>
<span class="p">(</span><span class="mi">3</span> <span class="mi">4</span> <span class="mi">5</span> <span class="mi">6</span> <span class="mi">7</span> <span class="mi">8</span> <span class="mi">9</span><span class="p">)</span>
</code></pre>
<p>And for slicing apart the other end of the sequence, we have <code>take-last</code> and <code>drop-last</code>:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nf">take-last</span> <span class="mi">3</span> <span class="p">(</span><span class="nb">range </span><span class="mi">10</span><span class="p">))</span>
<span class="p">(</span><span class="mi">7</span> <span class="mi">8</span> <span class="mi">9</span><span class="p">)</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nf">drop-last</span> <span class="mi">3</span> <span class="p">(</span><span class="nb">range </span><span class="mi">10</span><span class="p">))</span>
<span class="p">(</span><span class="mi">0</span> <span class="mi">1</span> <span class="mi">2</span> <span class="mi">3</span> <span class="mi">4</span> <span class="mi">5</span> <span class="mi">6</span><span class="p">)</span>
</code></pre>
<p><code>take-while</code> and <code>drop-while</code> work just like <code>take</code> and <code>drop</code>, but use a function to decide when to cut.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">take-while pos? </span><span class="p">[</span><span class="mi">3</span> <span class="mi">2</span> <span class="mi">1</span> <span class="mi">0</span> <span class="mi">-1</span> <span class="mi">-2</span> <span class="mi">10</span><span class="p">])</span>
<span class="p">(</span><span class="mi">3</span> <span class="mi">2</span> <span class="mi">1</span><span class="p">)</span>
</code></pre>
<p>In general, one can cut a sequence in twain by using <code>split-at</code>, and giving it
a particular index. There’s also <code>split-with</code>, which uses a function to decide
when to cut.</p>
<pre><code><span></span><span class="p">(</span><span class="nb">split-at </span><span class="mi">4</span> <span class="p">(</span><span class="nb">range </span><span class="mi">10</span><span class="p">))</span>
<span class="p">[(</span><span class="mi">0</span> <span class="mi">1</span> <span class="mi">2</span> <span class="mi">3</span><span class="p">)</span> <span class="p">(</span><span class="mi">4</span> <span class="mi">5</span> <span class="mi">6</span> <span class="mi">7</span> <span class="mi">8</span> <span class="mi">9</span><span class="p">)]</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nb">split-with </span><span class="nv">number?</span> <span class="p">[</span><span class="mi">1</span> <span class="mi">2</span> <span class="mi">3</span> <span class="ss">:mark</span> <span class="mi">4</span> <span class="mi">5</span> <span class="mi">6</span> <span class="ss">:mark</span> <span class="mi">7</span><span class="p">])</span>
<span class="p">[(</span><span class="mi">1</span> <span class="mi">2</span> <span class="mi">3</span><span class="p">)</span> <span class="p">(</span><span class="ss">:mark</span> <span class="mi">4</span> <span class="mi">5</span> <span class="mi">6</span> <span class="ss">:mark</span> <span class="mi">7</span><span class="p">)]</span>
</code></pre>
<p>Notice that because indexes start at zero, sequence functions tend to have
predictable numbers of elements. <code>(split-at 4)</code> yields <em>four</em> elements in the
first collection, and ensures the second collection <em>begins at index four</em>.
<code>(range 10)</code> has ten elements, corresponding to the first ten indices in a
sequence. <code>(range 3 5)</code> has two (since 5 - 3 is two) elements. These choices simplify the
definition of recursive functions as well.</p>
<p>We can select particular elements from a sequence by applying a function. To
find all positive numbers in a list, use <code>filter</code>:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">filter pos? </span><span class="p">[</span><span class="mi">1</span> <span class="mi">5</span> <span class="mi">-4</span> <span class="mi">-7</span> <span class="mi">3</span> <span class="mi">0</span><span class="p">])</span>
<span class="p">(</span><span class="mi">1</span> <span class="mi">5</span> <span class="mi">3</span><span class="p">)</span>
</code></pre>
<p><code>filter</code> looks at each element in turn, and includes it in the resulting
sequence <em>only</em> if <code>(f element)</code> returns a truthy value. Its complement is
<code>remove</code>, which only includes those elements where <code>(f element)</code> is <code>false</code> or <code>nil</code>.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">remove string? </span><span class="p">[</span><span class="mi">1</span> <span class="s">"turing"</span> <span class="ss">:apple</span><span class="p">])</span>
<span class="p">(</span><span class="mi">1</span> <span class="ss">:apple</span><span class="p">)</span>
</code></pre>
<p>Finally, one can group a sequence into chunks using <code>partition</code>,
<code>partition-all</code>, or <code>partition-by</code>. For instance, one might group alternating
values into pairs:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nf">partition</span> <span class="mi">2</span> <span class="p">[</span><span class="ss">:cats</span> <span class="mi">5</span> <span class="ss">:bats</span> <span class="mi">27</span> <span class="ss">:crocodiles</span> <span class="mi">0</span><span class="p">])</span>
<span class="p">((</span><span class="ss">:cats</span> <span class="mi">5</span><span class="p">)</span> <span class="p">(</span><span class="ss">:bats</span> <span class="mi">27</span><span class="p">)</span> <span class="p">(</span><span class="ss">:crocodiles</span> <span class="mi">0</span><span class="p">))</span>
</code></pre>
<p>Or separate a series of numbers into negative and positive runs:</p>
<pre><code><span></span><span class="p">(</span><span class="nf">user=></span> <span class="p">(</span><span class="nf">partition-by</span> <span class="nb">neg? </span><span class="p">[</span><span class="mi">1</span> <span class="mi">2</span> <span class="mi">3</span> <span class="mi">2</span> <span class="mi">1</span> <span class="mi">-1</span> <span class="mi">-2</span> <span class="mi">-3</span> <span class="mi">-2</span> <span class="mi">-1</span> <span class="mi">1</span> <span class="mi">2</span><span class="p">])</span>
<span class="p">((</span><span class="mi">1</span> <span class="mi">2</span> <span class="mi">3</span> <span class="mi">2</span> <span class="mi">1</span><span class="p">)</span> <span class="p">(</span><span class="mi">-1</span> <span class="mi">-2</span> <span class="mi">-3</span> <span class="mi">-2</span> <span class="mi">-1</span><span class="p">)</span> <span class="p">(</span><span class="mi">1</span> <span class="mi">2</span><span class="p">))</span>
</code></pre>
<h2><a href="#collapsing-sequences" id="collapsing-sequences">Collapsing sequences</a></h2>
<p>After transforming a sequence, we often want to collapse it in some way; to
derive some smaller value. For instance, we might want the number of times each element appears in a sequence:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nf">frequencies</span> <span class="p">[</span><span class="ss">:meow</span> <span class="ss">:mrrrow</span> <span class="ss">:meow</span> <span class="ss">:meow</span><span class="p">])</span>
<span class="p">{</span><span class="ss">:meow</span> <span class="mi">3</span>, <span class="ss">:mrrrow</span> <span class="mi">1</span><span class="p">}</span>
</code></pre>
<p>Or to group elements by some function:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nf">pprint</span> <span class="p">(</span><span class="nf">group-by</span> <span class="ss">:first</span> <span class="p">[{</span><span class="ss">:first</span> <span class="s">"Li"</span> <span class="ss">:last</span> <span class="s">"Zhou"</span><span class="p">}</span>
<span class="p">{</span><span class="ss">:first</span> <span class="s">"Sarah"</span> <span class="ss">:last</span> <span class="s">"Lee"</span><span class="p">}</span>
<span class="p">{</span><span class="ss">:first</span> <span class="s">"Sarah"</span> <span class="ss">:last</span> <span class="s">"Dunn"</span><span class="p">}</span>
<span class="p">{</span><span class="ss">:first</span> <span class="s">"Li"</span> <span class="ss">:last</span> <span class="s">"O'Toole"</span><span class="p">}]))</span>
<span class="p">{</span><span class="s">"Li"</span> <span class="p">[{</span><span class="ss">:last</span> <span class="s">"Zhou"</span>, <span class="ss">:first</span> <span class="s">"Li"</span><span class="p">}</span> <span class="p">{</span><span class="ss">:last</span> <span class="s">"O'Toole"</span>, <span class="ss">:first</span> <span class="s">"Li"</span><span class="p">}]</span>,
<span class="s">"Sarah"</span> <span class="p">[{</span><span class="ss">:last</span> <span class="s">"Lee"</span>, <span class="ss">:first</span> <span class="s">"Sarah"</span><span class="p">}</span> <span class="p">{</span><span class="ss">:last</span> <span class="s">"Dunn"</span>, <span class="ss">:first</span> <span class="s">"Sarah"</span><span class="p">}]}</span>
</code></pre>
<p>Here we’ve taken a sequence of people with first and last names, and used the
<code>:first</code> keyword (which can act as a function!) to look up those first names.
<code>group-by</code> used that function to produce a <em>map</em> of first names to lists of
people–kind of like an index.</p>
<p>In general, we want to <em>combine</em> elements together in some way, using a
function. Where <code>map</code> treated each element independently, reducing a sequence
requires that we bring some information along. The most general way to collapse a sequence is <code>reduce</code>.</p>
<pre><code>user=> (doc reduce)
-------------------------
clojure.core/reduce
([f coll] [f val coll])
f should be a function of 2 arguments. If val is not supplied,
returns the result of applying f to the first 2 items in coll, then
applying f to that result and the 3rd item, etc. If coll contains no
items, f must accept no arguments as well, and reduce returns the
result of calling f with no arguments. If coll has only 1 item, it
is returned and f is not called. If val is supplied, returns the
result of applying f to val and the first item in coll, then
applying f to that result and the 2nd item, etc. If coll contains no
items, returns val and f is not called.
</code></pre>
<p>That’s a little complicated, so we’ll start small. We need a function, <code>f</code>,
which combines successive elements of the sequence. <code>(f state element)</code> will
return the state for the <em>next</em> invocation of <code>f</code>. As <code>f</code> moves along the
sequence, it carries some changing state with it. The final state is the return
value of <code>reduce</code>.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">reduce + </span><span class="p">[</span><span class="mi">1</span> <span class="mi">2</span> <span class="mi">3</span> <span class="mi">4</span><span class="p">])</span>
<span class="mi">10</span>
</code></pre>
<p><code>reduce</code> begins by calling <code>(+ 1 2)</code>, which yields the state <code>3</code>. Then it calls
<code>(+ 3 3)</code>, which yields <code>6</code>. Then <code>(+ 6 4)</code>, which returns <code>10</code>. We’ve taken a
function over <em>two</em> elements, and used it to combine <em>all</em> the elements. Mathematically, we could write:</p>
<pre><code>1 + 2 + 3 + 4
3 + 3 + 4
6 + 4
10
</code></pre>
<p>So another way to look at <code>reduce</code> is like sticking a function <em>between</em> each
pair of elements. To see the reducing process in action, we can use
<code>reductions</code>, which returns a sequence of all the intermediate states.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nf">reductions</span> <span class="nb">+ </span><span class="p">[</span><span class="mi">1</span> <span class="mi">2</span> <span class="mi">3</span> <span class="mi">4</span><span class="p">])</span>
<span class="p">(</span><span class="mi">1</span> <span class="mi">3</span> <span class="mi">6</span> <span class="mi">10</span><span class="p">)</span>
</code></pre>
<p>Oftentimes we include a <em>default</em> state to start with. For instance, we could
start with an empty set, and add each element to it as we go along:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">reduce conj </span><span class="o">#</span><span class="p">{}</span> <span class="p">[</span><span class="ss">:a</span> <span class="ss">:b</span> <span class="ss">:b</span> <span class="ss">:b</span> <span class="ss">:a</span> <span class="ss">:a</span><span class="p">])</span>
<span class="o">#</span><span class="p">{</span><span class="ss">:a</span> <span class="ss">:b</span><span class="p">}</span>
</code></pre>
<p>Reducing elements into a collection has its own name: <code>into</code>. We can conj <code>[key value]</code> vectors into a map, for instance, or build up a list:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">into </span><span class="p">{}</span> <span class="p">[[</span><span class="ss">:a</span> <span class="mi">2</span><span class="p">]</span> <span class="p">[</span><span class="ss">:b</span> <span class="mi">3</span><span class="p">]])</span>
<span class="p">{</span><span class="ss">:a</span> <span class="mi">2</span>, <span class="ss">:b</span> <span class="mi">3</span><span class="p">}</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nb">into </span><span class="p">(</span><span class="nf">list</span><span class="p">)</span> <span class="p">[</span><span class="mi">1</span> <span class="mi">2</span> <span class="mi">3</span> <span class="mi">4</span><span class="p">])</span>
<span class="p">(</span><span class="mi">4</span> <span class="mi">3</span> <span class="mi">2</span> <span class="mi">1</span><span class="p">)</span>
</code></pre>
<p>Because elements added to a list appear at the <em>beginning</em>, not the end, this
expression reverses the sequence. Vectors <code>conj</code> onto the end, so to emit the
elements in order, using <code>reduce</code>, we might try:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">reduce conj </span><span class="p">[]</span> <span class="p">[</span><span class="mi">1</span> <span class="mi">2</span> <span class="mi">3</span> <span class="mi">4</span> <span class="mi">5</span><span class="p">])</span>
<span class="p">(</span><span class="nb">reduce conj </span><span class="p">[]</span> <span class="p">[</span><span class="mi">1</span> <span class="mi">2</span> <span class="mi">3</span> <span class="mi">4</span> <span class="mi">5</span><span class="p">])</span>
<span class="p">[</span><span class="mi">1</span> <span class="mi">2</span> <span class="mi">3</span> <span class="mi">4</span> <span class="mi">5</span><span class="p">]</span>
</code></pre>
<p>Which brings up an interesting thought: this looks an awful lot like <code>map</code>. All that’s missing is some kind of transformation applied to each element.</p>
<pre><code><span></span><span class="p">(</span><span class="kd">defn </span><span class="nv">my-map</span> <span class="p">[</span><span class="nv">f</span> <span class="nv">coll</span><span class="p">]</span>
<span class="p">(</span><span class="nb">reduce </span><span class="p">(</span><span class="k">fn </span><span class="p">[</span><span class="nv">output</span> <span class="nv">element</span><span class="p">]</span>
<span class="p">(</span><span class="nb">conj </span><span class="nv">output</span> <span class="p">(</span><span class="nf">f</span> <span class="nv">element</span><span class="p">)))</span>
<span class="p">[]</span>
<span class="nv">coll</span><span class="p">))</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nf">my-map</span> <span class="nb">inc </span><span class="p">[</span><span class="mi">1</span> <span class="mi">2</span> <span class="mi">3</span> <span class="mi">4</span><span class="p">])</span>
<span class="p">[</span><span class="mi">2</span> <span class="mi">3</span> <span class="mi">4</span> <span class="mi">5</span><span class="p">]</span>
</code></pre>
<p>Huh. <code>map</code> is just a special kind of <code>reduce</code>. What about, say, <code>take-while</code>?</p>
<pre><code><span></span><span class="p">(</span><span class="kd">defn </span><span class="nv">my-take-while</span> <span class="p">[</span><span class="nv">f</span> <span class="nv">coll</span><span class="p">]</span>
<span class="p">(</span><span class="nb">reduce </span><span class="p">(</span><span class="k">fn </span><span class="p">[</span><span class="nv">out</span> <span class="nv">elem</span><span class="p">]</span>
<span class="p">(</span><span class="k">if </span><span class="p">(</span><span class="nf">f</span> <span class="nv">elem</span><span class="p">)</span>
<span class="p">(</span><span class="nb">conj </span><span class="nv">out</span> <span class="nv">elem</span><span class="p">)</span>
<span class="p">(</span><span class="nf">reduced</span> <span class="nv">out</span><span class="p">)))</span>
<span class="p">[]</span>
<span class="nv">coll</span><span class="p">))</span>
</code></pre>
<p>We’re using a special function here, <code>reduced</code>, to indicate that we’ve
completed our reduction <em>early</em> and can skip the rest of the sequence.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nf">my-take-while</span> <span class="nb">pos? </span><span class="p">[</span><span class="mi">2</span> <span class="mi">1</span> <span class="mi">0</span> <span class="mi">-1</span> <span class="mi">0</span> <span class="mi">1</span> <span class="mi">2</span><span class="p">])</span>
<span class="p">[</span><span class="mi">2</span> <span class="mi">1</span><span class="p">]</span>
</code></pre>
<p><code>reduce</code> really is the uberfunction over sequences. Almost any
operation on a sequence can be expressed in terms of a reduce–though for
various reasons, many of the Clojure sequence functions are not written this
way. For instance, <code>take-while</code> is <em>actually</em> defined like so:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nf">source</span> <span class="nv">take-while</span><span class="p">)</span>
<span class="p">(</span><span class="kd">defn </span><span class="nv">take-while</span>
<span class="s">"Returns a lazy sequence of successive items from coll while</span>
<span class="s"> (pred item) returns true. pred must be free of side-effects."</span>
<span class="p">{</span><span class="ss">:added</span> <span class="s">"1.0"</span>
<span class="ss">:static</span> <span class="nv">true</span><span class="p">}</span>
<span class="p">[</span><span class="nv">pred</span> <span class="nv">coll</span><span class="p">]</span>
<span class="p">(</span><span class="nf">lazy-seq</span>
<span class="p">(</span><span class="nb">when-let </span><span class="p">[</span><span class="nv">s</span> <span class="p">(</span><span class="nb">seq </span><span class="nv">coll</span><span class="p">)]</span>
<span class="p">(</span><span class="nb">when </span><span class="p">(</span><span class="nf">pred</span> <span class="p">(</span><span class="nb">first </span><span class="nv">s</span><span class="p">))</span>
<span class="p">(</span><span class="nb">cons </span><span class="p">(</span><span class="nb">first </span><span class="nv">s</span><span class="p">)</span> <span class="p">(</span><span class="nb">take-while </span><span class="nv">pred</span> <span class="p">(</span><span class="nb">rest </span><span class="nv">s</span><span class="p">)))))))</span>
</code></pre>
<p>There’s a few new pieces here, but the structure is <em>essentially</em> the same as
our initial attempt at writing <code>map</code>. When the predicate matches the first
element, cons the first element onto <code>take-while</code>, applied to the rest of the
sequence. That <code>lazy-seq</code> construct allows Clojure to compute this sequence <em>as
required</em>, instead of right away. It defers execution to a later time.</p>
<p>Most of Clojure’s sequence functions are lazy. They don’t do anything until
needed. For instance, we can increment every number from zero to infinity:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="k">def </span><span class="nv">infseq</span> <span class="p">(</span><span class="nb">map inc </span><span class="p">(</span><span class="nb">iterate inc </span><span class="mi">0</span><span class="p">)))</span>
<span class="o">#</span><span class="ss">'user/infseq</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nf">realized?</span> <span class="nv">infseq</span><span class="p">)</span>
<span class="nv">false</span>
</code></pre>
<p>That function returned <em>immediately</em>. Because it hasn’t done any work yet, we
say the sequence is <em>unrealized</em>. It doesn’t increment any numbers at all until
we ask for them:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">take </span><span class="mi">10</span> <span class="nv">infseq</span><span class="p">)</span>
<span class="p">(</span><span class="mi">1</span> <span class="mi">2</span> <span class="mi">3</span> <span class="mi">4</span> <span class="mi">5</span> <span class="mi">6</span> <span class="mi">7</span> <span class="mi">8</span> <span class="mi">9</span> <span class="mi">10</span><span class="p">)</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nf">realized?</span> <span class="nv">infseq</span><span class="p">)</span>
<span class="nv">true</span>
</code></pre>
<p>Lazy sequences also <em>remember</em> their contents, once evaluated, for faster
access.</p>
<h2><a href="#putting-it-all-together" id="putting-it-all-together">Putting it all together</a></h2>
<p>We’ve seen how recursion generalizes a function over <em>one</em> thing into a
function over <em>many</em> things, and discovered a rich landscape of recursive
functions over sequences. Now let’s use our knowledge of sequences to solve a
more complex problem: find the sum of the products of consecutive pairs of the
first 1000 odd integers.</p>
<p>First, we’ll need the integers. We can start with 0, and work our way up to
infinity. To save time printing an infinite number of integers, we’ll start
with just the first 10.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">take </span><span class="mi">10</span> <span class="p">(</span><span class="nb">iterate inc </span><span class="mi">0</span><span class="p">))</span>
<span class="p">(</span><span class="mi">0</span> <span class="mi">1</span> <span class="mi">2</span> <span class="mi">3</span> <span class="mi">4</span> <span class="mi">5</span> <span class="mi">6</span> <span class="mi">7</span> <span class="mi">8</span> <span class="mi">9</span><span class="p">)</span>
</code></pre>
<p>Now we need to find only the ones which are odd. Remember, <code>filter</code> pares down
a sequence to only those elements which pass a test.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">take </span><span class="mi">10</span> <span class="p">(</span><span class="nb">filter </span><span class="nv">odd?</span> <span class="p">(</span><span class="nb">iterate inc </span><span class="mi">0</span><span class="p">)))</span>
<span class="p">(</span><span class="mi">1</span> <span class="mi">3</span> <span class="mi">5</span> <span class="mi">7</span> <span class="mi">9</span> <span class="mi">11</span> <span class="mi">13</span> <span class="mi">15</span> <span class="mi">17</span> <span class="mi">19</span><span class="p">)</span>
</code></pre>
<p>For consecutive pairs, we want to take <code>[1 3 5 7 ...]</code> and find a sequence like <code>([1 3] [3 5] [5 7] ...)</code>. That sounds like a job for <code>partition</code>:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">take </span><span class="mi">3</span> <span class="p">(</span><span class="nf">partition</span> <span class="mi">2</span> <span class="p">(</span><span class="nb">filter </span><span class="nv">odd?</span> <span class="p">(</span><span class="nb">iterate inc </span><span class="mi">0</span><span class="p">))))</span>
<span class="p">((</span><span class="mi">1</span> <span class="mi">3</span><span class="p">)</span> <span class="p">(</span><span class="mi">5</span> <span class="mi">7</span><span class="p">)</span> <span class="p">(</span><span class="mi">9</span> <span class="mi">11</span><span class="p">))</span>
</code></pre>
<p>Not quite right–this gave us non-overlapping pairs, but we wanted overlapping
ones too. A quick check of <code>(doc partition)</code> reveals the <code>step</code> parameter:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">take </span><span class="mi">3</span> <span class="p">(</span><span class="nf">partition</span> <span class="mi">2</span> <span class="mi">1</span> <span class="p">(</span><span class="nb">filter </span><span class="nv">odd?</span> <span class="p">(</span><span class="nb">iterate inc </span><span class="mi">0</span><span class="p">))))</span>
<span class="p">((</span><span class="mi">1</span> <span class="mi">3</span><span class="p">)</span> <span class="p">(</span><span class="mi">3</span> <span class="mi">5</span><span class="p">)</span> <span class="p">(</span><span class="mi">5</span> <span class="mi">7</span><span class="p">))</span>
</code></pre>
<p>Now we need to find the product for each pair. Given a pair, multiply the two
pieces together… yes, that sounds like <code>map</code>:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">take </span><span class="mi">3</span> <span class="p">(</span><span class="nb">map </span><span class="p">(</span><span class="k">fn </span><span class="p">[</span><span class="nv">pair</span><span class="p">]</span> <span class="p">(</span><span class="nb">* </span><span class="p">(</span><span class="nb">first </span><span class="nv">pair</span><span class="p">)</span> <span class="p">(</span><span class="nb">second </span><span class="nv">pair</span><span class="p">)))</span>
<span class="p">(</span><span class="nf">partition</span> <span class="mi">2</span> <span class="mi">1</span> <span class="p">(</span><span class="nb">filter </span><span class="nv">odd?</span> <span class="p">(</span><span class="nb">iterate inc </span><span class="mi">0</span><span class="p">)))))</span>
<span class="p">(</span><span class="mi">3</span> <span class="mi">15</span> <span class="mi">35</span><span class="p">)</span>
</code></pre>
<p>Getting a bit unwieldy, isn’t it? Only one final step: sum all those products.
We’ll adjust the <code>take</code> to include the first 1000, not the first 3, elements.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">reduce </span><span class="nv">+</span>
<span class="p">(</span><span class="nb">take </span><span class="mi">1000</span>
<span class="p">(</span><span class="nb">map </span><span class="p">(</span><span class="k">fn </span><span class="p">[</span><span class="nv">pair</span><span class="p">]</span> <span class="p">(</span><span class="nb">* </span><span class="p">(</span><span class="nb">first </span><span class="nv">pair</span><span class="p">)</span> <span class="p">(</span><span class="nb">second </span><span class="nv">pair</span><span class="p">)))</span>
<span class="p">(</span><span class="nf">partition</span> <span class="mi">2</span> <span class="mi">1</span>
<span class="p">(</span><span class="nb">filter </span><span class="nv">odd?</span>
<span class="p">(</span><span class="nb">iterate inc </span><span class="mi">0</span><span class="p">)))))</span>
<span class="mi">1335333000</span>
</code></pre>
<p>The sum of the first thousand products of consecutive pairs of the odd integers
starting at 0. See how each part leads to the next? This expression looks a lot
like the way we phrased the problem in English–but both English and Lisp
expressions are sort of backwards, in a way. The part that <em>happens first</em>
appears <em>deepest</em>, <em>last</em>, in the expression. In a chain of reasoning like
this, it’d be nicer to write it in order.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nf">->></span> <span class="mi">0</span>
<span class="p">(</span><span class="nb">iterate </span><span class="nv">inc</span><span class="p">)</span>
<span class="p">(</span><span class="nb">filter </span><span class="nv">odd?</span><span class="p">)</span>
<span class="p">(</span><span class="nf">partition</span> <span class="mi">2</span> <span class="mi">1</span><span class="p">)</span>
<span class="p">(</span><span class="nb">map </span><span class="p">(</span><span class="k">fn </span><span class="p">[</span><span class="nv">pair</span><span class="p">]</span>
<span class="p">(</span><span class="nb">* </span><span class="p">(</span><span class="nb">first </span><span class="nv">pair</span><span class="p">)</span> <span class="p">(</span><span class="nb">second </span><span class="nv">pair</span><span class="p">))))</span>
<span class="p">(</span><span class="nb">take </span><span class="mi">1000</span><span class="p">)</span>
<span class="p">(</span><span class="nb">reduce </span><span class="nv">+</span><span class="p">))</span>
<span class="mi">1335333000</span>
</code></pre>
<p>Much easier to read: now everything flows in order, from top to bottom, and
we’ve flattened out the deeply nested expressions into a single level. This is
how object-oriented languages structure their expressions: as a chain of
function invocations, each acting on the previous value.</p>
<p>But how is this possible? Which expression gets evaluated first? <code>(take 1000)</code>
isn’t even a valid call–where’s its second argument? How are <em>any</em> of these
forms evaluated?</p>
<p>What kind of arcane function <em>is</em> <code>->></code>?</p>
<p>All these mysteries, and more, in <a href="/posts/305-clojure-from-the-ground-up-macros">Chapter 5: Macros</a>.</p>
<h2><a href="#problems" id="problems">Problems</a></h2>
<ol>
<li>Write a function to find out if a string is a palindrome–that is, if it
looks the same forwards and backwards.</li>
<li>Find the number of ’c’s in “abracadabra”.</li>
<li>Write your own version of <code>filter</code>.</li>
<li>Find the first 100 prime numbers: 2, 3, 5, 7, 11, 13, 17, ….</li>
</ol>
https://aphyr.com/posts/303-clojure-from-the-ground-up-functionsClojure from the ground up: functions2013-11-03T00:41:53-05:002013-11-03T00:41:53-05:00Aphyrhttps://aphyr.com/<p>We <a href="/posts/302-clojure-from-the-ground-up-basic-types">left off last chapter</a> with a question: what <em>are</em> verbs, anyway? When you evaluate <code>(type :mary-poppins)</code>, what really happens?</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nf">type</span> <span class="ss">:mary-poppins</span><span class="p">)</span>
<span class="nv">clojure.lang.Keyword</span>
</code></pre>
<p>To understand how <code>type</code> works, we’ll need several new ideas. First, we’ll expand on the notion of symbols as references to other values. Then we’ll learn about functions: Clojure’s verbs. Finally, we’ll use the Var system to explore and change the definitions of those functions.</p>
<h2><a href="#let-bindings" id="let-bindings">Let bindings</a></h2>
<p>We know that symbols are names for things, and that when evaluated, Clojure replaces those symbols with their corresponding values. <code>+</code>, for instance, is a symbol which points to the verb <code>#<core$_PLUS_ clojure.core$_PLUS_@12992c></code>.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="nv">+</span>
<span class="o">#</span><span class="nv"><core$_PLUS_</span> <span class="nv">clojure.core$_PLUS_</span><span class="o">@</span><span class="mi">12992</span><span class="nv">c></span>
</code></pre>
<p>When you try to use a symbol which has no defined meaning, Clojure refuses:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="nv">cats</span>
<span class="nv">CompilerException</span> <span class="nv">java.lang.RuntimeException</span><span class="err">:</span> <span class="nv">Unable</span> <span class="nv">to</span> <span class="nb">resolve </span><span class="nv">symbol</span><span class="err">:</span> <span class="nv">cats</span> <span class="nv">in</span> <span class="nv">this</span> <span class="nv">context</span>, <span class="nv">compiling</span><span class="err">:</span><span class="p">(</span><span class="nf">NO_SOURCE_PATH</span><span class="ss">:0:0</span><span class="p">)</span>
</code></pre>
<p>But we can define a meaning for a symbol within a specific expression, using <code>let</code>.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="k">let </span><span class="p">[</span><span class="nv">cats</span> <span class="mi">5</span><span class="p">]</span> <span class="p">(</span><span class="nb">str </span><span class="s">"I have "</span> <span class="nv">cats</span> <span class="s">" cats."</span><span class="p">))</span>
<span class="s">"I have 5 cats."</span>
</code></pre>
<p>The <code>let</code> expression first takes a vector of <em>bindings</em>: alternating symbols and values that those symbols are <em>bound</em> to, within the remainder of the expression. "Let the symbol <code>cats</code> be 5, and construct a string composed of <code>"I have "</code>, <code>cats</code>, and <code>" cats"</code>.</p>
<p>Let bindings apply only within the let expression itself. They also override any existing definitions for symbols at that point in the program. For instance, we can redefine addition to mean subtraction, for the duration of a <code>let</code>:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="k">let </span><span class="p">[</span><span class="nb">+ </span><span class="nv">-</span><span class="p">]</span> <span class="p">(</span><span class="nb">+ </span><span class="mi">2</span> <span class="mi">3</span><span class="p">))</span>
<span class="mi">-1</span>
</code></pre>
<p>But that definition doesn’t apply outside the let:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">+ </span><span class="mi">2</span> <span class="mi">3</span><span class="p">)</span>
<span class="mi">5</span>
</code></pre>
<p>We can also provide <em>multiple</em> bindings. Since Clojure doesn’t care about spacing, alignment, or newlines, I’ll write this on multiple lines for clarity.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="k">let </span><span class="p">[</span><span class="nv">person</span> <span class="s">"joseph"</span>
<span class="nv">num-cats</span> <span class="mi">186</span><span class="p">]</span>
<span class="p">(</span><span class="nb">str </span><span class="nv">person</span> <span class="s">" has "</span> <span class="nv">num-cats</span> <span class="s">" cats!"</span><span class="p">))</span>
<span class="s">"joseph has 186 cats!"</span>
</code></pre>
<p>When multiple bindings are given, they are evaluated in order. Later bindings can use previous bindings.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="k">let </span><span class="p">[</span><span class="nv">cats</span> <span class="mi">3</span>
<span class="nv">legs</span> <span class="p">(</span><span class="nb">* </span><span class="mi">4</span> <span class="nv">cats</span><span class="p">)]</span>
<span class="p">(</span><span class="nb">str </span><span class="nv">legs</span> <span class="s">" legs all together"</span><span class="p">))</span>
<span class="s">"12 legs all together"</span>
</code></pre>
<p>So fundamentally, <code>let</code> defines the meaning of symbols within an expression. When Clojure evaluates a <code>let</code>, it replaces all occurrences of those symbols in the rest of the <code>let</code> expression with their corresponding values, then evaluates the rest of the expression.</p>
<h2><a href="#functions" id="functions">Functions</a></h2>
<p>We saw in <a href="http://aphyr.com/posts/301-clojure-from-the-ground-up-first-principles">chapter one</a> that Clojure evaluates lists by <em>substituting</em> some other value in their place:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">inc </span><span class="mi">1</span><span class="p">)</span>
<span class="mi">2</span>
</code></pre>
<p><code>inc</code> takes any number, and is replaced by that number plus one. That sounds an awful lot like a let:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="k">let </span><span class="p">[</span><span class="nv">x</span> <span class="mi">1</span><span class="p">]</span> <span class="p">(</span><span class="nb">+ </span><span class="nv">x</span> <span class="mi">1</span><span class="p">))</span>
<span class="mi">2</span>
</code></pre>
<p>If we bound <code>x</code> to <code>5</code> instead of <code>1</code>, this expression would evaluate to <code>6</code>. We can think about <code>inc</code> like a let expression, but without particular values provided for the symbols.</p>
<pre><code><span></span><span class="p">(</span><span class="k">let </span><span class="p">[</span><span class="nv">x</span><span class="p">]</span> <span class="p">(</span><span class="nb">+ </span><span class="nv">x</span> <span class="mi">1</span><span class="p">))</span>
</code></pre>
<p>We can’t actually evaluate this program, because there’s no value for <code>x</code> yet. It could be 1, or 4, or 1453. We say that <code>x</code> is <em>unbound</em>, because it has no binding to a particular value. This is the nature of the <em>function</em>: an expression with unbound symbols.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="k">fn </span><span class="p">[</span><span class="nv">x</span><span class="p">]</span> <span class="p">(</span><span class="nb">+ </span><span class="nv">x</span> <span class="mi">1</span><span class="p">))</span>
<span class="o">#</span><span class="nv"><user$eval293$fn__294</span> <span class="nv">user$eval293$fn__294</span><span class="o">@</span><span class="mi">663</span><span class="nv">fc37></span>
</code></pre>
<p>Does the name of that function remind you of anything?</p>
<pre><code><span></span><span class="nv">user=></span> <span class="nv">inc</span>
<span class="o">#</span><span class="nv"><core$inc</span> <span class="nv">clojure.core$inc</span><span class="o">@</span><span class="mi">16</span><span class="nv">bc0b3c></span>
</code></pre>
<p>Almost all verbs in Clojure are functions. Functions represent unrealized computation: expressions which are not yet evaluated, or incomplete. This particular function works just like <code>inc</code>: it’s an expression which has a single unbound symbol, <code>x</code>. When we <em>invoke</em> the function with a particular value, the expressions in the function are evaluated with <code>x</code> bound to that value.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">inc </span><span class="mi">2</span><span class="p">)</span>
<span class="mi">3</span>
<span class="nv">user=></span> <span class="p">((</span><span class="k">fn </span><span class="p">[</span><span class="nv">x</span><span class="p">]</span> <span class="p">(</span><span class="nb">+ </span><span class="nv">x</span> <span class="mi">1</span><span class="p">))</span> <span class="mi">2</span><span class="p">)</span>
<span class="mi">3</span>
</code></pre>
<p>We say that <code>x</code> is this function’s <em>argument</em>, or <em>parameter</em>. When Clojure evaluates <code>(inc 2)</code>, we say that <code>inc</code> is <em>called</em> with <code>2</code>, or that <code>2</code> is <em>passed</em> to <code>inc</code>. The result of that <em>function invocation</em> is the function’s <em>return value</em>. We say that <code>(inc 2)</code> <em>returns</em> <code>3</code>.</p>
<p>Fundamentally, functions describe the relationship between arguments and return values: given <code>1</code>, return <code>2</code>. Given <code>2</code>, return <code>3</code>, and so on. Let bindings describe a similar relationship, but with a specific set of values for those arguments. <code>let</code> is evaluated immediately, whereas <code>fn</code> is evaluated <em>later</em>, when bindings are provided.</p>
<p>There’s a shorthand for writing functions, too: <code>#(+ % 1)</code> is equivalent to <code>(fn [x] (+ x 1))</code>. <code>%</code> takes the place of the first argument to the function. You’ll sometime see <code>%1</code>, <code>%2</code>, etc. used for the first argument, second argument, and so on.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="k">let </span><span class="p">[</span><span class="nv">burrito</span> <span class="o">#</span><span class="p">(</span><span class="nb">list </span><span class="s">"beans"</span> <span class="nv">%</span> <span class="s">"cheese"</span><span class="p">)]</span>
<span class="p">(</span><span class="nf">burrito</span> <span class="s">"carnitas"</span><span class="p">))</span>
<span class="p">(</span><span class="s">"beans"</span> <span class="s">"carnitas"</span> <span class="s">"cheese"</span><span class="p">)</span>
</code></pre>
<p>Since functions exist to <em>defer</em> evaluation, there’s no sense in creating and invoking them in the same expression as we’ve done here. What we want is to give <em>names</em> to our functions, so they can be recombined in different ways.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="k">let </span><span class="p">[</span><span class="nv">twice</span> <span class="p">(</span><span class="k">fn </span><span class="p">[</span><span class="nv">x</span><span class="p">]</span> <span class="p">(</span><span class="nb">* </span><span class="mi">2</span> <span class="nv">x</span><span class="p">))]</span>
<span class="p">(</span><span class="nb">+ </span><span class="p">(</span><span class="nf">twice</span> <span class="mi">1</span><span class="p">)</span>
<span class="p">(</span><span class="nf">twice</span> <span class="mi">3</span><span class="p">)))</span>
<span class="mi">8</span>
</code></pre>
<p>Compare that expression to an equivalent, expanded form:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">+ </span><span class="p">(</span><span class="nb">* </span><span class="mi">2</span> <span class="mi">1</span><span class="p">)</span>
<span class="p">(</span><span class="nb">* </span><span class="mi">2</span> <span class="mi">3</span><span class="p">))</span>
</code></pre>
<p>The name <code>twice</code> is gone, and in its place is the same sort of computation–<code>(* 2 something)</code>–written twice. While we <em>could</em> represent our programs as a single massive expression, it’d be impossible to reason about. Instead, we use functions to compact redundant expressions, by isolating common patterns of computation. Symbols help us re-use those functions (and other values) in more than one place. By giving the symbols meaningful names, we make it easier to reason about the structure of the program as a whole; breaking it up into smaller, understandable parts.</p>
<p>This is core pursuit of software engineering: organizing expressions. Almost every programming language is in search of the right tools to break apart, name, and recombine expressions to solve large problems. In Clojure we’ll see one particular set of tools for composing programs, but the underlying ideas will transfer to many other languages.</p>
<h2><a href="#vars" id="vars">Vars</a></h2>
<p>We’ve used <code>let</code> to define a symbol within an expression, but what about the default meanings of <code>+</code>, <code>conj</code>, and <code>type</code>? Are they also <code>let</code> bindings? Is the whole universe one giant <code>let</code>?</p>
<p>Well, not exactly. That’s one way to think about default bindings, but it’s brittle. We’d need to wrap our whole program in a new <code>let</code> expression every time we wanted to change the meaning of a symbol. And moreover, once a <code>let</code> is defined, there’s no way to change it. If we want to redefine symbols for <em>everyone</em>–even code that we didn’t write–we need a new construct: a <em>mutable</em> variable.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="k">def </span><span class="nv">cats</span> <span class="mi">5</span><span class="p">)</span>
<span class="o">#</span><span class="ss">'user/cats</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nf">type</span> <span class="o">#</span><span class="ss">'user/cats</span><span class="p">)</span>
<span class="nv">clojure.lang.Var</span>
</code></pre>
<p><code>def</code> <em>defines</em> a type of value we haven’t seen before: a var. Vars, like symbols, are references to other values. When evaluated, a symbol pointing to a var is replaced by the var’s corresponding value:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="nv">user/cats</span>
<span class="mi">5</span>
</code></pre>
<p><code>def</code> also <em>binds</em> the symbol <code>cats</code> (and its globally qualified equivalent <code>user/cats</code>) to that var.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="nv">user/cats</span>
<span class="mi">5</span>
<span class="nv">user=></span> <span class="nv">cats</span>
<span class="mi">5</span>
</code></pre>
<p>When we said in chapter one that <code>inc</code>, <code>list</code>, and friends were symbols that pointed to functions, that wasn’t the whole story. The symbol <code>inc</code> points to the var <code>#'inc</code>, which in turn points to the function <code>#<core$inc clojure.core$inc@16bc0b3c></code>. We can see the intermediate var with <code>resolve</code>:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="ss">'inc</span>
<span class="nb">inc </span><span class="c1">; the symbol</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nb">resolve </span><span class="ss">'inc</span><span class="p">)</span>
<span class="o">#</span><span class="ss">'clojure.core/inc</span> <span class="c1">; the var</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nb">eval </span><span class="ss">'inc</span><span class="p">)</span>
<span class="o">#</span><span class="nv"><core$inc</span> <span class="nv">clojure.core$inc</span><span class="o">@</span><span class="mi">16</span><span class="nv">bc0b3c></span> <span class="c1">; the value</span>
</code></pre>
<p>Why two layers of indirection? Because unlike the symbol, we can <em>change</em> the meaning of a Var for everyone, globally, at any time.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="k">def </span><span class="nv">astronauts</span> <span class="p">[])</span>
<span class="o">#</span><span class="ss">'user/astronauts</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nb">count </span><span class="nv">astronauts</span><span class="p">)</span>
<span class="mi">0</span>
<span class="nv">user=></span> <span class="p">(</span><span class="k">def </span><span class="nv">astronauts</span> <span class="p">[</span><span class="s">"Sally Ride"</span> <span class="s">"Guy Bluford"</span><span class="p">])</span>
<span class="o">#</span><span class="ss">'user/astronauts</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nb">count </span><span class="nv">astronauts</span><span class="p">)</span>
<span class="mi">2</span>
</code></pre>
<p>Notice that <code>astronauts</code> had <em>two</em> distinct meanings, depending on <em>when</em> we evaluated it. After the first <code>def</code>, astronauts was an empty vector. After the second <code>def</code>, it had one entry.</p>
<p>If this seems dangerous, you’re a smart cookie. Redefining names in this way changes the meaning of expressions <em>everywhere</em> in a program, without warning. Expressions which relied on the value of a Var could suddenly take on new, possibly incorrect, meanings. It’s a powerful tool for experimenting at the REPL, and for updating a running program, but it can have unexpected consequences. Good Clojurists use <code>def</code> to set up a program initially, and only change those definitions with careful thought.</p>
<p>Totally redefining a Var isn’t the only option. There are safer, controlled ways to change the meaning of a Var within a particular part of a program, which we’ll explore later.</p>
<h2><a href="#defining-functions" id="defining-functions">Defining functions</a></h2>
<p>Armed with <em>def</em>, we’re ready to create our own named functions in Clojure.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="k">def </span><span class="nv">half</span> <span class="p">(</span><span class="k">fn </span><span class="p">[</span><span class="nv">number</span><span class="p">]</span> <span class="p">(</span><span class="nb">/ </span><span class="nv">number</span> <span class="mi">2</span><span class="p">)))</span>
<span class="o">#</span><span class="ss">'user/half</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nf">half</span> <span class="mi">6</span><span class="p">)</span>
<span class="mi">3</span>
</code></pre>
<p>Creating a function and binding it to a var is so common that it has its own form: <code>defn</code>, short for <code>def</code> <code>fn</code>.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="kd">defn </span><span class="nv">half</span> <span class="p">[</span><span class="nv">number</span><span class="p">]</span> <span class="p">(</span><span class="nb">/ </span><span class="nv">number</span> <span class="mi">2</span><span class="p">))</span>
<span class="o">#</span><span class="ss">'user/half</span>
</code></pre>
<p>Functions don’t have to take an argument. We’ve seen functions which take zero arguments, like <code>(+)</code>.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="kd">defn </span><span class="nv">half</span> <span class="p">[]</span> <span class="mi">1</span><span class="nv">/2</span><span class="p">)</span>
<span class="o">#</span><span class="ss">'user/half</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nf">half</span><span class="p">)</span>
<span class="mi">1</span><span class="nv">/2</span>
</code></pre>
<p>But if we try to use our earlier form with one argument, Clojure complains that the <em>arity</em>–the number of arguments to the function–is incorrect.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nf">half</span> <span class="mi">10</span><span class="p">)</span>
<span class="nv">ArityException</span> <span class="nv">Wrong</span> <span class="nv">number</span> <span class="nv">of</span> <span class="nv">args</span> <span class="p">(</span><span class="mi">1</span><span class="p">)</span> <span class="nv">passed</span> <span class="nv">to</span><span class="err">:</span> <span class="nv">user$half</span> <span class="nv">clojure.lang.AFn.throwArity</span> <span class="p">(</span><span class="nf">AFn.java</span><span class="ss">:437</span><span class="p">)</span>
</code></pre>
<p>To handle <em>multiple</em> arities, functions have an alternate form. Instead of an argument vector and a body, one provides a series of lists, each of which starts with an argument vector, followed by the body.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="kd">defn </span><span class="nv">half</span>
<span class="p">([]</span> <span class="mi">1</span><span class="nv">/2</span><span class="p">)</span>
<span class="p">([</span><span class="nv">x</span><span class="p">]</span> <span class="p">(</span><span class="nb">/ </span><span class="nv">x</span> <span class="mi">2</span><span class="p">)))</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nf">half</span><span class="p">)</span>
<span class="mi">1</span><span class="nv">/2</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nf">half</span> <span class="mi">10</span><span class="p">)</span>
<span class="mi">5</span>
</code></pre>
<p>Multiple arguments work just like you expect. Just specify an argument vector of two, or three, or however many arguments the function takes.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="kd">defn </span><span class="nv">add</span>
<span class="p">[</span><span class="nv">x</span> <span class="nv">y</span><span class="p">]</span>
<span class="p">(</span><span class="nb">+ </span><span class="nv">x</span> <span class="nv">y</span><span class="p">))</span>
<span class="o">#</span><span class="ss">'user/add</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nf">add</span> <span class="mi">1</span> <span class="mi">2</span><span class="p">)</span>
<span class="mi">3</span>
</code></pre>
<p>Some functions can take <em>any</em> number of arguments. For that, Clojure provides <code>&</code>, which slurps up all remaining arguments as a list:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="kd">defn </span><span class="nv">vargs</span>
<span class="p">[</span><span class="nv">x</span> <span class="nv">y</span> <span class="o">&</span> <span class="nv">more-args</span><span class="p">]</span>
<span class="p">{</span><span class="ss">:x</span> <span class="nv">x</span>
<span class="ss">:y</span> <span class="nv">y</span>
<span class="ss">:more</span> <span class="nv">more-args</span><span class="p">})</span>
<span class="o">#</span><span class="ss">'user/vargs</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nf">vargs</span> <span class="mi">1</span><span class="p">)</span>
<span class="nv">ArityException</span> <span class="nv">Wrong</span> <span class="nv">number</span> <span class="nv">of</span> <span class="nv">args</span> <span class="p">(</span><span class="mi">1</span><span class="p">)</span> <span class="nv">passed</span> <span class="nv">to</span><span class="err">:</span> <span class="nv">user$vargs</span> <span class="nv">clojure.lang.AFn.throwArity</span> <span class="p">(</span><span class="nf">AFn.java</span><span class="ss">:437</span><span class="p">)</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nf">vargs</span> <span class="mi">1</span> <span class="mi">2</span><span class="p">)</span>
<span class="p">{</span><span class="ss">:x</span> <span class="mi">1</span>, <span class="ss">:y</span> <span class="mi">2</span>, <span class="ss">:more</span> <span class="nv">nil</span><span class="p">}</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nf">vargs</span> <span class="mi">1</span> <span class="mi">2</span> <span class="mi">3</span> <span class="mi">4</span> <span class="mi">5</span><span class="p">)</span>
<span class="p">{</span><span class="ss">:x</span> <span class="mi">1</span>, <span class="ss">:y</span> <span class="mi">2</span>, <span class="ss">:more</span> <span class="p">(</span><span class="mi">3</span> <span class="mi">4</span> <span class="mi">5</span><span class="p">)}</span>
</code></pre>
<p>Note that <code>x</code> and <code>y</code> are mandatory, though there don’t have to be any remaining arguments.</p>
<p>To keep track of what arguments a function takes, why the function exists, and what it does, we usually include a <em>docstring</em>. Docstrings help fill in the missing context around functions, to explain their assumptions, context, and purpose to the world.</p>
<pre><code><span></span><span class="p">(</span><span class="kd">defn </span><span class="nv">launch</span>
<span class="s">"Launches a spacecraft into the given orbit by initiating a</span>
<span class="s"> controlled on-axis burn. Does not automatically stage, but</span>
<span class="s"> does vector thrust, if the craft supports it."</span>
<span class="p">[</span><span class="nv">craft</span> <span class="nv">target-orbit</span><span class="p">]</span>
<span class="s">"OK, we don't know how to control spacecraft yet."</span><span class="p">)</span>
</code></pre>
<p>Docstrings are used to automatically generate documentation for Clojure programs, but you can also access them from the REPL.</p>
<pre><code>user=> (doc launch)
-------------------------
user/launch
([craft target-orbit])
Launches a spacecraft into the given orbit by initiating a
controlled on-axis burn. Does not automatically stage, but
does vector thrust, if the craft supports it.
nil
</code></pre>
<p><code>doc</code> tells us the full name of the function, the arguments it accepts, and its docstring. This information comes from the <code>#'launch</code> var’s <em>metadata</em>, and is saved there by <code>defn</code>. We can inspect metadata directly with the <code>meta</code> function:</p>
<pre><code><span></span><span class="p">(</span><span class="nb">meta </span><span class="o">#</span><span class="ss">'launch</span><span class="p">)</span>
<span class="p">{</span><span class="ss">:arglists</span> <span class="p">([</span><span class="nv">craft</span> <span class="nv">target-orbit</span><span class="p">])</span>, <span class="ss">:ns</span> <span class="o">#</span><span class="nv"><Namespace</span> <span class="nv">user></span>, <span class="ss">:name</span> <span class="nv">launch</span>, <span class="ss">:column</span> <span class="mi">1</span>, <span class="ss">:doc</span> <span class="s">"Launches a spacecraft into the given orbit."</span>, <span class="ss">:line</span> <span class="mi">1</span>, <span class="ss">:file</span> <span class="s">"NO_SOURCE_PATH"</span><span class="p">}</span>
</code></pre>
<p>There’s some other juicy information in there, like the file the function was defined in and which line and column it started at, but that’s not particularly useful since we’re in the REPL, not a file. However, this <em>does</em> hint at a way to answer our motivating question: how does the <code>type</code> function work?</p>
<h2><a href="#how-does-type-work" id="how-does-type-work">How does type work?</a></h2>
<p>We know that <code>type</code> returns the type of an object:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nf">type</span> <span class="mi">2</span><span class="p">)</span>
<span class="nv">java.lang.long</span>
</code></pre>
<p>And that <code>type</code>, like all functions, is a kind of object with its own unique type:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="nv">type</span>
<span class="o">#</span><span class="nv"><core$type</span> <span class="nv">clojure.core$type</span><span class="o">@</span><span class="mi">39</span><span class="nv">bda9b9></span>
<span class="nv">user=></span> <span class="p">(</span><span class="nf">type</span> <span class="nv">type</span><span class="p">)</span>
<span class="nv">clojure.core$type</span>
</code></pre>
<p>This tells us that <code>type</code> is a particular <em>instance</em>, at memory address <code>39bda9b9</code>, of the type <code>clojure.core$type</code>. <code>clojure.core</code> is a namespace which defines the fundamentals of the Clojure language, and <code>$type</code> tells us that it’s named <code>type</code> in that namespace. None of this is particularly helpful, though. Maybe we can find out more about the <code>clojure.core$type</code> by asking what its <em>supertypes</em> are:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nf">supers</span> <span class="p">(</span><span class="nf">type</span> <span class="nv">type</span><span class="p">))</span>
<span class="o">#</span><span class="p">{</span><span class="nv">clojure.lang.AFunction</span> <span class="nv">clojure.lang.IMeta</span> <span class="nv">java.util.concurrent.Callable</span> <span class="nv">clojure.lang.Fn</span> <span class="nv">clojure.lang.AFn</span> <span class="nv">java.util.Comparator</span> <span class="nv">java.lang.Object</span> <span class="nv">clojure.lang.RestFn</span> <span class="nv">clojure.lang.IObj</span> <span class="nv">java.lang.Runnable</span> <span class="nv">java.io.Serializable</span> <span class="nv">clojure.lang.IFn</span><span class="p">}</span>
</code></pre>
<p>This is a set of all the types that include <code>type</code>. We say that <code>type</code> is an <em>instance</em> of <code>clojure.lang.AFunction</code>, or that it <em>implements</em> or <em>extends</em> <code>java.util.concurrent.Callable</code>, and so on. Since it’s a member of <code>clojure.lang.IMeta</code> it has metadata, and since it’s a member of clojure.lang.AFn, it’s a function. Just to double check, let’s confirm that <code>type</code> is indeed a function:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nf">fn?</span> <span class="nv">type</span><span class="p">)</span>
<span class="nv">true</span>
</code></pre>
<p>What about its documentation?</p>
<pre><code>user=> (doc type)
-------------------------
clojure.core/type
([x])
Returns the :type metadata of x, or its Class if none
nil
</code></pre>
<p>Ah, that’s helpful. <code>type</code> can take a single argument, which it calls <code>x</code>. If it has <code>:type</code> metadata, that’s what it returns. Otherwise, it returns the class of <code>x</code>. Let’s take a deeper look at <code>type</code>’s metadata for more clues.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">meta </span><span class="o">#</span><span class="ss">'type</span><span class="p">)</span>
<span class="p">{</span><span class="ss">:ns</span> <span class="o">#</span><span class="nv"><Namespace</span> <span class="nv">clojure.core></span>, <span class="ss">:name</span> <span class="nv">type</span>, <span class="ss">:arglists</span> <span class="p">([</span><span class="nv">x</span><span class="p">])</span>, <span class="ss">:column</span> <span class="mi">1</span>, <span class="ss">:added</span> <span class="s">"1.0"</span>, <span class="ss">:static</span> <span class="nv">true</span>, <span class="ss">:doc</span> <span class="s">"Returns the :type metadata of x, or its Class if none"</span>, <span class="ss">:line</span> <span class="mi">3109</span>, <span class="ss">:file</span> <span class="s">"clojure/core.clj"</span><span class="p">}</span>
</code></pre>
<p>Look at that! This function was first added to Clojure in version 1.0, and is defined in the file <code>clojure/core.clj</code>, on line 3109. We could go dig up the Clojure source code and read its definition there–or we could ask Clojure to do it for us:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nf">source</span> <span class="nv">type</span><span class="p">)</span>
<span class="p">(</span><span class="kd">defn </span><span class="nv">type</span>
<span class="s">"Returns the :type metadata of x, or its Class if none"</span>
<span class="p">{</span><span class="ss">:added</span> <span class="s">"1.0"</span>
<span class="ss">:static</span> <span class="nv">true</span><span class="p">}</span>
<span class="p">[</span><span class="nv">x</span><span class="p">]</span>
<span class="p">(</span><span class="nb">or </span><span class="p">(</span><span class="nb">get </span><span class="p">(</span><span class="nb">meta </span><span class="nv">x</span><span class="p">)</span> <span class="ss">:type</span><span class="p">)</span> <span class="p">(</span><span class="nb">class </span><span class="nv">x</span><span class="p">)))</span>
<span class="nv">nil</span>
</code></pre>
<p>Aha! Here, at last, is how <code>type</code> works. It’s a function which takes a single argument <code>x</code>, and returns either <code>:type</code> from its metadata, or <code>(class x)</code>.</p>
<p>We can delve into any function in Clojure using these tools:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nf">source</span> <span class="nv">+</span><span class="p">)</span>
<span class="p">(</span><span class="kd">defn </span><span class="nv">+</span>
<span class="s">"Returns the sum of nums. (+) returns 0. Does not auto-promote</span>
<span class="s"> longs, will throw on overflow. See also: +'"</span>
<span class="p">{</span><span class="ss">:inline</span> <span class="p">(</span><span class="nf">nary-inline</span> <span class="ss">'add</span> <span class="ss">'unchecked_add</span><span class="p">)</span>
<span class="ss">:inline-arities</span> <span class="nv">>1?</span>
<span class="ss">:added</span> <span class="s">"1.2"</span><span class="p">}</span>
<span class="p">([]</span> <span class="mi">0</span><span class="p">)</span>
<span class="p">([</span><span class="nv">x</span><span class="p">]</span> <span class="p">(</span><span class="nb">cast </span><span class="nv">Number</span> <span class="nv">x</span><span class="p">))</span>
<span class="p">([</span><span class="nv">x</span> <span class="nv">y</span><span class="p">]</span> <span class="p">(</span><span class="k">. </span><span class="nv">clojure.lang.Numbers</span> <span class="p">(</span><span class="nf">add</span> <span class="nv">x</span> <span class="nv">y</span><span class="p">)))</span>
<span class="p">([</span><span class="nv">x</span> <span class="nv">y</span> <span class="o">&</span> <span class="nv">more</span><span class="p">]</span>
<span class="p">(</span><span class="nf">reduce1</span> <span class="nb">+ </span><span class="p">(</span><span class="nb">+ </span><span class="nv">x</span> <span class="nv">y</span><span class="p">)</span> <span class="nv">more</span><span class="p">)))</span>
<span class="nv">nil</span>
</code></pre>
<p>Almost every function in a programming language is made up of other, simpler functions. <code>+</code>, for instance, is defined in terms of <code>cast</code>, <code>add</code>, and <code>reduce1</code>. Sometimes functions are defined in terms of themselves. <code>+</code> uses itself twice in this definition; a technique called <em>recursion</em>.</p>
<p>At the bottom, though, are certain fundamental constructs below which you can go no further. Core axioms of the language. Lisp calls these “special forms”. <code>def</code> and <code>let</code> are special forms (well–almost: <code>let</code> is a thin wrapper around <code>let*</code>, which is a special form) in Clojure. These forms are defined by the core implementation of the language, and are not reducible to other Clojure expressions.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nf">source</span> <span class="nv">def</span><span class="p">)</span>
<span class="nv">Source</span> <span class="nb">not </span><span class="nv">found</span>
</code></pre>
<p>Some Lisps are written <em>entirely</em> in terms of a few special forms, but Clojure is much less pure. Many functions bottom out in Java functions and types, or, for CLJS, in terms of Javascript. Any time you see an expression like <code>(. clojure.lang.Numbers (add x y))</code>, there’s Java code underneath. Below Java lies the JVM, which might be written in C or C++, depending on which one you use. And underneath C and C++ lie more libraries, the operating system, assembler, microcode, registers, and ultimately, electrons flowing through silicon.</p>
<p>A well-designed language <em>isolates</em> you from details you don’t need to worry about, like which logic gates or registers to use, and lets you focus on the task at hand. Good languages also need to allow escape hatches for performance or access to dangerous functionality, as we saw with Vars. You can write entire programs entirely in terms of Clojure, but sometimes, for performance or to use tools from other languages, you’ll rely on Java. The Clojure code is easy to explore with <code>doc</code> and <code>source</code>, but Java can be more opaque–I usually rely on the java source files and online documentation.</p>
<h2><a href="#review" id="review">Review</a></h2>
<p>We’ve seen how <code>let</code> associates names with values in a particular expression, and how Vars allow for <em>mutable</em> bindings which apply universally. and whose definitions can change over time. We learned that Clojure verbs are functions, which express the general shape of an expression but with certain values <em>unbound</em>. Invoking a function <em>binds</em> those variables to specific values, allowing evaluation of the function to proceed.</p>
<p>Functions <em>decompose</em> programs into simpler pieces, expressed in terms of one another. Short, meaningful names help us understand what those functions (and other values) mean.</p>
<p>Finally, we learned how to introspect Clojure functions with <code>doc</code> and <code>source</code>, and saw the definition of some basic Clojure functions. The <a href="http://clojure.org/cheatsheet">Clojure cheatsheet</a> gives a comprehensive list of the core functions in the language, and is a great starting point when you have to solve a problem but don’t know what functions to use.</p>
<p>We’ll see a broad swath of those functions in <a href="http://aphyr.com/posts/304-clojure-from-the-ground-up-sequences">Chapter 4: Sequences</a>.</p>
<p><em>My thanks to Zach Tellman, Kelly Sommers, and Michael R Bernstein for reviewing drafts of this chapter.</em></p>
https://aphyr.com/posts/302-clojure-from-the-ground-up-basic-typesClojure from the ground up: basic types2013-10-26T14:56:16-05:002013-10-26T14:56:16-05:00Aphyrhttps://aphyr.com/<p>We’ve learned <a href="http://aphyr.com/posts/301-clojure-from-the-ground-up-first-principles">the basics of Clojure’s syntax and evaluation model</a>. Now we’ll take a tour of the basic nouns in the language.</p>
<h2><a href="#types" id="types">Types</a></h2>
<p>We’ve seen a few different values already–for instance, <code>nil</code>, <code>true</code>, <code>false</code>, <code>1</code>, <code>2.34</code>, and <code>"meow"</code>. Clearly all these things are <em>different</em> values, but some of them seem more alike than others.</p>
<p>For instance, <code>1</code> and <code>2</code> are <em>very</em> similar numbers; both can be added, divided, multiplied, and subtracted. <code>2.34</code> is also a number, and acts very much like 1 and 2, but it’s not quite the same. It’s got <em>decimal</em> points. It’s not an <em>integer</em>. And clearly <code>true</code> is <em>not</em> very much like a number. What is true plus one? Or false divided by 5.3? These questions are poorly defined.</p>
<p>We say that a <em>type</em> is a group of values which work in the same way. It’s a <em>property</em> that some values share, which allows us to organize the world into sets of similar things. 1 + 1 and 1 + 2 use <em>the same addition</em>, which adds together integers. Types also help us <em>verify</em> that a program makes sense: that you can only add together numbers, instead of adding numbers to porcupines.</p>
<p>Types can overlap and intersect each other. Cats are animals, and cats are fuzzy too. You could say that a cat is a <em>member</em> (or sometimes “instance”), of the fuzzy and animal types. But there are fuzzy things like moss which <em>aren’t</em> animals, and animals like alligators that aren’t fuzzy in the slightest.</p>
<p>Other types completely subsume one another. All tabbies are housecats, and all housecats are felidae, and all felidae are animals. Everything which is true of an animal is automatically true of a housecat. Hierarchical types make it easier to write programs which don’t need to know all the specifics of every value; and conversely, to create new types in terms of others. But they can also get in the way of the programmer, because not every useful classification (like “fuzziness”) is purely hierarchical. Expressing overlapping types in a hierarchy can be tricky.</p>
<p>Every language has a <em>type system</em>; a particular way of organizing nouns into types, figuring out which verbs make sense on which types, and relating types to one another. Some languages are strict, and others more relaxed. Some emphasize hierarchy, and others a more ad-hoc view of the world. We call Clojure’s type system <em>strong</em> in that operations on improper types are simply not allowed: the program will explode if asked to subtract a dandelion. We also say that Clojure’s types are <em>dynamic</em> because they are enforced when the program is run, instead of when the program is first read by the computer.</p>
<p>We’ll learn more about the formal relationships between types later, but for now, keep this in the back of your head. It’ll start to hook in to other concepts later.</p>
<h2><a href="#integers" id="integers">Integers</a></h2>
<p>Let’s find the type of the number 3:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nf">type</span> <span class="mi">3</span><span class="p">)</span>
<span class="nv">java.lang.Long</span>
</code></pre>
<p>So 3 is a <code>java.lang.Long</code>, or a “Long”, for short. Because Clojure is built on top of Java, many of its types are plain old Java types.</p>
<p>Longs, internally, are represented as a group of sixty-four binary digits (ones and zeroes), written down in a particular pattern called <a href="http://en.wikipedia.org/wiki/Two's_complement">signed two’s complement representation</a>. You don’t need to worry about the specifics–there are only two things to remember about longs. First, longs use one bit to store the sign: whether the number is positive or negative. Second, the other 63 bits represent the <em>size</em> of the number. That means the biggest number you can represent with a long is 2<sup>63</sup> - 1 (the minus one is because of the number 0), and the smallest long is -2<sup>63</sup>.</p>
<p>How big is 2<sup>63</sup> - 1?</p>
<pre><code><span></span><span class="nv">user=></span> <span class="nv">Long/MAX_VALUE</span>
<span class="mi">9223372036854775807</span>
</code></pre>
<p>That’s a reasonably big number. Most of the time, you won’t need anything bigger, but… what if you did? What happens if you add one to the biggest Long?</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">inc </span><span class="nv">Long/MAX_VALUE</span><span class="p">)</span>
<span class="nv">ArithmeticException</span> <span class="nv">integer</span> <span class="nv">overflow</span> <span class="nv">clojure.lang.Numbers.throwIntOverflow</span> <span class="p">(</span><span class="nf">Numbers.java</span><span class="ss">:1388</span><span class="p">)</span>
</code></pre>
<p>An error occurs! This is Clojure telling us that something went wrong. The type of error was an <code>ArithmeticException</code>, and its message was “integer overflow”, meaning “this type of number can’t hold a number that big”. The error came from a specific <em>place</em> in the source code of the program: <code>Numbers.java</code>, on line 1388. That’s a part of the Clojure source code. Later, we’ll learn more about how to unravel error messages and find out what went wrong.</p>
<p>The important thing is that Clojure’s type system <em>protected</em> us from doing something dangerous; instead of returning a corrupt value, it aborted evaluation and returned an error.</p>
<p>If you <em>do</em> need to talk about really big numbers, you can use a BigInt: an arbitrary-precision integer. Let’s convert the biggest Long into a BigInt, then increment it:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">inc </span><span class="p">(</span><span class="nf">bigint</span> <span class="nv">Long/MAX_VALUE</span><span class="p">))</span>
<span class="mi">9223372036854775808</span><span class="nv">N</span>
</code></pre>
<p>Notice the N at the end? That’s how Clojure writes arbitrary-precision integers.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nf">type</span> <span class="mi">5</span><span class="nv">N</span><span class="p">)</span>
<span class="nv">clojure.lang.BigInt</span>
</code></pre>
<p>There are also smaller numbers.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nf">type</span> <span class="p">(</span><span class="nb">int </span><span class="mi">0</span><span class="p">))</span>
<span class="nv">java.lang.Integer</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nf">type</span> <span class="p">(</span><span class="nb">short </span><span class="mi">0</span><span class="p">))</span>
<span class="nv">java.lang.Short</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nf">type</span> <span class="p">(</span><span class="nb">byte </span><span class="mi">0</span><span class="p">))</span>
<span class="nv">java.lang.Byte</span>
</code></pre>
<p>Integers are half the size of Longs; they store values in 32 bits. Shorts are 16 bits, and Bytes are 8. That means their biggest values are 2<sup>31</sup>-1, 2<sup>15</sup>-1, and 2<sup>7</sup>-1, respectively.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="nv">Integer/MAX_VALUE</span>
<span class="mi">2147483647</span>
<span class="nv">user=></span> <span class="nv">Short/MAX_VALUE</span>
<span class="mi">32767</span>
<span class="nv">user=></span> <span class="nv">Byte/MAX_VALUE</span>
<span class="mi">127</span>
</code></pre>
<h2><a href="#fractional-numbers" id="fractional-numbers">Fractional numbers</a></h2>
<p>To represent numbers <em>between</em> integers, we often use floating-point numbers, which can represent small numbers with fine precision, and large numbers with coarse precision. Floats use 32 bits, and Doubles use 64. Doubles are the default in Clojure.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nf">type</span> <span class="mf">1.23</span><span class="p">)</span>
<span class="nv">java.lang.Double</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nf">type</span> <span class="p">(</span><span class="nb">float </span><span class="mf">1.23</span><span class="p">))</span>
<span class="nv">java.lang.Float</span>
</code></pre>
<p>Floating point math is <a href="http://en.wikipedia.org/wiki/Floating_point">complicated</a>, and we won’t get bogged down in the details just yet. The important thing to know is floats and doubles are <em>approximations</em>. There are limits to their correctness:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="mf">0.99999999999999999</span>
<span class="mf">1.0</span>
</code></pre>
<p>To represent fractions exactly, we can use the <em>ratio</em> type:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nf">type</span> <span class="mi">1</span><span class="nv">/3</span><span class="p">)</span>
<span class="nv">clojure.lang.Ratio</span>
</code></pre>
<h2><a href="#mathematical-operations" id="mathematical-operations">Mathematical operations</a></h2>
<p>The exact behavior of mathematical operations in Clojure depends on their types. In general, though, Clojure aims to <em>preserve</em> information. Adding two longs returns a long; adding a double and a long returns a double.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">+ </span><span class="mi">1</span> <span class="mi">2</span><span class="p">)</span>
<span class="mi">3</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nb">+ </span><span class="mi">1</span> <span class="mf">2.0</span><span class="p">)</span>
<span class="mf">3.0</span>
</code></pre>
<p><code>3</code> and <code>3.0</code> are <em>not</em> the same number; one is a long, and the other a double. But for most purposes, they’re equivalent, and Clojure will tell you so:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">= </span><span class="mi">3</span> <span class="mf">3.0</span><span class="p">)</span>
<span class="nv">false</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nb">== </span><span class="mi">3</span> <span class="mf">3.0</span><span class="p">)</span>
<span class="nv">true</span>
</code></pre>
<p><code>=</code> asks whether all the things that follow are equal. Since floats are approximations, <code>=</code> considers them different from integers. <code>==</code> also compares things, but a little more loosely: it considers integers equivalent to their floating-point representations.</p>
<p>We can also subtract with <code>-</code>, multiply with <code>*</code>, and divide with <code>/</code>.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">- </span><span class="mi">3</span> <span class="mi">1</span><span class="p">)</span>
<span class="mi">2</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nb">* </span><span class="mf">1.5</span> <span class="mi">3</span><span class="p">)</span>
<span class="mf">4.5</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nb">/ </span><span class="mi">1</span> <span class="mi">2</span><span class="p">)</span>
<span class="mi">1</span><span class="nv">/2</span>
</code></pre>
<p>Putting the verb <em>first</em> in each list allows us to add or multiply more than one number in the same step:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">+ </span><span class="mi">1</span> <span class="mi">2</span> <span class="mi">3</span><span class="p">)</span>
<span class="mi">6</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nb">* </span><span class="mi">2</span> <span class="mi">3</span> <span class="mi">1</span><span class="nv">/5</span><span class="p">)</span>
<span class="mi">6</span><span class="nv">/5</span>
</code></pre>
<p>Subtraction with more than 2 numbers subtracts all later numbers from the first. Division divides the first number by all the rest.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">- </span><span class="mi">5</span> <span class="mi">1</span> <span class="mi">1</span> <span class="mi">1</span><span class="p">)</span>
<span class="mi">2</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nb">/ </span><span class="mi">24</span> <span class="mi">2</span> <span class="mi">3</span><span class="p">)</span>
<span class="mi">4</span>
</code></pre>
<p>By extension, we can define useful interpretations for numeric operations with just a <em>single</em> number:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">+ </span><span class="mi">2</span><span class="p">)</span>
<span class="mi">2</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nb">- </span><span class="mi">2</span><span class="p">)</span>
<span class="mi">-2</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nb">* </span><span class="mi">4</span><span class="p">)</span>
<span class="mi">4</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nb">/ </span><span class="mi">4</span><span class="p">)</span>
<span class="mi">1</span><span class="nv">/4</span>
</code></pre>
<p>We can also add or multiply a list of no numbers at all, obtaining the additive and multiplicative identities, respectively. This might seem odd, especially coming from other languages, but we’ll see later that these generalizations make it easier to reason about higher-level numeric operations.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nf">+</span><span class="p">)</span>
<span class="mi">0</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nf">*</span><span class="p">)</span>
<span class="mi">1</span>
</code></pre>
<p>Often, we want to ask which number is bigger, or if one number falls between two others. <code><=</code> means “less than or equal to”, and asserts that all following values are in order from smallest to biggest.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb"><= </span><span class="mi">1</span> <span class="mi">2</span> <span class="mi">3</span><span class="p">)</span>
<span class="nv">true</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nb"><= </span><span class="mi">1</span> <span class="mi">3</span> <span class="mi">2</span><span class="p">)</span>
<span class="nv">false</span>
</code></pre>
<p><code><</code> means “strictly less than”, and works just like <code><=</code>, except that no two values may be equal.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb"><= </span><span class="mi">1</span> <span class="mi">1</span> <span class="mi">2</span><span class="p">)</span>
<span class="nv">true</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nb">< </span><span class="mi">1</span> <span class="mi">1</span> <span class="mi">2</span><span class="p">)</span>
<span class="nv">false</span>
</code></pre>
<p>Their friends <code>></code> and <code>>=</code> mean “greater than” and “greater than or equal to”, respectively, and assert that numbers are in descending order.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">> </span><span class="mi">3</span> <span class="mi">2</span> <span class="mi">1</span><span class="p">)</span>
<span class="nv">true</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nb">> </span><span class="mi">1</span> <span class="mi">2</span> <span class="mi">3</span><span class="p">)</span>
<span class="nv">false</span>
</code></pre>
<p>Also commonly used are <code>inc</code> and <code>dec</code>, which add and subtract one to a number, respectively:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">inc </span><span class="mi">5</span><span class="p">)</span>
<span class="mi">6</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nb">dec </span><span class="mi">5</span><span class="p">)</span>
<span class="mi">4</span>
</code></pre>
<p>One final note: equality tests can take more than 2 numbers as well.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">= </span><span class="mi">2</span> <span class="mi">2</span> <span class="mi">2</span><span class="p">)</span>
<span class="nv">true</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nb">= </span><span class="mi">2</span> <span class="mi">2</span> <span class="mi">3</span><span class="p">)</span>
<span class="nv">false</span>
</code></pre>
<h2><a href="#strings" id="strings">Strings</a></h2>
<p>We saw that strings are text, surrounded by double quotes, like <code>"foo"</code>. Strings in Clojure are, like Longs, Doubles, and company, backed by a Java type:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nf">type</span> <span class="s">"cat"</span><span class="p">)</span>
<span class="nv">java.lang.String</span>
</code></pre>
<p>We can make almost <em>anything</em> into a string with <code>str</code>. Strings, symbols, numbers, booleans; every value in Clojure has a string representation. Note that <code>nil</code>’s string representation is <code>""</code>; an empty string.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">str </span><span class="s">"cat"</span><span class="p">)</span>
<span class="s">"cat"</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nb">str </span><span class="ss">'cat</span><span class="p">)</span>
<span class="s">"cat"</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nb">str </span><span class="mi">1</span><span class="p">)</span>
<span class="s">"1"</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nb">str </span><span class="nv">true</span><span class="p">)</span>
<span class="s">"true"</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nb">str </span><span class="o">'</span><span class="p">(</span><span class="mi">1</span> <span class="mi">2</span> <span class="mi">3</span><span class="p">))</span>
<span class="s">"(1 2 3)"</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nb">str </span><span class="nv">nil</span><span class="p">)</span>
<span class="s">""</span>
</code></pre>
<p><code>str</code> can also <em>combine</em> things together into a single string, which we call “concatenation”.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">str </span><span class="s">"meow "</span> <span class="mi">3</span> <span class="s">" times"</span><span class="p">)</span>
<span class="s">"meow 3 times"</span>
</code></pre>
<p>To look for patterns in text, we can use a <a href="http://www.regular-expressions.info/tutorial.html">regular expression</a>, which is a tiny language for describing particular arrangements of text. <code>re-find</code> and <code>re-matches</code> look for occurrences of a regular expression in a string. To find a cat:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">re-find </span><span class="o">#</span><span class="s">"cat"</span> <span class="s">"mystic cat mouse"</span><span class="p">)</span>
<span class="s">"cat"</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nb">re-find </span><span class="o">#</span><span class="s">"cat"</span> <span class="s">"only dogs here"</span><span class="p">)</span>
<span class="nv">nil</span>
</code></pre>
<p>That <code>#"..."</code> is Clojure’s way of writing a regular expression.</p>
<p>With <code>re-matches</code>, you can extract particular parts of a string which match an expression. Here we find two strings, separated by a <code>:</code>. The parentheses mean that the regular expression should <em>capture</em> that part of the match. We get back a list containing the part of the string that matched the first parentheses, followed by the part that matched the second parentheses.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">rest </span><span class="p">(</span><span class="nb">re-matches </span><span class="o">#</span><span class="s">"(.+):(.+)"</span> <span class="s">"mouse:treat"</span><span class="p">))</span>
<span class="p">(</span><span class="s">"mouse"</span> <span class="s">"treat"</span><span class="p">)</span>
</code></pre>
<p>Regular expressions are a powerful tool for searching and matching text, especially when working with data files. Since regexes work the same in most languages, you can use any guide online to learn more. It’s not something you have to master right away; just learn specific tricks as you find you need them. For a deeper guide, try Fitzgerald’s <a href="http://shop.oreilly.com/product/0636920012337.do">Introducing Regular Expressions</a>.</p>
<h2><a href="#booleans-and-logic" id="booleans-and-logic">Booleans and logic</a></h2>
<p>Everything in Clojure has a sort of charge, a truth value, sometimes called “truthiness”. <code>true</code> is positive and <code>false</code> is negative. <code>nil</code> is negative, too.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">boolean </span><span class="nv">true</span><span class="p">)</span>
<span class="nv">true</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nb">boolean </span><span class="nv">false</span><span class="p">)</span>
<span class="nv">false</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nb">boolean </span><span class="nv">nil</span><span class="p">)</span>
<span class="nv">false</span>
</code></pre>
<p>Every other value in Clojure is positive.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">boolean </span><span class="mi">0</span><span class="p">)</span>
<span class="nv">true</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nb">boolean </span><span class="mi">1</span><span class="p">)</span>
<span class="nv">true</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nb">boolean </span><span class="s">"hi there"</span><span class="p">)</span>
<span class="nv">true</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nb">boolean </span><span class="nv">str</span><span class="p">)</span>
<span class="nv">true</span>
</code></pre>
<p>If you’re coming from a C-inspired language, where 0 is considered false, this might be a bit surprising. Likewise, in much of POSIX, 0 is considered success and nonzero values are failures. Lisp allows no such confusion: the only negative values are <code>false</code> and <code>nil</code>.</p>
<p>We can reason about truth values using <code>and</code>, <code>or</code>, and <code>not</code>. <code>and</code> returns the first negative value, or the last value if all are truthy.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">and </span><span class="nv">true</span> <span class="nv">false</span> <span class="nv">true</span><span class="p">)</span>
<span class="nv">false</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nb">and </span><span class="nv">true</span> <span class="nv">true</span> <span class="nv">true</span><span class="p">)</span>
<span class="nv">true</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nb">and </span><span class="mi">1</span> <span class="mi">2</span> <span class="mi">3</span><span class="p">)</span>
<span class="mi">3</span>
</code></pre>
<p>Similarly, <code>or</code> returns the first positive value.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">or </span><span class="nv">false</span> <span class="mi">2</span> <span class="mi">3</span><span class="p">)</span>
<span class="mi">2</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nb">or </span><span class="nv">false</span> <span class="nv">nil</span><span class="p">)</span>
<span class="nv">nil</span>
</code></pre>
<p>And <code>not</code> inverts the logical sense of a value:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">not </span><span class="mi">2</span><span class="p">)</span>
<span class="nv">false</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nb">not </span><span class="nv">nil</span><span class="p">)</span>
<span class="nv">true</span>
</code></pre>
<p>We’ll learn more about Boolean logic when we start talking about <em>control flow</em>; the way we alter evaluation of a program and express ideas like “if I’m a cat, then meow incessantly”.</p>
<h2><a href="#symbols" id="symbols">Symbols</a></h2>
<p>We saw symbols in the previous chapter; they’re bare strings of characters, like <code>foo</code> or <code>+</code>.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">class </span><span class="ss">'str</span><span class="p">)</span>
<span class="nv">clojure.lang.Symbol</span>
</code></pre>
<p>Symbols can have either short or full names. The short name is used to refer to things locally. The <em>fully qualified</em> name is used to refer unambiguously to a symbol from anywhere. If I were a symbol, my name would be “Kyle”, and my full name “Kyle Kingsbury.”</p>
<p>Symbol names are separated with a <code>/</code>. For instance, the symbol <code>str</code> is also present in a family called <code>clojure.core</code>; the corresponding full name is <code>clojure.core/str</code>.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">= str </span><span class="nv">clojure.core/str</span><span class="p">)</span>
<span class="nv">true</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nb">name </span><span class="ss">'clojure.core/str</span><span class="p">)</span>
<span class="s">"str"</span>
</code></pre>
<p>When we talked about the maximum size of an integer, that was a fully-qualified symbol, too.</p>
<pre><code><span></span><span class="p">(</span><span class="nf">type</span> <span class="ss">'Integer/MAX_VALUE</span><span class="p">)</span>
<span class="nv">clojure.lang.Symbol</span>
</code></pre>
<p>The job of symbols is to <em>refer</em> to things, to <em>point</em> to other values. When evaluating a program, symbols are looked up and replaced by their corresponding values. That’s not the only use of symbols, but it’s the most common.</p>
<h2><a href="#keywords" id="keywords">Keywords</a></h2>
<p>Closely related to symbols and strings are <em>keywords</em>, which begin with a <code>:</code>. Keywords are like strings in that they’re made up of text, but are specifically intended for use as <em>labels</em> or <em>identifiers</em>. These <em>aren’t</em> labels in the sense of symbols: keywords aren’t replaced by any other value. They’re just names, by themselves.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nf">type</span> <span class="ss">:cat</span><span class="p">)</span>
<span class="nv">clojure.lang.Keyword</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nb">str </span><span class="ss">:cat</span><span class="p">)</span>
<span class="s">":cat"</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nb">name </span><span class="ss">:cat</span><span class="p">)</span>
<span class="s">"cat"</span>
</code></pre>
<p>As labels, keywords are most useful when paired with other values in a collection, like a <em>map</em>. Keywords can also be used as verbs to <em>look up specific values</em> in other data types. We’ll learn more about keywords shortly.</p>
<h2><a href="#lists" id="lists">Lists</a></h2>
<p>A collection is a group of values. It’s a <em>container</em> which provides some structure, some framework, for the things that it holds. We say that a collection contains <em>elements</em>, or <em>members</em>. We saw one kind of collection–a list–in the previous chapter.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="o">'</span><span class="p">(</span><span class="mi">1</span> <span class="mi">2</span> <span class="mi">3</span><span class="p">)</span>
<span class="p">(</span><span class="mi">1</span> <span class="mi">2</span> <span class="mi">3</span><span class="p">)</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nf">type</span> <span class="o">'</span><span class="p">(</span><span class="mi">1</span> <span class="mi">2</span> <span class="mi">3</span><span class="p">))</span>
<span class="nv">clojure.lang.PersistentList</span>
</code></pre>
<p>Remember, we <em>quote</em> lists with a <code>'</code> to prevent them from being evaluated. You can also construct a list using <code>list</code>:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">list </span><span class="mi">1</span> <span class="mi">2</span> <span class="mi">3</span><span class="p">)</span>
<span class="p">(</span><span class="mi">1</span> <span class="mi">2</span> <span class="mi">3</span><span class="p">)</span>
</code></pre>
<p>Lists are comparable just like every other value:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">= </span><span class="p">(</span><span class="nb">list </span><span class="mi">1</span> <span class="mi">2</span><span class="p">)</span> <span class="p">(</span><span class="nb">list </span><span class="mi">1</span> <span class="mi">2</span><span class="p">))</span>
<span class="nv">true</span>
</code></pre>
<p>You can modify a list by <code>conj</code>oining an element onto it:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">conj </span><span class="o">'</span><span class="p">(</span><span class="mi">1</span> <span class="mi">2</span> <span class="mi">3</span><span class="p">)</span> <span class="mi">4</span><span class="p">)</span>
<span class="p">(</span><span class="mi">4</span> <span class="mi">1</span> <span class="mi">2</span> <span class="mi">3</span><span class="p">)</span>
</code></pre>
<p>We added 4 to the list–but it appeared at the <em>front</em>. Why? Internally, lists are stored as a <em>chain</em> of values: each link in the chain is a tiny box which holds the value and a connection to the next link. This
data structure, called a linked list, offers immediate access to the first element.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">first </span><span class="p">(</span><span class="nb">list </span><span class="mi">1</span> <span class="mi">2</span> <span class="mi">3</span><span class="p">))</span>
<span class="mi">1</span>
</code></pre>
<p>But getting to the second element requires an extra hop down the chain</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">second </span><span class="p">(</span><span class="nb">list </span><span class="mi">1</span> <span class="mi">2</span> <span class="mi">3</span><span class="p">))</span>
<span class="mi">2</span>
</code></pre>
<p>and the third element a hop after that, and so on.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">nth </span><span class="p">(</span><span class="nb">list </span><span class="mi">1</span> <span class="mi">2</span> <span class="mi">3</span><span class="p">)</span> <span class="mi">2</span><span class="p">)</span>
<span class="mi">3</span>
</code></pre>
<p><code>nth</code> gets the element of an ordered collection at a particular <em>index</em>. The first element is index 0, the second is index 1, and so on.</p>
<p>This means that lists are well-suited for small collections, or collections which are read in linear order, but are slow when you want to get arbitrary elements from later in the list. For fast access to every element, we use a <em>vector</em>.</p>
<h2><a href="#vectors" id="vectors">Vectors</a></h2>
<p>Vectors are surrounded by square brackets, just like lists are surrounded by parentheses. Because vectors <em>aren’t</em> evaluated like lists are, there’s no need to quote them:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">[</span><span class="mi">1</span> <span class="mi">2</span> <span class="mi">3</span><span class="p">]</span>
<span class="p">[</span><span class="mi">1</span> <span class="mi">2</span> <span class="mi">3</span><span class="p">]</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nf">type</span> <span class="p">[</span><span class="mi">1</span> <span class="mi">2</span> <span class="mi">3</span><span class="p">])</span>
<span class="nv">clojure.lang.PersistentVector</span>
</code></pre>
<p>You can also create vectors with <code>vector</code>, or change other structures into vectors with <code>vec</code>:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">vector </span><span class="mi">1</span> <span class="mi">2</span> <span class="mi">3</span><span class="p">)</span>
<span class="p">[</span><span class="mi">1</span> <span class="mi">2</span> <span class="mi">3</span><span class="p">]</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nf">vec</span> <span class="p">(</span><span class="nb">list </span><span class="mi">1</span> <span class="mi">2</span> <span class="mi">3</span><span class="p">))</span>
<span class="p">[</span><span class="mi">1</span> <span class="mi">2</span> <span class="mi">3</span><span class="p">]</span>
</code></pre>
<p><code>conj</code> on a vector adds to the <em>end</em>, not the <em>start</em>:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">conj </span><span class="p">[</span><span class="mi">1</span> <span class="mi">2</span> <span class="mi">3</span><span class="p">]</span> <span class="mi">4</span><span class="p">)</span>
<span class="p">[</span><span class="mi">1</span> <span class="mi">2</span> <span class="mi">3</span> <span class="mi">4</span><span class="p">]</span>
</code></pre>
<p>Our friends <code>first</code>, <code>second</code>, and <code>nth</code> work here too; but unlike lists, <code>nth</code> is <em>fast</em> on vectors. That’s because internally, vectors are represented as a very broad tree of elements, where each part of the tree branches into 32 smaller trees. Even very large vectors are only a few layers deep, which means getting to elements only takes a few hops.</p>
<p>In addition to <code>first</code>, you’ll often want to get the <em>remaining</em> elements in a collection. There are two ways to do this:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">rest </span><span class="p">[</span><span class="mi">1</span> <span class="mi">2</span> <span class="mi">3</span><span class="p">])</span>
<span class="p">(</span><span class="mi">2</span> <span class="mi">3</span><span class="p">)</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nb">next </span><span class="p">[</span><span class="mi">1</span> <span class="mi">2</span> <span class="mi">3</span><span class="p">])</span>
<span class="p">(</span><span class="mi">2</span> <span class="mi">3</span><span class="p">)</span>
</code></pre>
<p><code>rest</code> and <code>next</code> both return “everything but the first element”. They differ only by what happens when there are no remaining elements:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">rest </span><span class="p">[</span><span class="mi">1</span><span class="p">])</span>
<span class="p">()</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nb">next </span><span class="p">[</span><span class="mi">1</span><span class="p">])</span>
<span class="nv">nil</span>
</code></pre>
<p><code>rest</code> returns logical true, <code>next</code> returns logical false. Each has their uses, but in almost every case they’re equivalent–I interchange them freely.</p>
<p>We can get the final element of any collection with <code>last</code>:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">last </span><span class="p">[</span><span class="mi">1</span> <span class="mi">2</span> <span class="mi">3</span><span class="p">])</span>
<span class="mi">3</span>
</code></pre>
<p>And figure out how big the vector is with <code>count</code>:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">count </span><span class="p">[</span><span class="mi">1</span> <span class="mi">2</span> <span class="mi">3</span><span class="p">])</span>
<span class="mi">3</span>
</code></pre>
<p>Because vectors are intended for looking up elements by index, we can also use them directly as <em>verbs</em>:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">([</span><span class="ss">:a</span> <span class="ss">:b</span> <span class="ss">:c</span><span class="p">]</span> <span class="mi">1</span><span class="p">)</span>
<span class="ss">:b</span>
</code></pre>
<p>So we took the vector containing three keywords, and asked “What’s the element at index 1?” Lisp, like most (but not all!) modern languages, counts up from <em>zero</em>, not one. Index 0 is the first element, index 1 is the second element, and so on. In this vector, finding the element at index 1 evaluates to <code>:b</code>.</p>
<p>Finally, note that vectors and lists containing the same elements are considered equal in Clojure:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">= </span><span class="o">'</span><span class="p">(</span><span class="mi">1</span> <span class="mi">2</span> <span class="mi">3</span><span class="p">)</span> <span class="p">[</span><span class="mi">1</span> <span class="mi">2</span> <span class="mi">3</span><span class="p">])</span>
<span class="nv">true</span>
</code></pre>
<p>In almost all contexts, you can consider vectors, lists, and other sequences as interchangeable. They only differ in their performance characteristics, and in a few data-structure-specific operations.</p>
<h2><a href="#sets" id="sets">Sets</a></h2>
<p>Sometimes you want an unordered collection of values; especially when you plan to ask questions like “does the collection have the number 3 in it?” Clojure, like most languages, calls these collections <em>sets</em>.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="o">#</span><span class="p">{</span><span class="ss">:a</span> <span class="ss">:b</span> <span class="ss">:c</span><span class="p">}</span>
<span class="o">#</span><span class="p">{</span><span class="ss">:a</span> <span class="ss">:c</span> <span class="ss">:b</span><span class="p">}</span>
</code></pre>
<p>Sets are surrounded by <code>#{...}</code>. Notice that though we gave the elements <code>:a</code>, <code>:b</code>, and <code>:c</code>, they came out in a different order. In general, the order of sets can shift at any time. If you want a particular order, you can ask for it as a list or vector:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nf">vec</span> <span class="o">#</span><span class="p">{</span><span class="ss">:a</span> <span class="ss">:b</span> <span class="ss">:c</span><span class="p">})</span>
<span class="p">[</span><span class="ss">:a</span> <span class="ss">:c</span> <span class="ss">:b</span><span class="p">]</span>
</code></pre>
<p>Or ask for the elements in sorted order:</p>
<pre><code><span></span><span class="p">(</span><span class="nb">sort </span><span class="o">#</span><span class="p">{</span><span class="ss">:a</span> <span class="ss">:b</span> <span class="ss">:c</span><span class="p">})</span>
<span class="p">(</span><span class="ss">:a</span> <span class="ss">:b</span> <span class="ss">:c</span><span class="p">)</span>
</code></pre>
<p><code>conj</code> on a set adds an element:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">conj </span><span class="o">#</span><span class="p">{</span><span class="ss">:a</span> <span class="ss">:b</span> <span class="ss">:c</span><span class="p">}</span> <span class="ss">:d</span><span class="p">)</span>
<span class="o">#</span><span class="p">{</span><span class="ss">:a</span> <span class="ss">:c</span> <span class="ss">:b</span> <span class="ss">:d</span><span class="p">}</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nb">conj </span><span class="o">#</span><span class="p">{</span><span class="ss">:a</span> <span class="ss">:b</span> <span class="ss">:c</span><span class="p">}</span> <span class="ss">:a</span><span class="p">)</span>
<span class="o">#</span><span class="p">{</span><span class="ss">:a</span> <span class="ss">:c</span> <span class="ss">:b</span><span class="p">}</span>
</code></pre>
<p>Sets never contain an element more than once, so <code>conj</code>ing an element which is already present does nothing. Conversely, one removes elements with <code>disj</code>:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">disj </span><span class="o">#</span><span class="p">{</span><span class="s">"hornet"</span> <span class="s">"hummingbird"</span><span class="p">}</span> <span class="s">"hummingbird"</span><span class="p">)</span>
<span class="o">#</span><span class="p">{</span><span class="s">"hornet"</span><span class="p">}</span>
</code></pre>
<p>The most common operation with a set is to check whether something is inside it. For this we use <code>contains?</code>.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">contains? </span><span class="o">#</span><span class="p">{</span><span class="mi">1</span> <span class="mi">2</span> <span class="mi">3</span><span class="p">}</span> <span class="mi">3</span><span class="p">)</span>
<span class="nv">true</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nb">contains? </span><span class="o">#</span><span class="p">{</span><span class="mi">1</span> <span class="mi">2</span> <span class="mi">3</span><span class="p">}</span> <span class="mi">5</span><span class="p">)</span>
<span class="nv">false</span>
</code></pre>
<p>Like vectors, you can use the set <em>itself</em> as a verb. Unlike <code>contains?</code>, this expression returns the element itself (if it was present), or <code>nil</code>.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="o">#</span><span class="p">{</span><span class="mi">1</span> <span class="mi">2</span> <span class="mi">3</span><span class="p">}</span> <span class="mi">3</span><span class="p">)</span>
<span class="mi">3</span>
<span class="nv">user=></span> <span class="p">(</span><span class="o">#</span><span class="p">{</span><span class="mi">1</span> <span class="mi">2</span> <span class="mi">3</span><span class="p">}</span> <span class="mi">4</span><span class="p">)</span>
<span class="nv">nil</span>
</code></pre>
<p>You can make a set out of any other collection with <code>set</code>.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">set </span><span class="p">[</span><span class="ss">:a</span> <span class="ss">:b</span> <span class="ss">:c</span><span class="p">])</span>
<span class="o">#</span><span class="p">{</span><span class="ss">:a</span> <span class="ss">:c</span> <span class="ss">:b</span><span class="p">}</span>
</code></pre>
<h2><a href="#maps" id="maps">Maps</a></h2>
<p>The last collection on our tour is the <em>map</em>: a data structure which associates <em>keys</em> with <em>values</em>. In a dictionary, the keys are words and the definitions are the values. In a library, keys are call signs, and the books are values. Maps are indexes for looking things up, and for representing different pieces of named information together. Here’s a cat:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">{</span><span class="ss">:name</span> <span class="s">"mittens"</span> <span class="ss">:weight</span> <span class="mi">9</span> <span class="ss">:color</span> <span class="s">"black"</span><span class="p">}</span>
<span class="p">{</span><span class="ss">:weight</span> <span class="mi">9</span>, <span class="ss">:name</span> <span class="s">"mittens"</span>, <span class="ss">:color</span> <span class="s">"black"</span><span class="p">}</span>
</code></pre>
<p>Maps are surrounded by braces <code>{...}</code>, filled by alternating keys and values. In this map, the three keys are <code>:name</code>, <code>:color</code>, and <code>:weight</code>, and their values are <code>"mittens"</code>, <code>"black"</code>, and 9, respectively. We can look up the corresponding value for a key with <code>get</code>:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">get </span><span class="p">{</span><span class="s">"cat"</span> <span class="s">"meow"</span> <span class="s">"dog"</span> <span class="s">"woof"</span><span class="p">}</span> <span class="s">"cat"</span><span class="p">)</span>
<span class="s">"meow"</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nb">get </span><span class="p">{</span><span class="ss">:a</span> <span class="mi">1</span> <span class="ss">:b</span> <span class="mi">2</span><span class="p">}</span> <span class="ss">:c</span><span class="p">)</span>
<span class="nv">nil</span>
</code></pre>
<p><code>get</code> can also take a <em>default</em> value to return instead of nil, if the key doesn’t exist in that map.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">get </span><span class="p">{</span><span class="ss">:glinda</span> <span class="ss">:good</span><span class="p">}</span> <span class="ss">:wicked</span> <span class="ss">:not-here</span><span class="p">)</span>
<span class="ss">:not-here</span>
</code></pre>
<p>Since lookups are so important for maps, we can use a map as a verb directly:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">({</span><span class="s">"amlodipine"</span> <span class="mi">12</span> <span class="s">"ibuprofen"</span> <span class="mi">50</span><span class="p">}</span> <span class="s">"ibuprofen"</span><span class="p">)</span>
<span class="mi">50</span>
</code></pre>
<p>And conversely, keywords can <em>also</em> be used as verbs, which look themselves up in maps:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="ss">:raccoon</span> <span class="p">{</span><span class="ss">:weasel</span> <span class="s">"queen"</span> <span class="ss">:raccoon</span> <span class="s">"king"</span><span class="p">})</span>
<span class="s">"king"</span>
</code></pre>
<p>You can add a value for a given key to a map with <code>assoc</code>.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">assoc </span><span class="p">{</span><span class="ss">:bolts</span> <span class="mi">1088</span><span class="p">}</span> <span class="ss">:camshafts</span> <span class="mi">3</span><span class="p">)</span>
<span class="p">{</span><span class="ss">:camshafts</span> <span class="mi">3</span> <span class="ss">:bolts</span> <span class="mi">1088</span><span class="p">}</span>
<span class="nv">user=></span> <span class="p">(</span><span class="nb">assoc </span><span class="p">{</span><span class="ss">:camshafts</span> <span class="mi">3</span><span class="p">}</span> <span class="ss">:camshafts</span> <span class="mi">2</span><span class="p">)</span>
<span class="p">{</span><span class="ss">:camshafts</span> <span class="mi">2</span><span class="p">}</span>
</code></pre>
<p>Assoc adds keys if they aren’t present, and <em>replaces</em> values if they’re already there. If you associate a value onto <code>nil</code>, it creates a new map.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">assoc </span><span class="nv">nil</span> <span class="mi">5</span> <span class="mi">2</span><span class="p">)</span>
<span class="p">{</span><span class="mi">5</span> <span class="mi">2</span><span class="p">}</span>
</code></pre>
<p>You can combine maps together using <code>merge</code>, which yields a map containing all the elements of <em>all</em> given maps, preferring the values from later ones.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">merge </span><span class="p">{</span><span class="ss">:a</span> <span class="mi">1</span> <span class="ss">:b</span> <span class="mi">2</span><span class="p">}</span> <span class="p">{</span><span class="ss">:b</span> <span class="mi">3</span> <span class="ss">:c</span> <span class="mi">4</span><span class="p">})</span>
<span class="p">{</span><span class="ss">:c</span> <span class="mi">4</span>, <span class="ss">:a</span> <span class="mi">1</span>, <span class="ss">:b</span> <span class="mi">3</span><span class="p">}</span>
</code></pre>
<p>Finally, to remove a value, use <code>dissoc</code>.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">dissoc </span><span class="p">{</span><span class="ss">:potatoes</span> <span class="mi">5</span> <span class="ss">:mushrooms</span> <span class="mi">2</span><span class="p">}</span> <span class="ss">:mushrooms</span><span class="p">)</span>
<span class="p">{</span><span class="ss">:potatoes</span> <span class="mi">5</span><span class="p">}</span>
</code></pre>
<h2><a href="#putting-it-all-together" id="putting-it-all-together">Putting it all together</a></h2>
<p>All these collections and types can be combined freely. As software engineers, we model the world by creating a particular <em>representation</em> of the problem in the program. Having a rich set of values at our disposal allows us to talk about complex problems. We might describe a person:</p>
<pre><code><span></span><span class="p">{</span><span class="ss">:name</span> <span class="s">"Amelia Earhart"</span>
<span class="ss">:birth</span> <span class="mi">1897</span>
<span class="ss">:death</span> <span class="mi">1939</span>
<span class="ss">:awards</span> <span class="p">{</span><span class="s">"US"</span> <span class="o">#</span><span class="p">{</span><span class="s">"Distinguished Flying Cross"</span> <span class="s">"National Women's Hall of Fame"</span><span class="p">}</span>
<span class="s">"World"</span> <span class="o">#</span><span class="p">{</span><span class="s">"Altitude record for Autogyro"</span> <span class="s">"First to cross Atlantic twice"</span><span class="p">}}}</span>
</code></pre>
<p>Or a recipe:</p>
<pre><code><span></span><span class="p">{</span><span class="ss">:title</span> <span class="s">"Chocolate chip cookies"</span>
<span class="ss">:ingredients</span> <span class="p">{</span><span class="s">"flour"</span> <span class="p">[(</span><span class="nb">+ </span><span class="mi">2</span> <span class="mi">1</span><span class="nv">/4</span><span class="p">)</span> <span class="ss">:cup</span><span class="p">]</span>
<span class="s">"baking soda"</span> <span class="p">[</span><span class="mi">1</span> <span class="ss">:teaspoon</span><span class="p">]</span>
<span class="s">"salt"</span> <span class="p">[</span><span class="mi">1</span> <span class="ss">:teaspoon</span><span class="p">]</span>
<span class="s">"butter"</span> <span class="p">[</span><span class="mi">1</span> <span class="ss">:cup</span><span class="p">]</span>
<span class="s">"sugar"</span> <span class="p">[</span><span class="mi">3</span><span class="nv">/4</span> <span class="ss">:cup</span><span class="p">]</span>
<span class="s">"brown sugar"</span> <span class="p">[</span><span class="mi">3</span><span class="nv">/4</span> <span class="ss">:cup</span><span class="p">]</span>
<span class="s">"vanilla"</span> <span class="p">[</span><span class="mi">1</span> <span class="ss">:teaspoon</span><span class="p">]</span>
<span class="s">"eggs"</span> <span class="mi">2</span>
<span class="s">"chocolate chips"</span> <span class="p">[</span><span class="mi">12</span> <span class="ss">:ounce</span><span class="p">]}}</span>
</code></pre>
<p>Or the Gini coefficients of nations, as measured over time:</p>
<pre><code><span></span><span class="p">{</span><span class="s">"Afghanistan"</span> <span class="p">{</span><span class="mi">2008</span> <span class="mf">27.8</span><span class="p">}</span>
<span class="s">"Indonesia"</span> <span class="p">{</span><span class="mi">2008</span> <span class="mf">34.1</span> <span class="mi">2010</span> <span class="mf">35.6</span> <span class="mi">2011</span> <span class="mf">38.1</span><span class="p">}</span>
<span class="s">"Uruguay"</span> <span class="p">{</span><span class="mi">2008</span> <span class="mf">46.3</span> <span class="mi">2009</span> <span class="mf">46.3</span> <span class="mi">2010</span> <span class="mf">45.3</span><span class="p">}}</span>
</code></pre>
<p>In Clojure, we <em>compose</em> data structures to form more complex values; to talk about bigger ideas. We use operations like <code>first</code>, <code>nth</code>, <code>get</code>, and <code>contains?</code> to extract specific information from these structures, and modify them using <code>conj</code>, <code>disj</code>, <code>assoc</code>, <code>dissoc</code>, and so on.</p>
<p>We started this chapter with a discussion of <em>types</em>: groups of similar objects which obey the same rules. We learned that bigints, longs, ints, shorts, and bytes are all integers, that doubles and floats are approximations to decimal numbers, and that ratios represent fractions exactly. We learned the differences between strings for text, symbols as references, and keywords as short labels. Finally, we learned how to compose, alter, and inspect collections of elements. Armed with the basic nouns of Clojure, we’re ready to write a broad array of programs.</p>
<p>I’d like to conclude this tour with one last type of value. We’ve inspected dozens of types so far–but what happens when you turn the camera on itself?</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nf">type</span> <span class="nv">type</span><span class="p">)</span>
<span class="nv">clojure.core$type</span>
</code></pre>
<p>What <em>is</em> this <code>type</code> thing, exactly? What <em>are</em> these verbs we’ve been learning, and where do they come from? This is the central question of <a href="http://aphyr.com/posts/303-clojure-from-the-ground-up-functions">chapter three: functions</a>.</p>
https://aphyr.com/posts/301-clojure-from-the-ground-up-welcomeClojure from the ground up: welcome2013-10-26T13:06:30-05:002013-10-26T13:06:30-05:00Aphyrhttps://aphyr.com/<p>This guide aims to introduce newcomers and experienced programmers alike to the
beauty of functional programming, starting with the simplest building blocks of
software. You’ll need a computer, basic proficiency in the command line, a text editor, and an internet connection. By the end of this series, you’ll have a thorough command of the Clojure programming language.</p>
<h2><a href="#who-is-this-guide-for" id="who-is-this-guide-for">Who is this guide for?</a></h2>
<p>Science, technology, engineering, and mathematics are deeply rewarding fields, yet few women enter STEM as a career path. Still more are discouraged by a culture which repeatedly asserts that women lack the analytic aptitude for writing software, that they are not driven enough to be successful scientists, that it’s not cool to pursue a passion for structural engineering. Those few with the talent, encouragement, and persistence to break in to science and tech are discouraged by persistent sexism in practice: the old boy’s club of tenure, being passed over for promotions, isolation from peers, and flat-out assault. This landscape sucks. I want to help change it.</p>
<p><a href="https://twitter.com/WomenWhoCode">Women Who Code</a>, <a href="http://www.pyladies.com/">PyLadies</a>, <a href="http://www.blackgirlscode.com/">Black Girls Code</a>, <a href="http://railsbridge.org/">RailsBridge</a>, <a href="http://www.girlswhocode.com/about-us/">Girls Who Code</a>, <a href="http://www.girldevelopit.com/">Girl Develop It</a>, and <a href="http://www.lambdaladies.com/">Lambda Ladies</a> are just a few of the fantastic groups helping women enter and thrive in software. I wholeheartedly support these efforts.</p>
<p>In addition, I want to help in my little corner of the technical community–functional programming and distributed systems–by making high-quality educational resources available for free. The <a href="/tags/jepsen">Jepsen</a> series has been, in part, an effort to share my enthusiasm for distributed systems with beginners of all stripes–but especially for <a href="http://aphyr.com/posts/275-meritocracy-is-short-sighted">women, LGBT folks, and people of color</a>.</p>
<p>As technical authors, we often assume that our readers are white, that our readers are straight, that our readers are traditionally male. This is the invisible default in US culture, and it’s especially true in tech. People continue to assume on the basis of my software and writing that I’m straight, because well hey, it’s a statistically reasonable assumption.</p>
<p>But I’m <em>not</em> straight. I get called faggot, cocksucker, and sinner. People say they’ll pray for me. When I walk hand-in-hand with my boyfriend, people roll down their car windows and stare. They threaten to beat me up or kill me. Every day I’m aware that I’m the only gay person some people know, and that I can show that not all gay people are effeminate, or hypermasculine, or ditzy, or obsessed with image. That you can be a manicurist or a mathematician or both. Being different, being a stranger in your culture, <a href="http://aphyr.com/posts/274-identity-and-state">comes with all kinds of challenges</a>. I can’t speak to everyone’s experience, but I can take a pretty good guess.</p>
<p>At the same time, in the technical community I’ve found overwhelming warmth and support, from people of <em>all</em> stripes. My peers stand up for me every day, and I’m so thankful–especially you straight dudes–for understanding a bit of what it’s like to be different. I want to extend that same understanding, that same empathy, to people unlike myself. Moreover, I want to reassure everyone that though they may feel different, they <em>do</em> have a place in this community.</p>
<p>So before we begin, I want to reinforce that you <em>can</em> program, that you <em>can</em> do math, that you <em>can</em> design car suspensions and fire suppression systems and spacecraft control software and distributed databases, regardless of what your classmates and media and even fellow engineers think. You don’t have to be white, you don’t have to be straight, you don’t have to be a man. You can grow up never having touched a computer and still become a skilled programmer. Yeah, it’s harder–and yeah, people will give you shit, but that’s not your fault and has nothing to do with your <em>ability</em> or your <em>right</em> to do what you love. All it takes to be a good engineer, scientist, or mathematician is your curiosity, your passion, the right teaching material, and putting in the hours.</p>
<p>There’s nothing in this guide that’s just for lesbian grandmas or just for mixed-race kids; bros, you’re welcome here too. There’s nothing dumbed down. We’re gonna go as deep into the ideas of programming as I know how to go, and we’re gonna do it with everyone on board.</p>
<p>No matter who you are or who people <em>think</em> you are, this guide is for you.</p>
<h2><a href="#why-clojure" id="why-clojure">Why Clojure?</a></h2>
<p>This book is about how to program. We’ll be learning in Clojure, which is a modern dialect of a very old family of computer languages, called Lisp. You’ll find that many of this book’s ideas will translate readily to other languages; though they may be <a href="http://aphyr.com/posts/266-core-language-concepts">expressed in different ways</a>.</p>
<p>We’re going to explore the nature of syntax, metalanguages, values, references, mutation, control flow, and concurrency. Many languages leave these ideas implicit in the language construction, or don’t have a concept of metalanguages or concurrency at all. Clojure makes these ideas explicit, first-class language constructs.</p>
<p>At the same time, we’re going to defer or omit any serious discussion of static type analysis, hardware, and performance. This is not to say that these ideas aren’t <em>important</em>; just that they don’t fit well within this particular narrative arc. For a deep exploration of type theory I recommend a study in Haskell, and for a better understanding of underlying hardware, learning C and an assembly language will undoubtedly help.</p>
<p>In more general terms, Clojure is a well-rounded language. It offers broad library support and runs on multiple operating systems. Clojure performance is not terrific, but is orders of magnitude faster than Ruby, Python, or Javascript. Unlike some faster languages, Clojure emphasizes <em>safety</em> in its type system and approach to parallelism, making it easier to write correct multithreaded programs. Clojure is <em>concise</em>, requiring very little code to express complex operations. It offers a REPL and dynamic type system: ideal for beginners to experiment with, and well-suited for manipulating complex data structures. A consistently designed standard library and full-featured set of core datatypes rounds out the Clojure toolbox.</p>
<p>Finally, there are some drawbacks. As a compiled language, Clojure is much slower to start than a scripting language; this makes it unsuitable for writing small scripts for interactive use. Clojure is also <em>not</em> well-suited for high-performance numeric operations. Though it is possible, you have to jump through hoops to achieve performance comparable with Java. I’ll do my best to call out these constraints and shortcomings as we proceed through the text.</p>
<p>With that context out of the way, let’s get started by installing Clojure!</p>
<h2><a href="#getting-set-up" id="getting-set-up">Getting set up</a></h2>
<p>First, you’ll need a Java Virtual Machine, or JVM, and its associated
development tools, called the JDK. This is the software which <em>runs</em> a Clojure
program. If you’re on Windows, install <a href="http://www.oracle.com/technetwork/java/javase/downloads/index.html">Oracle
JDK 1.7</a>. If you’re on OS X or Linux, you may already have a JDK installed.
In a terminal, try:</p>
<pre><code><span></span>which javac
</code></pre>
<p>If you see something like</p>
<pre><code><span></span>/usr/bin/javac
</code></pre>
<p>Then you’re good to go. If you don’t see any output from that command, install the appropriate
<a href="http://www.oracle.com/technetwork/java/javase/downloads/index.html">Oracle
JDK 1.7</a> for your operating system, or whatever JDK your package manager has available.</p>
<p>When you have a JDK, you’ll need <a href="http://leiningen.org">Leiningen</a>,
the Clojure build tool. If you’re on a Linux or OS X computer, the instructions below
should get you going right away. If you’re on Windows, see the Leiningen page
for an installer. If you get stuck, you might want to start with a
<a href="http://blog.teamtreehouse.com/command-line-basics">primer on command line basics</a>.</p>
<pre><code><span></span>mkdir -p ~/bin
<span class="nb">cd</span> ~/bin
curl -O https://raw.githubusercontent.com/technomancy/leiningen/stable/bin/lein
chmod a+x lein
</code></pre>
<p>Leiningen automatically handles installing Clojure, finding libraries from the
internet, and building and running your programs. We’ll create a new Leiningen
project to play around in:</p>
<pre><code><span></span><span class="nb">cd</span>
lein new scratch
</code></pre>
<p>This creates a new directory in your homedir, called <code>scratch</code>. If you see <code>command not found</code> instead, it means the directory <code>~/bin</code> isn’t registered with your terminal as a place to search for programs. To fix this, add the line</p>
<pre><code><span></span><span class="nb">export</span> <span class="nv">PATH</span><span class="o">=</span><span class="s2">"</span><span class="nv">$PATH</span><span class="s2">"</span>:~/bin
</code></pre>
<p>to the file <code>.bash_profile</code> in your home directory, then run <code>source ~/.bash_profile</code>. Re-running <code>lein new scratch</code> should work.</p>
<p>Let’s enter that directory, and start using Clojure itself:</p>
<pre><code><span></span><span class="nb">cd</span> scratch
lein repl
</code></pre>
<h2><a href="#the-structure-of-programs" id="the-structure-of-programs">The structure of programs</a></h2>
<p>When you type <code>lein repl</code> at the terminal, you’ll see something like this:</p>
<pre><code>aphyr@waterhouse:~/scratch$ lein repl
nREPL server started on port 45413
REPL-y 0.2.0
Clojure 1.5.1
Docs: (doc function-name-here)
(find-doc "part-of-name-here")
Source: (source function-name-here)
Javadoc: (javadoc java-object-or-class-here)
Exit: Control+D or (exit) or (quit)
user=>
</code></pre>
<p>This is an interactive Clojure environment called a REPL, for “Read, Evaluate,
Print Loop”. It’s going to <em>read</em> a program we enter, run that program, and
print the results. REPLs give you quick feedback, so they’re a great way to
explore a program interactively, run tests, and prototype new ideas.</p>
<p>Let’s write a simple program. The simplest, in fact. Type “nil”, and hit enter.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="nv">nil</span>
<span class="nv">nil</span>
</code></pre>
<p><code>nil</code> is the most basic value in Clojure. It represents emptiness,
nothing-doing, not-a-thing. The absence of information.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="nv">true</span>
<span class="nv">true</span>
<span class="nv">user=></span> <span class="nv">false</span>
<span class="nv">false</span>
</code></pre>
<p><code>true</code> and <code>false</code> are a pair of special values called <em>Booleans</em>. They mean
exactly what you think: whether a statement is true or false. <code>true</code>, <code>false</code>, and <code>nil</code> form the three poles of the Lisp logical system.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="mi">0</span>
<span class="mi">0</span>
</code></pre>
<p>This is the number zero. Its numeric friends are <code>1</code>, <code>-47</code>, <code>1.2e-4</code>, <code>1/3</code>,
and so on. We might also talk about <em>strings</em>, which are chunks of text surrounded by double quotes:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="s">"hi there!"</span>
<span class="s">"hi there!"</span>
</code></pre>
<p><code>nil</code>, <code>true</code>, <code>0</code>, and <code>"hi there!"</code> are all different types of <em>values</em>; the
nouns of programming. Just as one could say “House.” in English, we can write a
program like <code>"hello, world"</code> and it evaluates to itself: the string <code>"hello world"</code>. But most sentences aren’t just about stating the existence of a thing; they involve <em>action</em>. We need <em>verbs</em>.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="nv">inc</span>
<span class="o">#</span><span class="nv"><core$inc</span> <span class="nv">clojure.core$inc</span><span class="o">@</span><span class="mi">6</span><span class="nv">f7ef41c></span>
</code></pre>
<p>This is a verb called <code>inc</code>–short for “increment”. Specifically, <code>inc</code> is a
<em>symbol</em> which <em>points</em> to a verb: <code>#<core$inc clojure.core$inc@6f7ef41c></code>–
just like the word “run” is a <em>name</em> for the <em>concept</em> of running.</p>
<p>There’s a key distinction here–that a signifier, a reference, a label, is not
the same as the signified, the referent, the concept itself. If you write the
word “run” on paper, the ink means nothing by itself. It’s just a symbol. But
in the mind of a reader, that symbol takes on <em>meaning</em>; the idea of running.</p>
<p>Unlike the number 0, or the string “hi”, symbols are references to other
values. when Clojure evaluates a symbol, it looks up that symbol’s meaning.
Look up <code>inc</code>, and you get <code>#<core$inc clojure.core$inc@6f7ef41c></code>.</p>
<p>Can we refer to the symbol itself, <em>without</em> looking up its meaning?</p>
<pre><code><span></span><span class="nv">user=></span> <span class="ss">'inc</span>
<span class="nv">inc</span>
</code></pre>
<p>Yes. The single quote <code>'</code> <em>escapes</em> a sentence. In programming languages, we call sentences <code>expressions</code> or <code>statements</code>. A quote says “Rather than <em>evaluating</em> this expression’s text, simply return the text itself, unchanged.” Quote a symbol, get a symbol. Quote a number, get a number. Quote anything, and get it back exactly as it came in.</p>
<pre><code><span></span><span class="nv">user=></span> <span class="ss">'123</span>
<span class="mi">123</span>
<span class="nv">user=></span> <span class="o">'</span><span class="s">"foo"</span>
<span class="s">"foo"</span>
<span class="nv">user=></span> <span class="o">'</span><span class="p">(</span><span class="mi">1</span> <span class="mi">2</span> <span class="mi">3</span><span class="p">)</span>
<span class="p">(</span><span class="mi">1</span> <span class="mi">2</span> <span class="mi">3</span><span class="p">)</span>
</code></pre>
<p>A new kind of value, surrounded by parentheses: the <em>list</em>. LISP originally stood for LISt Processing, and lists are still at the core of the language. In fact, they form the most basic way to compose expressions, or sentences. A list is a single expression which has <em>multiple parts</em>. For instance, this list contains three elements: the numbers 1, 2, and 3. Lists can contain anything: numbers, strings, even other lists:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="o">'</span><span class="p">(</span><span class="nf">nil</span> <span class="s">"hi"</span><span class="p">)</span>
<span class="p">(</span><span class="nf">nil</span> <span class="s">"hi"</span><span class="p">)</span>
</code></pre>
<p>A list containing two elements: the number 1, and a second list. That list
contains two elements: the number 2, and another list. <em>That</em> list contains two
elements: 3, and an empty list.</p>
<pre><code>user=> '(1 (2 (3 ())))
(1 (2 (3 ())))
</code></pre>
<p>You could think of this structure as a tree–which is a provocative idea,
because <em>languages</em> are like trees too: sentences are comprised of clauses, which can be nested, and each clause may have subjects modified by adjectives, and verbs modified by adverbs, and so on. “Lindsay, my best friend, took the dog which we found together at the pound on fourth street, for a walk with her mother Michelle.”</p>
<pre><code>Took
Lindsay
my best friend
the dog
which we found together
at the pound
on fourth street
for a walk
with her mother
Michelle
</code></pre>
<p>But let’s try something simpler. Something we know how to talk about.
“Increment the number zero.” As a tree:</p>
<pre><code>Increment
the number zero
</code></pre>
<p>We have a symbol for incrementing, and we know how to write the number zero.
Let’s combine them in a list:</p>
<pre><code><span></span><span class="nv">clj=></span> <span class="o">'</span><span class="p">(</span><span class="nb">inc </span><span class="mi">0</span><span class="p">)</span>
<span class="p">(</span><span class="nb">inc </span><span class="mi">0</span><span class="p">)</span>
</code></pre>
<p>A basic sentence. Remember, since it’s quoted, we’re talking about the tree,
the text, the expression, by itself. Absent interpretation. If we remove the
single-quote, Clojure will <em>interpret</em> the expression:</p>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">inc </span><span class="mi">0</span><span class="p">)</span>
<span class="mi">1</span>
</code></pre>
<p>Incrementing zero yields one. And if we wanted to increment <em>that</em> value?</p>
<pre><code>Increment
increment
the number zero
</code></pre>
<pre><code><span></span><span class="nv">user=></span> <span class="p">(</span><span class="nb">inc </span><span class="p">(</span><span class="nb">inc </span><span class="mi">0</span><span class="p">))</span>
<span class="mi">2</span>
</code></pre>
<p>A sentence in Lisp is a list. It starts with a verb, and is followed by zero or
more objects for that verb to act on. Each part of the list can <em>itself</em> be
another list, in which case that nested list is evaluated first, just like a
nested clause in a sentence. When we type</p>
<pre><code><span></span><span class="p">(</span><span class="nb">inc </span><span class="p">(</span><span class="nb">inc </span><span class="mi">0</span><span class="p">))</span>
</code></pre>
<p>Clojure first looks up the meanings for the symbols in the code:</p>
<pre><code><span></span><span class="p">(</span><span class="o">#</span><span class="nv"><core$inc</span> <span class="nv">clojure.core$inc</span><span class="o">@</span><span class="mi">6</span><span class="nv">f7ef41c></span>
<span class="p">(</span><span class="o">#</span><span class="nv"><core$inc</span> <span class="nv">clojure.core$inc</span><span class="o">@</span><span class="mi">6</span><span class="nv">f7ef41c></span>
<span class="mi">0</span><span class="p">))</span>
</code></pre>
<p>Then evaluates the innermost list <code>(inc 0)</code>, which becomes the number 1:</p>
<pre><code><span></span><span class="p">(</span><span class="o">#</span><span class="nv"><core$inc</span> <span class="nv">clojure.core$inc</span><span class="o">@</span><span class="mi">6</span><span class="nv">f7ef41c></span>
<span class="mi">1</span><span class="p">)</span>
</code></pre>
<p>Finally, it evaluates the outer list, incrementing the number 1:</p>
<pre><code><span></span><span class="mi">2</span>
</code></pre>
<p>Every list starts with a verb. Parts of a list are evaluated from left to
right. Innermost lists are evaluated before outer lists.</p>
<pre><code><span></span><span class="p">(</span><span class="nb">+ </span><span class="mi">1</span> <span class="p">(</span><span class="nb">- </span><span class="mi">5</span> <span class="mi">2</span><span class="p">)</span> <span class="p">(</span><span class="nb">+ </span><span class="mi">3</span> <span class="mi">4</span><span class="p">))</span>
<span class="p">(</span><span class="nb">+ </span><span class="mi">1</span> <span class="mi">3</span> <span class="p">(</span><span class="nb">+ </span><span class="mi">3</span> <span class="mi">4</span><span class="p">))</span>
<span class="p">(</span><span class="nb">+ </span><span class="mi">1</span> <span class="mi">3</span> <span class="mi">7</span><span class="p">)</span>
<span class="mi">11</span>
</code></pre>
<p>That’s it.</p>
<p>The entire grammar of Lisp: the structure for every expression in the language.
We transform expressions by <em>substituting</em> meanings for symbols, and obtain
some result. This is the core of the <a href="http://en.wikipedia.org/wiki/Lambda_calculus">Lambda Calculus</a>, and it is the theoretical basis for almost all computer languages.
Ruby, Javascript, C, Haskell; all languages express the text of their programs in different ways, but internally all construct a <em>tree</em> of expressions. Lisp simply makes it explicit.</p>
<h2><a href="#review" id="review">Review</a></h2>
<p>We started by learning a few basic nouns: numbers like <code>5</code>, strings like
<code>"cat"</code>, and symbols like <code>inc</code> and <code>+</code>. We saw how quoting makes the
difference between an <em>expression</em> itself and the thing it <em>evaluates</em> to. We
discovered symbols as <em>names</em> for other values, just like how words represent
concepts in any other language. Finally, we combined lists to make trees, and
used those trees to represent a program.</p>
<p>With these basic elements of syntax in place, it’s time to expand our
vocabulary with new verbs and nouns; learning to <a href="http://aphyr.com/posts/302-clojure-from-the-ground-up-basic-types">represent more complex values and transform them in different ways</a>.</p>