Rust for Clojurists (2015)

0
17

Contents

Why care about Rust?

You already write software in Clojure. It pays your bills. You enjoy it. You’re in an industry reaping disproportionate benefit from loose money policies, leading to a trend-chasing culture of overpaid nerds making web apps. You feel guilty about this, but there is nothing you can do about it because you have no other talents that a rational person would pay you for.

Learning Rust will probably not do much to solve that problem for you. It won’t assist you in making the ontological leap from a tired stereotype into something sentient and real. You will remain a replaceable silhouette with no discernible identity. It might even exacerbate the problem. However, it will give you a useful tool for writing low-level software.

Let’s start by reaffirming why we love Clojure:

  • Expressiveness (Lisp syntax, functional programming)
  • Interoperability (hosted on the JVM and JavaScript)
  • Concurrency (pmap/pvalues, atoms/agents/refs, core.async)

Now let’s think about Clojure’s weaknesses:

  • Performance (fast for a dynamic lang, but slow for a compiled lang)
  • Safety (would you program avionics or pacemakers in it?)
  • Embeddability (garbage collected, requires an external runtime)

Rust’s strengths are Clojure’s weaknesses, and vice-versa. Rust isn’t as expressive or interoperable, and its concurrency story isn’t as complete. That said, it’s much better for performance or safety critical needs, and it can be embedded inside other programs or on very limited hardware.

Many people try to compare Rust to Go, but this is flawed. Go is an ancient board game that emphasizes strategy. Rust is more appropriately compared to Chess, a board game focused on low-level tactics. Clojure, with its high-level purview, is a better analogy to the enduring game of stones.

The Toolchain

With Clojure, we typically start by installing Leiningen, which builds projects and deploys libraries to clojars.org. The project.clj file at the root of a project specifies metadata like dependencies. One elegant aspect of Clojure we take for granted is that it is just a library, so projects can specify in this file what version of Clojure to use just like any other library.

With Rust, we start by installing Cargo, which builds projects and deploys libraries to crates.io. The Cargo.toml file at the root of a project specifies metadata like dependencies. Rust takes the more traditional approach of being bundled with its build tool; it isn’t a library and its compiler can’t be embedded into programs for REPL-driven development.

Creating a New Project

With Clojure, we start an app with lein new app hello-world, which creates a project containing this:

(ns hello_world.core
  (:gen-class))

(defn -main
  "I don't do a whole lot ... yet."
  [& args]
  (println "Hello, World!"))

With Rust, we start an app with cargo new hello_world --bin, which creates a project containing this:

fn main() {
    println!("Hello, world!");
}

As you can see, the process is basically identical, and apart from syntactic differences, they both start you off with the same main function. With Cargo, if you leave out the “–bin”, it will create a library instead. Either way, be sure to think hard about your project’s name. A name like “Rust” ensures many opportunities for clever puns that will surely never get tiresome or old.

Modules

While the Rust project doesn’t start off with any equivalent to Clojure’s namespace declaration, once you move beyond a single source file you’ll need to use it. This isn’t C or C++, where you just include files like a caveman. Rust separates code into modules, and each source file is automatically given one based on the file name. We can make a functions in a separate file like this:

// utils.rs

pub fn say_hello() {
    println!("Hello, world!");
}

pub fn say_goodbye() {
    println!("Goodbye, world!");
}
// main.rs

mod utils;

fn main() {
    utils::say_hello();
    utils::say_goodbye();
}

Rust’s mod is similar to Clojure’s ns in that it creates a module, but they all go at the top of main.rs instead of in the files that the modules come from. From there, we can just prepend utils:: to the function names to use them. Note that they are declared with pub. Unlike Clojure, Rust makes functions private by default.

Rust’s use is similar to Clojure’s require in that it brings in an existing module. Here’s a slightly modified main.rs, where we are bringing in symbols explicitly so we don’t need to alias it, much like Clojure’s require does with the :refer keyword:

// main.rs

use utils::{say_hello, say_goodbye};

mod utils;

fn main() {
    say_hello();
    say_goodbye();
}

Read Crates and Modules to learn more.

Crates

As you know, languages with their own package managers usually have a special format for their libraries with their own unique name. Python has its “eggs” and Ruby has its “gems”. Clojure has the disadvantage of being on an existing ecosystem, so it couldn’t invent its own format; it uses the same boring “jars” as other JVM languages.

Thankfully, Rust does not have this problem, and it chose to call its format “crates”. This reflects the language’s industrial roots and the humble, blue collar town of its sponsor: Mountain View, California. To use a crate, you add it to Cargo.toml much like you would with project.clj. Here’s what mine looks like after adding the time crate:

[package]

name = "hello_world"
version = "0.0.1"
authors = ["oakes "]

[dependencies.time]

time = "0.1.2"

To use it, we first need to declare the crate at the top of main.rs:

// main.rs

extern crate time;

use utils::{say_hello, say_goodbye};

mod utils;

fn main() {
    say_hello();
    say_goodbye();
}

Then, in the file we want to use it in, we’ll bring it in with use:

// utils.rs

use time;

pub fn say_hello() {
    println!("Hello, world at {}!", time::now().asctime());
}

pub fn say_goodbye() {
    println!("Goodbye, world at {}!", time::now().asctime());
}

Read Crates and Modules to learn more.

Types

Until now, we’ve avoided seeing types because none of our functions take arguments. Rust is statically typed. The upside is, you will curse it at compile-time instead of at runtime. The downside is, “exploratory programming” means exploring how to convince the compiler to let you try an idea. Let’s modify our functions so we pass the “Hello” or “Goodbye” as an argument:

// main.rs

extern crate time;

use utils::say_something;

mod utils;

fn main() {
    say_something("Hello");
    say_something("Goodbye");
}
// utils.rs

use time;

pub fn say_something(word: &str) {
    let t = time::now();
    println!("{}, world at {}!", word, t.asctime());
}

So, the syntax for arguments is similar to ML, where the name comes first, followed by a colon and then the type. In Rust, statically-allocated strings have the type of &str, which is pronounced as “string slice”. Heap-allocated strings have the type of String. This is a distinction you don’t find in Clojure or other high-level languages. Read Strings to learn more.

Note that in this latest revision, we also moved the time object into a local variable using let, which should be familiar to a Clojure user. In Rust, you are required to specify the types of top-level functions, but almost never for local variables. Rust has type inference, so it can figure out the type of t on its own. It happens to be Tm.

As the Tm docs indicate, the asctime function returns a TmFmt. Although println! has no idea what that is, it doesn’t matter. It implements a trait — similar to a Clojure protocol — called Display, which is all that println! needs. This mechanism is used pervasively in Rust. Read Traits to learn more.

References

The distinction in the previous section between stack and heap allocation is worth focusing on. In high-level languages, you can’t control which is used, so you never think about it. In C and C++, you have complete control over it, but only at the price of being more error-prone. Rust promises to give you that control while being as safe as high-level languages.

When you have direct control over memory allocation, you also have control over how values are passed to functions. In high-level languages, you normally just pass a value to a function and the language will decide whether to only pass a reference to the value or to pass an entire copy of the value. In Rust you explicitly pass references to values.

That is what the & means in &str. Literal strings are automatically represented as a references, but under normal circumstances things will start their life as a value, and to pass them as a reference you will need to prepend them with &. For example, let’s pass the Tm object to the say_something function:

// main.rs

extern crate time;

use utils::say_something;

mod utils;

fn main() {
    let t = time::now();
    say_something("Hello", &t);
    say_something("Goodbye", &t);
}
// utils.rs

use time;

pub fn say_something(word: &str, t: &time::Tm) {
    println!("{}, world at {}!", word, t.asctime());
}

What would happen if we just did say_something("Hello", t);, and change the argument’s type to t: time::Tm? The value t will be “moved” into the function, and will no longer be available outside of it. Since say_something("Goodbye", t); is called after, it will throw an error. Read References and Borrowing to learn more.

Mutability

A Clojure programmer will be pleased to find that Rust shares a belief in data being immutable by default. The Tm object in the previous section cannot be mutated — you’ll get a compile error. For example, because it implements the Clone trait, it has a function called clone_from, which lets you replace it with a completely new Tm object. This is obviously a mutation, so if we want to use it, we must declare it with let mut:

// main.rs

extern crate time;

use utils::say_something;

mod utils;

fn main() {
    let mut t = time::now();
    t.clone_from(&time::now_utc());
    say_something("Hello", &t);
    say_something("Goodbye", &t);
}

In that example, the t object is being completely replaced by a new one that uses UTC time instead of local time. Interestingly, the say_something function still cannot mutate it, because references are immutable by default as well. If we wanted to run the clone_from function there, we would have to use a mutable reference:

// main.rs

extern crate time;

use utils::say_something;

mod utils;

fn main() {
    let mut t = time::now();
    say_something("Hello", &mut t);
    say_something("Goodbye", &mut t);
}
// utils.rs

use time;

pub fn say_something(word: &str, t: &mut time::Tm) {
    t.clone_from(&time::now_utc());
    println!("{}, world at {}!", word, t.asctime());
}

The neat thing about this is that you can tell when a function is mutating an argument by simply looking at its type signature. If you don’t see &mut, it can’t do so (unless it’s internally mutable). It could still perform I/O like writing to the disk or requesting a network resource, so it’s not necessarily pure in that sense, but at least we know that it’s pure vis-à-vis its own arguments.

Nullability

In Clojure, we have the concept of nil to represent the lack of a value. It is convenient, but if we forget to check for it, we get the dreaded NullPointerException. Rust follows in the footsteps of languages like Haskell by doing the same thing it does with mutability: making it explicitly part of the type.

For example, let’s say we want the say_something function to let you pass a Tm reference or nothing at all. If you do the latter, it will just create its own Tm object using time::now_utc(). To express this, we have to make it an optional type. That means changing the type to Option<&time::Tm> and changing the value we pass to it like this:

// main.rs

extern crate time;

use utils::say_something;

mod utils;

fn main() {
    let t = time::now();
    say_something("Hello", Some(&t));
    say_something("Goodbye", None);
}
// utils.rs

use time;

pub fn say_something(word: &str, t: Option<&time::Tm>) {
    if t.is_some() {
        println!("{}, world at {}!", word, t.unwrap().asctime());
    } else {
        println!("{}, world at {}!", word, time::now_utc().asctime());
    }
}

So, if we actually want to pass a value, we surround it with Some(...), and if we want to pass the equivalent of Clojure’s nil, we pass in None. Then, in say_something, we can check if t contains a value using the is_some function, and if so, we call unwrap on it to get its value.

This may seem like a lot of work compared to just using nil, but the advantage is that NullPointerExceptions are impossible. We are forced by the compiler to check if it contains a value. Additionally, it has the same advantage that &mut has in the previous section; just by looking its type signature, we know which arguments allow no value to be passed.

Pattern Matching

In Clojure, we can get very powerful pattern matching capabilities using the core.match library. Rust has a similar mechanism baked into the language. This can be used to simplify complicated conditional statements using the match keyword. Read Match to learn more.

For our purposes, pattern matching can help us make our if statement safer. In the previous section, say_something is not very idiomatic, because it manually checks t.is_some() and calls t.unwrap(). It is much better to use the if let syntax like this:

// utils.rs

use time;

pub fn say_something(word: &str, t: Option<&time::Tm>) {
    if let Some(t_ptr) = t {
        println!("{}, world at {}!", word, t_ptr.asctime());
    } else {
        println!("{}, world at {}!", word, time::now_utc().asctime());
    }
}

Clojure, of course, has its own if-let, and the concept is very similar. The only difference is that we must use pattern matching to pull the value out of the option type. That’s what Some(t_ptr) = t is doing. Pattern matching is used pervasively in Rust for everything from error handling to destructuring. Read Patterns to learn more.

Expressions

In Clojure, everything is an expression, which means we can embed code inside of code without any restriction. In Rust, it’s not quite as pervasive, but nonetheless almost everything is an expression. The only things you’ve run into that can’t be expressions are declarations, such as mod, use, fn, and let.

What about if statements? In the previous section, say_something is deliberately verbose. There is clearly no benefit to writing redundant code like the calls to println! beyond ensuring one’s own job security. In Rust, if statements are expressions, so we can just embed it into a let statement like this:

// utils.rs

use time;

pub fn say_something(word: &str, t: Option<&time::Tm>) {
    let t_val = if let Some(t_ptr) = t {
        *t_ptr
    } else {
        time::now_utc()
    };
    println!("{}, world at {}!", word, t_val.asctime());
}

Here, we are making the local variable t_val, which will contain either the value inside t, or a new object if t is None. Notice the * before t_ptr. This is doing the opposite of & by grabbing the value that the reference is referring to. We need to do this because time::now_utc() returns a value, and we need to make sure both return the same type.

Also notice that neither expression in our if statement ends with a semicolon. Semicolons are used to demarcate statements. To return a value, we just write an expression without a semicolon. This is similar to what we do in Clojure. When we want to return a value, we simply put it at the end. Read Expressions vs. Statements to learn more.

Note that the same thing is done to return a value at the end of a function. If we wanted say_something to return our Tm object, all we need to do is indicate that in the type signature and then put t_val at the end of the function:

// utils.rs

use time;

pub fn say_something(word: &str, t: Option<&time::Tm>) -> time::Tm {
    let t_val = if let Some(t_ptr) = t {
        *t_ptr
    } else {
        time::now_utc()
    };
    println!("{}, world at {}!", word, t_val.asctime());
    t_val
}

Macros

You may have wondered this entire time why println! ends with a bang. In Clojure, it is idiomatic to do this for functions that are side-effecting. In Rust, it is the compiler-enforced syntax for macros. Users of Lisp dialects like Clojure are certainly fond of their macros, as there is a tremendous power, simplicity, and hubristic feeling of personal superiority they afford due to their homoiconic syntax.

Rust is not homoiconic, and unsurprisingly the macro system isn’t as powerful. Their primary purpose is similar to that of C macros: to reduce code duplication through symbol replacement. Unlike C macros, however, they are hygenic. Read Macros to learn more. If you are looking for the ability to run arbitrary code at compile-time, you may need to write a compiler plugin instead.

Learn More

There is much more to learn about Rust from here. We haven’t touched on lifetimes, the mechanism for achieving memory safety without garbage collection. We haven’t looked at FFI, the mechanism for introducing segfaults and stack corruption into your program. The Rust Book, which I’ve been linking to all along, is a great next step for the reader.

LEAVE A REPLY

Please enter your comment!
Please enter your name here