I will try to introduce concepts gradually without assuming prior knowledge of Clojure (or any other LISP dialect). However I will assume that you are already an experienced developer in any other popular language such as Java, C/C++, Python or Javascript. General programming concepts such as functions, parameters, recursion, objects and common data-structures such as: linked lists, maps (or dictionaries), vectors and sets will be assumed to be already known.
The REPL
The REPL is (IMHO) one of the key Clojure features. REPL stands for: Read Eval Print Loop and although this is present in many languages such as python, ruby and soon Java as well, in Clojure it is part of the main development workflow. In other words if you are not using the REPL for your Clojure development you are doing it wrong!
The REPL allows you to connect to a running system, inspect runtime values, and even make live changes in your code without having to restart your system.
It is the best way to explore a system or a dataset and get familiar with its domain.
In terms of feedback, the Read Eval Print Loop is so much better than TDD, that a new development methodology has been created/inspired.. the REPL Driven Development
For this session we are going to use the REPL to explore Clojure features, this might give a glimpse of what is possible to do with the Clojure REPL.
Clojure syntax
Clojure syntax is very simple.
A program is composed of s-expressions,
every s-expr is delimited by a set of
parenthesis. Line comments are made
with a semicolon (;
) and by conventions
a full line comment is two or more consecutive
semicolons ;;
while and in-line comment
is only one ;
.
You can skip the evaluation and execution of a
block with the comment
form, however this
isn’t a complete comment in the same way as the
semicolon, as it still get parsed by the reader
in the same way as the rest of the
code. Therefore the comment block has to be
valid Clojure code. For example:
;; this is a valid comment
;; a : b : c
;; while this won't be readable
(comment a : b : c)
The difference is that the semicolon comment is
ignored by the reader, while the comment
block is something that you could have
implemented yourself using the macros, which in
this case just tells the compiler to not
generate anything any code.
We will denote the output of the REPL evaluation
with by prefixing the result with ;;=>
.
So every time you see a Clojure expression
followed by ;;=>
and a value it means
that the value is the result of the evaluation
of last expression.
The function call.
The first concept I will introduce is how to make a function call. We will see more about functions later, but for the moment I want to make sure that you will understand the next few examples. Let’s start to make some comparisons with method or function calls in a few different languages
// java and C++
myObject.myFunction(arg1, arg2, arg3);
// C
myFunction(myStruct, arg1, arg2, arg3);
;; Clojure
(my-function myObject arg1 arg2 arg3)
As you can see in Clojure the brackets surround the function and all its arguments. In object oriented languages such as Java and C++ the object comes before the method name or function name. In C and Clojure the function comes first, then followed by the target object. Let’s see a concrete example, for the sake of the example I will omit the required package imports.
// java
"Hello World!".toLowerCase();
// C - single char
tolower(*c);
// C - Whole string
for ( ; *c; ++c) *c = tolower(*c);
^^^^^^^^^^^^^
;; Clojure
(lower-case "Hello World!")
NOTE: In the standard C library there is only a function to turn a single character into its lowercase form, that’s why there is a loop.
However in the tolower(*c)
we can see the
function comes first followed by its arguments
surrounded by bracket. In Clojure, the
expression (called s-expr
) starts with an
open bracket, followed by a function followed
by a list of arguments.
The following code is designed to run in the
Clojure REPL, the conventions I will follow
throughout the text is to display the result of
the expression evaluation prefixed with this
evaluation marker ;;=>
. So every time you’ll
see a Clojure expression followed by ;;=>
and
followed by another value it means that the
result of the evaluation of the prior
expression is what follows the marker. For
example the evaluation of the expression (+ 1
1)
with its result will be noted as follow:
(+ 1 1)
;;=> 2
Booleans
In Clojure we have boolean values like in many
other languages. No surprise here we have two
values true
and false
which just evaluate
to themselves. Now we can use the function
type
to see what is the concrete type of
these values in the host platform, and if we
check the type of these values we’ll find that
they are just simple Java java.lang.Boolean
objects.
true
;;=> true
false
;;=> false
(type true)
;;=> java.lang.Boolean
Now boolean values are often associated to
logic programming and the concept of
“truthiness”. In strongly typed languages
such as Java you can only use boolean in
conditional operation. Some other languages
such C/C++ have a more loose definition
“truthiness”. In Clojure everything is
considered **true** with the exception of
false
and nil
.
For example we can use the following form (if
condition truthy falsey)
which evaluates the
given condition
and if the condition has a
logical value of true then it will evaluate
truthy
form otherwise it evaluates the
falsey
.
(if true "it's true" "it's false")
;;=> "it's true"
(if false "it's true" "it's false")
;;=> "it's false"
(if nil "it's true" "it's false")
;;=> "it's false"
(if "HELLO" "it's true" "it's false")
;;=> "it's true"
(if 1 "it's true" "it's false")
;;=> "it's true"
Numbers
Clojure has a quite unique support for numerical values. As you would expect every number just evaluates to itself.
Integers
They are mapped to java.lang.Long
, but since
they can be indefinitely large they can be
promoted to clojure.lang.BigInt
once they go
beyond the java.lang.Long#MAX_VALUE
.
1 ;;=> 1
-4 ;;=> -4
9223372036854775807 ; java.lang.Long#MAX_VALUE
;;=> 9223372036854775807
(type 1)
;;=> java.lang.Long
(type 9223372036854775807)
;;=> java.lang.Long
29384756298374652983746528376529837456
;;=> 29384756298374652983746528376529837456N
(type 29384756298374652983746528376529837456)
;;=> clojure.lang.BigInt
(type 1N)
;;=> clojure.lang.BigInt
You can also define integers literals in other basis such as octal, hexadecimals and binary.
127 ;;=> 127 ; decimal
0x7F ;;=> 127 ; hexadecimal
0177 ;;=> 127 ; octal
32r3V ;;=> 127 ; base 32
2r01111111 ;;=> 127 ; binary
36r3J ;;=> 127 ; base 36
36rClojure ;;=> 27432414842
2r0111001101010001001001 ;;=> 1889353
In Clojure there are no operators, in fact +
,
-
, *
and /
are normal functions.
(+ 1 2 3 4 5)
;;=> 15
You can access static fields by
providing the fully qualified class name
followed by a slash (/
) and the field name,
for example: java.lang.Long/MAX_VALUE
.
java.lang.Long/MAX_VALUE
;;=> 9223372036854775807
(- java.lang.Long/MAX_VALUE 1)
;;=> 9223372036854775806
(+ 1 java.lang.Long/MAX_VALUE)
;;=> ArithmeticException integer overflow
Clojure has a number of functions which will
automatically auto-promote the number to be
bigger type in case it doesn’t fit in the 64bit
Java Long object. These functions are: +'
,
-'
and *'
(+' 1 java.lang.Long/MAX_VALUE)
;;=> 9223372036854775808N
(*' java.lang.Long/MAX_VALUE java.lang.Long/MAX_VALUE)
;;=> 85070591730234615847396907784232501249N
Decimals
Clojure supports floating point decimals and
exact decimals. Floating point decimals are
mapped to java.lang.Double
and they evaluate
to themselves. While exact decimals are mapped
to java.math.BigDecimal
and they also
evaluate to themselves. Use the latter when
you require exact decimals but be careful to
numbers which can’t be represented with exact
decimals like: 1 divided by 3 (0.3333333…) as
the the decimal part continue forever.
3.2
;;=> 3.2
(type 3.2)
;;=> java.lang.Double
3.2M
;;=> 3.2M
(type 3.2M)
;;=> java.math.BigDecimal
(+ 0.3 0.3 0.3 0.1) ;; floating point
;;=> 0.9999999999999999
(+ 0.3M 0.3M 0.3M 0.1M) ;; big-decimal
;;=> 1.0M
(/ 1.0M 3.0M)
;;=> ArithmeticException Non-terminating decimal expansion; no exact representable decimal result.
(with-precision 10 (/ 1.0M 3.0M))
;;=> 0.3333333333M
Rationals
Number like 1 divided by 3 are called rational numbers, and Clojure supports them. You can mix then in your calculation and as long as you don’t put floating point values it will retain the precision.
(/ 1 3)
;;=> 1/3
(type 1/3)
;;=> clojure.lang.Ratio
(+ 1/3 1/3 1/3)
;;=> 1N
(/ 21 6)
;;=> 7/2
(+ 1/3 1/3 1/3 1)
;;=> 2N
(+ 1/3 1/3 0.333)
;;=> 0.9996666666666667
Characters
So far we have seen the rich support for
numerical values in Clojure. Clojure does
support characters and strings literals as
well. Characters map to java.lang.Character
,
support Unicode characters and as all
value-types they evaluate to themselves.
\a ; this is the character 'a'
\A ; this is the character 'A'
\\ ; this is the character '\'
\u0041 ; this is unicode for 'A'
\tab ; this is the tab character
\newline ; this is the newline character
\space ; this is the space character
\a ;;=> \a
(type \a)
;;=> java.lang.Character
Strings
Strings literals have no surprise. They map to
java.lang.String
, they are multi-line, like
in Java they are immutable and they evaluate to
themselves.
"This is a string"
;;=> "This is a string"
(type "This is a string")
;;=> java.lang.String
"Strings in Clojure
can be multi lines
as well!!"
;;=> "Strings in Clojure\n can be multi lines\n as well!!"
Via the Java interop. infrastructure you can
call all java.lang.String
methods directly
(.toUpperCase "This is a String")
;;=> "THIS IS A STRING"
You can use the function str
to concatenate
strings or to convert numbers into strings (via
Object#toString()
method).
(str "This" " is " "a" " concatenation.")
;;=> "This is a concatenation."
(str "Number of lines: " 123)
;;=> "Number of lines: 123"
Keywords
Keywords are labels for things in our programs,
they evaluate to themselves and can be used to
give name to things similarly to Java’s
enumerations. They mostly used as key in maps
(we will see this later), and the Clojure
runtime maintains them in a internal pool
(similarly to interned strings in Java.) which
guarantee that only one copy of a particular
keyword will ever exist in a program. For this
reason they provide very fast equality test.
Equality test in Clojure is done via the
function =
with the same semantic as the
Java’s .equals()
method, while the identity
equality is done via the function identical?
which in turn implements the Java’s ==
operator. You can use the function keyword
to create a keyword out of a string.
:words
;;=> :words
(type :this-is-a-keyword)
;;=> clojure.lang.Keyword
(keyword "blue")
;;=> :blue
(= :blue :blue)
;;=> true
(= (str "bl" "ue") (str "bl" "ue"))
;;=> true
(identical? :blue :blue)
;;=> true
(identical? (str "bl" "ue") (str "bl" "ue"))
;;=> false
(identical? (keyword (str "bl" "ue")) (keyword (str "bl" "ue")))
;;=> true
Collections
In Java the only collection literals available is the array. Clojure like most modern languages offers a variety of collection literals which makes the language more expressive. Out-of-the-box support is provided for the following collections literals: single linked lists, vectors, maps (or dictionaries) and sets. However Clojure supports a larger number of data structures which are built with functions such as: sorted maps, sorted sets, array maps, hash maps and hash sets. Many more data structures are available in community maintained libraries such as graphs, ring buffers and AVL trees. All Clojure collections can contain a mixture of values.
Lists
Clojure has single-linked lists built-in and
like all other Clojure collections are
immutable. Lists guarantee O(1)
insertion on
the head, O(n)
traversal and element search.
to create a list you can use the function `list`
(list 1 2 3 4 5)
;;=> (1 2 3 4 5)
to “add” an element on the front of the list you can
use the cons
function.
(cons 0 (list 1 2 3 4 5))
;;=> (0 1 2 3 4 5)
As the output suggest the lists literals in Clojure are expressed with a sequence of values surrounded by brackets, which is the same of the function call. That is the reason why the following line throws an error.
(1 2 3 4 5)
;;=> ClassCastException java.lang.Long cannot be cast to clojure.lang.IFn
To be able to express a list of values as a
literal we have to used the quote
form which
it will preserve the list without initiate the
function call.
(quote (1 2 3 4 5))
;;=> (1 2 3 4 5)
As syntax sugar we can use the single quote
sign '
instead of the longer (quote ,,,)
form.
'(1 2 3 4 5)
;;=> (1 2 3 4 5)
'(1 "hi" :test 4/5 \c)
;;=> (1 "hi" :test 4/5 \c)
you can get the head of the list with the
function first
and use rest
or next
to
get the tail. count
returns the number of
elements in it. nth
returns the nth element
of the list, while last
returns last item in
the list.
(first '(1 2 3 4 5))
;;=> 1
(rest '(1 2 3 4 5))
;;=> (2 3 4 5)
(next '(1 2 3 4 5))
;;=> (2 3 4 5)
(rest '(1))
;;=> ()
(next '(1))
;;=> nil
(count '(5))
;;=> 1
(count '(1 2 3 4 5))
;;=> 5
(nth '(1 2 3 4 5) 0)
;;=> 1
(nth '(1 2 3 4 5) 1)
;;=> 2
(nth '(1 2 3 4 5) 10)
;;=> IndexOutOfBoundsException
(nth '(1 2 3 4 5) 10 :not-found)
;;=> :not-found
(last '(1 2 3 4 5))
;;=> 5
(last '(1))
;;=> 1
(last '())
;;=> nil
Vectors
Vectors are collections of values which are
indexed by their position in the vector
(starting from 0) called index. Insertion
at the end of the vector is near O(1)
as well
as retrieval of an element by it’s index. The
literals is expressed with a sequence of values
surrounded by square brackets or you can use
the vector
function to construct one. You
can append an element at the end of the vector
with conj
and use get
to retrieve an
element in a specific index. Function such as
first
, next
rest
, last
and count
will
work just as fine with Vectors.
[1 2 3 4 5]
;;=> [1 2 3 4 5]
[1 "hi" :test 4/5 \c]
;;=> [1 "hi" :test 4/5 \c]
(vector 1 2 3 4 5)
;;=> [1 2 3 4 5]
(conj [1 2 3 4 5] 6)
;;=> [1 2 3 4 5 6]
(count [1 2])
;;=> 2
(first [:a :b :c])
;;=> :a
(get [:a :b :c] 1)
;;=> :b
([:a :b :c] 1)
;;=> :b
(get [:a :b :c] 10)
;;=> nil
(get [:a :b :c] 10 :z)
;;=> :z
One important thing to note is that Clojure’s data-structures are persistent which has anything to do with the durability (like: disk persistence). Persistent data structure do have structural sharing. To understand more about this you can read the following blog post: Understanding Clojure’s Persistent
Maps
Maps are associative data structures (often
called dictionaries) which maps keys to their
corresponding value. Maps have a literal form
which can be expressed by any number of
key/value pairs surrounded by curly brackets,
or by using hash-map
or array-map
functions. Hash-maps provides a near O(1)
insertion time and near O(1)
seek time. You
can use assoc
to “add or overwrite” an new
pair, dissoc
to “remove” a key and its value,
and use get
to retrieve the value of a given
key.
{"jane" "jane@acme.com"
"fred" "fred@acme.com"
"rob" "rob@acme.com"}
;;=> {"jane" "jane@acme.com", "fred" "fred@acme.com", "rob" "rob@acme.com"}
{:a 1, :b 2, :c 3}
;;=> {:a 1, :b 2, :c 3}
(hash-map :a 1, :b 2, :c 3)
;;=> {:c 3, :b 2, :a 1}
(array-map :a 1, :b 2, :c 3)
;;=> {:a 1, :b 2, :c 3}
(assoc {:a 1, :b 2, :c 3} :d 4)
;;=> {:a 1, :b 2, :c 3, :d 4}
(assoc {:a 1, :b 2, :c 3} :b 10)
;;=> {:a 1, :b 10, :c 3}
(dissoc {:a 1, :b 2, :c 3} :b)
;;=> {:a 1, :c 3}
(count {:a 1, :b 2, :c 3})
;;=> 3
(get {:a 1, :b 2, :c 3} :a)
;;=> 1
(get {:a 1, :b 2, :c 3} :a :not-found)
;;=> 1
(get {:a 1, :b 2, :c 3} :ZULU :not-found)
;;=> :not-found
(:a {:a 1, :b 2, :c 3})
;;=> 1
({:a 1, :b 2, :c 3} :a)
;;=> 1
Sets
Sets are a type of collection which doesn’t
allow for duplicate values. While lists and
vector can have duplicate elements, set
eliminates all duplicates. Clojure has a
literal form for sets which is expressed by a
sequence of values surrounded by #{
}
. Otherwise you construct a set using the
set
function. With conj
you can “add” a
new element to an existing set, and disj
to
“remove” an element from the set. With
clojure.set/union
, clojure.set/difference
and clojure.set/intersection
you have typical
sets operations. count
returns the number of
elements in the set in O(1)
time.
#{1 2 4}
;;=> #{1 4 2}
If you put twice the same element your Clojure code will be syntactically incorrect. At the REPL you will get an error.
#{ 1 1 3 5} ;;=> IllegalArgumentException Duplicate key: 1
#{:a 4 5 :d "hello"}
;;=> #{"hello" 4 5 :d :a}
(type #{:a :z})
;;=> clojure.lang.PersistentHashSet
(set [:a :b :c])
;;=> #{:c :b :a}
(conj #{:a :c} :b)
;;=> #{:c :b :a}
(conj #{:a :c} :c)
;;=> #{:c :a}
(disj #{:a :b :c} :b)
;;=> #{:c :a}
(clojure.set/union #{:a} #{:a :b} #{:c :a})
;;=> #{:c :b :a}
(clojure.set/difference #{:a :b} #{:c :a})
;;=> #{:b}
(clojure.set/intersection #{:a :b} #{:c :a})
;;=> #{:a}
The sequence abstraction
One of the most powerful abstraction of
Clojure’s data structures is the sequence
(clojure.lang.ISeq
) which all data structure
implements. This interface resembles to a Java
iterator, and it implements methods like
first()
, rest()
, more()
and cons()
. The
power of this abstraction is that it is general
enough to be used in all data structures
(lists, vectors, maps, sets and even strings
can all produce sequences) and you have loads
of functions which manipulates it. Functions
such as first
, rest
, next
and last
and
many others such as reverse
, shuffle
,
drop
, take
, partition
, filter
etc are
all built on top of the sequence abstraction.
So if you create your own data-structure and
you implement the four methods of the
clojure.lang.ISeq
interface you can benefit
from all these function without having to
re-implement them for your specific
data-structure.
You can create a sequence explicitly with the
seq
function but there are loads of functions
which already return a sequence. The sequence
of a list is the list itself, other
data-structures will produce one. Maps will
produce a sequence of map entries, where each
entry can be represented like a vector of two
values (the key and it’s value.)
(seq '(1 2 3 4))
;;=> (1 2 3 4)
(seq [1 2 3 4])
;;=> (1 2 3 4)
(seq #{1 2 3 4})
;;=> (1 4 3 2)
(seq {:a 1, :b 2, :c 3})
;;=> ([:a 1] [:b 2] [:c 3])
There is no need to call seq
explicitly, in
most of the cases, functions which take a
sequence can work with all data structures
directly.
(first [1 2 3 4])
;;=> 1
(take 3 [:a :b :c :d])
;;=> (:a :b :c)
(shuffle [1 2 3 4])
;;=> [1 3 2 4]
(shuffle #{1 2 3 4})
;;=> [2 4 1 3]
(reverse [1 2 3 4])
;;=> (4 3 2 1)
(last (reverse {:a 1 :b 2 :c 3}))
;;=> [:a 1]
Because the Clojure String implements the sequence abstraction, you can treat the String as a sequence of characters.
(seq "Hello World!")
;;=> (\H \e \l \l \o \space \W \o \r \l \d \!)
(first "Hello")
;;=> \H
(rest "Hello")
;;=> (\e \l \l \o)
(count "Hello World!")
;;=> 12
Lazy Sequences
Some of the sequences produced by the core
library are lazy which means that the entire
collection won’t be created (realised) all at
once. At first, an iterator like structure is
created, with subsequent calls to next()
causing chunks of items to be
fetched/computed. This is a very important
element of the language which allows the easy
expression of infinite sequences without
running out of memory. For example the function
range
returns a lazy sequence of natural
numbers between two given numbers. But when it
is called without arguments it returns a lazy
sequence of all natural numbers. Yet it
doesn’t run out of memory. What it really
produces is just an iterator that computes the
next chunk of numbers when next()
is called.
NOTE: As subsequent calls are made to
next()
, it is advisable not to reference/hold
earlier lazy sequence items for too long. This
allows earlier items to be cleared from memory
and prevents OOM (OutOfMemoryError).
(range 5 10)
;;=> (5 6 7 8 9)
WARNING!!! Evaluating this from your REPL might hang/crash your process, as it will try evaluate an infinite lazy sequence all at once.
(range)
;;=> (0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 ...)
(take 10 (range))
;;=> (0 1 2 3 4 5 6 7 8 9)
Regular expression patterns
Clojure also supports regular expression
patterns as literals which directly map to the
java.util.Pattern
and offers a number of
functions to match, find and extract patterns.
For example: re-find
and re-seq
to find
respectively the first or all occurrences of a
matching pattern. With re-pattern
you can
programmatically create a function out of a
string.
#"[\w\d.-]+@[\w\d-.]+\.[\w]+"
;;=> #"[\w\d.-]+@[\w\d-.]+\.[\w]+"
(type #"[\w\d.-]+@[\w\d-.]+\.[\w]+")
;;=> java.util.regex.Pattern
(re-find #"[0-9]+" "only 123 numbers")
;;=> "123"
(re-find #"[0-9]+" "no numbers")
;;=> nil
(re-find #"[\w\d.-]+@[\w\d-.]+\.[\w]+"
"bob.smith@acme.org")
;;=> "bob.smith@acme.org"
(if (re-find #"^[\w\d.-]+@[\w\d-.]+\.[\w]+$"
"bob.smith@acme.org")
"it's an email"
"it's not an email")
;;=> "it's an email"
(re-seq #"[0-9]+" "25, 43, 54, 12, 15, 65")
;;=> ("25" "43" "54" "12" "15" "65")
(re-pattern "[0-9]{1,3}(\\.[0-9]{1,3}){3}")
;;=> #"[0-9]{1,3}(\.[0-9]{1,3}){3}"
(re-find
(re-pattern "[0-9]{1,3}(\\.[0-9]{1,3}){3}")
"my IP is: 192.168.0.12")
;;=> ["192.168.0.12" ".12"]
Using re-matcher
, re-matches
, re-groups
allows you to have fine control over the capturing
groups.
Symbols and Vars
Symbols in Clojure are a way to identify things
in your programs which may have various values
at runtime. Like in a mathematical notation, x
is something not known which could assume
several different values. In a programming
context, Clojure symbols are similar to
variables in other languages but not exactly.
In other languages variables are places where
you store information, symbols in Clojure
cannot contain data themselves. Vars in
Clojure are the containers of data (one type
of), and symbols are a way to identify them and
give vars meaningful names for your program.
Everything we have seen so far were pure
values, as such they were all evaluating to
themselves. Like 42
is just 42
, the
following vector [:a "hello" 9]
just
evaluates to itself, it is just a value.
Symbols, however, during the evaluation
are replaced with the current value of var
they are pointing to. If you try to evaluate
a var which is undefined you will get an error.
Symbols are organised into namespaces. We will not explore much about namespaces here, but it will suffice to know that symbols belong to a namespace in which they assume a particular value, and you can have the same symbol name in different namesapce pointing to different values.
In Clojure symbols start with a letter, and can
contain letters, numbers, dashes, some
punctuation marks and other
characters. Basically anything which doesn’t
belong in the Clojure syntax (following
characters aren’t accepted in symbols name
@#,/.[]{}()
) anything else is a valid symbol.
You can create symbols by quoting a word with
the quote
function or the single quote
character, you can use the function symbol
,
but most commonly you will use symbols in place
of vars and locals which are define with the
special forms def
and let
respectively. A
symbol name which is NOT quoted will be
resolved to the current value of the associated
var.
As we will see in the following examples symbols
are un-typed and can refer to any Clojure value,
including nil
(symbol "username")
;;=> username
(type (symbol "username"))
;;=> clojure.lang.Symbol
(type 'username)
;;=> clojure.lang.Symbol
(def username "bruno1")
;;=> #'learn-clojure.basics/username
username
;;=> "bruno1"
age ;; undefined var produces error
;;=> Unable to resolve symbol: age in this context
(def age 21)
;;=> #'learn-clojure.basics/age
age
;;=> 21
(type 'age)
;;=> clojure.lang.Symbol
(type age)
;;=> java.lang.Long
(def user {:username "bruno1"
:score 12345
:level 32
:achievements #{:fast-run :precision10
:strategy}})
;;=> #'learn-clojure.basics/user
user
;;=> {:username "bruno1", :score 12345, :level 32, :achievements #{:precision10 :strategy :fast-run}}
(def user nil)
;;=> #'learn-clojure.basics/user
user
;;=> nil
Immutability
All basics data-types in Clojure are immutable,
including the collections. This is a very
important aspect of Clojure approach to
functional programming. In Clojure functions
transform values into new values and values are
just values. Since it is absurd to think of
changing a number (1 is always 1),
composite data structures are treated in the same way.
So functions do not mutate values they just produce new ones.
Like adding 1
to 42
produces 43
but
doesn’t really change the number 42
as it keeps on
existing on its own, adding an element to a list will
produce a new list but the old one will still be same
and unmodified.
The advantage of the immutability is that values (even deeply nested and complex structures) can be safely shared across threads and with function callers without worrying about unsafe or uncoordinated changes. This simple constraint makes Clojure programs so much easier to reason about, as the only way to produce a new value is via a functional transformation.
(def colours '(:red :green :blue))
;;=> #'learn-clojure.basics/colours
(def new-colours (cons :black colours))
;;=> #'learn-clojure.basics/new-colours
new-colours
;;=> (:black :red :green :blue)
colours
;;=> (:red :green :blue)
(def user {:username "bruno1"
:score 12345
:level 32})
;;=> #'learn-clojure.basics/user
(def user' (assoc user :level 33))
;;=> #'learn-clojure.basics/user'
user'
;;=> {:username "bruno1", :score 12345, :level 33}
user
;;=> {:username "bruno1", :score 12345, :level 32}
Functions
So far we have seen how to represent data in our system, now we will see how to make sense of this data and how to extract/process/transform it. The way we express this in Clojure is via functions.
Purity
While Clojure doesn’t enforce purity at compiler level, it certainly promotes pure-functions. Pure functions are those functions in which the processing doesn’t use or produce any side effect, which means it will use only the input parameters to compute the resulting value, and given the same parameters it will always produce the same result.
When a function given a certain input, always produces the same output it is said to be referentially transparent, because the function call itself can be replaced with its value without altering the rest of the expression.
Pure functions are important because they are incredibly easy to test as they don’t depend on external state.
Here are two examples: the first is the function +
which we have already seen, and the second is the
function rand-int
which produce a random
integer number between 0
and the given
integer. While the first is pure because given
the same input parameters it will always produce the
same output, the second one given the same
input returns a different value every time.
(+ 1 2 3)
;;=> 6
(rand-int 100)
;;=> 18
(rand-int 100)
;;=> 85
(+ 1 2 (+ 1 1 1)) ;; (+ 1 1 1) is referentially transparent
;;=> 6
Function definition
To define a function you have to use the
special form fn
or defn
with the following
syntax.
for example if we want to define a function which increments the input parameters by 1 you will write something as follow:
/- fn, special form
/ parameter vector, 1 param called `n`
| | body -> expression to evaluate when
| | | this function is called
(fn [n] (+ n 1))
This is the simplest way to define a function.
Now to refer to this function in our code we
need to give it a name. We can do so with def
as we done earlier.
(def plus-one (fn [n] (+ n 1)))
;;=> #'learn-clojure.basics/plus-one
(plus-one 10)
;;=> 11
(plus-one -42)
;;=> -41
As mentioned earlier, during the evaluation process
the symbol plus-one
is simply replaced with
its value, in the same way we can replace the
symbol with the function definition and obtain
the same result. So symbols can also refer to
functions.
((fn [n] (+ n 1)) 10)
;;=> 11
((fn [n] (+ n 1)) -42)
;;=> -41
Since defining functions is very common, there
is a shorthand to the idiom (def funciton-name
(fn [parameter list] (expression)))
via the
defn
form which just combines the def
and
fn
forms. So we can redefine the previous
function in the following way:
(defn plus-one [n]
(+ n 1))
;;=> #'learn-clojure.basics/plus-one
(plus-one 1)
;;=> 2
It is good practice to include a short description (called docstring
)
in the function.
(defn plus-one
"Returns a number which is one greater than the given `n`."
[n]
(+ n 1))
;;=> #'learn-clojure.basics/plus-one
NOTE: that Clojure core already contains
such a function and it is called inc
, while the
function dec
decrements by 1 the given value.
(inc 10)
;;=> 11
In the following example we see how to create
functions with multiple parameters. Let’s
assume we have to create a function which
create a corporate email address for its
employee. Oftentimes this type of email follows
a very specific pattern In this case we will
take the first letter of the name followed by
the lastname then @
the company domain.
(defn email-address [firstname lastname domain]
(clojure.string/lower-case (str (first firstname) lastname "@" domain)))
;;=> #'learn-clojure.basics/email-address
(email-address "John" "Smith" "acme.org")
;;=> "jsmith@acme.org"
(email-address "Walter" "White" "breakingbad.org")
;;=> "wwhite@breakingbad.org"
Function with multi-arities
So far we’ve seen how to create functions which accept a fix number of parameters. In Clojure is possible to create functions which accept different set of ‘arities’.
(defn simple-greet
([]
(simple-greet "World"))
([name]
(str "Hello " name "!")))
;;=> #'learn-clojure.basics/simple-greet
(simple-greet)
;;=> "Hello World!"
(simple-greet "Fred")
;;=> "Hello Fred!"
(defn greet
([]
"Hey, Stranger!")
([name]
(str "Hello " name))
([firstname lastname]
(str "Hi, you must be: " lastname ", " firstname " " lastname))
([title firstname lastname]
(str "Hello " title " " firstname " " lastname)))
;;=> #'learn-clojure.basics/greet
(greet)
;;=> "Hey, Stranger!"
(greet "James")
;;=> "Hello James"
(greet "James" "Bond")
;;=> "Hi, you must be: Bond, James Bond"
(greet "Dr" "John H." "Watson")
;;=> "Hello Dr John H. Watson"
It is also possible to create functions
which have any number of parameters.
these are called variadic functions
.
(defn de-dup [& names]
(seq (set names)))
;;=> #'learn-clojure.basics/de-dup
(de-dup "John" "Fred" "Lara" "John" "John" "Susan")
;;=> ("Susan" "Fred" "John" "Lara")
(defn short-name [firstname & names]
(str firstname " " (last names)))
;;=> #'learn-clojure.basics/short-name
(short-name "Maria" "Teresa" "Jiulia" "Ramírez de Arroyo" "García")
;;=> "Maria García"
High-order functions
In Clojure functions are reified contructs, therefore we can threat them as normal values. As such functions can be passed as parameters of function or returned as result of function call.
(defn is-commutative? [op a b]
(= (op a b) (op b a)))
;;=> #'learn-clojure.basics/is-commutative?
(is-commutative? + 3 7)
;;=> true
(is-commutative? / 3 7)
;;=> false
(defn multiplier [m]
(fn [n]
(* n m)))
;;=> #'learn-clojure.basics/multiplier
(def doubler (multiplier 2))
;;=> #'learn-clojure.basics/doubler
(doubler 5)
;;=> 10
(doubler 10)
;;=> 20
(def mult-10x (multiplier 10))
;;=> #'learn-clojure.basics/mult-10x
(mult-10x 35)
;;=> 350
Anonymous functions or lambda functions
Oftentimes you want to create a function for a specific task in a local context. Such functions don’t have any reason to have a global name as they are meaningful only in that specific context, in this case you can create anonymous functions (also called lambda function) and Clojure has some support to make this easier. We already seen an example of an anonymous function with our very first function example.
(fn [n] (+ n 1)) ;;Evaluates to an object/value. In Clojure, functions are values
;;=> #function[learn-clojure.basics/eval20002/fn--20003]
((fn [n] (+ n 1)) 10)
;;=> 11
here the function we built hasn’t got a name.
We then used a def
form to give it the
plus-one
name.
This anonymous function could also be written
in the following way.
#(+ % 1)
;;=> #function[learn-clojure.basics/eval20028/fn--20029]
(#(+ % 1) 10)
;;=> 11
In this function the symbol %
replace the argument
If you have more than one parameter you can denote them as
%1
(or %
), %2
, %3
, %4
…
for example in our is-commutative?
function we expect
and operation which accept two arguments:
(is-commutative? #(+ %1 %2) 9 8)
;;=> true
Closures
Closures (with the s
) are lambdas which refer
to a context (or values from another context).
These functions are said to be “closing over”
the environment. This means that it can access
parameters and values which are NOT in the
parameters list.
Like in our multiplier
function example, the
returned function is closing over the value m
which is not in its parameter list but it is a
parameter of the parent context the
multiplier
fn. While n
is a normal
parameter m
is the value we are “closing
over” providing a context for that function.
(defn multiplier [m]
(fn [n]
(* n m)))
;;=> #'learn-clojure.basics/multiplier
Let’s see another example. Here we want to
create a function which takes a number and
return a logical value representing whether the
number is between two limits (limits included).
For this purpose we can use the function >=
which returns whether a number is greater or
equal then the other one.
Other similar functions are >
, <
, <=
, =
and not=
.
(>= 10 3) ;; like 10 >= 3
;;=> true
(>= 3 10)
;;=> false
(>= 6 6)
;;=> true
(>= 6 5 2)
;;=> true
(defn limit-checker [min max]
(fn [n]
(>= max n min)))
;;=> #'learn-clojure.basics/limit-checker
(def legal-value (limit-checker 5 10))
;;=> #'learn-clojure.basics/legal-value
(legal-value 1)
;;=> false
(legal-value 7)
;;=> true
(legal-value 10)
;;=> true
(legal-value 11)
;;=> false
Recursion
A recursive function is a function which calls itself. There are two types of recursion the mundane recursion and the tail recursion.
Let’s see an example of both with this function which given a number it calculates the sum of all natural numbers from 1 to the given number.
(defn sum1
([n]
(sum1 n 0))
([n accumulator]
(if (< n 1)
accumulator
;; else
(sum1 (dec n) (+ n accumulator)))))
;;=> #'learn-clojure.basics/sum1
(sum1 1)
;;=> 1
(sum1 3)
;;=> 6
(sum1 10)
;;=> 55
This type of recursion is called mundane recursion and every new call it allocates one new frame on the stack so if you run this with high enough numbers it will blow your stack.
(sum1 10000)
;;=> StackOverflowError
Let’s see how we can write this
function with a tail recursion using
recur
.
(defn sum2
([n]
(sum2 n 0))
([n accumulator]
(if (< n 1)
accumulator
;; else
(recur (dec n) (+ n accumulator)))))
;;=> #'learn-clojure.basics/sum2
(sum2 10)
;;=> 55
(sum2 10000)
;;=> 50005000
(sum2 1000000)
;;=> 500000500000
(sum2 100000000)
;;=> 5000000050000000
As you can see the function can recur much more without exploding this is because it doesn’t consume stack. The tail recursion can be used only when when the recursion point is in the tail position (a return position).
Now in sum1
and sum2
we had to add
another function arity just to keep track
of the accumulator
. This is very
common in recursion, while recurring
you have to keep track of some
accumulated value, therefore Clojure
makes it simpler by providing another
form called loop
which plays well
with recur
. In Clojure you’ll often
hear about loop/recur
construct.
Let’s see how we can rewrite the previous
function to leverage the loop/recur
construct.
(defn sum3
[num]
(loop [n num
accumulator 0]
(if (< n 1)
accumulator
;; else
(recur (dec n) (+ n accumulator)))))
;;=> #'learn-clojure.basics/sum3
(sum3 10)
;;=> 55
Let’s see another example with the Fibonacci sequence. Let’s start with the mundane recursion.
(defn fibonacci1
[n]
(if (< n 2)
1
;; else
(+ (fibonacci1 (- n 1))
(fibonacci1 (- n 2)))))
;;=> #'learn-clojure.basics/fibonacci1
(fibonacci1 1)
;;=> 1
(fibonacci1 10)
;;=> 89
Now this is a simple and very functional
definition of the Fibonacci sequence, however
it is particularly bad in terms of computational
complexity. in fact this is O(2^n)
.
Let’s use the time
function to
calculate how much it takes to compute the
35th number in the sequence.
(time
(fibonacci1 35))
;;=> "Elapsed time: 1806.753129 msecs"
;;=> 14930352
Let’s try to use tail recursion. As you will see we have to restructure our function to allow the recursion to happen in the tail position.
(defn fibonacci2
[n]
(loop [i n c 1 p 1]
(if (< i 2)
c
(recur (dec i) (+' c p) c))))
;;=> #'learn-clojure.basics/fibonacci2
(fibonacci2 10)
;;=> 89
(time
(fibonacci2 35))
;;=> "Elapsed time: 0.04467 msecs"
;;=> 14930352
(time
(fibonacci2 1000))
;;=> "Elapsed time: 1.145227 msecs"
;;=> 70330367711422815821835254877183549770181269836358732742604905087154537118196933579742249494562611733487750449241765991088186363265450223647106012053374121273867339111198139373125598767690091902245245323403501N
Function composition and partial functions
We have seen earlier that there are functions
such as first
, second
, last
and rest
to
access respectively the first item of the
sequence, the second item, the last item and
the tail of the sequence. These functions can
be combined to create other functions for
accessing the third, fourth, fifth and other
positional items. The following functions are
an example of how to construct two such
functions.
(defn third
[coll]
(first (rest (rest coll))))
(third '(1 2 3 4 5))
;;=> 3
(defn fourth
[coll]
(first (rest (rest (rest coll)))))
(fourth '(1 2 3 4 5))
;;=> 4
But there is another way. If, like in this
case, the output of a function can be passed
directly into the input of the next one as a
simple pipeline of functions then you can just
use the comp
function.
(comp f1 f2 f3 ... fn)
(def third (comp first rest rest))
(def fourth (comp first rest rest rest))
(third '(1 2 3 4 5))
;;=> 3
(fourth '(1 2 3 4 5))
;;=> 4
Let’s see another example. Let’s assume
we have to write a function which given
a number it doubles it and subtract 1
from it. So we can use the multiplier
function we wrote earlier to accomplish
the first part and the Clojure core dec
to decrement it by one and compose them
together with comp
.
(defn multiplier [m]
(fn [n]
(* n m)))
(def doubler (multiplier 2))
(def almost-twice (comp dec doubler))
(almost-twice 5)
;;=> 9
(almost-twice 9)
;;=> 17
Now let’s say we want to create a function
which given a number perform almost-twice
two
times.
(def almost-twice-twice (comp almost-twice almost-twice))
(almost-twice-twice 5)
;;=> 17
(almost-twice-twice 10)
;;=> 37
Another way we could have written the doubler
function is by using the partial application of
the function *
. In Clojure this is achieved
via the function partial
.
(partial f arg1 ... argn)
(def doubler (partial * 2))
(doubler 5)
;;=> 10
what happens here is that the partial
function returns a function which calls *
with the parameters of the partial and the
parameter of the final call, all in one call.
Another nice example is using the function
format
which takes a format-string and a
bunch of arguments and formats the string
accordingly. This is very similar to the C
printf
function however Clojure uses the Java
String.format
implementation. So we can use
this to create a function that produces a
string which contains a zero-padded formatted
version of the given number.
(def pad0 (partial format "%013d"))
(pad0 43)
;;=> "0000000000043"
(pad0 2346765847)
;;=> "0002346765847"
(def item-location (partial format "Section: %d, Row %d, Shelve: %s"))
(item-location 3 12 "F")
;;=> "Section: 3, Row 12, Shelve: F"
Vars, namespaces, scope and local bindings
When defining a var using def
or defn
followed by symbol, the symbol is created
in the local namespace.
When starting the REPL in a empty project
the default namespace is called user
so unless you configure differently
all your vars will be created there.
Namespaces are like containers in which
vars live in, but namespaces,
once defined are globally accessible.
As a consequence when you define a var
using def
or defn
these will be accessible
globally.
We will use ns
which create a namespace if
not present and switch to it, and in-ns
just
changes the current namespace. we will see how
to loads namespaces we need with our processing
with require
and how vars are globally
accessible.
(ns user.test.one)
;;=> nil
(def my-name "john")
;;=> #'user.test.one/my-name
my-name
;;=> "john"
(ns user.test.two)
;;=> nil
(def my-name "julie")
;;=> #'user.test.two/my-name
my-name
;;=> "julie"
user.test.one/my-name
;;=> "john"
user.test.two/my-name
;;=> "julie"
(in-ns 'user.test.one)
;;=> #namespace[user.test.one]
my-name
;;=> "john"
(ns user.test.one)
;;=> nil
(def my-name (clojure.string/upper-case "john"))
;;=> #'user.test.one/my-name
my-name
;;=> "JOHN"
(ns user.test.one
(:require [clojure.string :as s]))
;;=> nil
(def my-name (s/upper-case "john"))
;;=> #'user.test.one/my-name
(ns user.test.one
(:require [clojure.string :refer [upper-case]]))
;;=> nil
(def my-name (upper-case "john"))
;;=> #'user.test.one/my-name
my-name
;;=> "JOHN"
(ns user.test.one
(:require [clojure.string :refer [upper-case]])
(:require [user.test.two :as two]))
;;=> nil
(def my-name (upper-case two/my-name))
;;=> #'user.test.one/my-name
my-name
;;=> "JULIE"
The global accessible vars (globals) is one level of scoping. If you don’t want to have globally accessible vars then you have to use local bindings.
We already had a glimpse of these while defining functions. In fact parameters are only visible inside the function:
(defn sum
[v1 v2]
(+ v1 v2))
In this example v1
and v2
are only
accessible inside the function. Outside might
be undefined or have a different value:
(def v1 "hello")
(def v2 "world")
(sum 10 25)
;;=> 35
v1
;;=> "hello"
v2
;;=> "world"
There is another way to create local binding
which are valid only inside the s-expr block,
using let
. With the let form you can create
local variable which are visible only inside
the block.
(let [v1 23
v2 45]
;; inside this block v1 v2 have the values 23 and 45
(+ v1 v2))
;;=> 68
outside the block v1 and v2 are resolved in the
parent scope which in this case is the
namespace/global You can even nest let
bindings and use them inside functions. Here
we use println
to print to the standard
output a message
(let [v1 "this is a local value"] ;; outer block
(println "outer-v1:" v1)
(let [v1 1] ;; inner block
(println "inner-v1:" v1))
(println "after-v1:" v1))
(println "global-v1:" v1) ;; global
;;=> outer-v1: this is a local value
;;=> inner-v1: 1
;;=> after-v1: this is a local value
;;=> global-v1: hello
Destructuring
Destructuring is a simple, yet powerful feature of Clojure. There are several ways in which you can leverage destructuring to make your code cleaner, with less repetitions, and less bug-prone code. Destructuring is a way to unpack a collection into values and bind them to locals. It takes a bit of exercise to make the eye used to read destructuring forms, but once done, the code appears much cleaner. I won’t cover the destructuring here, however I wrote a detailed post about the topic which you can find here: The complete guide to Clojure destructuring
Flow control
We briefly introduced if
for flow control,
which is the basic form on top of which all the
others are based upon. Moreover there are more
options for flow control in Clojure which we
will see i.e if
,not
, and
, or
, if-not
,
when
, when-not
, cond
and case
.
(if condition
then
else)
the condition
doesn’t have to be a boolean
expression necessarily as, in Clojure, anything
is considered to be true
except false
and
nil
As you would expect if the condition
is
evaluated to be true the then
expression is
evaluated, otherwise the else
expression is
evaluated. The overall result will be
determined by the result of the expression
which is evaluated.
(if (= 1 1)
"this is true"
"this is false")
;;=> "this is true"
(if (not (= 1 1))
"this is true"
"this is false")
;;=> "this is false"
Some times you don’t have else
clause,
so you can omit it.
(if (not= 1 0)
(println "that's odd"))
;;=> that's odd
;;=> nil
when you have if
and not
together you can
combine them in if-not
(if-not (= 1 0)
(println "that's odd"))
;;=> that's odd
;;=> nil
But when there is no else expression
a more idiomatic way to write it in Clojure
would be to use the form when
, and similarly
when you have a negation in your condition
you can use when-not
.
(when (not= 1 0)
(println "that's odd"))
;;=> that's odd
;;=> nil
(when-not (= 1 0)
(println "that's odd"))
;;=> that's odd
;;=> nil
when
accepts more than one expression and the
result of the overall expression is the result
of last form, or nil
if the condition is
false.
(when true
1
2
3
4)
;;=> 4
However if
accepts one form for the then, and
another form for the else
when given. If you
have to invoke several functions perhaps with
side-effect, then you have to use the do
form.
(do
1
2
3
4)
;;=> 4
(if true
(do
(println "this is executed when true")
(println "this one too.")
(println "the next line is the value returned")
:ok)
(do
(println "this is executed in the else")
:this-is-else))
;;=> this is executed when true
;;=> this one too.
;;=> the next line is the value returned
;;=> :ok
If you have to check the equality to many
different values you can use the case
which is similar to switch/case
of many
languages. In Clojure it looks like this:
(case value
val1 expr1
val2 expr2
val3 expr3
default-exp)
(let [order-status :completed]
(case order-status
:new "We have received your order, thanks."
:processing "We are processing your order"
:ready "We are processing your order"
:shipped "Your order is on it's way"
:completed "This order has been already delivered"
"This order is not found"))
;;=> "This order has been already delivered"
If you have multiple value with the same expression you can group them in a list.
(let [order-status :ready]
(case order-status
:new "We have received your order, thanks."
(:processing :ready) "We are processing your order"
:shipped "Your order is on it's way"
:completed "This order has been already delivered"
"This order is not found"))
;;=> "We are processing your order"
Another very popular conditional form is
cond
, this is used in place of their
if/else-if/else-if/else
of other languages.
(cond
condition1 expr1
condition2 expr2
condition3 expr3
:else default-expr)
(let [age 21]
(cond
(< age 16) "You are too young to drive"
(<= 16 age 18) "You can start your driving lessons"
(>= 100 age 18) "You can drive only if you have got a license"
:else "Maybe you should let someone else driving."))
;;=> "You can drive only if you have got a license"
If you have complicated conditions you might
have to combine the conditions logically with and
,
or
and not
. We’ve already seen not
which
negates the given condition, while and
and
or
work as you would expect.
(and
condition1
condition2
condition3)
the value of the entire expression is the value
of the last condition. If a condition is found
to be falsey (false
or nil
) the
evaluation is interrupted and the whole expression
will have the value of last evaluated expression.
(and true true true)
;;=> true
(and 1 2 3 4)
;;=> 4
(and 1 2 nil 4 5)
;;=> nil
Similarly or
accepts multiple conditions,
and they are evaluated in the given order,
and the first condition which is found to
be true will stop the evaluation
and return its value as the value of the
the whole expression.
(or false false nil true)
;;=> true
(or false 1 nil 3)
;;=> 1
or
is often used to provide default
values to parameters function via destructuring
however it can be used in normal code as well.
(defn connection-url [config-map resource]
(let [protocol (or (:protocol config-map) "http")
hostname (or (:hostname config-map) "localhost")
port (or (:port config-map) 8080)]
(str protocol "://" hostname ":" port resource )))
(connection-url {} "/users")
;;=> "http://localhost:8080/users"
Obviously you can combine and
, or
and not
to create arbitrary complex conditions.
Core functions
The core has hundreds of functions defined, which all work on the basic data structures that we’ve seen so far. You can find the full list in the Clojure cheatsheet
The function: apply
For the purpose of this course we will
only see a few examples starting with apply
.
As the same suggests, it “applies” a function
to a given list of arguments.
(apply f args)
(apply f x args)
(def words ["Hello" " " "world!"])
(str ["Hello" " " "world!"])
;;=> "[\"Hello\" \" \" \"world!\"]"
(apply str ["Hello" " " "world!"])
;;=> "Hello world!"
(apply str "first-argument: " ["Hello" " " "world!"])
;;=> "first-argument: Hello world!"
The function: map
Next we will see one of the most used functions
in the core map
which has nothing to do with
the associative maps (data structures) we seen
before. map
comes from the set theory and is
a function which takes a function and a
sequence of values and applies the function to
all values in the sequence. It returns a
lazy-sequence which means that the function
application is not performed when calling map
,
but it will be performed when the result will
be consumed.
(map f coll)
(map clojure.string/upper-case
["Hello" "world!"])
;;=> ("HELLO" "WORLD!")
The function: mapcat
Sometimes the application of the function f
returns a list of things. In the following
example, applying the split function to each sentence
spilts each sentence and returns a list of words.
(map #(clojure.string/split % #"\W+")
["Lorem ipsum dolor sit amet, consectetur adipiscing elit."
"Duis vel ante est."
"Pellentesque habitant morbi tristique"
"senectus et netus et malesuada fames ac turpis egestas."])
;;=> (["Lorem" "ipsum" "dolor" "sit" "amet" "consectetur" "adipiscing" "elit"] ["Duis" "vel" "ante" "est"] ["Pellentesque" "habitant" "morbi" "tristique"] ["senectus" "et" "netus" "et" "malesuada" "fames" "ac" "turpis" "egestas"])
application of the split function to a single
sentence produces a list of words. Consequently
the application of the function to all
sentences produces a list of lists. If we
rather have a single list with all the words we
then need to concatenate all the sub-lists into
one. To do so Clojure core has the concat
function which just concatenates multiple lists
into one.
(concat [0 1 2 3] [:a :b :c] '(d e f))
;;=> (0 1 2 3 :a :b :c d e f)
To obtain a single list of all words we just need
to apply the concat
function to the map
result.
(apply concat
(map #(clojure.string/split % #"\W+")
["Lorem ipsum dolor sit amet, consectetur adipiscing elit."
"Duis vel ante est."
"Pellentesque habitant morbi tristique"
"senectus et netus et malesuada fames ac turpis egestas."]))
;;=> ("Lorem" "ipsum" "dolor" "sit" "amet" "consectetur" "adipiscing" "elit" "Duis" "vel" "ante" "est" "Pellentesque" "habitant" "morbi" "tristique" "senectus" "et" "netus" "et" "malesuada" "fames" "ac" "turpis" "egestas")
This construct is common enough that Clojure has
a core function that does just this called mapcat
.
(mapcat #(clojure.string/split % #"\W+")
["Lorem ipsum dolor sit amet, consectetur adipiscing elit."
"Duis vel ante est."
"Pellentesque habitant morbi tristique"
"senectus et netus et malesuada fames ac turpis egestas."])
;;=> ("Lorem" "ipsum" "dolor" "sit" "amet" "consectetur" "adipiscing" "elit" "Duis" "vel" "ante" "est" "Pellentesque" "habitant" "morbi" "tristique" "senectus" "et" "netus" "et" "malesuada" "fames" "ac" "turpis" "egestas")
The function: reduce
Hadoop uses the two concept of map
and
reduce
to perform arbitrary computation on
large data. Clojure has reduce
as core
function as well. While map
is applied
one-by-one to all arguments with the objective
of performing a transformation reduce
seeks
to summarize many values into one. For example
if you want to find the total sum of a list of
values you can use reduce in the following way.
(reduce f coll)
It can be used with many core functions
like the arithmetic functions +
, *
but also with functions like max
and min
which respectively return the highest and
the lowest value passed. But they
can be used with your own functions too.
(reduce + [10 15 23 32 43 54 12 11])
;;=> 200
(reduce * [10 15 23 32 43 54 12 11])
;;=> 33838041600
(reduce max [10 15 23 32 43 54 12 11])
;;=> 54
(reduce str ["Hello" " " "world!"])
;;=> "Hello world!"
The function: filter
The next function in the core is filter
which
takes a predicate function and a collection
and returns a lazy-sequence of the items in the
collection for which the application of the
function returns a “truthy” value. Predicate
functions are functions which takes one
parameter and return a logical true or false.
(filter pred? coll)
For example:
(filter odd? [0 1 2 3 4 5 6 7])
;;=> (1 3 5 7)
(filter #(> (count %) 5)
["Lorem" "ipsum" "dolor" "sit" "amet" "consectetur" "adipiscing"])
;;=> ("consectetur" "adipiscing")
identity
is a function which given a value
will just return the value.
This is often used when a function transformation
is required as parameter, but no transformation is wanted.
another idiomatic use of it is to remove nil and false
from a collection.
(filter identity
["Lorem" "ipsum" nil "sit" nil "consectetur" nil])
;;=> ("Lorem" "ipsum" "sit" "consectetur")
The function remove
is the dual of filter
in the sense that is will remove the items
for which the predicate function returns true.
(filter odd? [0 1 2 3 4 5 6 7])
;;=> (1 3 5 7)
(remove odd? [0 1 2 3 4 5 6 7])
;;=> (0 2 4 6)
The function: sort
sort
as you would expect returns a sorted
sequence of the elements in the given collection.
(sort coll)
(sort comp coll)
(sort [8 3 5 2 5 7 9 4 3 1 0])
;;=> (0 1 2 3 3 4 5 5 7 8 9)
(sort > [8 3 5 2 5 7 9 4 3 1 0])
;;=> (9 8 7 5 5 4 3 3 2 1 0)
(sort-by count
["Lorem" "ipsum" "dolor" "sit" "amet" "consectetur" "adipiscing"])
;;=> ("sit" "amet" "Lorem" "ipsum" "dolor" "adipiscing" "consectetur")
(sort-by count >
["Lorem" "ipsum" "dolor" "sit" "amet" "consectetur" "adipiscing"])
;;=> ("consectetur" "adipiscing" "Lorem" "ipsum" "dolor" "amet" "sit")
(sort-by :score >
[{:user "john1" :score 345}
{:user "fred3" :score 75}
{:user "sam2" :score 291}])
;;=> ({:user "john1", :score 345} {:user "sam2", :score 291} {:user "fred3", :score 75})
A similar function is sort-by
which accepts a
function which is applied to the item before the
comparison.
The function: group-by
Out of the box in Clojure you have a function
to perform grouping on your data. group-by
accepts a function and a collection and it will
apply the given function to all items in the
collection and then group the items using the
result of the function, i.e items that give the
same result when the function is applied end up
in the same group. Each group will be
associated with it’s common function result.
It returns a map where the key is the group
common function result, and the value of the
map is a list of items which belong to that
group.
(group-by odd? (range 10))
;;=> {false [0 2 4 6 8], true [1 3 5 7 9]}
(group-by count ["Lorem" "ipsum" "dolor" "sit" "amet" "consectetur" "adipiscing"])
;;=> {5 ["Lorem" "ipsum" "dolor"], 3 ["sit"], 4 ["amet"], 11 ["consectetur"], 10 ["adipiscing"]}
(group-by :user-id [{:user-id 1 :uri "/"}
{:user-id 2 :uri "/foo"}
{:user-id 1 :uri "/account"}])
;;=> {1 [{:user-id 1, :uri "/"} {:user-id 1, :uri "/account"}], 2 [{:user-id 2, :uri "/foo"}]}
The function: frequencies
When looking to count how frequent an item appears
in a collection for example to compute histograms
you can use the function called frequencies
.
(frequencies ["john" "fred" "alice" "fred" "jason" "john" "alice" "john"])
;;=> {"john" 3, "fred" 2, "alice" 2, "jason" 1}
(frequencies [1 2 3 1 2 3 2 3 1 2 3 3 2 3 2 3 4 4])
;;=> {1 3, 2 6, 3 7, 4 2}
The function: partition
Another interesting group of functions in the Clojure
core are partition
, partition-all
, partition-by
.
Here we will see only the first two.
partition
chunks the given sequence into
sub-sequences (lazy) of n
items each.
(partition n coll)
(partition n step coll)
(partition 3 (range 11))
;;=> ((0 1 2) (3 4 5) (6 7 8))
partition-all
does the same, but it returns
also chunks of which are incomplete.
(partition-all 3 (range 11))
;;=> ((0 1 2) (3 4 5) (6 7 8) (9 10))
The step
parameters tells the function how
many item has to move forward after every
chunk. if not given step
is equal to n
(partition 3 1 (range 11))
;;=> ((0 1 2) (1 2 3) (2 3 4) (3 4 5) (4 5 6) (5 6 7) (6 7 8) (7 8 9) (8 9 10))
(partition 3 5 (range 11))
;;=> ((0 1 2) (5 6 7))
The function: into
into
is used to create a new collection of a
given type with all items from another
collection “into” it. Items are conjoined
using conj
. It is often used to change the
type of a collection, or to build a map out of
key/value pairs.
(into dest source)
(into [] '(0 1 2 3 4 5 6 7 8 9))
;;=> [0 1 2 3 4 5 6 7 8 9]
(into '() '(0 1 2 3 4 5 6 7 8 9))
;;=> (9 8 7 6 5 4 3 2 1 0)
(into (sorted-map) {:b 2, :c 3, :a 1})
;;=> {:a 1, :b 2, :c 3}
(into {} [[:a 1] [:b 2] [:c 3]])
;;=> {:a 1, :b 2, :c 3}
(map (fn [e] [(first e) (inc (second e))])
{:a 1, :b 2, :c 3})
;;=> ([:a 2] [:b 3] [:c 4])
(into {}
(map (fn [e] [(first e) (inc (second e))])
{:a 1, :b 2, :c 3}))
;;=> {:a 2, :b 3, :c 4}
The function: juxt
This function takes a set of functions, and returns a function which when called with a argument returns a vector with all the functions applied to the argument in the given order.
(juxt f1 f2 f3 ... fn)
it returns a function which is equivalent to:
(fn [x] (vector (f1 x) (f2 x) (f3 x) ...))
(def string-info
(juxt identity clojure.string/upper-case count frequencies))
(string-info "Hello World")
;;=> ["Hello World" "HELLO WORLD" 11 {\H 1, \e 1, \l 3, \o 2, \space 1, \W 1, \r 1, \d 1}]
Operation with files
To open, read, write files there are wrappers from the java machinery for files. However here we will only see how to read and write text files which are small enough to fit in memory.
To write some text in a file you can use the
function spit
, while to read the content of a
file as a string you can use slurp
.
(spit "/tmp/my-file.txt"
"This is the content")
;;=> nil
(slurp "/tmp/my-file.txt")
;;=> "This is the content."
Error handling
What happens if the file you trying to read
doesn’t exists? or the device you trying to
write to is full? The underlying Java APIs will
throw an exception. Clojure provides access to
the java machinery for error handling and you
can use try
, catch
, finally
and throw
with the same semantic as the Java’s ones.
You have to surround the code which might throw
an exception using a try
form, then you can
handle the errors by their native type with a
catch
block. Finally is a block that gets
executed no matter what happen in the try block and
whether or not an exception is raised. throw
is used to throw an exception from your own code.
(slurp "/this_doesnt_exists.txt")
;;=> FileNotFoundException /this_doesnt_exists.txt (No such file or directory)
(try
(slurp "/this_doesnt_exists.txt")
(catch Exception x
(println "unable to read file.")
""))
;;=> unable to read file
;;=> ""
Oftentimes while working with network requests, you might want to retry a given request a number of times before giving up. In such cases there is a library called safely which might be handy.
Macros
The macros are function which are executed at compile time by the compiler. The take code as input, and the output is still code. The code is expressed in the same stuff you have seen so far: lists, symbols, keywords, vectors, maps strings etc and from a user point of view they look just like normal Clojure functions (almost). It is a great way to extends the language to meet your domain needs. However I think this is a topic for a more advanced course. If you want to learn the basics of the macro you can read the following blog post: