

Julia 0.5 Highlights
PlatformsTools & Languagesposted by Julia Computing May 10, 2017

To follow along with the examples in this blog post and run them live, you can go to JuliaBox, create a free login, and open the “Julia 0.5 Highlights” notebook under “What’s New in 0.5”. The notebook can also be downloaded from here.
Julia 0.5 is a pivotal release. It introduces more transformative features than any release since the first official version. Moreover, several of these features set the stage for even more to come in the lead up to Julia 1.0. In this post, we’ll go through some of the major changes in 0.5, including improvements to functional programming, comprehensions, generators, arrays, strings, and more.
Functions
Julia has always supported functional programming features:
- anonymous functions (lambdas),
- inner functions that close over local variables (closures),
- functions passed to and from other functions (first-class and higher-order functions).
Before this release, however, these features all came with a significant performance cost. In a language that targets high-performance technical computing, that’s a serious limitation. So the Julia standard library and ecosystem have been rife with work-arounds to get the expressiveness of functional programming without the performance problems. But the right solution, of course, is to make functional programming fast – ideally just as fast as the optimal hand-written version of your code would be. In Julia 0.5, it is. And that changes everything.
This change is so important that there will be a separate blog post about it in the coming weeks, explaining how higher-order functions, closures and lambdas have been made so efficient, as well as detailing the kinds of zero-cost abstractions these changes enable. But for now, I’ll just tease with a little timing comparison. First, some definitions – they’re the same in both 0.4 and 0.5:
v = rand(10^7); # 10 million random numbers
double_it_vec(v) = 2v # vectorized doubling of input
double_it_map(v) = map(x->2x, v) # map a lambda over input
First, a timing comparison in Julia 0.4:
julia> VERSION
v"0.4.7"
julia> mean([@elapsed(double_it_vec(v)) for _=1:100])
0.024444888209999998
julia> mean([@elapsed(double_it_map(v)) for _=1:100])
0.5515606454499999
On 0.4, the functional version using map
is 22 times slower than the vectorized version, which uses specialized generated code for maximal speed. Now, the same comparison in Julia 0.5:
julia> VERSION
v"0.5.0"
julia> mean([@elapsed(double_it_vec(v)) for _=1:100])
0.024549842180000003
julia> mean([@elapsed(double_it_map(v)) for _=1:100])
0.023871925960000002
The version using map
is as fast as the vectorized one in 0.5. In this case, writing 2v
happens to be more convenient than writing map(x->2x, v)
, so we may choose not to use map
here, but there are many cases where functional constructs are clearer, more general, and more convenient. Now, they are also fast.
Ambiguous methods
One design decision that any multiple dispatch language must make is how to handle dispatch ambiguities: cases where none of the methods applicable to a given set of arguments is more specific than the rest. Suppose, for example, that a generic function, f
, has the following methods:
f(a::Int, b::Real) = 1
f(a::Real, b::Int) = 2
In Julia 0.4 and earlier, the second method definition causes an ambiguity warning:
WARNING: New definition
f(Real, Int64) at none:1
is ambiguous with:
f(Int64, Real) at none:1.
To fix, define
f(Int64, Int64)
before the new definition.
This warning is clear and gets right to the point: the case f(a,b)
where a
and b
are of type Int
(aka Int64
on 64-bit systems) is ambiguous. Evaluating f(3,4)
calls the first method of f
– but this behavior is undefined. Giving a warning whenever methods could be ambiguous is a fairly conservative choice: it urges people to define a method covering the ambiguous intersection before even defining the methods that overlap. When we decided to give warnings for potentially ambiguous methods, we hoped that people would avoid ambiguities and all would be well in the world.
Warning about method ambiguities turns out to be both too strict and too lenient. It’s far too easy for ambiguities to arise when shared generic functions serve as extension points across unrelated packages. When many packages extend the same generic functions, it’s common for the methods added to have some ambiguous overlap. This happens even when each package has no ambiguities on its own. Worse still, slight changes to one package can introduce ambiguities elsewhere, resulting in the least fun game of whack-a-mole ever. At the same time, the fact that ambiguities only cause warnings means that people learn to ignore them, which is annoying at best, and dangerous at worst: it’s far too easy for a real problem to be hidden by a barrage of insignificant ambiguity warnings. In particular, on 0.4 and earlier if an ambiguous method is actually called, no error occurs. Instead, one of the possible methods is called, based on the order in which methods were defined – which is essentially arbitrary when they come from different packages. Usually the method works – it does apply, after all – but this is clearly not the right thing to do.
The solution is simple: in Julia 0.5 the existence of potential ambiguities is fine, but actually calling an ambiguous method is an immediate error. The above method definitions for f
, which previously triggered a warning, are now silent, but calling f
with two Int
arguments is a method dispatch error:
julia> f(3,4)
ERROR: MethodError: f(::Int64, ::Int64) is ambiguous. Candidates:
f(a::Real, b::Int64) at REPL[2]:1
f(a::Int64, b::Real) at REPL[1]:1
in eval(::Module, ::Any) at ./boot.jl:231
in macro expansion at ./REPL.jl:92 [inlined]
in (::Base.REPL.##1#2{Base.REPL.REPLBackend})() at ./event.jl:46
This improves the experience of using the Julia package ecosystem considerably, while also making Julia safer and more reliable. No more torrent of insignificant ambiguity warnings. No more playing ambiguity whack-a-mole when someone else refactors their code and accidentally introduces ambiguities in yours. No more risk that a method call could be silently broken because of warnings that we’ve all learned to ignore.
Return type annotations
A long-requested feature has been the ability to annotate method definitions with an explicit return type. This aids the clarity of code, serves as self-documentation, helps the compiler reason about code, and ensures that return types are what programmers intend them to be. In 0.5, you can annotate method definitions with a return type like so:
function clip{T<:Real}(x::T, lo::Real, hi::Real)::T
if x < lo
return lo
elseif x > hi
return hi
else
return x
end
end
This function is similar to the built-in clamp
function, but let’s consider this definition for the sake of example. The return annotation on clip
has the effect of inserting implicit calls to x->convert(T, x)
at each return point of the method. It has no effect on any other method of clip
, only the one where the annotation occurs. In this case, the annotation ensures that this method always returns a value of the same type as x
, regardless of the types of lo
and hi
:
julia> clip(0.5, 1, 2) # convert(T, lo)
1.0
julia> clip(1.5, 1, 2) # convert(T, x)
1.5
julia> clip(2.5, 1, 2) # convert(T, hi)
2.0
You’ll note that the annotated return type here is T
, which is a type parameter of the clip
method. Not only is that allowed, but the return type can be an arbitrary expression of argument values, type parameters, and values from outer scopes. For example, here is a variation that promotes its arguments:
function clip2(x::Real, lo::Real, hi::Real)::promote_type(typeof(x), typeof(lo), typeof(hi))
if x < lo
return lo
elseif x > hi
return hi
else
return x
end
end
julia> clip2(2, 1, 3)
2
julia> clip2(2, 1, 13//5)
2//1
julia> clip2(2.5, 1, 13//5)
2.5
Return type annotations are a fairly simple syntactic transformation, but they make it easier to write methods with consistent and predictable return types. If different branches of your code can lead to slightly different types, the fix is now as simple as putting a single type annotation on the entire method.
Vectorized function calls
Julia 0.5 introduces the syntax f.(A1, A2, ...)
for vectorized function calls. This syntax translates to broadcast(f, A1, A2, ...)
, where broadcast
is a higher-order function (introduced in 0.2), which generically implements the kind of broadcasting behavior found in Julia’s “dotted operators” such as .+
, .-
, .*
, and ./
. Since higher-order functions are now efficient, writing broadcast(f,v,w)
and f.(v,w)
are both about as fast as loops specialized for the operation f
and the shapes of v
and w
. This syntax lets you vectorize your scalar functions the way built-in vectorized functions like log
, exp
, and atan2
work. In fact, in the future, this syntax will likely replace the pre-vectorized methods of functions like exp
and log
, so that users will write exp.(v)
to exponentiate a vector of values. This may seem a little bit uglier, but it’s more consistent than choosing an essentially arbitrarily set of functions to pre-vectorize, and as I’ll explain below, this approach can also have significant performance benefits.
To give a more concrete sense of what this syntax can be used for, consider the clip
function defined above for real arguments. This scalar function can be applied to vectors using vectorized call syntax without any further method definitions:
julia> v = randn(10)
10-element Array{Float64,1}:
-0.868996
1.79301
-0.309632
1.16802
-1.57178
-0.223385
-0.608423
-1.54862
-1.33672
0.864448
julia> clip(v, -1, 1)
ERROR: MethodError: no method matching clip(::Array{Float64,1}, ::Int64, ::Int64)
Closest candidates are:
clip{T<:Real}(::T<:Real, ::Real, ::Real) at REPL[2]:2
julia> clip.(v, -1, 1)
10-element Array{Float64,1}:
-0.868996
1.0
-0.309632
1.0
-1.0
-0.223385
-0.608423
-1.0
-1.0
0.864448
The second and third arguments don’t need to be scalars – as with dotted operators, they can be vectors as well, and the clip
operation will be applied to each corresponding triple of values:
julia> clip.(v, repmat([-1,0.5],5), repmat([-0.5,1],5))
10-element Array{Float64,1}:
-0.868996
1.0
-0.5
1.0
-1.0
0.5
-0.608423
0.5
-1.0
0.864448
From these examples, it may be unclear why this operation is called “broadcast
”. The function gets its name from the following behavior: wherever one of its arguments has a singleton dimension (i.e. dimension of size 1), it “broadcasts” that value along the corresponding dimension of the other arguments when applying the operator. Broadcasting allows dotted operations to easily do handy tricks like mean-centering the columns of a matrix:
julia> A = rand(3,4);
julia> B = A .- mean(A,1)
3×4 Array{Float64,2}:
0.343976 0.427378 -0.503356 -0.00448691
-0.210096 -0.531489 0.168928 -0.128212
-0.13388 0.104111 0.334428 0.132699
julia> mean(B,1)
1×4 Array{Float64,2}:
0.0 0.0 0.0 0.0
The matrix A
is 3×4 and mean(A,1)
is 1×4 so the .-
operator broadcasts the subtraction of each mean value along the corresponding column of A, thereby mean-centering each column. Combining this broadcasting behavior with vectorized call syntax lets us write some fairly fancy custom array operations very concisely:
julia> clip.(B, [-0.3, -0.2, -0.1], [0.4, 0.3, 0.2, 0.1]')
3×4 Array{Float64,2}:
0.343976