fbpx
Julia 0.5 Highlights Julia 0.5 Highlights
To follow along with the examples in this blog post and run them live, you can go to JuliaBox, create a... Julia 0.5 Highlights

To follow along with the examples in this blog post and run them live, you can go to JuliaBox, create a free login, and open the “Julia 0.5 Highlights” notebook under “What’s New in 0.5”. The notebook can also be downloaded from here.

Julia 0.5 is a pivotal release. It introduces more transformative features than any release since the first official version. Moreover, several of these features set the stage for even more to come in the lead up to Julia 1.0. In this post, we’ll go through some of the major changes in 0.5, including improvements to functional programming, comprehensions, generators, arrays, strings, and more.

Functions

Julia has always supported functional programming features:

Before this release, however, these features all came with a significant performance cost. In a language that targets high-performance technical computing, that’s a serious limitation. So the Julia standard library and ecosystem have been rife with work-arounds to get the expressiveness of functional programming without the performance problems. But the right solution, of course, is to make functional programming fast – ideally just as fast as the optimal hand-written version of your code would be. In Julia 0.5, it is. And that changes everything.

This change is so important that there will be a separate blog post about it in the coming weeks, explaining how higher-order functions, closures and lambdas have been made so efficient, as well as detailing the kinds of zero-cost abstractions these changes enable. But for now, I’ll just tease with a little timing comparison. First, some definitions – they’re the same in both 0.4 and 0.5:

v = rand(10^7);                   # 10 million random numbers
double_it_vec(v) = 2v             # vectorized doubling of input
double_it_map(v) = map(x->2x, v)  # map a lambda over input

First, a timing comparison in Julia 0.4:

julia> VERSION
v"0.4.7"

julia> mean([@elapsed(double_it_vec(v)) for _=1:100])
0.024444888209999998

julia> mean([@elapsed(double_it_map(v)) for _=1:100])
0.5515606454499999

On 0.4, the functional version using map is 22 times slower than the vectorized version, which uses specialized generated code for maximal speed. Now, the same comparison in Julia 0.5:

julia> VERSION
v"0.5.0"

julia> mean([@elapsed(double_it_vec(v)) for _=1:100])
0.024549842180000003

julia> mean([@elapsed(double_it_map(v)) for _=1:100])
0.023871925960000002

The version using map is as fast as the vectorized one in 0.5. In this case, writing 2v happens to be more convenient than writing map(x->2x, v), so we may choose not to use map here, but there are many cases where functional constructs are clearer, more general, and more convenient. Now, they are also fast.

Ambiguous methods

One design decision that any multiple dispatch language must make is how to handle dispatch ambiguities: cases where none of the methods applicable to a given set of arguments is more specific than the rest. Suppose, for example, that a generic function, f, has the following methods:

f(a::Int, b::Real) = 1
f(a::Real, b::Int) = 2

In Julia 0.4 and earlier, the second method definition causes an ambiguity warning:

WARNING: New definition
    f(Real, Int64) at none:1
is ambiguous with:
    f(Int64, Real) at none:1.
To fix, define
    f(Int64, Int64)
before the new definition.

This warning is clear and gets right to the point: the case f(a,b) where a and b are of type Int (aka Int64 on 64-bit systems) is ambiguous. Evaluating f(3,4) calls the first method of f – but this behavior is undefined. Giving a warning whenever methods could be ambiguous is a fairly conservative choice: it urges people to define a method covering the ambiguous intersection before even defining the methods that overlap. When we decided to give warnings for potentially ambiguous methods, we hoped that people would avoid ambiguities and all would be well in the world.

Warning about method ambiguities turns out to be both too strict and too lenient. It’s far too easy for ambiguities to arise when shared generic functions serve as extension points across unrelated packages. When many packages extend the same generic functions, it’s common for the methods added to have some ambiguous overlap. This happens even when each package has no ambiguities on its own. Worse still, slight changes to one package can introduce ambiguities elsewhere, resulting in the least fun game of whack-a-mole ever. At the same time, the fact that ambiguities only cause warnings means that people learn to ignore them, which is annoying at best, and dangerous at worst: it’s far too easy for a real problem to be hidden by a barrage of insignificant ambiguity warnings. In particular, on 0.4 and earlier if an ambiguous method is actually called, no error occurs. Instead, one of the possible methods is called, based on the order in which methods were defined – which is essentially arbitrary when they come from different packages. Usually the method works – it does apply, after all – but this is clearly not the right thing to do.

The solution is simple: in Julia 0.5 the existence of potential ambiguities is fine, but actually calling an ambiguous method is an immediate error. The above method definitions for f, which previously triggered a warning, are now silent, but calling f with two Int arguments is a method dispatch error:

julia> f(3,4)
ERROR: MethodError: f(::Int64, ::Int64) is ambiguous. Candidates:
  f(a::Real, b::Int64) at REPL[2]:1
  f(a::Int64, b::Real) at REPL[1]:1
 in eval(::Module, ::Any) at ./boot.jl:231
 in macro expansion at ./REPL.jl:92 [inlined]
 in (::Base.REPL.##1#2{Base.REPL.REPLBackend})() at ./event.jl:46

This improves the experience of using the Julia package ecosystem considerably, while also making Julia safer and more reliable. No more torrent of insignificant ambiguity warnings. No more playing ambiguity whack-a-mole when someone else refactors their code and accidentally introduces ambiguities in yours. No more risk that a method call could be silently broken because of warnings that we’ve all learned to ignore.

Return type annotations

A long-requested feature has been the ability to annotate method definitions with an explicit return type. This aids the clarity of code, serves as self-documentation, helps the compiler reason about code, and ensures that return types are what programmers intend them to be. In 0.5, you can annotate method definitions with a return type like so:

function clip{T<:Real}(x::T, lo::Real, hi::Real)::T
    if x < lo
        return lo
    elseif x > hi
        return hi
    else
        return x
    end
end

This function is similar to the built-in clamp function, but let’s consider this definition for the sake of example. The return annotation on clip has the effect of inserting implicit calls to x->convert(T, x) at each return point of the method. It has no effect on any other method of clip, only the one where the annotation occurs. In this case, the annotation ensures that this method always returns a value of the same type as x, regardless of the types of lo and hi:

julia> clip(0.5, 1, 2) # convert(T, lo)
1.0

julia> clip(1.5, 1, 2) # convert(T, x)
1.5

julia> clip(2.5, 1, 2) # convert(T, hi)
2.0

You’ll note that the annotated return type here is T, which is a type parameter of the clip method. Not only is that allowed, but the return type can be an arbitrary expression of argument values, type parameters, and values from outer scopes. For example, here is a variation that promotes its arguments:

function clip2(x::Real, lo::Real, hi::Real)::promote_type(typeof(x), typeof(lo), typeof(hi))
    if x < lo
        return lo
    elseif x > hi
        return hi
    else
        return x
    end
end

julia> clip2(2, 1, 3)
2

julia> clip2(2, 1, 13//5)
2//1

julia> clip2(2.5, 1, 13//5)
2.5

Return type annotations are a fairly simple syntactic transformation, but they make it easier to write methods with consistent and predictable return types. If different branches of your code can lead to slightly different types, the fix is now as simple as putting a single type annotation on the entire method.

Vectorized function calls

Julia 0.5 introduces the syntax f.(A1, A2, ...) for vectorized function calls. This syntax translates to broadcast(f, A1, A2, ...), where broadcast is a higher-order function (introduced in 0.2), which generically implements the kind of broadcasting behavior found in Julia’s “dotted operators” such as .+, .-, .*, and ./. Since higher-order functions are now efficient, writing broadcast(f,v,w) and f.(v,w) are both about as fast as loops specialized for the operation f and the shapes of v and w. This syntax lets you vectorize your scalar functions the way built-in vectorized functions like log, exp, and atan2 work. In fact, in the future, this syntax will likely replace the pre-vectorized methods of functions like exp and log, so that users will write exp.(v) to exponentiate a vector of values. This may seem a little bit uglier, but it’s more consistent than choosing an essentially arbitrarily set of functions to pre-vectorize, and as I’ll explain below, this approach can also have significant performance benefits.

To give a more concrete sense of what this syntax can be used for, consider the clip function defined above for real arguments. This scalar function can be applied to vectors using vectorized call syntax without any further method definitions:

julia> v = randn(10)
10-element Array{Float64,1}:
 -0.868996
  1.79301
 -0.309632
  1.16802
 -1.57178
 -0.223385
 -0.608423
 -1.54862
 -1.33672
  0.864448

julia> clip(v, -1, 1)
ERROR: MethodError: no method matching clip(::Array{Float64,1}, ::Int64, ::Int64)
Closest candidates are:
  clip{T<:Real}(::T<:Real, ::Real, ::Real) at REPL[2]:2

julia> clip.(v, -1, 1)
10-element Array{Float64,1}:
 -0.868996
  1.0
 -0.309632
  1.0
 -1.0
 -0.223385
 -0.608423
 -1.0
 -1.0
  0.864448

The second and third arguments don’t need to be scalars – as with dotted operators, they can be vectors as well, and the clip operation will be applied to each corresponding triple of values:

julia> clip.(v, repmat([-1,0.5],5), repmat([-0.5,1],5))
10-element Array{Float64,1}:
 -0.868996
  1.0
 -0.5
  1.0
 -1.0
  0.5
 -0.608423
  0.5
 -1.0
  0.864448

From these examples, it may be unclear why this operation is called “broadcast”. The function gets its name from the following behavior: wherever one of its arguments has a singleton dimension (i.e. dimension of size 1), it “broadcasts” that value along the corresponding dimension of the other arguments when applying the operator. Broadcasting allows dotted operations to easily do handy tricks like mean-centering the columns of a matrix:

julia> A = rand(3,4);

julia> B = A .- mean(A,1)
3×4 Array{Float64,2}:
  0.343976   0.427378  -0.503356  -0.00448691
 -0.210096  -0.531489   0.168928  -0.128212
 -0.13388    0.104111   0.334428   0.132699

julia> mean(B,1)
1×4 Array{Float64,2}:
 0.0  0.0  0.0  0.0

The matrix A is 3×4 and mean(A,1) is 1×4 so the .- operator broadcasts the subtraction of each mean value along the corresponding column of A, thereby mean-centering each column. Combining this broadcasting behavior with vectorized call syntax lets us write some fairly fancy custom array operations very concisely:

julia> clip.(B, [-0.3, -0.2, -0.1], [0.4, 0.3, 0.2, 0.1]')
3×4 Array{Float64,2}:
0.343976
Julia Computing

Julia Computing was founded by the creators of the Julia language. We help our customers solve complex and challenging computational problems that run on thousands of processors in the cloud or on matchbox sized systems in ones pocket. We operate out of Boston, New York, and Bangalore. Julia is a new programming language that combines the productivity of R, Python, and Matlab with the performance of C and Fortran. Julia provides a sophisticated compiler, distributed parallel execution, numerical accuracy, and an extensive library of fast mathematical functions. It is being used by a number of universities for teaching and research, and by businesses in areas as diverse as engineering, finance, and e-commerce, to name a few.

1