Publication quality plots in Julia

2017/12/01 julia plots

In light of recent discussions on Julia's Discourse forum about getting “publication-quality” or simply “nice” plots in Julia, I thought it would be worthwhile to briefly summarize what works for me.1 If you are a seasoned Julia user, this post may have nothing new for you, but I hope that newcomers to Julia find it useful.

Generate the data

I try to separate data generation and plotting. The first may be time-consuming (some calculations can take hours or days to run), and I find it best to save the results independently of any plotting. Recently I was sitting at a conference where a presentation about a really interesting topic had some plots that were extremely hard to see: if I remember correctly, something like 10x2 subplots, with almost all fine detail lost due to the resolution of the projector or the human eye. When someone in the audience asked about this, the presenting author replied that he is aware of the issue, but remaking the plots would involve rerunning the calculations, which take weeks. Saving the data separately will ensure that you are never in this situation; also, you can benefit from updates to plotting libraries when tweaking your plots.

For saving results, JLD2 is probably the most convenient tool: while it is technically work in progress, it is stable, fast, and convenient.2 The key question is where to save the data: I find it best to use a consistent path that you can just include in scripts.

You have several options:

  1. define a global variable in your ~/.juliarc for your projects, and construct a path with joinpath,

  2. if you have packaged your code, Pkg.dir can be used to obtain a subdirectory in the package root,

  3. if your code is in a module, you can wrap @__DIR__ in a function to obtain a directory.

For this blog post I used the first option, while in practice I use the second and the third.

To illustrate plots, I use the code below to generate random variates for sample skewness, and save it.

download as data.jl

using StatsBase                 # for skewness
using JLD2                      # saving data
cd(joinpath(BLOG_POSTS, "plot-workflow")) # default path
sample_skewness = [skewness(randn(100)) for _ in 1:1000]
@save "data.jld2" sample_skewness # save data

Make the plot

No plotting so far, so let's remedy that. I use Plots.jl, which is a metapackage that unifies syntax for plotting via various plotting backends in Julia. I find this practical, because I can quickly switch backends for different purposes, and experiment with various options when I find the output suboptimal. The price you pay for this flexibility is compilation time, a known issue which means that you have to wait a bit to get your first plot. Separating plotting and data generation has the advantage that once I fire up the plotting infrastructure, I switch to “plotting mode” and clean up several plots at the same time.

Users frequently ask what the “best” backend is. This all depends on your needs, but these days I use the pgfplots() backend almost exclusively.3 The gr() backend is also useful, because it is very fast.

Time to tweak the plot! I find the attributes documentation the most useful for this. For this plot I need axis labels, a title, and prefer to disable the legend since I am plotting a single series. I am also using LaTeXStrings.jl, which means that I can use LaTeX-compatible syntax for labels seamlessly (notice the L before the string).

download as plot.jl

using JLD2                      # loading data
using Plots; pgfplots()         # PGFPlots backend
using LaTeXStrings              # nice LaTeX strings
cd(joinpath(BLOG_POSTS, "plot-workflow")) # default path
@load "data.jld2"                         # load data
# make plot and tweak; this is the end result
plot(histogram(sample_skewness, normalize = true),
     xlab = L"\gamma_1", fillcolor = :lightgray,
     yaxis = ("frequency", (0, 2)), title = "sample skewness", legend = false)
# finally save
savefig("sample_skewness.svg")  # for quick viewing and web content
savefig("sample_skewness.tex")  # for inclusion into papers
savefig("sample_skewness.pdf")  # for quick viewing

Having generated the plot, I save it in various formats with savefig. The SVG output is shown below.

The plot

How to get help

If you cannot achieve the desired output, you can

  1. reread the Plots.jl manual,

  2. study the example plots,

  3. ask for help in the Visualization topic.

For the third option, make sure you include a self-contained minimal working example,4 which also generates or loads the data, so that others can run your code as is. Randomly generated data should be fine, or standard datasets from RDatasets.jl.

Sometimes you will find that the feature you are looking for is not (yet) supported. You should check if there is an open issue for your problem (the discussion forum linked above is useful for this), and if not, open one.

When asking for help or just discussing plotting libraries in Julia, please keep in mind that they are a community effort with volunteers devoting their time to address a very difficult problem. Plotting is not a well-defined exercise, it involves a lot of heuristics and special cases, and most languages took years to get it right (for a given value of “right”). Make it easy for people to help you by making a reproducible, clean MWE: it is very hard to explain how to improve your plot without the actual code and output.

  1. Code in this post was written in December 2017, you may need to tweak it if the API of the packages changes. [return]
  2. In the worst case scenario, you can always regenerate the data ☺ [return]
  3. Note that you need a working TeX installation, which is easy to obtain on Linux. [return]
  4. You should use triple-backticks ``` to format your code. [return]
site not optimized for small screens, math may break