There are four foundations upon which this package rests:

  • the Altair Python package, to build chart specifications
  • the reticulate R package, to provide inter-operability with Python
  • the Vega-Lite JavaScript framework, to render chart specifications in an HTML file
  • the htmlwidgets R package, to provide inter-operability with HTML and JavaScript

This article deals with the first two items; the Field Guide to Rendering deals with the other two.

The purpose of this document is to try to collect in one place, in a semi-organized fashion, all the fiddly-bits we have found dealing with Python stuff. If you get a cryptic Python error, check here. If you find a workaround for something that isn’t here, please let us know!

Overview

The Altair documentation is the best resource for learning how to create charts. In the course of building and documenting this package, we have noted a few “gotchas” and their workarounds. If you find another, please let us know!

Here’s the short version:

  • Where you see a ., use a $ instead.

  • Altair methods return a copy of the object. Assignment of a Python object returns a reference, not a copy.

  • To get a copy a of “bare” Altair object, use a $copy() method.

  • If you have a dataset that has variables with dots in their names, e.g. Sepal.Width, you have to make some accommodation when referring to such names in Altiar. As a workaround, you can use square-brackets to refer to “[Sepal.Width]”.

  • There is an Altair Chart method called repeat(), which in R is a reserved word, so it needs to be enclosed in backticks: $`repeat`().

  • Where you see an inversion operator, ~, like ~highlight, in Altair examples, call the method explicitly from R: hightlight$`__invert__`(). Alternatively, you may be able to rearrange the code so as to avoid using the inversion.

  • Where you see a hyphen in the name of a Python object, use an underscore in R: vega_data$sf_temps()

  • Where you see a Python list, ["foo", "bar"], in Altair examples, use an unnamed list in R: list("foo", "bar").

  • Where you see a Python dictionary, {'a' = "foo", 'b' = "bar"}, in Altair examples, use a named list in R: list(a = "foo", b = "bar")

  • Where you see a None in Altair examples, use a NULL in R.

  • You may see a function call with **, baz(a = 1, **{'foo': 'bar'}), in an Altair example. In R, interpolate the dictionary into the rest of the arguments, baz(a = 1, foo = "bar").

Method chaining

When reticulate returns a Python object with a custom class, it appears in R as an S3 object that behaves like a reference class. This means that if you see this sort of notation in Python:

You would use this notation in R:

In essence, you wherever you see a . in Python, use a $ in R.

Altair method returns copy

In Python, Altair methods return a copy of the object. To verify this, let’s use `pryr::

Although this looks like a reference-class method, the Altair method acts like an S3 method.

Python assignment returns reference

The object returned by an Altair method is a modified copy of the calling-object, much as we are accustomed-to in R. However, it is important to note that using the R assignment operator (<-, =, ->) on a Python object returns a reference to the object rather than a copy.

This becomes apparent when assigning a “bare” object:

To return a copy of the object, use a copy method.

Dots in variable names

In Python, dots can refer to a nested structure within a Data Frame variable. Vega-Lite supports such nesting, so it assumes that a dot in a variable-name will refer to a nested variable.

This means that we can run into trouble using R’s iris dataset:

The problem here is that there are variables whose names have dots in them, e.g. Sepal.Width. One workaround is to use square brackets when referring to such variable names; another is to use backslashes, \\:

As you can see, this has the side-effect of showing the brackets and slashes in the scale labels.

To fix the fix, you can set the title for each axis:

Repeat

As shown in the View Composistion article, you can use the repeat() method to compose one-or-more charts such that the only thing different among them is an encoding.

However, the article notes, there is a catch: repeat is a reserved word in R, so we have to enclose it in backticks, e.g. $`repeat`().

Inversion: ~

This is another case where an operator has a completely different meaning in Python than it has in R. As you know, the ~ operator is used to construct a formula. In Python, it is the bitwise inversion operator.

You might come across this in an Altair example where the operator is used to invert a selection.

There are a couple of alternatives available here, the first is to invoke the $__invert__() operator explicitly.

The second alternative is to swap the order of the if_true and if_false arguments in alt$condition().

Lists: [] and Dictionaries: {}

A Python list corresponds to an atomic vector in R; a Python dictionary corresponds to a named list in R.

In practice, we find that reticulate does the right thing if we provide an R unnamed list where Altair expects a list, and an R named list where Altair expects a dictionary.

example_list <- list(1, 2, 3)
example_dictionary <- list(a = 1, b = 2, c = 3)

Consider this Altair example that uses lists and dictionaries. This is some of the Python bits:

Here’s an R translation of the complete example, which demonstrates interactive cross-filtering.

None and **{}

These concepts are not related other that they are found in the same example:

In this example, we have a list containing None, which reticulate associates with R’s NULL.

We also have some syntax, **{'as': 'TotalTime'}. This is a mechanism to pass additional arguments to a Python function, perhaps similar to ... in R. It is passing a dictionary, so perhaps we can add the additional named argument in R: