Creating a chart

First we’ll make a chart, then we’ll look at the bits-and-pieces used to put it together.

library("altair")

vega_data <- import_vega_data()

chart <- 
  alt$Chart(vega_data$cars())$
  mark_point()$
  encode(
    x = "Horsepower:Q",
    y = "Miles_per_Gallon:Q",
    color = "Origin:N",
    tooltip = c("Name", "Horsepower", "Miles_per_Gallon", "Origin")
  )

vegawidget(chart)

Installation

The first part of the code block is a call to load the package:

This assumes that your computer has a Python installation and an Altair installation. Please see the Installation article for more information on how to get up-and-running.

The next bit is to get access to Vega datasets, which also has its own article in this collection.

vega_data <- import_vega_data()

Chart object

The next part of the code block creates a chart object:

chart <- alt$Chart(vega_data$cars())

There’s a few things going on here. The first is that the altair (R) package exposes Altair (Python) methods and objects using the variable alt; the same convention is used in the Altair documentation.

# Python
import altair as alt

We expose the “same” variable, by default, as a part of the package-loading process.

The next step is to create the chart itself. In Python, we would use language like this:

# Python
chart = alt.Chart(...)

Here’s where reticulate does its job. It exposes Python objects as R S3 objects that behave like Reference Class objects. In practical terms, wherever you see a . in the Altair Python documentation, use a $ in your R code:

chart <- alt$Chart(...)

The argument to the Chart() function merits some explanation. In Python, a data argument is expected to be a Pandas DataFrame. Using this package, the reticulate package automatically converts R data.frames to Pandas DataFrames.

As well, the reticulate package offers us a function r_to_py(), and its complement py_to_r() manage conversion back-and-forth between some common data-types “shared” by R and Python. The reticulate documentation has more information.

As you follow the Altair examples, you will come across other Python-R translation issues. We have compiled our discoveries and workarounds in an article, Field Guide to Python.

The data argument to a chart function need not be a data frame; it can be a reference to a data frame like a URL:

vega_data$cars$url
#> [1] "https://cdn.jsdelivr.net/npm/vega-datasets@v1.29.0/data/cars.json"

chart <- alt$Chart(vega_data$cars$url)

Note that referring to an external URL may not be allowed in the RStudio IDE, so your chart may not render in the IDE. However, it will render in a regular browser.

Adding a mark

chart <- 
  alt$Chart(vega_data$cars())$
  mark_point()

In Vega, “mark” is a similar concept to “geom” in ggplot2. In this case, we are saying we want to represent our data using points.

In Python, methods can be chained using the . operator ($ in our case), a little bit like how the %>% operator is used in Tidyverse. To specify that we want points, we append $mark_point() to our Chart object.

Note that we can extend the $ operation across lines, giving us the illusion of piping. In the future, it may be interesting to wrap operators like $foo using functions that could be piped, alt_foo(chart, ...).

Adding encoding

chart <- 
  alt$Chart(vega_data$cars())$
  mark_point()$
  encode(
    x = "Horsepower:Q",
    y = "Miles_per_Gallon:Q",
    color = "Origin:N",
    tooltip = c("Name", "Horsepower", "Miles_per_Gallon", "Origin")
  )

In Vega, “encode” plays a similar role to “aesthetics” in ggplot2. We are mapping variables in the data to scales in the plot. We can pass multiple variables to the tooltip by giving a list of variables.

What we see here is, in fact, a shorthand. As explained in the Altair documentation, there’s a longer version:

chart <- 
  alt$Chart(vega_data$cars())$
  mark_point()$
  encode(
    x = alt$X("Horsepower", type = "quantitative"),
    y = alt$Y("Miles_per_Gallon", type = "quantitative"),
    color = alt$Color("Origin", type = "nominal"),
    tooltip = 
      list(
        alt$Tooltip(field = "Name", type = "nominal"),
        alt$Tooltip(field = "Horsepower", type = "quantitative"),
        alt$Tooltip(field = "Miles_per_Gallon", type = "quantitative"),
        alt$Tooltip(field = "Origin", type = "nominal")
      )
  )

Altair recognizes four types of data, "quantitative", "nominal", "ordinal", and "temporal".

Displaying and Examining

Now that we have specified our chart, it remains to display it. This package provides a function vegawidget() that takes a chart object then renders and embeds it as an htmlwidget.

vegawidget(chart)

However, if you are using its defaults, you need not call the vegawidget() function explicitly, as this package provides a print() method and a knit_print() method.

chart

More details on rendering charts, including how sizing works, are found in the article Field Guide to Rendering.

You can examine the chart-specification by using vegawidget::vw_examine() function, which wraps listviewer::jsonedit(). To use this function, you will need to install the listviewer package from CRAN.

vegawidget::vw_examine(chart, mode = "code")