Please think of this as an experimental function

vw_serialize_data(data, iso_dttm = FALSE, iso_date = TRUE)

Arguments

data

data.frame, data to be serialized

iso_dttm

logical, indicates if datetimes (POSIXct) are to be formatted using ISO-8601

iso_date

logical, indicates if dates (Date) are to be formatted using ISO-8601

Value

object with the same type as data

Details

In Vega, for now, there are only two time-zones available: the local time-zone of the browser where the spec is rendered, and UTC. This differs from R, where a time-zone attribute is available to POSIXct vectors. Accordingly, when designing a vegaspec that uses time, you have to make some some compromises. This function helps you to implement your compromise in a principled way, as explained in the opinions below.

Let's assume that your POSIXct data has a time-zone attached. There are three different scenarios for rendering this data:

  • using the time-zone of the browser

  • using UTC

  • using the time-zone of the data

If you intend to display the data using the time-zone of the browser, or using UTC, you should serialize datetimes using ISO-8601, i.e. iso_dttm = TRUE. In the rest of your vegaspec, you should choose local or UTC time-scales accordingly. However, in either case, you should use local time-units. No compromise is necessary.

If you intend to display the data using the time-zone of the browser, this is where you will have to compromise. In this case, you should serialize using iso_dttm = FALSE. By doing this, your datetimes will be serialized using a non-ISO-8601 format, and notably, using the time-zone of the datetime. When you design your vegaspec, you should treat this as if it were a UTC time. You should direct Vega to parse this data as UTC, i.e. {"foo": "utc:'%Y-%m-%d %H:%M:%S'"}. In other words, Vega should interpret your local timestamp as if it were a UTC timestamp. As in the first UTC case, you should use UTC time-scales and local time-units.

The compromise you are making is this: the internal representation of the instants in time will be different in Vega than it will be in R. You are losing information because you are converting from a POSIXct object with a time-zone to a timestamp without a time-zone. It is also worth noting that the time information in your Vega plot should not be used anywhere else - this should be the last place this serialized data should be used because it is no longer trustworthy. For this, you will gain the ability to show the data in the context of its time-zone.

Dates can be different creatures than datetimes. I think that can be "common currency" for dates. I think this is because it is more common to compare across different locations using dates as a common index. For example, you might compare daily stock-market data from NYSE, CAC-40, and Hang Seng. To maintain a common time-index, you might choose UTC to represent the dates in all three locations, despite the time-zone differences.

This is why the default for iso_date is TRUE. In this scenario, you need not specify to Vega how to parse the date; because of its ISO-8601 format, it will parse to UTC. As with the other UTC cases, you should use UTC time-scales and local time-units.

Examples

  # datetimes
  data_seattle_hourly %>% head()
#> # A tibble: 6 × 2
#>   date                 temp
#>   <dttm>              <dbl>
#> 1 2010-01-01 01:00:00   4  
#> 2 2010-01-01 02:00:00   3.9
#> 3 2010-01-01 03:00:00   3.8
#> 4 2010-01-01 04:00:00   3.8
#> 5 2010-01-01 05:00:00   3.7
#> 6 2010-01-01 06:00:00   3.7
  data_seattle_hourly %>% head() %>% vw_serialize_data(iso_dttm = TRUE)
#> # A tibble: 6 × 2
#>   date                      temp
#>   <chr>                    <dbl>
#> 1 2010-01-01T09:00:00.000Z   4  
#> 2 2010-01-01T10:00:00.000Z   3.9
#> 3 2010-01-01T11:00:00.000Z   3.8
#> 4 2010-01-01T12:00:00.000Z   3.8
#> 5 2010-01-01T13:00:00.000Z   3.7
#> 6 2010-01-01T14:00:00.000Z   3.7
  data_seattle_hourly %>% head() %>% vw_serialize_data(iso_dttm = FALSE)
#> # A tibble: 6 × 2
#>   date                     temp
#>   <chr>                   <dbl>
#> 1 2010-01-01 01:00:00.000   4  
#> 2 2010-01-01 02:00:00.000   3.9
#> 3 2010-01-01 03:00:00.000   3.8
#> 4 2010-01-01 04:00:00.000   3.8
#> 5 2010-01-01 05:00:00.000   3.7
#> 6 2010-01-01 06:00:00.000   3.7

  # dates
  data_seattle_daily %>% head()
#> # A tibble: 6 × 6
#>   date       precipitation temp_max temp_min  wind weather
#>   <date>             <dbl>    <dbl>    <dbl> <dbl> <chr>  
#> 1 2012-01-01           0       12.8      5     4.7 drizzle
#> 2 2012-01-02          10.9     10.6      2.8   4.5 rain   
#> 3 2012-01-03           0.8     11.7      7.2   2.3 rain   
#> 4 2012-01-04          20.3     12.2      5.6   4.7 rain   
#> 5 2012-01-05           1.3      8.9      2.8   6.1 rain   
#> 6 2012-01-06           2.5      4.4      2.2   2.2 rain   
  data_seattle_daily %>% head() %>% vw_serialize_data(iso_date = TRUE)
#> # A tibble: 6 × 6
#>   date       precipitation temp_max temp_min  wind weather
#>   <chr>              <dbl>    <dbl>    <dbl> <dbl> <chr>  
#> 1 2012-01-01           0       12.8      5     4.7 drizzle
#> 2 2012-01-02          10.9     10.6      2.8   4.5 rain   
#> 3 2012-01-03           0.8     11.7      7.2   2.3 rain   
#> 4 2012-01-04          20.3     12.2      5.6   4.7 rain   
#> 5 2012-01-05           1.3      8.9      2.8   6.1 rain   
#> 6 2012-01-06           2.5      4.4      2.2   2.2 rain   
  data_seattle_daily %>% head() %>% vw_serialize_data(iso_date = FALSE)
#> # A tibble: 6 × 6
#>   date       precipitation temp_max temp_min  wind weather
#>   <chr>              <dbl>    <dbl>    <dbl> <dbl> <chr>  
#> 1 2012/01/01           0       12.8      5     4.7 drizzle
#> 2 2012/01/02          10.9     10.6      2.8   4.5 rain   
#> 3 2012/01/03           0.8     11.7      7.2   2.3 rain   
#> 4 2012/01/04          20.3     12.2      5.6   4.7 rain   
#> 5 2012/01/05           1.3      8.9      2.8   6.1 rain   
#> 6 2012/01/06           2.5      4.4      2.2   2.2 rain