Please think of this as an experimental function
vw_serialize_data(data, iso_dttm = FALSE, iso_date = TRUE)
data.frame
, data to be serialized
logical
, indicates if datetimes (POSIXct
) are to be
formatted using ISO-8601
logical
, indicates if dates (Date
) are to be
formatted using ISO-8601
object with the same type as data
In Vega, for now, there are only two time-zones available: the local
time-zone of the browser where the spec is rendered, and UTC. This differs
from R, where a time-zone attribute is available to POSIXct
vectors.
Accordingly, when designing a vegaspec that uses time, you have to make some
some compromises. This function helps you to implement your compromise in
a principled way, as explained in the opinions below.
Let's assume that your POSIXct
data has a time-zone attached.
There are three different scenarios for rendering this data:
using the time-zone of the browser
using UTC
using the time-zone of the data
If you intend to display the data using the time-zone of the browser,
or using UTC, you should serialize datetimes using ISO-8601, i.e.
iso_dttm = TRUE
. In the rest of your vegaspec, you should choose
local or UTC time-scales accordingly. However, in either case, you should
use local time-units. No compromise is necessary.
If you intend to display the data using the time-zone of the browser,
this is where you will have to compromise. In this case, you should
serialize using iso_dttm = FALSE
. By doing this, your datetimes will be
serialized using a non-ISO-8601 format, and notably, using the time-zone
of the datetime. When you design your vegaspec, you should treat this as
if it were a UTC time. You should direct Vega to parse this data as UTC,
i.e. {"foo": "utc:'%Y-%m-%d %H:%M:%S'"}
. In other words, Vega should
interpret your local timestamp as if it were a UTC timestamp.
As in the first UTC case, you should use UTC time-scales and local
time-units.
The compromise you are making is this: the internal representation of
the instants in time will be different in Vega than it will be in R.
You are losing information because you are converting from a POSIXct
object with a time-zone to a timestamp without a time-zone. It is also
worth noting that the time information in your Vega plot should not
be used anywhere else - this should be the last place this serialized
data should be used because it is no longer trustworthy. For this,
you will gain the ability to show the data in the context of its
time-zone.
Dates can be different creatures than datetimes. I think that can be "common currency" for dates. I think this is because it is more common to compare across different locations using dates as a common index. For example, you might compare daily stock-market data from NYSE, CAC-40, and Hang Seng. To maintain a common time-index, you might choose UTC to represent the dates in all three locations, despite the time-zone differences.
This is why the default for iso_date
is TRUE
. In this scenario,
you need not specify to Vega how to parse the date; because of its
ISO-8601 format, it will parse to UTC. As with the other UTC cases,
you should use UTC time-scales and local time-units.
# datetimes
data_seattle_hourly %>% head()
#> # A tibble: 6 × 2
#> date temp
#> <dttm> <dbl>
#> 1 2010-01-01 01:00:00 4
#> 2 2010-01-01 02:00:00 3.9
#> 3 2010-01-01 03:00:00 3.8
#> 4 2010-01-01 04:00:00 3.8
#> 5 2010-01-01 05:00:00 3.7
#> 6 2010-01-01 06:00:00 3.7
data_seattle_hourly %>% head() %>% vw_serialize_data(iso_dttm = TRUE)
#> # A tibble: 6 × 2
#> date temp
#> <chr> <dbl>
#> 1 2010-01-01T09:00:00.000Z 4
#> 2 2010-01-01T10:00:00.000Z 3.9
#> 3 2010-01-01T11:00:00.000Z 3.8
#> 4 2010-01-01T12:00:00.000Z 3.8
#> 5 2010-01-01T13:00:00.000Z 3.7
#> 6 2010-01-01T14:00:00.000Z 3.7
data_seattle_hourly %>% head() %>% vw_serialize_data(iso_dttm = FALSE)
#> # A tibble: 6 × 2
#> date temp
#> <chr> <dbl>
#> 1 2010-01-01 01:00:00.000 4
#> 2 2010-01-01 02:00:00.000 3.9
#> 3 2010-01-01 03:00:00.000 3.8
#> 4 2010-01-01 04:00:00.000 3.8
#> 5 2010-01-01 05:00:00.000 3.7
#> 6 2010-01-01 06:00:00.000 3.7
# dates
data_seattle_daily %>% head()
#> # A tibble: 6 × 6
#> date precipitation temp_max temp_min wind weather
#> <date> <dbl> <dbl> <dbl> <dbl> <chr>
#> 1 2012-01-01 0 12.8 5 4.7 drizzle
#> 2 2012-01-02 10.9 10.6 2.8 4.5 rain
#> 3 2012-01-03 0.8 11.7 7.2 2.3 rain
#> 4 2012-01-04 20.3 12.2 5.6 4.7 rain
#> 5 2012-01-05 1.3 8.9 2.8 6.1 rain
#> 6 2012-01-06 2.5 4.4 2.2 2.2 rain
data_seattle_daily %>% head() %>% vw_serialize_data(iso_date = TRUE)
#> # A tibble: 6 × 6
#> date precipitation temp_max temp_min wind weather
#> <chr> <dbl> <dbl> <dbl> <dbl> <chr>
#> 1 2012-01-01 0 12.8 5 4.7 drizzle
#> 2 2012-01-02 10.9 10.6 2.8 4.5 rain
#> 3 2012-01-03 0.8 11.7 7.2 2.3 rain
#> 4 2012-01-04 20.3 12.2 5.6 4.7 rain
#> 5 2012-01-05 1.3 8.9 2.8 6.1 rain
#> 6 2012-01-06 2.5 4.4 2.2 2.2 rain
data_seattle_daily %>% head() %>% vw_serialize_data(iso_date = FALSE)
#> # A tibble: 6 × 6
#> date precipitation temp_max temp_min wind weather
#> <chr> <dbl> <dbl> <dbl> <dbl> <chr>
#> 1 2012/01/01 0 12.8 5 4.7 drizzle
#> 2 2012/01/02 10.9 10.6 2.8 4.5 rain
#> 3 2012/01/03 0.8 11.7 7.2 2.3 rain
#> 4 2012/01/04 20.3 12.2 5.6 4.7 rain
#> 5 2012/01/05 1.3 8.9 2.8 6.1 rain
#> 6 2012/01/06 2.5 4.4 2.2 2.2 rain