3 - Manipulating a track table

Simon Garnier

Under the hood, a track table is a data frame with a few extra bells and whistles. Therefore, you can manipulate a track table in the same way you would a base::data.frame, tibble::tibble, or data.table::data.table (depending on the data frame class you used as a base for your track table). Anything that you can do with one of these three data frame classes can be done the same way with a track table.

There are, however, a few additional things that are specific to track tables and we will review them in this vignette.

But first, let’s load a track table that is provided with trackdf:

library(trackdf)

data("tracks")
print(tracks, max = 10 * ncol(tracks))
## Track table [7195 observations]
## Number of tracks:  2 
## Dimensions:  2D 
## Geographic:  TRUE 
## Projection:  +proj=longlat 
## Table class:  data frame ('data.frame')
##    id                   t        x         y
## 1   1 2015-09-10 07:00:00 15.76468 -22.37957
## 2   1 2015-09-10 07:00:01 15.76468 -22.37957
## 3   1 2015-09-10 07:00:04 15.76468 -22.37958
## 4   1 2015-09-10 07:00:05 15.76468 -22.37958
## 5   1 2015-09-10 07:00:08 15.76467 -22.37959
## 6   1 2015-09-10 07:00:09 15.76467 -22.37959
## 7   1 2015-09-10 07:00:09 15.76467 -22.37959
## 8   1 2015-09-10 07:00:10 15.76467 -22.37959
## 9   1 2015-09-10 07:00:11 15.76467 -22.37959
## 10  1 2015-09-10 07:00:12 15.76467 -22.37959
##  [ reached 'max' / getOption("max.print") -- omitted 7185 rows ]

This track table contains the GPS coordinates of two goats foraging through the Tsaobis Nature Park in Namibia, sometimes in 2015.


3.1 - Basic information about the track table

In addition to the usual information that you can ask about a data frame (e.g., the number of rows and columns, the class of each column, etc), you can access additional information about the content of a track table.

First, you can check whether an object is indeed a track table as follows:

is_track(tracks)
## [1] TRUE

You can also check whether the track table contains geographic coordinates or not as follows:

is_geo(tracks)
## [1] TRUE

You can find out the number of different tracks included in the track table as follows:

n_tracks(tracks)
## [1] 2

Finally, you can retrieve the dimensionality (2D or 3D) of the track table as follows:

n_dims(tracks)
## [1] 2

3.2 - Accessing data

Accessing and modifying the different parts (rows, columns, elements) of a track table is similar to accessing and modifying the different parts of the underlying data frame. We will, therefore, not discuss this topic further as it is something that you should already be very familar with.

Note, however, that different data frame classes may do thing slightly differently from each other. Make sure to know what class is used with the track tables you are working with. For instance, the track table that we loaded for this tutorial if of class data.frame, as indicated in the 6th line of the print out of the track table:

print(tracks, max = 10 * ncol(tracks))
## Track table [7195 observations]
## Number of tracks:  2 
## Dimensions:  2D 
## Geographic:  TRUE 
## Projection:  +proj=longlat 
## Table class:  data frame ('data.frame')
##    id                   t        x         y
## 1   1 2015-09-10 07:00:00 15.76468 -22.37957
## 2   1 2015-09-10 07:00:01 15.76468 -22.37957
## 3   1 2015-09-10 07:00:04 15.76468 -22.37958
## 4   1 2015-09-10 07:00:05 15.76468 -22.37958
## 5   1 2015-09-10 07:00:08 15.76467 -22.37959
## 6   1 2015-09-10 07:00:09 15.76467 -22.37959
## 7   1 2015-09-10 07:00:09 15.76467 -22.37959
## 8   1 2015-09-10 07:00:10 15.76467 -22.37959
## 9   1 2015-09-10 07:00:11 15.76467 -22.37959
## 10  1 2015-09-10 07:00:12 15.76467 -22.37959
##  [ reached 'max' / getOption("max.print") -- omitted 7185 rows ]

3.3 - Accessing the projection information of a track table

One particularity of track tables over regular data frames is that they can store geographic data explicitly and perform projection operations to change their coordinate reference system if necessary.

In order to access the coordinate reference system (or projection) of a track table containing geographic data, you simply need to execute the following command:

projection(tracks)
## Coordinate Reference System:
##   User input: +proj=longlat 
##   wkt:
## GEOGCRS["unknown",
##     DATUM["World Geodetic System 1984",
##         ELLIPSOID["WGS 84",6378137,298.257223563,
##             LENGTHUNIT["metre",1]],
##         ID["EPSG",6326]],
##     PRIMEM["Greenwich",0,
##         ANGLEUNIT["degree",0.0174532925199433],
##         ID["EPSG",8901]],
##     CS[ellipsoidal,2],
##         AXIS["longitude",east,
##             ORDER[1],
##             ANGLEUNIT["degree",0.0174532925199433,
##                 ID["EPSG",9122]]],
##         AXIS["latitude",north,
##             ORDER[2],
##             ANGLEUNIT["degree",0.0174532925199433,
##                 ID["EPSG",9122]]]]

This returns an object of class crs which is a list consisting of an input object (usually the character string that you have entered in track under the proj parameter), and a wkt object which is an automatically generated WKT 2 representation of the coordinate reference system.

You can modify in place the projection of a track table as follows. This will automatically convert the x and y coordinates contained in the track table to the appropriate projection system:

projection(tracks) <- "+proj=somerc +lat_0=46.9524056 +lon_0=7.43958333 +ellps=bessel +x_0=2600000 +y_0=1200000 +towgs84=674.374,15.056,405.346 +units=m +k_0=1 +no_defs"
print(tracks, max = 10 * ncol(tracks))
## Track table [7195 observations]
## Number of tracks:  2 
## Dimensions:  2D 
## Geographic:  TRUE 
## Projection:  +proj=somerc +lat_0=46.9524056 +lon_0=7.43958333 +ellps=bessel +x_0=2600000 +y_0=1200000 +towgs84=674.374,15.056,405.346 +units=m +k_0=1 +no_defs 
## Table class:  data frame ('data.frame')
##    id                   t       x        y
## 1   1 2015-09-10 07:00:00 4927487 -9217299
## 2   1 2015-09-10 07:00:01 4927487 -9217299
## 3   1 2015-09-10 07:00:04 4927487 -9217301
## 4   1 2015-09-10 07:00:05 4927487 -9217302
## 5   1 2015-09-10 07:00:08 4927486 -9217304
## 6   1 2015-09-10 07:00:09 4927485 -9217305
## 7   1 2015-09-10 07:00:09 4927485 -9217305
## 8   1 2015-09-10 07:00:10 4927485 -9217306
## 9   1 2015-09-10 07:00:11 4927485 -9217306
## 10  1 2015-09-10 07:00:12 4927485 -9217306
##  [ reached 'max' / getOption("max.print") -- omitted 7185 rows ]

And back to the original projection:

projection(tracks) <- "+proj=longlat" 
print(tracks, max = 10 * ncol(tracks))
## Track table [7195 observations]
## Number of tracks:  2 
## Dimensions:  2D 
## Geographic:  TRUE 
## Projection:  +proj=longlat 
## Table class:  data frame ('data.frame')
##    id                   t        x         y
## 1   1 2015-09-10 07:00:00 15.76468 -22.37957
## 2   1 2015-09-10 07:00:01 15.76468 -22.37957
## 3   1 2015-09-10 07:00:04 15.76468 -22.37958
## 4   1 2015-09-10 07:00:05 15.76468 -22.37958
## 5   1 2015-09-10 07:00:08 15.76467 -22.37959
## 6   1 2015-09-10 07:00:09 15.76467 -22.37959
## 7   1 2015-09-10 07:00:09 15.76467 -22.37959
## 8   1 2015-09-10 07:00:10 15.76466 -22.37959
## 9   1 2015-09-10 07:00:11 15.76466 -22.37959
## 10  1 2015-09-10 07:00:12 15.76466 -22.37959
##  [ reached 'max' / getOption("max.print") -- omitted 7185 rows ]

If you prefer not to modify the original object, you can create a new one with the new projection using theproject function as follows:

tracks_somerc <- project(tracks, "+proj=somerc +lat_0=46.9524056 +lon_0=7.43958333 +ellps=bessel +x_0=2600000 +y_0=1200000 +towgs84=674.374,15.056,405.346 +units=m +k_0=1 +no_defs")
print(tracks_somerc, max = 10 * ncol(tracks))
## Track table [7195 observations]
## Number of tracks:  2 
## Dimensions:  2D 
## Geographic:  TRUE 
## Projection:  +proj=somerc +lat_0=46.9524056 +lon_0=7.43958333 +ellps=bessel +x_0=2600000 +y_0=1200000 +towgs84=674.374,15.056,405.346 +units=m +k_0=1 +no_defs 
## Table class:  data frame ('data.frame')
##    id                   t       x        y
## 1   1 2015-09-10 07:00:00 4927487 -9217299
## 2   1 2015-09-10 07:00:01 4927487 -9217299
## 3   1 2015-09-10 07:00:04 4927487 -9217301
## 4   1 2015-09-10 07:00:05 4927487 -9217302
## 5   1 2015-09-10 07:00:08 4927486 -9217304
## 6   1 2015-09-10 07:00:09 4927485 -9217305
## 7   1 2015-09-10 07:00:09 4927485 -9217305
## 8   1 2015-09-10 07:00:10 4927485 -9217306
## 9   1 2015-09-10 07:00:11 4927485 -9217306
## 10  1 2015-09-10 07:00:12 4927485 -9217306
##  [ reached 'max' / getOption("max.print") -- omitted 7185 rows ]

3.4 - Combining track tables

Combining track tables requires a bit of caution. Indeed, traditional methods to combine data frames (e.g., base::rbind, data.table::rbindlist, or dplyr::bind_rows) will successfully bind together multiple track tables but they will not check whether these track tables are compatible with each other. For instance, they will not check that the coordinates are using the same coordinate reference system or that the time stamps are all in the same time zone.

In order to ensure that different track tables can be combined without creating problems down the analysis pipeline, trackdf provides its own method to bind multiple track tables together: bind_tracks.

To demonstrate how bind_tracks works, let’s first create 3 track tables, 2 that are compatible with each other, and 1 that is not.

raw1 <- read.csv(system.file("extdata/gps/02.csv", package = "trackdf"))
raw2 <- read.csv(system.file("extdata/gps/03.csv", package = "trackdf"))
raw3 <- read.csv(system.file("extdata/video/01.csv", package = "trackdf"))

track1 <- track(x = raw1$lon, y = raw1$lat, t = paste(raw1$date, raw1$time), 
                id = 1, proj = "+proj=longlat", tz = "Africa/Windhoek")
track2 <- track(x = raw2$lon, y = raw2$lat, t = paste(raw2$date, raw2$time), 
                id = 2, proj = "+proj=longlat", tz = "Africa/Windhoek")
track3 <- track(x = raw3$x, y = raw3$y, t = raw3$frame, id = raw3$track_fixed, 
                origin = "2019-03-24 12:55:23", period = "0.04S", 
                tz = "America/New_York")

If you try to combine the 3 track tables using bind_tracks, an error will be thrown to let you know that they are not compatible with each other:

bounded_tracks <- bind_tracks(track1, track2, track3)
## Error in bind_tracks(track1, track2, track3): All track tables should have the same projection.

Compare this to what happens with one of the traditional binding methods:

bounded_tracks <- rbind(track1, track2, track3)
print(bounded_tracks, max = 10 * ncol(bounded_tracks))
## Track table [29182 observations]
## Number of tracks:  81 
## Dimensions:  2D 
## Geographic:  TRUE 
## Projection:  +proj=longlat 
## Table class:  data frame ('data.frame')
##    id                   t        x         y
## 1   1 2015-09-10 07:00:00 15.76459 -22.37971
## 2   1 2015-09-10 07:00:01 15.76459 -22.37971
## 3   1 2015-09-10 07:00:02 15.76459 -22.37971
## 4   1 2015-09-10 07:00:03 15.76459 -22.37971
## 5   1 2015-09-10 07:00:04 15.76459 -22.37971
## 6   1 2015-09-10 07:00:05 15.76459 -22.37971
## 7   1 2015-09-10 07:00:06 15.76459 -22.37971
## 8   1 2015-09-10 07:00:07 15.76459 -22.37971
## 9   1 2015-09-10 07:00:08 15.76459 -22.37971
## 10  1 2015-09-10 07:00:09 15.76459 -22.37971
##  [ reached 'max' / getOption("max.print") -- omitted 29172 rows ]

Here, the tracks tables are combined with each other despite having different coordinate reference systems and time zones. Using bind_tracks instead ensures that this cannot happen.


3.5 - Tidyverse

Track tables are compatible with (most) of the functions from the “tidyverse”. For instance, you can use all the dplyr verbs to filter, mutate, group, etc., a track table, in the same way you would do with a tibble::tibble or a base::data.frame. As long as the result of the operation that you are applying to a track table does not affect its fundamental structure (see vignette “Building a track table”), the output that you will get will remain a track table with its specific attributes.

For instance, here is how to filter a track table to keep only the observations between 2 specific time stamps:

library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
filtered_tracks <- tracks %>%
  filter(., t >= as.POSIXct("2015-09-10 07:01:00", tz = "Africa/Windhoek"),
         t <= as.POSIXct("2015-09-10 07:11:00 CAT", tz = "Africa/Windhoek"))
print(filtered_tracks, max = 10 * ncol(filtered_tracks))
## Track table [1202 observations]
## Number of tracks:  2 
## Dimensions:  2D 
## Geographic:  TRUE 
## Projection:  +proj=longlat 
## Table class:  data frame ('data.frame')
##    id                   t        x         y
## 1   1 2015-09-10 07:01:00 15.76468 -22.37961
## 2   1 2015-09-10 07:01:01 15.76469 -22.37960
## 3   1 2015-09-10 07:01:02 15.76469 -22.37960
## 4   1 2015-09-10 07:01:03 15.76469 -22.37960
## 5   1 2015-09-10 07:01:04 15.76470 -22.37960
## 6   1 2015-09-10 07:01:05 15.76469 -22.37960
## 7   1 2015-09-10 07:01:06 15.76469 -22.37960
## 8   1 2015-09-10 07:01:07 15.76469 -22.37959
## 9   1 2015-09-10 07:01:08 15.76469 -22.37959
## 10  1 2015-09-10 07:01:09 15.76469 -22.37959
##  [ reached 'max' / getOption("max.print") -- omitted 1192 rows ]

3.6 - Plotting

You can use any plotting method accepting a data frame of any class to represent the data in a track table.

Here is an example using ggplot2:

library(ggplot2)

ggplot(data = tracks) +
  aes(x = x, y = y, color = id) +
  geom_path() +
  coord_map()