--- title: "Working with Different Types of Time Indices Using `tind` Class" author: "Grzegorz Klima" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Working with Different Types of Time Indices Using `tind` Class} %\VignetteEngine{knitr::rmarkdown} \usepackage[utf8]{inputenc} --- ```{r, echo = FALSE} library("tind") ``` `tind` class is designed to represent time indices of different types and perform computations with them. Indices are represented as vectors of integers or doubles. The following types of time indices are supported: * years, * quarters, * months, * weeks (ISO 8601), * dates, * time of day, * date-time, * arbitrary integer and numeric indices. Time indices can be constructed via calls to `tind` constructor, using `as.tind` methods, or by calling `parse_t` and `strptind` functions for parsing arbitrary time index formats. Before proceeding, let us load the package. ```{r color} library("tind") ``` ## Years The internal code for years is `y`. The simplest way to construct year indices is to invoke `tind` constructor with a single argument `y`. ```{r} (ys <- tind(y = 2010:2020)) ``` In `as.tind` method, integers in range 1800--2199 are automatically interpreted as years. For instance: ```{r} (ys <- as.tind(2010:2020)) ``` To let `as.tind` know that numbers outside 1800--2199 range should be interpreted as years, `type` argument has to be provided as in: ```{r} (ys <- as.tind(c(1700, 1800, 1900, 2000, 2100), type = "y")) ``` Convenience function `as.year` is a shortcut for `as.tind(*, type = "y")`: ```{r} (ys <- as.year(c(1700, 1800, 1900, 2000, 2100))) ``` Four-digit character strings in the format `YYYY` are also automatically interpreted as year indices: ```{r} (ys <- as.tind(c("1700", "1800"))) ``` Years as four-digit strings are indicated by `%Y` format specifier. The last examples could also be written as: ```{r} (ys <- as.tind(c("1700", "1800", "1900", "2000", "2100"), format = "%Y")) ``` Two-digit year indices are indicated by `%y` format specifier. By default, numbers in range 69--99 are interpreted as years 1969--1999 and numbers 00--68 as years 2000--2068: ```{r} (ys <- as.tind(c("98", "99", "00", "01", "02"), format = "%y")) format(ys, "%Y") format(ys, "%y") ``` `as.year` guesses short year format: ```{r} (ys <- as.year(c("98", "99", "00", "01", "02"))) ``` Treatment of two-digit years is controlled by `tind.abbr.year.start` option. ```{r} options("tind.abbr.year.start") ``` ## Quarters The internal code for quarters is `q`. Quarters can be constructed using `tind` constructor with arguments `y` and `q`. Arguments of `tind` constructor are recycled if necessary. ```{r} (qs <- tind(y = rep(2020:2023, each = 4), q = 1:4)) ``` The default format for quarters is `"%YQ%q"`, where `%q` format specifier is a `tind` extension giving quarter number. This will be automatically recognized: ```{r} (qs <- as.tind(c("2020Q1", "2020Q2", "2020Q3", "2020Q4"))) format(qs) format(qs, "%YQ%q") ``` Less popular formats can be parsed using combinations of `%Y` (or `%y`) and`%q` specifiers. Consider `YYYY.Q` format: ```{r} as.tind("2023.2", format = "%Y.%q") ``` One can also specify the order of index components using `order` argument: ```{r} as.tind("2023.2", order = "yq") ``` If order is `yq`, quarters will be automatically parsed if `type = "q"` is set: ```{r} as.tind(c("2020 1", "2020 2", "2020 3", "2020 4"), type = "q") ``` Convenience function `as.quarter` is a shortcut for `as.tind(*, type = "q")`: ```{r} as.quarter(c("2020 1", "2020 2", "2020 3", "2020 4")) ``` Packages `stats` and `zoo` (class `yearqrt`) represent quarters as year fractions with e.g., `2020.0`, `2020.25`, `2020.5`, `2020.75` representing `2020Q1`, `2020Q2`, `2020Q3`, `2020Q4` respectively. Conversion from this format can be done using `yf2tind` function. ```{r} yf2tind(2020 + (0:3) / 4, "q") ``` ## Months The internal code for months is `m`. Months can be constructed using `tind` constructor with arguments `y` and `m`. Arguments of `tind` constructor are recycled if necessary. ```{r} (ms <- tind(y = 2023, m = 1:12)) ``` The default format for months is `"%Y-%m"`, where `%m` format specifier represents month as a two-digit number. ```{r} format(ms) format(ms, "%Y-%m") ``` If order is `ym`, months will be automatically parsed if `type = "m"` is set: ```{r} as.tind("2023-11", type = "m") ``` Convenience function `as.month` is a shortcut for `as.tind(*, type = "m")`: ```{r} as.month("2023-11") ``` Format specifier `%b` denotes abbreviated month name as in the following example. See documentation of `month_names` for discussion of locale settings. ```{r} (shrtms <- format(ms, "%b %y", locale = "C")) ``` The above can be parsed using format specification or using order specification (with month first). ```{r} as.tind(shrtms, format = "%b %y", locale = "C") as.tind(shrtms, order = "my", locale = "C") ``` Packages `stats` and `zoo` (class `yearmon`) represent months as year fractions with e.g., `2020.0`, `2020.0833`, `2020.1667`, `2020.25` representing `2020-01`, `2020-01`, `2020-03`, `2020-04` respectively. Conversion from this format can be done using `yf2tind` function. ```{r} yf2tind(2020 + (0:11) / 12, "m") ``` ## Weeks (ISO 8601) `tind` supports ISO 8610 weeks i.e. weeks starting on Monday with the first week in a year being the week with the first Thursday in a year. See https://en.wikipedia.org/wiki/ISO_week_date. The internal code for weeks is `w`. Weeks can be constructed using `tind` constructor with arguments `y` and `w`. Arguments of `tind` constructor are recycled if necessary. ```{r} (ws <- tind(y = 2024, w = 1:52)) ``` The default format for weeks is `%G-W%V`, where `%G` is the week-based year (ISO week-numbering year, ISO year) and `%V` ISO week number. ```{r} format(ws) format(ws, format = "%G-W%V") ``` Note that you cannot use `%Y`, `%W`, and `%U` specifiers with weeks. ISO week-numbering year may differ from Gregorian (i.e. calendar) year `%Y`. `%W` and `%U` formats refer to non-ISO weeks and are not supported. `as.week` (a shortcut for `as.tind(*, type = "w")`) automatically recognizes this format. ```{r} as.week("2024-W51") ``` ## Dates Dates (code `d`) can be most easily constructed from `y`, `m`, and `d` components passed to `tind` constructor. Order of arguments is irrelevant. ```{r} tind(m = 3, d = 15, y = 2024) (ds <- tind(y = 2024, m = rep(1:3, each = 2), d = c(1, 16))) ``` The default format for dates is `%Y-%m-%d` (ISO format, shortcut `%F`). ```{r} format(ds) format(ds, format = "%Y-%m-%d") format(ds, format = "%F") ``` `as.date` (a shortcut for `as.tind(*, type = "d")`) automatically recognizes this format. ```{r} as.date("2024-12-31") ``` US format `%m/%d/%y` (shortcut `%D`) is also automatically recognized. ```{r} as.date("12/31/24") format(as.date("12/31/24"), "%m/%d/%y") format(as.date("12/31/24"), "%D") ``` Month names can also be used. See documentation of `month_names` for discussion of locale settings. ```{r} (chds <- format(ds, "%b %d, %y", locale = "C")) as.tind(chds, order = "mdy", locale = "C") ``` ## Time of Day Type `h` (as in *h*our) represents times between midnight (00:00) and midnight of the next day (24:00). Time of day can be constructed from `H`, `M`, and `S` arguments passed to `tind` constructor. ```{r} tind(H = 0:23) tind(H = 13, M = (0:3) * 15) (tod1 <- tind(H = 13, M = 30, S = (0:11) * 5)) ``` Sub-second accuracy is also supported. ```{r} (tod2 <- tind(H = 13, M = 30, S = 17 + (0:9) / 10)) ``` As seen above, `tind` automatically determines whether seconds should be shown or sub-second accuracy is required. Format specifiers `%H`, `%M`, and `%S` can be used with time of day. ```{r} format(tod1, "%H:%M") format(tod1, "%H:%M:%S") as.tind("13:47", format = "%H:%M") as.tind("13:47:39", format = "%H:%M:%S") ``` Sub-second accuracy can be explicitly requested via `%OS[0-6]` format specifier. ```{r} format(tod2, "%H:%M:%S") format(tod2, "%H:%M:%OS1") format(tod2, "%H:%M:%OS2") format(tod2, "%H:%M:%OS3") ``` For parsing, `%OS` (without digits) should be used. ```{r} as.tind("13:47:39.89", format = "%H:%M:%OS") ``` `H`, `M`, and `S` can be used for order specification. With order specifier `S` sub-second accuracy is automatically determined. ```{r} as.tind("13", order = "H") as.tind("13:47", order = "HM") as.tind("13:47:39", order = "HMS") as.tind("13:47:39.9", order = "HMS") as.tind("13:47:39.89", order = "HMS") ``` 12-hour clock can be used with the help of `%I` (hour in 12-hour clock) and `%p` (ante meridiem, post meridiem) specifiers. ```{r} format(tind(H = 0:23), "%I %p") as.tind("9:30am", format = "%I:%M%p") ``` Alternatively, `I` and `p` order specifiers can be used. ```{r} as.tind("9:30am", order = "IMp") ``` ## Date-time Date-time indices (code `t`) can be constructed from components required for date and at least hour (`H`) component. ```{r} tind(y = 2024, m = 8, d = 2, H = 16) tind(y = 2024, m = 8, d = 2, H = 16, M = (0:3) * 15) tind(y = 2024, m = 8, d = 2, H = 16, M = 0, S = 10 * (0:5)) ``` All date-time indices always have time zone attribute set. For more information, see documentation of `tzone` function. By default, time zone is set to system time zone but can be explicitly set using `tz` argument. ```{r} tind(y = 2024, m = 8, d = 2, H = 16, M = (0:3) * 15, tz = "UTC") ``` Date-time indices can also be constructed from date and time of day indices using `date_time` function. ```{r} (dt1 <- date_time(tind(y = 2024, m = 8, d = 2), tind(H = 16, M = (0:3) * 15))) (dt2 <- date_time(tind(y = 2024, m = 8, d = 2), tind(H = 16, M = (0:3) * 15), tz = "UTC")) ``` Reverse operation can be performed using `date_time_split` function. ```{r} date_time_split(dt1) ``` As lists can be trivially converted to data frames, if a data frame is desired, `date_time_split` only has to be wrapped by `as.data.frame`. ```{r} as.data.frame(date_time_split(dt1)) ``` As seen, formatting of date-time indices depends on actual indices (need for seconds or for subsecond accuracy). Moreover, `as.character` and `format` methods differ. The former returns UTC offset (or `Z` for UTC), the latter time zone abbreviation. ```{r} as.character(dt1) as.character(dt2) format(dt1) format(dt2) ``` All format specifiers that can be used with dates and time of day can also be used with date-time. `%z` specifier represents UTC offset, `%Z` returns time zone abbreviation. ```{r} format(dt1, "%F %H:%M") format(dt1, "%F %H:%M%z") format(dt1, "%F %H:%M %Z") format(dt1, "%D %I:%M%p") ``` Standard formats are automatically recognized during parsing. ```{r} as.tind("2025-02-01 13:03") as.tind("2025-02-01 13:03:34") as.tind("2025-02-01 13:03:34.534") ``` When converting to `tind`, either `format` or `order` arguments can be provided. ```{r} as.tind("2025-02-01 13:03:34.534", format = "%F %H:%M:%OS") as.tind("02/01/25 01:03:34pm", format = "%D %I:%M:%OS%p") as.tind("2025-02-01 13:03:34.534", order = "ymdHMS") as.tind("2025-02-01 13:03:34.534", order = "ymdHMS", tz = "UTC") as.tind("02/01/25 01:03:34pm", order = "mdyIMSp") as.tind("02/01/25 01:03:34pm", order = "mdyIMSp", tz = "UTC") ``` The parser recognizes time zone abbreviations: ```{r} as.tind("2025-02-22 09:54:04 CET", tz = "Europe/Warsaw") as.tind("2025-08-23 09:54:04 CEST", tz = "Europe/Warsaw") as.tind("2/22/25 9:54 a.m. EST", order = "mdyIMpz", tz = "America/New_York") as.tind("8/23/25 9:54 a.m. EDT", order = "mdyIMpz", tz = "America/New_York") ``` When time zone is not provided, the parser tries to guess the time zone: ```{r} as.tind("2025-02-22 09:54:04 CET") as.tind("2025-08-23 09:54:04 CEST") as.tind("2/22/25 9:54 a.m. EST", order = "mdyIMpz") as.tind("8/23/25 9:54 a.m. EDT", order = "mdyIMpz") ``` When time zone abbreviation can denote different UTC offsets (unfortunately, this can be the case) `NA`s are introduced with a warning. ## Arbitrary Numeric Indices For completeness, `tind` supports arbitrary integer indices (code `i`) and arbitrary numeric indices (code `n`). ```{r} as.tind(0:9, type = "i") as.tind(0:9 / 10, type = "n") ``` ## Index Conversion Time index conversion is extremely easy with `tind` class. `as.tind` method as well as `as.year`, `as.date`, etc. convenience functions can be used for this. ```{r} ms as.quarter(ms) as.year(ms) as.date(ms) as.date_time(ms) as.date_time(ms, tz = "UTC") ``` ## Coercion to Base R Types `as.Date`, `as.POSIXct`, and `as.POSIXlt` can be used to convert time indices to base R date and date-time classes. ```{r} ds as.Date(ds) dt1 as.POSIXct(dt1) dt2 as.POSIXlt(dt2) ``` ## Matching Periods, Comparisons, `cut` Method `match_t` function and `%in_t%` operator allow to match time indices to another set of time indices, possibly of different type (lower resolution). In the following example, a sequence of dates is matched to months: ```{r} (x <- as.date("2025-03-02") + 15 * (0:5)) (table <- as.month("2025-03") + -1:1) match_t(x, table) ``` Below we check which dates fall in March 2025: ```{r} x %in_t% "2025-03" ``` Comparison operators (e.g., `>`, `>=`) can be used to compare time indices. Below we check which dates fall in or after April 2025 and before April 2025: ```{r} x >= "2025-04" x < "2025-04" ``` `cut` method for object of `tind` class divides time indices into periods. Using `x` from the last example, we can split dates into months and quarters: ```{r} cut(x, "m") cut(x, "q") ```