--- title: "Introduction to openaq" subtitle: "Get started with the openaq package" author: "Russ Biggs" date: "2025-01-17" description: > "Get started with the openaq package" output: rmarkdown::html_vignette: df_print: kable vignette: > %\VignetteIndexEntry{Introduction to openaq} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ``` r library(openaq) ``` This guide provides an overview of the key features of the openaq package. For detailed information on the functions provided in the package see the reference section. For more general documentation on the OpenAQ platform and API see the main OpenAQ documentation site at [docs.openaq.org](https://docs.openaq.org). ## Key concepts ### API Key An API key is required for using the OpenAQ API. Register for an account at [https://explore.openaq.org/register](https://explore.openaq.org/register) to get an API key. By default the OpenAQ R client looks for an API key in the `OPENAQ_API_KEY` system environment variable. The package also provides a helper function called `set_api_key()` to set this value. ``` r set_api_key("my-super-secret-openaq-key-1234") ``` Alternatively, the API key can be set on individual resource function calls e.g. ``` r list_locations(api_key = "my-super-secret-openaq-key-1234") ``` Setting the API key at an individual function level will always take precedent over an API key set at the environment variable level ``` r set_api_key("my-super-secret-openaq-key-1234") list_locations(api_key = "this-is-my-alternate-api-key") ``` ### Rate limits The OpenAQ API limits the number of requests a single API key can make in a set time to ensure fair access for all users and prevent overuse. The API provides custom rate limit headers to indicate the number of requests used, the number remaining, the rate limit allowance, and the number of seconds remaining in the current period until reset. These headers are preserved by default in the openaq package as object attributes on the output data frame: * `x_ratelimit_used` * `x_ratelimit_remaining` * `x_ratelimit_limit` * `x_ratelimit_reset` ``` r locations <- list_locations( limit = 1000, parameters_id = 2, providers_id = 166 ) headers <- attr(locations, "headers") print(headers[["x_ratelimit_remaining"]]) ``` ``` ## [1] 50 ``` Read more about the headers and rate limits in the OpenAQ API documentation under [Rate Limits](https://docs.openaq.org/using-the-api/rate-limits) The openaq package provides optional functionality to automatically throttle requests when the rate limit has been reached. #### Automatic rate limit handling The openaq package provides optional functionality to automatically throttle requests when the rate limit has been reached. This feature uses httr2's built-in retry mechanism to intelligently handle rate limit errors. Using the openaq package you can enable automatic rate limiting in two ways: **Option 1: Enable globally for your session** ``` r # Enable automatic rate limiting for all subsequent requests enable_rate_limit() # Now all API calls will automatically handle rate limits locations <- list_locations(limit = 1000, parameters_id = 2) ``` ``` ## Setting `max_tries = 2`. ``` ``` r nrow(locations) ``` ``` ## [1] 1000 ``` **Option 2: Enable per request** ``` r # Enable rate limiting for a single function call locations <- list_locations( limit = 1000, parameters_id = 2, rate_limit = TRUE ) ``` ``` ## Setting `max_tries = 2`. ``` ``` r head(locations) ``` ``` ## id name is_mobile is_monitor timezone countries_id ## 1 3 NMA - Nima FALSE TRUE Africa/Accra 152 ## 2 4 NMT - Nima FALSE TRUE Africa/Accra 152 ## 3 5 JTA - Jamestown FALSE TRUE Africa/Accra 152 ## 4 6 ADT - Asylum Down FALSE TRUE Africa/Accra 152 ## 5 7 ADEPA - Asylum Down FALSE TRUE Africa/Accra 152 ## 6 8 ADA - Asylum Down FALSE TRUE Africa/Accra 152 ## country_name country_iso latitude longitude datetime_first datetime_last ## 1 Ghana GH 5.583890 -0.1996800 NA NA ## 2 Ghana GH 5.581650 -0.1989800 NA NA ## 3 Ghana GH 5.540114 -0.2103972 NA NA ## 4 Ghana GH 5.570722 -0.2120555 NA NA ## 5 Ghana GH 5.567833 -0.2040278 NA NA ## 6 Ghana GH 5.566722 -0.2077778 NA NA ## owner_name providers_id ## 1 Unknown Governmental Organization 209 ## 2 Unknown Governmental Organization 209 ## 3 Unknown Governmental Organization 209 ## 4 Unknown Governmental Organization 209 ## 5 Unknown Governmental Organization 209 ## 6 Unknown Governmental Organization 209 ## provider_name ## 1 Dr. Raphael E. Arku and Colleagues ## 2 Dr. Raphael E. Arku and Colleagues ## 3 Dr. Raphael E. Arku and Colleagues ## 4 Dr. Raphael E. Arku and Colleagues ## 5 Dr. Raphael E. Arku and Colleagues ## 6 Dr. Raphael E. Arku and Colleagues ``` This is particularly useful when making many sequential requests or when working with large datasets where you might exceed the rate limit. The automatic retry mechanism will pause execution until the rate limit resets, then continue automatically without raising an error. ### Pagination The OpenAQ API uses pagination provide access to large amounts of data in "pages". The number of results is controlled by the `limit` parameter which defaults 100 and can be configured up to 1000 results. If your query results in more than the page limit you can page through the results using the `page` parameter. For a `limit = 1000` `page=1` will contain results 1-1000, `page=2` will contain results 1001-2000 an so on. The `page` and `limit` are available on any resource that returns more than on results, i.e. "list" functions such as `list_locations()`, `list_licenses()` or `list_sensor_measurements()` Examples: ``` r locs <- list_locations( limit = 1000, page = 1 ) ``` ``` ## Setting `max_tries = 2`. ``` ``` r head(locs) ``` ``` ## id name is_mobile is_monitor timezone countries_id ## 1 3 NMA - Nima FALSE TRUE Africa/Accra 152 ## 2 4 NMT - Nima FALSE TRUE Africa/Accra 152 ## 3 5 JTA - Jamestown FALSE TRUE Africa/Accra 152 ## 4 6 ADT - Asylum Down FALSE TRUE Africa/Accra 152 ## 5 7 ADEPA - Asylum Down FALSE TRUE Africa/Accra 152 ## 6 8 ADA - Asylum Down FALSE TRUE Africa/Accra 152 ## country_name country_iso latitude longitude datetime_first datetime_last ## 1 Ghana GH 5.583890 -0.1996800 NA NA ## 2 Ghana GH 5.581650 -0.1989800 NA NA ## 3 Ghana GH 5.540114 -0.2103972 NA NA ## 4 Ghana GH 5.570722 -0.2120555 NA NA ## 5 Ghana GH 5.567833 -0.2040278 NA NA ## 6 Ghana GH 5.566722 -0.2077778 NA NA ## owner_name providers_id ## 1 Unknown Governmental Organization 209 ## 2 Unknown Governmental Organization 209 ## 3 Unknown Governmental Organization 209 ## 4 Unknown Governmental Organization 209 ## 5 Unknown Governmental Organization 209 ## 6 Unknown Governmental Organization 209 ## provider_name ## 1 Dr. Raphael E. Arku and Colleagues ## 2 Dr. Raphael E. Arku and Colleagues ## 3 Dr. Raphael E. Arku and Colleagues ## 4 Dr. Raphael E. Arku and Colleagues ## 5 Dr. Raphael E. Arku and Colleagues ## 6 Dr. Raphael E. Arku and Colleagues ``` ``` r locs <- list_locations( limit = 1000, page = 2 ) ``` ``` ## Setting `max_tries = 2`. ``` ``` r head(locs) ``` ``` ## id name is_mobile is_monitor timezone countries_id ## 1 1119 HANOVER FALSE TRUE America/New_York 155 ## 2 1120 HAMPTON - NASA FALSE TRUE America/New_York 155 ## 3 1121 Jerome Mack FALSE TRUE America/Los_Angeles 155 ## 4 1122 Jersey City FALSE TRUE America/New_York 155 ## 5 1123 46th and Farnam FALSE TRUE America/Chicago 155 ## 6 1124 Joe Neal FALSE TRUE America/Los_Angeles 155 ## country_name country_iso latitude longitude datetime_first ## 1 United States US 37.60613 -77.21880 2016-03-06 20:00:00 ## 2 United States US 37.10373 -76.38702 2016-03-10 08:00:00 ## 3 United States US 36.14187 -115.07874 2016-03-06 20:00:00 ## 4 United States US 40.73169 -74.06657 2016-03-06 20:00:00 ## 5 United States US 41.25732 -95.98383 2016-03-06 20:00:00 ## 6 United States US 36.27059 -115.23828 2016-03-06 20:00:00 ## datetime_last owner_name providers_id ## 1 2026-03-09 20:00:00 Unknown Governmental Organization 119 ## 2 2026-03-09 20:00:00 Unknown Governmental Organization 119 ## 3 2026-03-09 20:00:00 Unknown Governmental Organization 119 ## 4 2026-03-09 20:00:00 Unknown Governmental Organization 119 ## 5 2018-04-25 05:00:00 Unknown Governmental Organization 119 ## 6 2026-03-09 20:00:00 Unknown Governmental Organization 119 ## provider_name ## 1 AirNow ## 2 AirNow ## 3 AirNow ## 4 AirNow ## 5 AirNow ## 6 AirNow ``` ## Features ### Queryable resources The OpenAQ API follows a resource-oriented design, allowing developers to retrieve air quality data through standardized HTTP requests to specific endpoints representing data resources like measurements, locations, and parameters. The OpenAQ R package provides functions that correspond to these API resources, simplifying the process of querying and retrieving data resources. #### Countries ``` r get_country() ``` ``` ## Error in get_country(): argument "countries_id" is missing, with no default ``` ``` r list_countries() ``` #### Instruments ``` r get_instrument() ``` ``` ## Error in get_instrument(): argument "instruments_id" is missing, with no default ``` ``` r list_instruments() ``` ``` r list_manufacturer_instruments() ``` #### Latest ``` r list_location_latest() ``` ``` r list_parameter_latest() ``` #### Licenses ``` r list_licenses() ``` ``` r get_license() ``` #### Locations ``` r list_locations() ``` ``` r get_location() ``` #### Manufacturers ``` r list_manufacturers() ``` ``` r get_manufacturer() ``` #### Measurements ``` r list_sensor_measurements() ``` #### Owners ``` r list_owners() ``` ``` r get_owner() ``` #### Parameters ``` r list_parameters() ``` ``` r get_parameter() ``` #### Providers ``` r list_providers() ``` ``` r get_provider() ``` #### Sensors ``` r get_sensor() ``` ``` r get_location_sensors() ``` ### Data frames All resource functions return a typed data frame by default. If you prefer to work with JSON parsed as a standard list you can toggle off data frame parsing with the `as_data_frame` function parameter. ``` r list_locations( limit = 1000, parameters_id = 2, providers_id = 166, as_data_frame = FALSE ) #> list() #> attr(,"meta") #> attr(,"meta")$name #> [1] "openaq-api" #> #> attr(,"meta")$website #> [1] "/" #> #> attr(,"meta")$page #> [1] 1 #> ... ``` `as.data.frame` methods are provided for all resource classes as well. JSON results are parsed with the `httr2::resp_body_json()` function under-the-hood. ### Automatic rate limiting All resource function provide an option to enable automatic rate limiting to ensure you do not exceed account rate limits. You can or course implement your own rate limiting yourself, but the built-in functionality is provided as an easy to use option. ``` r list_locations( limit = 1000, parameters_id = 2, providers_id = 166, rate_limit = TRUE ) ``` This functionality uses the OpenAQ API's [rate limit headers](https://docs.openaq.org/using-the-api/rate-limits#rate-limit-headers) and the `httr2::req_retry()` function under-the-hood. ### Debugging Every resource function provides an optional parameter named `DRY_RUN` that prevents a full HTTP request to the API and instead prints out a summary of how the request would have been made. ``` r list_locations( limit = 1000, parameters_id = 2, providers_id = 166, dry_run = TRUE ) ``` This can be helpful when debugging to identify issues and compare the raw query URL and headers. This functionality uses the `httr2::req_dry_run()` function under-the-hood.