Introduction to openaq

Get started with the openaq package

Russ Biggs

2025-01-17

library(openaq)

This guide provides an overview of the key features of the openaq package. For detailed information on the functions provided in the package see the reference section.

For more general documentation on the OpenAQ platform and API see the main OpenAQ documentation site at docs.openaq.org.

Key concepts

API Key

An API key is required for using the OpenAQ API. Register for an account at https://explore.openaq.org/register to get an API key.

By default the OpenAQ R client looks for an API key in the OPENAQ_API_KEY system environment variable. The package also provides a helper function called set_api_key() to set this value.

set_api_key("my-super-secret-openaq-key-1234")

Alternatively, the API key can be set on individual resource function calls e.g.

list_locations(api_key = "my-super-secret-openaq-key-1234")

Setting the API key at an individual function level will always take precedent over an API key set at the environment variable level

set_api_key("my-super-secret-openaq-key-1234")
list_locations(api_key = "this-is-my-alternate-api-key")

Rate limits

The OpenAQ API limits the number of requests a single API key can make in a set time to ensure fair access for all users and prevent overuse.

The API provides custom rate limit headers to indicate the number of requests used, the number remaining, the rate limit allowance, and the number of seconds remaining in the current period until reset. These headers are preserved by default in the openaq package as object attributes on the output data frame:

locations <- list_locations(
  limit = 1000,
  parameters_id = 2,
  providers_id = 166
)
headers <- attr(locations, "headers")
print(headers[["x_ratelimit_remaining"]])
## [1] 50

Read more about the headers and rate limits in the OpenAQ API documentation under Rate Limits

The openaq package provides optional functionality to automatically throttle requests when the rate limit has been reached.

Automatic rate limit handling

The openaq package provides optional functionality to automatically throttle requests when the rate limit has been reached. This feature uses httr2’s built-in retry mechanism to intelligently handle rate limit errors.

Using the openaq package you can enable automatic rate limiting in two ways:

Option 1: Enable globally for your session

# Enable automatic rate limiting for all subsequent requests
enable_rate_limit()

# Now all API calls will automatically handle rate limits
locations <- list_locations(limit = 1000, parameters_id = 2)
## Setting `max_tries = 2`.
nrow(locations)
## [1] 1000

Option 2: Enable per request

# Enable rate limiting for a single function call
locations <- list_locations(
  limit = 1000,
  parameters_id = 2,
  rate_limit = TRUE
)
## Setting `max_tries = 2`.
head(locations)
##   id                name is_mobile is_monitor     timezone countries_id
## 1  3          NMA - Nima     FALSE       TRUE Africa/Accra          152
## 2  4          NMT - Nima     FALSE       TRUE Africa/Accra          152
## 3  5     JTA - Jamestown     FALSE       TRUE Africa/Accra          152
## 4  6   ADT - Asylum Down     FALSE       TRUE Africa/Accra          152
## 5  7 ADEPA - Asylum Down     FALSE       TRUE Africa/Accra          152
## 6  8   ADA - Asylum Down     FALSE       TRUE Africa/Accra          152
##   country_name country_iso latitude  longitude datetime_first datetime_last
## 1        Ghana          GH 5.583890 -0.1996800             NA            NA
## 2        Ghana          GH 5.581650 -0.1989800             NA            NA
## 3        Ghana          GH 5.540114 -0.2103972             NA            NA
## 4        Ghana          GH 5.570722 -0.2120555             NA            NA
## 5        Ghana          GH 5.567833 -0.2040278             NA            NA
## 6        Ghana          GH 5.566722 -0.2077778             NA            NA
##                          owner_name providers_id
## 1 Unknown Governmental Organization          209
## 2 Unknown Governmental Organization          209
## 3 Unknown Governmental Organization          209
## 4 Unknown Governmental Organization          209
## 5 Unknown Governmental Organization          209
## 6 Unknown Governmental Organization          209
##                        provider_name
## 1 Dr. Raphael E. Arku and Colleagues
## 2 Dr. Raphael E. Arku and Colleagues
## 3 Dr. Raphael E. Arku and Colleagues
## 4 Dr. Raphael E. Arku and Colleagues
## 5 Dr. Raphael E. Arku and Colleagues
## 6 Dr. Raphael E. Arku and Colleagues

This is particularly useful when making many sequential requests or when working with large datasets where you might exceed the rate limit. The automatic retry mechanism will pause execution until the rate limit resets, then continue automatically without raising an error.

Features

Queryable resources

The OpenAQ API follows a resource-oriented design, allowing developers to retrieve air quality data through standardized HTTP requests to specific endpoints representing data resources like measurements, locations, and parameters. The OpenAQ R package provides functions that correspond to these API resources, simplifying the process of querying and retrieving data resources.

Countries

get_country()
## Error in get_country(): argument "countries_id" is missing, with no default
list_countries()

Instruments

get_instrument()
## Error in get_instrument(): argument "instruments_id" is missing, with no default
list_instruments()
list_manufacturer_instruments()

Latest

list_location_latest()
list_parameter_latest()

Licenses

list_licenses()
get_license()

Locations

list_locations()
get_location()

Manufacturers

list_manufacturers()
get_manufacturer()

Measurements

list_sensor_measurements()

Owners

list_owners()
get_owner()

Parameters

list_parameters()
get_parameter()

Providers

list_providers()
get_provider()

Sensors

get_sensor()
get_location_sensors()

Data frames

All resource functions return a typed data frame by default. If you prefer to work with JSON parsed as a standard list you can toggle off data frame parsing with the as_data_frame function parameter.

list_locations(
  limit = 1000,
  parameters_id = 2,
  providers_id = 166,
  as_data_frame = FALSE
)

#> list()
#> attr(,"meta")
#> attr(,"meta")$name
#> [1] "openaq-api"
#>
#> attr(,"meta")$website
#> [1] "/"
#>
#> attr(,"meta")$page
#> [1] 1
#> ...

as.data.frame methods are provided for all resource classes as well.

JSON results are parsed with the httr2::resp_body_json() function under-the-hood.

Automatic rate limiting

All resource function provide an option to enable automatic rate limiting to ensure you do not exceed account rate limits. You can or course implement your own rate limiting yourself, but the built-in functionality is provided as an easy to use option.

list_locations(
  limit = 1000,
  parameters_id = 2,
  providers_id = 166,
  rate_limit = TRUE
)

This functionality uses the OpenAQ API’s rate limit headers and the httr2::req_retry() function under-the-hood.

Debugging

Every resource function provides an optional parameter named DRY_RUN that prevents a full HTTP request to the API and instead prints out a summary of how the request would have been made.

list_locations(
  limit = 1000,
  parameters_id = 2,
  providers_id = 166,
  dry_run = TRUE
)

This can be helpful when debugging to identify issues and compare the raw query URL and headers.

This functionality uses the httr2::req_dry_run() function under-the-hood.