--- title: "Getting started with edr4r" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Getting started with edr4r} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- `edr4r` is a small, tidy client for any service that speaks [OGC API - Environmental Data Retrieval](https://ogcapi.ogc.org/edr/). Most of the real-world use to date has been against in-situ monitoring networks -- stream gauges, weather stations, snow telemetry, reservoir telemetry -- but the package itself is generic. Two example endpoints you can point it at right now: - [USGS waterdata OGC API](https://api.waterdata.usgs.gov/ogcapi/beta/) - [Western Water Datahub](https://api.wwdh.internetofwater.app) This vignette runs end-to-end against the Western Water Datahub's `rise-edr` collection -- a [pygeoapi](https://pygeoapi.io) wrap of USBR's reservoir telemetry. The USGS example lives in its own article, `vignette("usgs-streamgages")`. ## 1. Create a client ``` r library(edr4r) library(ggplot2) client <- edr_client("https://api.wwdh.internetofwater.app") client #> #> base_url: #> user_agent: edr4r/0.1.0 (+https://github.com/ksonda/edr4r) #> timeout: 60s #> max_tries: 3 ``` The client just stores connection settings (base URL, user agent, timeout, retry policy). Pass `verbose = TRUE` to echo every request URL, or `headers =` to attach auth tokens. ## 2. Discover collections and parameters `edr_collections()` lists every EDR collection the service serves. ``` r collections <- edr_collections(client) collections[, c("id", "title", "data_queries")] #> # A tibble: 35 × 3 #> id title data_queries #> #> 1 rise-edr USBR Reclamation Information Sharing Environ… #> 2 snotel-edr USDA Snowpack Telemetry Network (SNOTEL) #> 3 awdb-forecasts-edr USDA Air and Water Database (AWDB) Forecasts #> 4 snotel-huc06-means USDA Snotel Snow Water Equivalent Aggregated… #> 5 usace-edr USACE Access2Water API #> 6 noaa-qpf-day-1 NOAA Weather Prediction Center Quantitative … #> 7 noaa-qpf-day-2 NOAA Weather Prediction Center Quantitative … #> 8 noaa-qpf-day-3 NOAA Weather Prediction Center Quantitative … #> 9 noaa-qpf-day-4-5 NOAA Weather Prediction Center Quantitative … #> 10 noaa-qpf-day-6-7 NOAA Weather Prediction Center Quantitative … #> # ℹ 25 more rows ``` The `data_queries` column tells you which EDR query types each collection supports (`locations`, `cube`, `area`, ...). Hit a verb the server doesn't implement and you get an HTTP error. To see the data parameters a collection exposes (the values you can pass to `parameter_name =` on the query verbs), use `edr_parameters()`: ``` r params <- edr_parameters(client, "rise-edr") nrow(params) #> [1] 782 head(params[, c("id", "name", "unit_symbol")]) #> # A tibble: 6 × 3 #> id name unit_symbol #> #> 1 1835 Secondary Canal Stage ft #> 2 1834 Lake/Reservoir Elevation ft #> 3 1830 Lake/Reservoir Release - Total cfs #> 4 1818 Total Dissolved Gas (TDG) % #> 5 1817 Growing Degree Days (50 Degree Base Temperature) GDD #> 6 1816 Calculated Unregulated Flow taf ``` `edr_queryables()` is something different -- it returns the OGC queryables JSON Schema (filter properties for CQL2 / OGC API Features). For discovering parameter names, `edr_parameters()` is what you want. ``` r # Pick out the daily reservoir storage parameter (id "3"): params[params$id == "3", c("id", "name", "unit_symbol", "unit_label")] #> # A tibble: 1 × 4 #> id name unit_symbol unit_label #> #> 1 3 Daily Lake/Reservoir Storage (af) acre·ft Acre Foot ``` ## 3. Find locations With no filters, `edr_locations()` returns the station index as a GeoJSON `FeatureCollection`, promoted to an `sf` object when [`sf`](https://r-spatial.github.io/sf/) is installed: ``` r stations <- edr_locations(client, "rise-edr") nrow(stations) #> [1] 926 head(stations[, c("_id", "locationName")]) #> Simple feature collection with 6 features and 2 fields #> Geometry type: POINT #> Dimension: XY #> Bounding box: xmin: -122.474 ymin: 32.8834 xmax: -101.2656 ymax: 48.2667 #> Geodetic CRS: WGS 84 #> # A tibble: 6 × 3 #> `_id` locationName geometry #> #> 1 1 Marys Lake (-105.5343 40.34408) #> 2 2 Audubon Lake North Dakota (-101.2656 47.6114) #> 3 3 Gray Reef Reservoir and Dam (-106.6989 42.5656) #> 4 5 Lake Frances Reservoir (-112.2 48.2667) #> 5 6 Spring Creek Reservoir and Debris Dam (-122.474 40.629) #> 6 7 Imperial Reservoir at Imperial Dam (-114.4676 32.8834) ``` The WWDH `rise-edr` locations index uses `_id` as the identifier column, not `id`. That's fine -- the query verbs accept either, and `edr_map()` will auto-detect. ## 4. Retrieve data for one station Pick a known station (Lake Mead, `_id` 3514) and a parameter (`3`, Daily Reservoir Storage) and you get CoverageJSON back. Flatten it with `covjson_to_tibble()`: ``` r resp <- edr_location( client, "rise-edr", location_id = 3514, datetime = "2023-01-01/2023-06-30", parameter_name = "3" ) df <- covjson_to_tibble(resp) head(df) #> # A tibble: 6 × 9 #> coverage_id parameter parameter_label unit datetime x y #> #> 1 1 3 Lake/Reservoir St… af 2023-01-01 07:00:00 -115. 36.0 #> 2 1 3 Lake/Reservoir St… af 2023-01-02 07:00:00 -115. 36.0 #> 3 1 3 Lake/Reservoir St… af 2023-01-03 07:00:00 -115. 36.0 #> 4 1 3 Lake/Reservoir St… af 2023-01-04 07:00:00 -115. 36.0 #> 5 1 3 Lake/Reservoir St… af 2023-01-05 07:00:00 -115. 36.0 #> 6 1 3 Lake/Reservoir St… af 2023-01-06 07:00:00 -115. 36.0 #> # ℹ 2 more variables: z , value ``` ## 5. Plot the time series `edr_plot()` is a small `ggplot2` wrapper for the tidy tibble: ``` r edr_plot(resp) + ggtitle("Lake Mead — daily reservoir storage") ```
plot of chunk plot-lake-mead

plot of chunk plot-lake-mead

Faceted by parameter (so different units don't share a y-axis) and coloured by station. Add layers or themes like any other ggplot. ## 6. Map stations with per-station popups For collections that support `cube`, `edr_explore()` fetches every station's data in **one** bulk request and renders the map with per-station popups -- much faster than fetching one station at a time. `rise-edr` advertises `cube`, so the default `method = "auto"` takes that path. ``` r m <- edr_explore( client, "rise-edr", bbox = c(-114, 32, -111, 38), # SW US datetime = "2023-01-01/2023-06-30", parameter_name = "3", popup = "plot+csv", label_col = "locationName", quiet = TRUE ) ``` ```{r include-full-station-map, echo=FALSE, results='asis'} if (Sys.getenv("EDR4R_FULL_VIGNETTES") == "true") { station_map <- "getting-started-full/station-map-iframe.html" if (!file.exists(station_map)) { stop("Full station map output is missing: ", station_map) } cat(readLines(station_map, warn = FALSE), sep = "\n") } ``` The object prints as an interactive leaflet map in an R session; each blue marker opens the larger plot/CSV popup for the station. Markers with data are blue (clickable popup with plot + CSV); stations the bbox covers that returned no data for the requested parameter/window are dimmed grey. A legend in the bottom-right explains the distinction. Save the same map to standalone HTML (selfcontained, no sidecar directory): ``` r edr_save_html(m, "wwdh-southwest.html") ``` ## 7. Map a gridded PRISM coverage Station maps are only one side of EDR. The Western Water Datahub also publishes gridded PRISM climate data as the `usgs-prism` collection. That collection advertises `cube`, so you can request a small bounding box and map the returned grid directly. This example pulls two monthly PRISM variables over a small northern Arizona window. `edr_map()` detects the grid and puts the parameter and time selectors inside the leaflet map, so the slice choice lives with the interactive map rather than in a separate plotting call. ``` r prism <- edr_cube( client, "usgs-prism", bbox = c(-112.6, 36.7, -112.2, 37.1), datetime = "2023-01-01/2023-02-28", parameter_name = c("ppt", "tmx") ) prism_grid <- covjson_to_tibble(prism) prism_grid[, c("parameter", "datetime", "x", "y", "value")] #> # A tibble: 324 × 5 #> parameter datetime x y value #> #> 1 ppt 2023-01-01 00:00:00 -113. 37.1 116. #> 2 ppt 2023-01-01 00:00:00 -113. 37.1 120. #> 3 ppt 2023-01-01 00:00:00 -112. 37.1 131. #> 4 ppt 2023-01-01 00:00:00 -112. 37.1 126. #> 5 ppt 2023-01-01 00:00:00 -112. 37.1 115. #> 6 ppt 2023-01-01 00:00:00 -112. 37.1 115. #> 7 ppt 2023-01-01 00:00:00 -112. 37.1 116. #> 8 ppt 2023-01-01 00:00:00 -112. 37.1 111. #> 9 ppt 2023-01-01 00:00:00 -112. 37.1 93.6 #> 10 ppt 2023-01-01 00:00:00 -113. 37.0 100. #> # ℹ 314 more rows ``` ```{r include-full-prism-map, echo=FALSE, results='asis'} if (Sys.getenv("EDR4R_FULL_VIGNETTES") == "true") { prism_map <- "getting-started-full/prism-map-iframe.html" if (!file.exists(prism_map)) { stop("Full PRISM map output is missing: ", prism_map) } cat(readLines(prism_map, warn = FALSE), sep = "\n") } ```
PRISM precipitation and maximum monthly temperature over the sample bounding box.

PRISM precipitation and maximum monthly temperature over the sample bounding box.

Use the selectors in the top-right of the map to switch between precipitation (`ppt`) and maximum monthly temperature (`tmx`), and between January and February 2023. The map colors stay focused on the selected slice; click a grid cell to open the active value and a time-series chart for that pixel across the available dates. The same one-shot helper works for gridded collections too: ``` r edr_explore( client, "usgs-prism", bbox = c(-112.6, 36.7, -112.2, 37.1), datetime = "2023-01-01/2023-02-28", parameter_name = c("ppt", "tmx"), method = "cube" ) ``` ## A few things worth knowing - `datetime` is forgiving: pass `"start/end"`, an open interval like `"2020-01-01/.."`, or a length-2 character vector. It gets normalised into the ISO-8601 form the server expects. - `parameter_name` is a character vector. It's sent as one comma-separated `parameter-name` query parameter. - Some monitoring networks use compound station IDs (colon-separated triplets like `"1185:CO:SNTL"` show up in snow and forecast networks). Those work as-is -- reserved characters get URL-encoded for you. A literal `/` in an ID is rejected, because it can't survive a round trip through HTTP path segments no matter how you encode it. - Not every server implements every EDR verb. `locations`, `position`, `cube`, and `area` are common; `radius`, `trajectory`, and `corridor` less so. The client supports them all per the [spec](https://ogcapi.ogc.org/edr/), but a call against a collection that doesn't implement a given verb returns an HTTP error. ## See also - `vignette("usgs-streamgages")` -- a walk-through against the USGS waterdata endpoint, which only advertises `locations` (so the per-station fallback kicks in).