Crosswalk Lake IDs with mwlaxeref

mwlaxeref is an R package for going back and forth between different lake and waterbody identifiers such as: NHDHR+, NHD, LAGOS, and local state waterbody identification.



Basic Usage

For the examples used in this vignette, we’ll use the following data from Wisconsin

head(wis_lakes, n = 3)
#>   state      county lake.id      lake.name
#> 1    wi fond_du_lac    8900    forest_lake
#> 2    wi     burnett 2638700 big_trade_lake
#> 3    wi    washburn 2451300      bass_lake


Crosswalk functions are intuitive and easy to understand. For example, to crosswalk these Wisconsin lake IDs to NHDHR+, use the code below. The from_colname must be specified so that the function knows which column in wis_lakes contains the local IDs (the “lake.id” column in this case).

nhdhr_ids <- local_to_nhdhr(wis_lakes, from_colname = "lake.id", states = "wi")
head(nhdhr_ids, n = 3)
#> # A tibble: 3 × 5
#>   state county      lake.id nhdhr.id                               lake.name    
#>   <chr> <chr>       <chr>   <chr>                                  <chr>        
#> 1 wi    fond_du_lac 8900    139268654                              forest_lake  
#> 2 wi    burnett     2638700 91678049                               big_trade_la…
#> 3 wi    washburn    2451300 {AC03C0F2-2D44-4F50-8CF3-197E0EB7BF42} bass_lake


Similarly, NHDHR+ IDs can be converted to LAGOS.

nhdhr_ids <- nhdhr_ids[, "nhdhr.id"]
lagos_ids <- nhdhr_to_lagos(nhdhr_ids)
head(lagos_ids, n = 3)
#> # A tibble: 3 × 2
#>   nhdhr.id                               lagos.id
#>   <chr>                                  <chr>   
#> 1 139268654                              4993    
#> 2 91678049                               4362    
#> 3 {AC03C0F2-2D44-4F50-8CF3-197E0EB7BF42} 4554



Lake Identifiers

There are 6 different lake identification fields, and back and forth cross-walking functions exist for each of them. The six different ID fields are the first six column names of the lake_id_xref data.frame.

head(lake_id_xref, n = 3)
#> # A tibble: 3 × 9
#>   nhdhr.id  nhd.comid nhd.id   lagos.id mglp.id   local.id state agency id.field
#>   <chr>         <int> <chr>    <chr>    <chr>     <chr>    <chr> <chr>  <chr>   
#> 1 120017928        NA <NA>     139100   WI600079… 100      wi    wisco… WBIC    
#> 2 151959502        NA 13062951 119419   <NA>      100000   wi    wisco… WBIC    
#> 3 70331693         NA 13393553 100263   WI600009… 1000000  wi    wisco… WBIC



State Shortcuts

Each state has its own shortcut function to each of the various other lake identifiers. For example, to go from Wisconsin local ID to LAGOS ID, you can use the following

lagos_id <- wi_to_lagos(wis_lakes, from_colname = "lake.id")
head(lagos_id, n = 3)
#> # A tibble: 3 × 5
#>   state county      lake.id lagos.id lake.name     
#>   <chr> <chr>       <chr>   <chr>    <chr>         
#> 1 wi    fond_du_lac 8900    4993     forest_lake   
#> 2 wi    burnett     2638700 4362     big_trade_lake
#> 3 wi    washburn    2451300 4554     bass_lake



Certain State Caveats

In some cases states contain multiple unique identifiers. In other cases there are multiple state agencies that each have their own unique ID. These duplicate instances often have the same NHDHR+, LAGOS, or other ID, so the agency and id_field arguments have been implemented to allow you to specify which agencies’ unique ID to use (or which unique identification field if multiple exist within the same agency).

For example, in Michigan many lakes have both a UNIQUE ID and a NEW KEY field. Trying to go from NHDHR+ or LAGOS for these will yield duplicate results due to their being two IDs.

mi_nhdhr <- data.frame(nhdhr.id = "123397651")
nhdhr_to_mi(mi_nhdhr)
#> Warning in crosswalk_lake_id(data, from = "nhdhr", to = "local", from_colname = from_colname, : 
#>   Some of records in the output may be duplicated due to one-to-many relationships among lake identifiers.
#>   i.e. newdat <- nhdhr_to_mi(mi_nhdhr)
#>   nrow(newdat) > nrow(mi_nhdhr))
#>   This likely means duplicated data. Proceed with caution.
#>   Some states have multiple ID fields. Consider using the id_field argument
#>    nhdhr.id local.id
#> 1 123397651   27-265
#> 2 123397651    13524

This duplication can be overcome by specifying the id_field argument as follows:

nhdhr_to_mi(mi_nhdhr, id_field = "NEW_KEY")
#>    nhdhr.id local.id
#> 1 123397651   27-265

Different ID fields for certain states can be found in lake_id_xref under the column called id_field. For the example of Michigan, see unique(lake_id_xref$id.field[lake_id_xref$state == "mi"]).