--- title: "pointcoral workflow" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{pointcoral workflow} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` `pointcoral` turns local CPCe point-count annotations into ecological summaries, QC overlays, and ML-ready datasets. The package is fully local and does not require MERMAID, CoralNet, or any closed platform. The key label convention is that `raw_label` remains the original short CPCe label from the `.cpc` file. You can run the bare workflow directly from those raw labels. A crosswalk is optional when you want to populate `full_label`, `clean_label`, `label_class`, `major_category`, `ml_class`, and project-specific `class_id` values. The bundled example maps short labels such as `SPO`, `CALG`, and `PEFL` to full labels such as `Sponge`, `Coralline algae`, and `Peyssonnelia flavescens`, and to major classes such as `SPONGES (S)`, `CORALLINE ALGAE (CA)`, and `PEYSSONNELIACEAE`. ```{r setup} library(pointcoral) library(dplyr) ``` ## 1. Import CPCe data ```{r import} example_dir <- system.file("extdata", package = "pointcoral") points_raw <- read_cpce_folder( path = example_dir, image_root = example_dir, recursive = FALSE ) dplyr::glimpse(points_raw) ``` ## 2. Match images `read_cpce_folder()` can match images during import when `image_root` is provided. You can also match later: ```{r match} points_raw <- match_images(points_raw, image_root = example_dir) ``` ## 3. Bare workflow from raw CPCe labels The `.cpc` files already contain point labels. In this bare workflow, `summarize_images()` and `make_ml_points()` fall back to `raw_label` because `major_category` and `ml_class` have not been populated by a crosswalk. ```{r bare} validate_points(points_raw) summarize_images(points_raw) points_split_raw <- split_ml_points(points_raw, split_by = "image", seed = 1) ml_points_raw <- make_ml_points(points_split_raw) dplyr::count(ml_points_raw, label, class_id, sort = TRUE) ``` ## 4. Optional: read a crosswalk table ```{r crosswalk} crosswalk_path <- system.file( "extdata", "pointcoral_example_crosswalk.csv", package = "pointcoral" ) crosswalk <- read_label_crosswalk(crosswalk_path) dplyr::glimpse(crosswalk) ``` ## 5. Optional: check unmapped labels ```{r check} check_crosswalk(points_raw, crosswalk) ``` ## 6. Optional: apply label standardization ```{r standardize} points_clean <- standardize_labels( points_raw, crosswalk, unknown_action = "warn" ) points_clean |> count(raw_label, full_label, label_class, major_category, class_id, sort = TRUE) ``` ## 7. Validate standardized points ```{r validate} validate_points(points_clean) ``` ## 8. Create standardized ecological summary tables ```{r summarize} summarize_images(points_clean, class_col = "major_category") summarize_transects(points_clean, class_col = "major_category") summarize_sites(points_clean, class_col = "major_category") summarize_images(points_clean, class_col = "clean_label") ``` ## 9. Create ML point-label CSVs ```{r ml} points_split <- split_ml_points(points_clean, split_by = "image", seed = 1) ml_points <- make_ml_points(points_split, class_col = "ml_class") out_dir <- tempfile("pointcoral-vignette-") dir.create(out_dir) write_ml_points_csv(ml_points, file.path(out_dir, "ml")) ``` ## 10. Extract point patches Patch extraction writes image crops to disk. For large projects, run this in a dedicated output folder. ```{r patches, eval = FALSE} extract_point_patches( points_split, image_root = example_dir, out_dir = file.path(out_dir, "patches"), patch_size = 224, class_col = "ml_class", edge = "skip" ) ``` ## 11. Create sparse masks Sparse masks are weak labels. Pixels outside point disks remain `ignore_index` by default. ```{r masks, eval = FALSE} make_sparse_masks( points_split, image_root = example_dir, out_dir = file.path(out_dir, "sparse_masks"), radius = 3 ) ``` ## 12. Create QC overlays ```{r qc, eval = FALSE} write_qc_overlays( points_split, image_root = example_dir, out_dir = file.path(out_dir, "qc"), label_col = "ml_class" ) ``` ## Full workflow wrapper ```{r wrapper, eval = FALSE} run_pointcoral( cpce_dir = example_dir, image_root = example_dir, out_dir = file.path(out_dir, "outputs"), make_patches = TRUE, make_masks = TRUE, make_qc = TRUE ) ``` Add `crosswalk_path = crosswalk_path` to that wrapper call when you want the standardized labels and major classes shown above.