harness launches a command-line coding agent of your
choice in a terminal tab pre-configured for a professional R role. A
role is described by a curated harness: a subset of community skills, a
system prompt, a folder layout, and quality gates.
The package does not run an agent loop and does not call a language model. It discovers the chosen coder binary, generates its configuration, links the curated skills, and opens the terminal. Code written by the agent is run manually by the user, so that every generated script passes through a human audit gate before execution.
This is the second package of the r-cs-packages family,
after gpumetropolis.
For several years the most capable agentic coding tools reached R
users first as dedicated editors or as extensions for general-purpose
IDEs. That path moved part of the R workflow out of RStudio, into an
environment built around other languages. harness reverses
the move. Modern command-line coding agents are editor-agnostic: they
run in any terminal, including the RStudio terminal tab. The package
wires them there, anchored in the project directory, while the console,
the plots, the environment pane and the data viewer stay where the R
user already works. The agentic session and the analytical session share
one window again.
Three properties separate this from opening a coder in a bare terminal:
source(). The human gate is the design, not a
restriction: nothing the agent produces reaches the session state until
a person reads it and chooses to run it.The case for staying in RStudio is therefore concrete rather than nostalgic: the R-native environment hosts the agent in-place, adds role-aware curation that a generic terminal session lacks, and enforces a review step before execution. The package competes with neither RStudio’s own assistants nor the command-line coders it launches; it positions the agent inside the R workflow and curates it for the work at hand.
The development version can be installed from GitHub:
# install.packages("devtools")
devtools::install_github("pcbrom/harness")The curated skills come from the external community-skills catalogue, which is never bundled with this package. When the catalogue is not found, the package points at the command that fetches it as soon as it is loaded:
library(harness)
#> harness: community-skills catalogue not found.
#> Fetch it with: harness::clone_community_skills()
#> Or set COMMUNITY_SKILLS_PATH to an existing checkout.clone_community_skills() clones the catalogue into
~/.community-skills/, one of the discovery paths, so the
next call finds it with no further configuration:
clone_community_skills()To use a checkout you already keep elsewhere, point the environment
variable at it instead of cloning, through
COMMUNITY_SKILLS_PATH, ~/.community-skills/ or
~/projects/community-skills/.
update_community_skills() runs a fast-forward
git pull on the checkout so the curated skills track
upstream:
update_community_skills()The update can also run when the package is attached, but only as an explicit opt-in. The default does nothing on load, so the package never accesses the network without instruction. Enable the behaviour with an option or an environment variable:
options(harness.auto_update = TRUE) # in .Rprofile, for example
# or
Sys.setenv(HARNESS_AUTO_UPDATE = "true")
library(harness)
#> harness: community-skills updated at /home/you/.community-skills.library(harness)
# Inspect the environment: skills checkout, roles, adapters
status()
# List the curated roles, names only
available_roles()
# Tabulate the roles with version, skill count and description
role_list()
# Inspect the skills of a role
role("data-scientist")$skills
role_skills("data-scientist")
# Skills of a role, flagged by presence in the community-skills checkout
role_skills("data-scientist", available = TRUE)
# Show the full harness configuration of a role, including the system prompt
role_config("data-scientist")
# Validate the environment for a role and scaffold its folder layout
setup("data-scientist", scaffold = TRUE)
# Launch the chosen coder in a terminal tab, configured for the role
launch("claude", role = "data-scientist")launch() opens the terminal with
rstudioapi::terminalCreate when run inside RStudio, and
falls back to an external terminal emulator or, when none is available,
reports the command for the user to run.
The public functions fall into three groups: discovering roles and skills, preparing the environment and the catalogue, and launching a coder.
| Function | Purpose |
|---|---|
status() |
report the environment: checkout, roles, adapters |
available_roles() |
role names, as a character vector |
role_list() |
roles with version, skill count and description |
role_skills(name, available =) |
skills of a role, optionally flagged by checkout presence |
role(name) |
load the full role object |
role_config(name) |
print the full configuration, including the system prompt |
community_skills_path() |
resolve the community-skills checkout |
clone_community_skills() |
clone the external catalogue |
update_community_skills() |
fast-forward the catalogue |
setup(name, scaffold =) |
validate the environment and scaffold the layout |
scaffold_layout(name, dir, create =) |
create the role’s folder layout |
adapters() |
registered coder names |
launch(adapter, role, ...) |
open the coder in a terminal tab |
available_roles()
#> [1] "bioinformatician" "causal-inference" "clinical-biostat" ...
role_list()
#> role version skills description
#> 1 bioinformatician 0.1.0 5 Harness for bioinformatics in R ...
#> ...
# The skills of a role, optionally flagged by presence in the checkout
role_skills("data-scientist")
role_skills("data-scientist", available = TRUE)
# The full role object, for programmatic access
ds <- role("data-scientist")
ds$skills
ds$layout
ds$system_prompt
# The full configuration printed for reading, including the system prompt
role_config("data-scientist")# Where the external catalogue is, or NA if not found
community_skills_path()
# Clone the catalogue into ~/.community-skills (a discovery path)
clone_community_skills()
# Fast-forward the catalogue to track upstream
update_community_skills()
# Report the environment: checkout, roles, adapters and their binaries
status()
# Validate the environment for a role and create its folder layout
setup("data-scientist", scaffold = TRUE)
# Create only the folder layout, without the rest of setup()
scaffold_layout("data-scientist", project_dir = ".", create = TRUE)# The registered coders
adapters()
#> [1] "claude" "opencode" "codex"
# Open the coder in a terminal tab, configured for the role
launch("claude", role = "data-scientist")launch() accepts:
| Argument | Default | Meaning |
|---|---|---|
adapter |
"claude" |
the coder to open; see adapters() |
role |
(required) | the role; see available_roles() |
project_dir |
getwd() |
the project root |
scaffold |
TRUE |
create the role’s folder layout |
dry_run |
FALSE |
configure everything but do not open a terminal |
config_home |
adapter default | where the coder configuration is written |
skills_path |
discovered | override the community-skills checkout |
binary |
discovered | override the coder binary path |
The package never talks to a language model. It prepares the project and opens the coder; the conversation happens inside the coder, in the terminal. A session has four steps:
launch(adapter, role) writes the role system
prompt where the coder reads it (.claude/CLAUDE.md for
claude, AGENTS.md for opencode and codex), links the
curated skills, creates the folder layout, and opens the terminal tab
anchored in the project.logs/, and it does not run
anything.source(). Nothing the agent produced reaches the
session state until you choose to run it.Set up a role and launch a coder:
library(harness)
setup("data-scientist", scaffold = TRUE) # validate and create the layout
launch("claude", role = "data-scientist") # open the coder in a terminal tabIn the coder terminal, state a concrete task, for example: classify the species in the iris dataset, with an exploratory figure, a stratified train/test split, a multinomial model and the test-set accuracy.
The agent writes, but does not run, a script under
analysis/scripts/ and a decision log under
logs/:
analysis/scripts/2026-06-04_iris-classification.R
logs/2026-06-04_01_iris-classification.md
The log records the decision, its justification and the result, leaving the run outcome blank until execution. You read the script, then run it yourself:
source("analysis/scripts/2026-06-04_iris-classification.R")
#> Test accuracy: 0.911Nothing the agent produced reached the session state until you chose to run it.
The coder is selected by the first argument of launch().
The current adapters are claude, opencode and
codex; aider and gemini-cli
arrive in a later phase. The same role drives any adapter, so switching
coder keeps the skills, the prompt and the folder convention:
launch("opencode", role = "data-scientist")
launch("codex", role = "data-scientist")To experiment without touching a real coder configuration, redirect the config home to a temporary directory:
launch("opencode", role = "data-scientist", config_home = tempfile("opencode-home"))
launch("codex", role = "data-scientist", config_home = tempfile("codex-home"))Because the same role drives any coder, a single project can run several coders on one problem and keep their outputs apart. Scaffold the role once, then open each coder and give it the same task, pointing each at its own scripts folder:
setwd("~/Downloads/testes")
library(harness)
setup("data-scientist", scaffold = TRUE)
launch("claude", role = "data-scientist")
launch("codex", role = "data-scientist")
launch("opencode", role = "data-scientist")In each coder terminal, paste the same task and direct it to a coder-specific folder. For claude:
Classify the iris species. Write a SINGLE R script to
analysis/scripts_claude/2026-06-04_iris-classification.R that uses set.seed(42),
a stratified 70/30 split by Species, nnet::multinom, the test-set accuracy and a
confusion matrix, and saves two ggplot2 figures to output/figures/. Follow the
project instructions: native pipe, a short comment above each block, do not
execute anything, only write the script.
Repeat in the codex and opencode terminals with
analysis/scripts_codex/ and
analysis/scripts_opencode/. Each agent writes its script
and a decision log under logs/, and runs nothing. You then
read and run each script yourself and compare:
source("analysis/scripts_claude/2026-06-04_iris-classification.R")
source("analysis/scripts_codex/2026-06-04_iris-classification.R")
source("analysis/scripts_opencode/2026-06-04_iris-classification.R")The separate folders keep the three implementations side by side, while the decision logs record why each agent made its choices.
The comparison above was run once with the three coders on the iris task, to check that the curation and the audit rules hold across coders. The observations below are from that single run and depend on the coder and model versions used; they are recorded as a worked example, not as a benchmark.
| Observation | claude | codex | opencode |
|---|---|---|---|
| Wrote the script, ran nothing | yes | yes | yes |
| Wrote the decision log | yes | yes | varied between runs |
| Loaded tidyverse components, not the meta-package | yes | yes | yes |
| Test accuracy of the produced model | about 0.91 | about 0.91 | about 0.91 |
| Train/test split | index-based | rsample::initial_split |
anti_join |
What held for every coder is what the package guarantees: each agent wrote into the role’s layout, ran nothing, and left the execution to the user. What varied is what the package does not fix: the split strategy, the choice of figures, and how closely each agent followed every convention. The decision logs made those choices auditable after the fact, and the separate folders kept the runs from overwriting each other. The reproducible point is the workflow and the audit gate, not a ranking of coders.
Seventeen curated harnesses ship in the current development version.
List them with role_list():
| Role | Focus |
|---|---|
data-scientist |
exploratory analysis and communication with the tidyverse |
statistician |
mixed models, survival, Bayesian inference, marginal effects |
package-maintainer |
package development, tests, documentation, CRAN preparation |
paper-author |
reproducible papers in R Markdown or Quarto |
data-engineer |
columnar formats, embedded engines, database pipelines |
ml-engineer |
tidymodels training, evaluation and deployment artifacts |
shiny-developer |
modular Shiny applications |
code-documenter |
roxygen2 docstrings and reference sites |
econometrician |
panel models, fixed effects, time series |
epidemiologist |
outbreak reconstruction and reproduction numbers |
clinical-biostat |
CDISC derivation and regulatory tables with the pharmaverse |
geospatial-analyst |
vector and raster analysis, thematic mapping |
causal-inference |
difference-in-differences, matching, causal graphs |
forecast-specialist |
time series forecasting with the tidyverts stack |
reproducibility-engineer |
dependency pinning and pipeline orchestration |
bioinformatician |
Bioconductor sequence and expression analysis |
performance-engineer |
optimisation under a hard output-equivalence gate |
Each harness is a proposal open to contribution; refinements are welcome by pull request.
Every harness pins execution_policy: manual. The package
rejects, at load time, any harness that does not. The system prompt of
each role instructs the agent to write scripts into the role’s layout
folders and to leave execution to the user. The agent writes, the user
runs.
Every role, including roles contributed later, carries a decision-log
convention. The agent writes one Markdown file per step to
logs/, named
<YYYY-MM-DD>_<NN>_<slug>.md, with three
sections: Decision, Justification and
Result. The Result section lists the files
written and leaves a line for the run outcome, filled after the user
runs the script. The logs/ directory is scaffolded for
every role and the entries form an audit trail that pairs each generated
artifact with the reasoning behind it.
MIT, see LICENSE.