First stable release, and a rename.
Renamed to rtransparency. The
package is renamed from rtransparent (the name of the
original tool by Serghiou et al.) to avoid confusion with that project.
The GitHub repository is also renamed to
choxos/rtransparency (old URLs redirect): install with
remotes::install_github("choxos/rtransparency") and load
with library(rtransparency). Function names
(rt_*) are unchanged. Serghiou is credited as an author and
the foundational 2021 paper is cited
(citation("rtransparency")).
This 1.0.0 release marks a stable public API: eight transparency
indicators (conflicts of interest, funding, registration, novelty,
replication, data, code, and AI-use disclosure), multilingual
conflict-of-interest and funding detection, plain-text and PMC XML
parity, corpus-scale batch processing with
rt_all_pmc_dir(), and accuracy correction for seven of the
eight indicators.
New rt_ai(), a plain-text detector
for generative-AI-use disclosure, the text counterpart of
rt_ai_pmc(). Because a text file carries no reliable
publication date, rt_ai() applies no 2023 year gate
(is_ai_pred is always TRUE/FALSE,
never NA) and cannot confine the scan to back-matter
sections; restrict it to articles from 2023 onward.
Corrected the foundational citation in
inst/CITATION to the full PLOS Biology author list:
Serghiou, Contopoulos-Ioannidis, Boyack, Riedel, Wallach and
Ioannidis.
Release polish. Removed an unused
oddpub-derived tokenization helper, so no AGPL-licensed
code remains in the package. Corrected the README plain-text workflow
and the rt_read_pdf() documentation:
rt_read_pdf() returns a character string, which must be
written to a .txt file before the text detectors are run on
it. rt_summary() documentation and the startup message now
include AI-use disclosure, and the README validation section reports the
newer-indicator metrics and label provenance.
Citation, documentation, and packaging polish.
Citation. Added inst/CITATION, so
citation("rtransparent") returns the package together with
the foundational Serghiou et al. (2021) paper.
New vignette “Scope and limitations” documenting
what each indicator does and does not capture (disclosure-based
conflicts of interest and AI, data offered “upon request” excluded,
novelty and replication as claim detection, language coverage), the
output schema, and how to pass extracted data- and code-availability
links to FAIR-assessment tooling such as rfair.
Replication is now accuracy-corrected; a fresh validation of replication and AI.
Replication added to rt_accuracy.
Earlier releases left replication out of the accuracy table because its
gold set had too few positives (5) for a stable sensitivity estimate. A
new replication-enriched validation of 250 open-access articles,
selected for external-validation language and hand-labeled (111
positives), gives a stable estimate: sensitivity 92.8 on the enriched
positives, with the representative specificity (98.5) carried over from
the 2023 1000-article sample. rt_summary() now reports an
accuracy-corrected replication prevalence. New benchmark
inst/benchmark/results_replication_enriched.{csv,md} and
labeled set
data-raw/benchmark/labels_replication_enriched.csv.
AI disclosure validated on 2024-2025 articles.
On a random sample of recent open-access articles the
generative-AI-disclosure rate is about two to three percent (far below
curated AI-focused corpora), and the detector’s positives were precise
on inspection. Because that prevalence is too low in unselected
literature for a stable corrected estimate, AI remains uncorrected in
rt_summary() (reported as apparent prevalence).
No detector logic changed in this release, so all held-out benchmarks are unchanged.
Conflict-of-interest and funding detection in five more languages.
Multilingual COI and funding. Conflict-of-interest and funding statements are now detected in Spanish, Portuguese, French, German and Italian, not only English. The conflict-of-interest relevance gate and matcher and the funding matcher and no-funding rules gained language-distinctive, accent-tolerant patterns. On 70 open-access articles per language, the conflict-of-interest detection rate rose most for monolingual articles: German 33% to 97%, French 70% to 80%. Funding detection now catches Spanish, Portuguese, French, German and Italian statements (for example Italian 67% to 74%).
The new tokens are language-distinctive and do not occur in English, so the English detectors are unchanged: conflicts of interest stay at 100 / 91.8 on the 2023 sample and the held-out Serghiou et al. (2021) benchmarks are untouched. The multilingual funding patterns also surfaced two Spanish and Portuguese funding statements in the 2023 sample that had been mislabeled as unfunded; those labels were corrected.
Because the text detectors share the PMC detection cores (0.9.8), the new languages are recognized in plain-text input as well.
New multilingual benchmark
(inst/benchmark/results_multilingual.{csv,md}).
Data-availability detection remains English-only for now; multilingual data-sharing detection is planned for a future release.
The plain-text detectors now share the PMC detection logic.
TXT/PMC parity. rt_coi(),
rt_fund() and rt_register() route their text
through the same detection helpers as rt_coi_pmc(),
rt_fund_pmc() and rt_register_pmc(), replacing
separate and weaker text logic. (rt_novelty() and
rt_replication() already shared their helpers.) Measured on
text extracted from the 1000-article 2023 validation set (sensitivity /
specificity): registration 46.2 / 98.7 to 90.4 / 98.4, conflicts of
interest 88.8 / 86.3 to 88.6 / 90.4, funding 79.1 / 89.5 to 79.3 / 90.5.
The remaining gap to the PMC detectors is the XML-structural routes
(tagged funding groups, footnote types, section titles) that a
plain-text file does not carry.
New TXT-parity benchmark
(data-raw/benchmark/build_txt_parity.R,
inst/benchmark/results_txt_parity.{csv,md}) measures the
TXT detectors against the same hand labels as the PMC
benchmark.
The PMC detectors, the held-out Serghiou et al. (2021) benchmarks and the novelty/replication gold set are unchanged; only the TXT entry points changed.
Corpus-scale batch processing.
New rt_all_pmc_dir(). Processes
every PMC XML in a directory (or a vector of paths) through
rt_all_pmc() in a single call. The run is resumable (with
output, results are written to a CSV in chunks and a re-run
skips files already recorded), isolates per-file failures (a malformed
file yields an is_success = FALSE row instead of aborting
the run), shows a progress bar, and can run in parallel via the optional
furrr package and an active
future::plan().
furrr and future are added to Suggests;
they are used only for
rt_all_pmc_dir(parallel = TRUE).
The hand-labeled 2023 validation sample reaches 1000 articles.
Validation sample reaches 1000. The final twenty open-access PMC articles were hand-labeled for all eight indicators and added to the committed sample, bringing it to a round 1000. Metrics (sensitivity / specificity): conflicts of interest 100 / 91.8, funding 94.8 / 95.3, registration 84.6 / 99.2, novelty 90.2 / 93.3, replication 82.4 / 98.5, data 91.1 / 97.8, code 93.9 / 99.0, AI 100 / 100.
Funding. The Portuguese no-funding declaration “os autores nao reportam qualquer financiamento” (“the authors report no funding”) is now read as absence of funding.
The held-out Serghiou et al. (2021) benchmarks and the novelty/replication gold set are unchanged.
The hand-labeled 2023 validation sample is expanded to 980 articles (265 new), with a focused improvement to replication precision and a further funding fix.
Validation sample grows to 980. Eighteen new batches (265 articles) were hand-labeled for all eight indicators and folded into the committed sample. Current metrics (sensitivity / specificity): conflicts of interest 100 / 91.7, funding 94.8 / 95.2, registration 84.6 / 99.2, novelty 90.1 / 93.4, replication 81.2 / 98.5, data 90.8 / 97.8, code 93.8 / 98.9, AI 100 / 100.
Replication precision. The replication detector previously fired on several non-replication contexts. It now suppresses: limitations and strengths discussion paragraphs (“a third limitation concerns the validity of …”), editorial statements about reproducibility as a value (“reproducibility is the cornerstone of scientific integrity”), reviews assessing the “validity of” a method or algorithm, lists of machine-learning evaluation metrics, results reproduced only within the arms of a single trial, and negative results (“not always replicated”). Replication PPV rose from 33.3 to 40.0 on the novelty/replication gold set and to 48.1 on the larger 2023 sample (with specificity 98.5); replication positives are still few, so PPV remains modest.
Funding. “The authors did not receive any external financial support for this work” is now read as absence of funding.
The held-out Serghiou et al. (2021) benchmarks and the novelty gold set are unchanged.
The hand-labeled 2023 validation sample is expanded to 715 articles (210 new), with three small detector fixes surfaced by the new batches.
Validation sample grows to 715. Fourteen new
batches (210 articles) were hand-labeled for all eight indicators and
folded into data-raw/benchmark/labels_2023_sample.csv and
inst/benchmark/results_2023_sample.md. Current independent
metrics: registration 88.9 / 99.6, novelty 89.1 / 94.5, code 92.0 /
99.7, replication 84.6 / 98.0; detector-adjudicated funding 93.2 / 95.5
and data 90.9 / 97.9.
Funding: more no-funding declarations recognized. “There are no source of support”, “not supported by any organizations”, “no external sources of funding” and “conducted without the receipt of any dedicated grant or financial support” are now read as absence of funding rather than disclosed funding (these otherwise leaked through the funding-title route).
Novelty recall. “previously unobserved” is added to the gap-claim cues (“we identify a previously unobserved …”), and “undertake” to the priority verbs (“the first to undertake a comprehensive review”).
The held-out Serghiou et al. (2021) benchmarks and the novelty/replication gold set are unchanged (the new funding phrases are absence-of-funding declarations that cannot drop a funded positive).
The hand-labeled 2023 validation sample is expanded to 505 articles (120 new), with three small detector fixes surfaced by the new batches.
Validation sample grows to 505. Eight new
batches (120 articles) were hand-labeled for all eight indicators and
folded into data-raw/benchmark/labels_2023_sample.csv and
inst/benchmark/results_2023_sample.md. Current independent
metrics: registration 88.2 / 99.4, novelty 87.7 / 95.8, code 94.1 /
99.6, replication 81.8 / 98.0; detector-adjudicated funding 91.8 / 95.3
and data 88.6 / 97.7.
Funding: more no-funding declarations recognized. “The authors were not financially supported by any funding or institutions” (the adverb “financially” previously broke the match) and non-English declarations (Portuguese “nao teve fontes de financiamento”, Spanish “no recibio financiacion” / “sin financiacion”) are now read as absence of funding rather than disclosed funding.
AI: disclosure-section titles broadened. A section titled “Statement on the use of artificial intelligence” (and similar “… on the use of AI / generative AI / LLMs” headings) is now recognized as an AI-use disclosure, matching the existing “Declaration of generative AI” handling.
The held-out Serghiou et al. (2021) benchmarks are unchanged: the new funding phrases are absence-of-funding declarations (which cannot drop a funded positive and do not occur in that English, pre-2021 set), and the AI indicator is not part of it.
A precision release from the next round of hand-label review (2023 sample grown to 385 articles).
Funding: no-funding declarations no longer leak. The BMJ standard statement “The authors have not declared a specific grant for this research from any funding agency in the public, commercial or not-for-profit sectors” sits under a section titled “Funding”, so it was counted as disclosed funding. It is now recognized as an absence-of-funding declaration. Three 2023-sample articles were relabeled to FALSE accordingly (their only funding statement is this declaration; one also cites historical NIH funding of unrelated past research, which is not the article’s own funding).
Novelty precision. Active-voice disease surveillance (“the country recorded its first case of COVID-19 on 27 February 2020”) is now suppressed, matching the existing passive-voice rule; genuine case-report novelty (“we report the first case of …”) is preserved.
Novelty recall. The explicit self-assertion “the novelty of our study …” is now recognized.
Measured effect. On the 2023 sample, funding specificity rose from 91.2 to 94.7 and PPV from 94.6 to 96.3 (sensitivity unchanged); novelty holds at 86.7 / 95.1. The held-out Serghiou et al. (2021) benchmarks and the novelty/replication gold set are unchanged (the new funding phrase appears only in modern articles, and an absence-of-funding rule cannot drop a funded positive).
This release overhauls the novelty detector for both recall and precision, fixes two long-standing bugs in the public PMC entry points, and corrects mislabeled articles in the 2023 validation sample.
Novelty recall. Fixed a core gap: the “first to <verb>” rule was missing many common verbs (confirm, validate, find, discover, prove, predict and others), so canonical claims such as “the first study to confirm …” went undetected. The relevance pre-filter was also widened to a cheap superset of the pattern cues, so genuine claims placed in results or discussion sections are no longer discarded before the precise rules run. New patterns recognize “first <research object> to <verb>” (technology, technique, approach, method, tool, model and similar), the author-voice idiom “we provide the first evidence that …”, superlative and “fails to”/“no such study” gaps introduced by “to our knowledge” (whether the gap precedes or follows the phrase), and the passive “a novel <object> was developed/detected”.
Novelty precision. Bare “new” is no longer treated as a novelty cue (it is far too frequent in non-priority contexts such as “a new model” or “new insights”); procedural “we first <verb> …, then …” no longer counts as a priority claim; and the weak “this novel <term>” pattern was removed. Gap claims (“previously un…”, “has not been studied”) must now be tied to the present study rather than to background or a cited work. New suppression rules drop firstness that is attributed (“not the first”, an author
Measured effect. On the independent 2023 sample,
novelty rose from sensitivity 77.2 / specificity 89.6 (PPV 71.0) to 86.3
/ 94.9 (PPV 85.4). On the novelty/replication gold set it rose from 76.5
/ 90.8 to 83.8 / 95.2; the rt_accuracy novelty estimate
used by rt_summary() was updated accordingly (0.765/0.908
to 0.838/0.952). Replication is unchanged.
Bug fix: duplicated columns.
rt_novelty_pmc() and rt_replication_pmc()
raised “Column names … must not be duplicated” because their identifier
output duplicated the prediction and text columns supplied by the
internal detector. Both now return a single, well-formed row.
(rt_all_pmc(), which calls the internal detectors directly,
was never affected.)
Validation labels. Corrected eleven novelty labels in the 2023 sample that were assigned in error during fast batch labeling: seven clear author priority claims had been marked FALSE, and four enumeration, ordinal or “new method” mentions with no priority claim had been marked TRUE. The committed benchmark and the novelty/replication gold set were rebuilt from the corrected labels.
The held-out Serghiou et al. (2021) conflicts-of-interest, funding, registration, data and code benchmarks are unchanged; no detector other than novelty was modified.
This is a feature release centered on the novelty and replication detectors and a second, independent validation set.
Independent 2023 validation sample. Added a
held-out set of 370 open-access PMC articles published in 2023,
hand-labeled for all eight transparency indicators
(data-raw/benchmark/labels_2023_sample.csv,
inst/benchmark/results_2023_sample.md). It is a modern
companion to the Serghiou et al. (2021) held-out set, which predates
these indicators and the 2023-era reporting conventions. The
conflicts-of-interest, funding and data labels were reconciled against
the detector’s extracted statement where the author’s back matter was
truncated during labeling, so those three are not independent of the
detector; novelty, replication, registration and code sharing were
labeled independently and are the meaningful test.
Novelty detector improvements. Recall was
broadened to recognize “new” and “innovative” (not only “novel”), a much
wider set of research objects (device, sequence, model, tool, assay,
algorithm, variant, isolate, …), passive claims (“a novel X is
developed”), an adverbial “first” (“our study first provided evidence”),
more “first to .negate_novelty_1) removes firstness attributed to a
cited study (“Smith et al. demonstrated for the first time”),
ordinal/temporal “first” (first-time transplant, first day/week/stage)
while preserving the priority phrase “for the first time, we …”, and
historical dates (“used for the first time in 1993”). On the 2023
sample, novelty sensitivity rose from 72.8% to 77.2% and specificity
from 87.8% to 89.6%.
Replication detector. Future/conditional replication proposed for later work (“this study can be replicated with a larger sample”) is now treated as not performed. The replication gold set remains small (few positives), so its estimates are reported as low-power.
The novelty/replication gold set was expanded from 160 to 370
articles, and the novelty accuracy used by
rt_summary(accuracy = TRUE) was updated
accordingly.
Code sharing: do not mistake a “Web Resources” / “URLs” list for shared code. Genomics papers commonly list the external tools and databases they used as “Name: URL, Name: URL, …” (for example ANNOVAR, BWA, GATK and third-party GitHub tools such as Delly, Lumpy and Manta). Such a resource list cites software the authors used, not code they released, but the GitHub URLs made it register as code sharing. A list of three or more “label: URL” entries is now vetoed. The held-out code benchmark is unchanged (sensitivity 88.1%, specificity 99.5%).
Funding: do not count an open-access publishing arrangement as
research funding. Statements such as “Open Access funding enabled and
organized by Projekt DEAL” (or by CAUL, IReL and similar library
consortia) pay the article-processing charge and are not a
research-funding disclosure, but the “funding … by
<funding-group>. When an article’s funding-group
named a funder (<funding-source>) and award
identifier but carried no narrative
<funding-statement> and no funding section title, the
funding was missed. The named funder is now treated as a funding
disclosure (and returned as the funding text). The held-out funding
benchmark is unchanged (sensitivity 100%, specificity 95.7%). Added
regression tests.rt_data_code_pmc() and rt_all_pmc() now
also return the identifiers of the shared data and code, not just
whether sharing occurred. New columns open_data_links and
open_code_links hold the DOIs (as doi.org
URLs), repository URLs and database accessions extracted from the
detected availability statements, with accessions normalized to
identifiers.org prefix:accession form (for example
geo:GSE12345, bioproject:PRJEB51269); multiple
identifiers are separated by " ; ". Identifiers are taken
only from the availability statements, so a reused accession cited in
the methods is not collected. Added regression tests.data-raw/benchmark/labels_novelty_replication.csv, with
the label definitions documented in
run_novelty_replication.R) is scored by
data-raw/benchmark/run_novelty_replication.R; results are
in inst/benchmark/results_novelty_replication.md. Novelty
scores sensitivity 81.0%, specificity 93.2% (n = 160, 42 positives);
replication has too few positives for a stable sensitivity estimate
(specificity 96.8%).rt_accuracy now includes novelty (sensitivity 0.810,
specificity 0.932), so rt_summary() reports an
error-corrected novelty prevalence. Replication and AI-use disclosure
remain uncorrected.Fixes for genome data-papers (Darwin Tree of Life and similar), found during the manual validation of 1,000 open-access PMC articles:
rt_all_pmc() now returns all eight transparency
indicators in a single call. It previously returned six (COI, funding,
registration, novelty, replication and AI-use disclosure) and data and
code sharing had to be obtained separately from
rt_data_code_pmc(); the output now also carries
is_open_data, is_open_code and their matched
statements (open_data_statements,
open_code_statements). The detection is the same native
detector as rt_data_code_pmc(), so the two agree exactly.
The change is additive: existing columns are unchanged, and the COI,
funding and registration benchmarks are unaffected. The vignettes are
updated to reflect the single-call workflow.Documentation and example data, so the package website showcases every indicator:
vignette("ai-disclosure"), on the AI-use
disclosure indicator: what rt_ai_pmc() detects, why it is
gated to 2023 onward, and how to chart its adoption across a
corpus.rt_demo gains an is_ai_pred column
(NA before 2023) and now spans 2010-2026, so
rt_summary() and rt_plot() examples can show
the AI indicator and its time trend. The data remain simulated.Further fixes from the manual validation on a fresh sample of 1,000 open-access PMC articles from 2023:
Fixes from a manual validation on a fresh, disjoint sample of 1,000 open-access PMC articles from 2023:
rt_ai_pmc() detects whether an article discloses the use
(or non-use) of generative AI or AI-assisted tools in preparing the
manuscript, as journals have asked of authors since 2023. It recognizes
positive disclosures (“the authors used ChatGPT to improve the
readability of the manuscript”), negative disclosures (“no generative AI
was used in the preparation of this work”) and dedicated “Declaration of
generative AI” sections, while not flagging articles that merely use AI
as their research method. Because the practice did not exist before
2023, the indicator is only evaluated for articles published in 2023 or
later; earlier articles return NA
(is_ai_pred), and the publication year is
reported. The indicator is included in rt_all_pmc() and
recognized by rt_summary(). On the 1,000-article
open-access validation set (almost all published 2024-2026) it flags
about 16% of articles, with high precision on inspection.rt_accuracy was updated.
The patterns are gated on a language prefix or the word “script” so
non-analysis “codes” (ICD, diagnosis, qualitative) are not matched.
Added regression tests.Improvements from a large audit: the tool was run over 1,000 cached open-access PMC articles and a sample was hand-checked against the human-labeled benchmark.
rt_accuracy was
updated.Precision and recall fixes from an independent manual review of a sample of open-access PMC articles:
rt_accuracy was updated to these
estimates.@noRd, so the
manual and the pkgdown reference present only the public API.rt_summary() and rt_score() so
indicator columns must be logical or numeric 0/1 values, with
NA allowed.data-raw/external-validation/.GPL-3 + file LICENSE to GPL-3. The package is
plain GPL-3 with no additional terms, so the + file LICENSE
form (which signals extra restrictions in the LICENSE file)
was misleading; the full GPL-3 text is still provided in
LICENSE for reference.rt_summary() reports each indicator’s prevalence with a
Wilson confidence interval and, by default, a prevalence corrected for
the detector’s sensitivity and specificity (the Rogan-Gladen estimator).
It can summarize within groups via by.rt_score() adds a per-article count of the openness
practices met.rt_plot() draws a prevalence bar chart or a
prevalence-over-time line chart (requires ggplot2).rt_accuracy (detector sensitivity and
specificity estimates, used by rt_summary()) and
rt_demo (a small simulated corpus for the examples).vignette("transparency-summary"),
illustrating the output: from one article to a corpus prevalence table,
an accuracy-corrected prevalence, a practice-count distribution,
subgroup summaries and plots.oddpub and tokenizers. The native
detector (added in 0.4.0) is the only data and code path;
oddpub, tokenizers and metareadr
have been dropped from Suggests, so the package and its
CRAN-style check no longer reference any GitHub-only packages.R CMD check note about the undefined
. global variable.DESCRIPTION
Title is now in title case and the pkgdown URL carries its
trailing slash.rt_data_code_pmc_list() documentation
example.rt_fund_pmc(). It previously
predicted funding TRUE for no-funding articles with empty
evidence text; it now delegates to the same detection path as
rt_all_pmc() so the two agree, and a positive prediction
always carries evidence. Added regression tests.rt_meta_pmc() (article metadata
from a PMC XML file), which the README advertised but which was not
exported.R CMD check now passes with no errors or
warnings.oddpub / metareadr instructions.R/data_code.R) and no longer requires the
oddpub package at runtime. On the XML benchmark used at the
time, the native detector scored data 64% sensitivity / 95% specificity
and code 68% sensitivity / 94% specificity (the published paper reports
about 76% and 59% sensitivity). Code detection already exceeded the
paper’s sensitivity and the data precision matched the original
oddpub; data sensitivity was being improved toward
oddpub’s ~84%.rt_data_code, rt_data_code_pmc and
rt_data_code_pmc_list were rewritten to use the native
detector and return is_open_data /
is_open_code with the matched statement text. They no
longer depend on oddpub or tokenizers.data-raw/benchmark/run_data_code.R,
inst/benchmark/results_data_code.md).CRD numbers exceed 5 digits), in both the TXT and PMC
detectors. No change on the benchmark (the held-out set has no
PROSPERO-only cases). The fork’s other commits were assessed and
deferred: “coi update” is TXT-only (not exercised by the PMC benchmark)
and “pipe update” is a cosmetic reformat that conflicts with this line’s
changes.get_fund_acknow_new(). It previously flagged any
acknowledgment that merely named an institution or used the word
“support”, so competing-interest statements, generic thanks,
data-availability statements and affiliations were misread as funding.
It now requires explicit funding language: a funding verb directed at a
funder, an institutional “support/funding of the …”, a grant or award
identifier, or a named award. Sensitivity is unchanged at 100% on the
test set.data-raw/benchmark/, inst/benchmark/) that
scores the detectors against the human-labeled gold standard of Serghiou
et al. (2021) and reports sensitivity, specificity, PPV, NPV and
accuracy with bootstrap confidence intervals, alongside the published
Fig 2 numbers..reroot_xml() to handle bare
<article> and NCBI EFetch
<pmc-articleset> roots. Previously it returned an
empty document for anything other than the PMC OAI-PMH format, which
silently suppressed all detection.str_detect()/regex()
calls in the funding detector that errored on articles lacking a
structured funding statement.oddpub,
tokenizers) are now optional (moved to
Suggests); the package loads and every other indicator runs
without them. The data and code functions raise a clear, actionable
error when these packages are absent.metareadr, GPL-3).rt_novelty and rt_novelty_pmc added:
detect claims of novelty (“for the first time”) in TXT and PMC XML
files.rt_replication and rt_replication_pmc
added: detect replication/validation components in TXT and PMC XML
files.rt_register and rt_register_pmc expanded:
now detect registrations on ISRCTN, ANZCTR (ACTRN), DRKS, IRCT, and UMIN
in addition to NCT and PROSPERO.rt_all and rt_all_pmc updated to include
novelty and replication indicators.tests/testthat/).rt_coi now searches for Conflicts of interest
statements within text files.rt_fund now searches for Funding statements within text
files.rt_register now searches for Registration statements
within text files.rt_all now searches for many indicators within text
files.rt_read_pdf now converts PDF files into TXT using
poppler.