cffr: Generate Citation File Format Metadata for R Packages

JOSS paper

Diego Hernangómez1

2021-11-08

DOI

Summary

The Citation File Format project (Druskat et al. 2021) defines a standardized format for providing software or datasets citation metadata in plaintext files that are easy to read by both humans and machines.

This metadata format is being adopted by GitHub as the primary format for its built-in citation support (GitHub 2021). Other leading archives for scientific software, including Zenodo and Zotero (Druskat 2021), have included as well support for CITATION.cff files in their GitHub integration.

The cffr package provides utilities to generate and validate these CITATION.cff files automatically for R (R Core Team 2021) packages by parsing the DESCRIPTION file and the native R citation file. The package also includes utilities and examples for parsing components as persons and additional citations, as well as several vignettes which illustrate both the basic usage of the package as well as some more technical details about the metadata extraction process.

Statement of need

Citation of research software on research project is often omitted (Salmon, Chamberlain, and Ram 2021) . Among many reasons why software is not cited, one is the lack of a clear citation information from package developers.

Some of the main reasons for citing software used on research are:

  1. Reproducibility: Software and their versions are important information to include in any research project. It helps peers to understand and reproduce effectively the results of any work. Including versions is also crucial as a way of recording the context of your manuscript when software changes.
  2. Developer Credit: On the context of Free and Open Source Software (FOSS), many of the software developers themselves are also researches. Receive credit for software development shouldn’t be different from the credit received on other formats, as books or articles.

CITATION.cff files provides a clear citation rules for software. The format is easily readable by humans and also can be parsed by appropriate software. The adoption of GitHub of this format sends a strong message that research software is something worthy of citation, and therefore deserves credit.

The cffr package allow R software developers to create CITATION.cff files from the metadata already included on the package. Additionally, the package also include validation tools via the jsonvalidate package (FitzJohn et al. 2021), that allow developers to assess the validity of the file created using the latest CFF schema.json.

Acknowledgements

I would like to thank Carl Boettiger, Maëlle Salmon and the rest of contributors of the codemetar package (Boettiger and Salmon 2021). This package was the primary inspiration for developing cffr and shares a common goal of increasing awareness on the efforts of software developers.

I would like also to thank João Martins and Scott Chamberlain for thorough reviews, that helps improving the package and the documentation as well as Emily Riederer for handling the review process.

Citation

Hernangómez, D., (2021). cffr: Generate Citation File Format Metadata for R Packages. Journal of Open Source Software, 6(67), 3900, https://doi.org/10.21105/joss.03900

@article{hernangomez2021,
  doi = {10.21105/joss.03900},
  url = {https://doi.org/10.21105/joss.03900},
  year = {2021},
  publisher = {The Open Journal},
  volume = {6},
  number = {67},
  pages = {3900},
  author = {Diego Hernangómez},
  title = {cffr: Generate Citation File Format Metadata for R Packages},
  journal = {Journal of Open Source Software}
}

References

Boettiger, Carl, and Maëlle Salmon. 2021. codemetar: Generate ’CodeMeta’ Metadata for R Packages.
Druskat, Stephan. 2021. “Making Software Citation Easi(er) - The Citation File Format and Its Integrations.” https://doi.org/10.5281/zenodo.5529914.
Druskat, Stephan, Jurriaan H. Spaaks, Neil Chue Hong, Robert Haines, James Baker, Spencer Bliven, Egon Willighagen, David Pérez-Suárez, and Alexander Konovalov. 2021. “Citation File Format.” https://doi.org/10.5281/zenodo.5171937.
FitzJohn, Rich, Rob Ashton, Alex Hill, Alicia Schep, Ian Lyttle, Kara Woo, Mathias Buus (Author of bundled imjv library), and Evgeny Poberezkin (Author of bundled Ajv library). 2021. jsonvalidate: Validate ’JSON’ Schema.” https://CRAN.R-project.org/package=jsonvalidate.
GitHub. 2021. “About CITATION Files.” https://docs.github.com/en/repositories/managing-your-repositorys-settings-and-features/customizing-your-repository/about-citation-files.
R Core Team. 2021. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing; R Foundation for Statistical Computing. https://www.R-project.org/.
Salmon, Maëlle, Scott Chamberlain, and Karthik Ram. 2021. “Make Your R Package Easier to Cite.” https://ropensci.org/blog/2021/02/16/package-citation/.

  1. Independent Researcher, https://orcid.org/0000-0001-8457-4658↩︎