Reference

Last updated on 2026-05-29 | Edit this page

Glossary


Bus factor The number of contributors who would need to leave (or be “hit by a bus”) before a project becomes unmaintainable. A bus factor of 1 means the project depends entirely on one person.

CITATION.cff A plain-text file in YAML format placed in the root of a repository that provides structured citation metadata — title, authors, version, DOI, and more. GitHub reads this file automatically and displays a “Cite this repository” button with ready-made citations in multiple formats (BibTeX, APA, CFF).

CFF (Citation File Format) The file format used by CITATION.cff. Maintained by the Citation File Format community. Supported by GitHub, Zenodo, and reference managers including Zotero.

CODE Beyond FAIR A 2026 research software roadmap (Di Cosmo et al., Scientific Data) that extends FAIR principles for software. The CODE pillars are: Collaborate, Open, Document, Execute. Recommends Software Heritage archiving alongside DOI-based citation.

DataCite The DOI registration agency for research data and software. When you mint a DOI on Zenodo, DataCite registers it. DataCite metadata is harvested by library catalogs, OpenAlex, and other scholarly discovery systems.

DOI (Digital Object Identifier) A persistent identifier assigned to a specific version of a resource. Unlike URLs, DOIs are permanent — they resolve even if the underlying repository moves or the hosting platform closes. Recommended for software citation because they point to an exact, archived snapshot.

FAIR4RS The FAIR Principles for Research Software, adapted from the FAIR data principles. Software should be: Findable (has DOI, metadata, CITATION.cff), Accessible (public repository, standard protocols), Interoperable (standard formats, documented dependencies), Reusable (license, README, reproducible environment).

Link rot The failure of a URL-based citation when the target URL changes, moves, or disappears. Common causes include username changes, repository renames, account deletions, and platform shutdowns. Gitorious (2015) and Google Code (2016) are documented examples where platform closures made thousands of citations unreachable.

Lockfile A file generated by an environment manager (e.g., pixi.lock, conda-lock.yml) that records the exact version of every dependency, including transitive dependencies. A lockfile enables byte-for-byte environment reproducibility across machines and time.

ORCID Open Researcher and Contributor ID. A persistent identifier for individual researchers, analogous to a DOI for people. Including ORCIDs in CITATION.cff and Zenodo metadata ensures author credit is unambiguous regardless of name changes or institutional affiliations.

pixi A modern, cross-platform environment manager supporting Python, R, and other languages. Uses pixi.toml to declare dependencies and generates a pixi.lock file for reproducibility. See also: conda, mamba, pip/venv, renv (R).

Research software Software created or used in a research context — including analysis scripts, data processing pipelines, simulation models, and tools that support research workflows. Distinct from general-purpose software in that it is often created by researchers rather than professional software developers, and its outputs are part of the scientific record.

Semantic versioning (SemVer) A versioning convention using MAJOR.MINOR.PATCH (e.g., v1.2.0). MAJOR increments signal breaking changes; MINOR signal new features; PATCH signal bug fixes. Starting at 0.x.x indicates the software is in initial development.

Software Heritage A universal source code archive that continuously crawls GitHub, GitLab, and other forges and preserves everything. Assigns SWHIDs (see below) to every file, directory, commit, and release. Free, non-profit (Inria/UNESCO), designed specifically for long-term software preservation.

SWHID (Software Heritage Identifier) A persistent identifier assigned by Software Heritage to an exact, immutable snapshot of source code. Complements a Zenodo DOI: the DOI is for citation (version-level); the SWHID is for long-term preservation and points to the precise code state. Can be added to CITATION.cff under repository-artifact.

Zenodo A free, open-access repository operated by CERN. Integrates with GitHub to automatically archive each release and mint a version-specific DOI. Zenodo records flow into DataCite and are indexed by Google Scholar and library catalogs.

Zenodo Sandbox A test environment for Zenodo (sandbox.zenodo.org) that works identically to the production service but does not create permanent DOIs. Used in this lesson for practice so learners do not pollute the permanent scholarly record.


References


Foundational Principles

Software Citation

Software Preservation

Licensing

Environment Management

README and Repository Documentation

Community Health Files

Further Reading