leadeR relies on spaCy, a Python NLP library, via the spacyr R package. You will need:
Install spaCy and the English model from a terminal:
Install leadeR from GitHub:
Before using any leadeR function, initialize spaCy and (optionally) set a seed for reproducibility of bootstrap results.
The package ships with three speeches by John F. Kennedy:
| Dataset | Date | Occasion |
|---|---|---|
jfk19610120 |
January 20, 1961 | Inaugural Address |
jfk19610925 |
September 25, 1961 | Address Before the UN General Assembly |
jfk19630610 |
June 10, 1963 | Commencement Address at American University |
Speech transcripts often contain editorial annotations in brackets,
parentheses, or curly braces. The clean_text() function
removes these and normalizes whitespace.
Users may need additional cleaning steps depending on the source of their text data (e.g., removing headers, footers, or speaker labels).