tlda: Tools for Language Data Analysis

Support functions and datasets to facilitate the analysis of linguistic data. The current focus is on the calculation of corpus-linguistic dispersion measures as described in Gries (2021) <doi:10.1007/978-3-030-46216-1_5> and Soenning (2025) <doi:10.3366/cor.2025.0326>. The most commonly used parts-based indices are implemented, including different formulas and modifications that are found in the literature, with the additional option to obtain frequency-adjusted scores. Dispersion scores can be computed based on individual count variables or a term-document matrix.

Version: 0.1.0
Depends: R (≥ 3.5.0)
Suggests: knitr, rmarkdown, testthat (≥ 3.0.0)
Published: 2025-04-25
DOI: 10.32614/CRAN.package.tlda
Author: Lukas Soenning ORCID iD [aut, cre, cph], German Research Foundation (DFG) ROR ID [fnd] (Grant number 548274092)
Maintainer: Lukas Soenning <lukas.soenning at uni-bamberg.de>
BugReports: https://github.com/lsoenning/tlda/issues
License: MIT + file LICENSE
URL: https://github.com/lsoenning/tlda
NeedsCompilation: no
Materials: README NEWS
CRAN checks: tlda results

Documentation:

Reference manual: tlda.pdf
Vignettes: Dispersion analysis (source, R code)
Frequency-adjusted dispersion scores (source, R code)

Downloads:

Package source: tlda_0.1.0.tar.gz
Windows binaries: r-devel: not available, r-release: tlda_0.1.0.zip, r-oldrel: tlda_0.1.0.zip
macOS binaries: r-release (arm64): tlda_0.1.0.tgz, r-oldrel (arm64): tlda_0.1.0.tgz, r-release (x86_64): tlda_0.1.0.tgz, r-oldrel (x86_64): tlda_0.1.0.tgz

Linking:

Please use the canonical form https://CRAN.R-project.org/package=tlda to link to this page.