pangoling: Access to Large Language Model Predictions

Provides access to word predictability estimates using large language models (LLMs) based on 'transformer' architectures via integration with the 'Hugging Face' ecosystem <https://huggingface.co/>. The package interfaces with pre-trained neural networks and supports both causal/auto-regressive LLMs (e.g., 'GPT-2') and masked/bidirectional LLMs (e.g., 'BERT') to compute the probability of words, phrases, or tokens given their linguistic context. For details on GPT-2 and causal models, see Radford et al. (2019) <https://storage.prod.researchhub.com/uploads/papers/2020/06/01/language-models.pdf>, for details on BERT and masked models, see Devlin et al. (2019) <doi:10.48550/arXiv.1810.04805>. By enabling a straightforward estimation of word predictability, the package facilitates research in psycholinguistics, computational linguistics, and natural language processing (NLP).

Version: 1.0.3
Depends: R (≥ 4.1.0)
Imports: cachem, data.table, memoise, reticulate, rstudioapi, stats, tidyselect, tidytable (≥ 0.7.2), utils
Suggests: brms, knitr, parallel, rmarkdown, spelling, testthat (≥ 3.0.0), tictoc, covr
Published: 2025-04-07
DOI: 10.32614/CRAN.package.pangoling
Author: Bruno Nicenboim ORCID iD [aut, cre], Chris Emmerly [ctb], Giovanni Cassani [ctb], Lisa Levinson [rev], Utku Turk [rev]
Maintainer: Bruno Nicenboim <b.nicenboim at tilburguniversity.edu>
BugReports: https://github.com/ropensci/pangoling/issues
License: MIT + file LICENSE
URL: https://docs.ropensci.org/pangoling/, https://github.com/ropensci/pangoling
NeedsCompilation: no
Language: en-US
Citation: pangoling citation info
Materials: NEWS
CRAN checks: pangoling results

Documentation:

Reference manual: pangoling.pdf
Vignettes: Worked-out example: Surprisal from a causal (GPT) model as a cognitive processing bottleneck in reading (source)
Using a Bert model to get the predictability of words in their context (source)
Using a GPT2 transformer model to get word predictability (source)
Troubleshooting the use of Python in R (source, R code)

Downloads:

Package source: pangoling_1.0.3.tar.gz
Windows binaries: r-devel: pangoling_1.0.3.zip, r-release: not available, r-oldrel: not available
macOS binaries: r-devel (arm64): pangoling_1.0.3.tgz, r-release (arm64): pangoling_1.0.3.tgz, r-oldrel (arm64): pangoling_1.0.3.tgz, r-devel (x86_64): pangoling_1.0.3.tgz, r-release (x86_64): pangoling_1.0.3.tgz, r-oldrel (x86_64): pangoling_1.0.3.tgz

Linking:

Please use the canonical form https://CRAN.R-project.org/package=pangoling to link to this page.