Last updated on 2026-02-18 17:53:39 CET.
| Package | ERROR | WARN | NOTE | OK |
|---|---|---|---|---|
| bindata | 14 | |||
| cclust | 14 | |||
| chron | 14 | |||
| clue | 1 | 13 | ||
| date | 14 | |||
| ISOcodes | 14 | |||
| mlbench | 1 | 13 | ||
| movMF | 14 | |||
| NLP | 14 | |||
| NLPutils | 14 | |||
| OAIHarvester | 14 | |||
| openNLP | 3 | 11 | ||
| openNLPdata | 3 | 11 | ||
| oz | 14 | |||
| relations | 1 | 3 | 10 | |
| RKEA | 14 | |||
| RKEAjars | 3 | 11 | ||
| Rpoppler | 3 | 11 | ||
| Rsymphony | 2 | 12 | ||
| RWeka | 1 | 4 | 9 | |
| RWekajars | 3 | 11 | ||
| skmeans | 7 | 7 | ||
| slam | 5 | 9 | ||
| tau | 5 | 9 | ||
| textcat | 14 | |||
| tm | 3 | 2 | 9 | |
| tm.plugin.mail | 14 | |||
| tseries | 1 | 2 | 11 | |
| Unicode | 14 | |||
| W3CMarkupValidator | 14 | |||
| wordnet | 3 | 11 |
Current CRAN status: OK: 14
Current CRAN status: OK: 14
Current CRAN status: OK: 14
Current CRAN status: WARN: 1, OK: 13
Version: 0.3-67
Check: Rd files
Result: WARN
GVME.Rd:21: processing build-stage \Sexpr code failed:
Error in gzfile(file, "rb"): cannot open the connection
GVME_Consensus.Rd:6: processing build-stage \Sexpr code failed:
Error in gzfile(file, "rb"): cannot open the connection
Kinship82.Rd:14: processing build-stage \Sexpr code failed:
Error in gzfile(file, "rb"): cannot open the connection
Kinship82_Consensus.Rd:6: processing build-stage \Sexpr code failed:
Error in gzfile(file, "rb"): cannot open the connection
Phonemes.Rd:14: processing build-stage \Sexpr code failed:
Error in gzfile(file, "rb"): cannot open the connection
addtree.Rd:30: processing build-stage \Sexpr code failed:
Error in gzfile(file, "rb"): cannot open the connection
cl_agreement.Rd:45: processing build-stage \Sexpr code failed:
Error in gzfile(file, "rb"): cannot open the connection
cl_bag.Rd:26: processing build-stage \Sexpr code failed:
Error in gzfile(file, "rb"): cannot open the connection
cl_consensus.Rd:46: processing build-stage \Sexpr code failed:
Error in gzfile(file, "rb"): cannot open the connection
cl_dissimilarity.Rd:43: processing build-stage \Sexpr code failed:
Error in gzfile(file, "rb"): cannot open the connection
cl_fuzziness.Rd:46: processing build-stage \Sexpr code failed:
Error in gzfile(file, "rb"): cannot open the connection
cl_medoid.Rd:24: processing build-stage \Sexpr code failed:
Error in gzfile(file, "rb"): cannot open the connection
cl_pam.Rd:20: processing build-stage \Sexpr code failed:
Error in gzfile(file, "rb"): cannot open the connection
cl_pclust.Rd:59: processing build-stage \Sexpr code failed:
Error in gzfile(file, "rb"): cannot open the connection
cl_validity.Rd:40: processing build-stage \Sexpr code failed:
Error in gzfile(file, "rb"): cannot open the connection
kmedoids.Rd:35: processing build-stage \Sexpr code failed:
Error in gzfile(file, "rb"): cannot open the connection
l1_fit_ultrametric.Rd:35: processing build-stage \Sexpr code failed:
Error in gzfile(file, "rb"): cannot open the connection
lattice.Rd:81: processing build-stage \Sexpr code failed:
Error in gzfile(file, "rb"): cannot open the connection
ls_fit_addtree.Rd:45: processing build-stage \Sexpr code failed:
Error in gzfile(file, "rb"): cannot open the connection
ls_fit_sum_of_ultrametrics.Rd:34: processing build-stage \Sexpr code failed:
Error in gzfile(file, "rb"): cannot open the connection
ls_fit_ultrametric.Rd:36: processing build-stage \Sexpr code failed:
Error in gzfile(file, "rb"): cannot open the connection
pclust.Rd:90: processing build-stage \Sexpr code failed:
Error in gzfile(file, "rb"): cannot open the connection
solve_LSAP.Rd:36: processing build-stage \Sexpr code failed:
Error in gzfile(file, "rb"): cannot open the connection
sumt.Rd:75: processing build-stage \Sexpr code failed:
Error in gzfile(file, "rb"): cannot open the connection
problems found in ‘GVME.Rd’, ‘GVME_Consensus.Rd’, ‘Kinship82.Rd’, ‘Kinship82_Consensus.Rd’, ‘Phonemes.Rd’, ‘addtree.Rd’, ‘cl_agreement.Rd’, ‘cl_bag.Rd’, ‘cl_consensus.Rd’, ‘cl_dissimilarity.Rd’, ‘cl_fuzziness.Rd’, ‘cl_medoid.Rd’, ‘cl_pam.Rd’, ‘cl_pclust.Rd’, ‘cl_validity.Rd’, ‘kmedoids.Rd’, ‘l1_fit_ultrametric.Rd’, ‘lattice.Rd’, ‘ls_fit_addtree.Rd’, ‘ls_fit_sum_of_ultrametrics.Rd’, ‘ls_fit_ultrametric.Rd’, ‘pclust.Rd’, ‘solve_LSAP.Rd’, ‘sumt.Rd’
Flavor: r-devel-macos-arm64
Current CRAN status: OK: 14
Current CRAN status: OK: 14
Current CRAN status: WARN: 1, OK: 13
Version: 2.1-7
Check: Rd files
Result: WARN
BostonHousing.Rd:11: processing build-stage \Sexpr code failed:
Error in gzfile(file, "rb"): cannot open the connection
BreastCancer.Rd:43: processing build-stage \Sexpr code failed:
Error in gzfile(file, "rb"): cannot open the connection
DNA.Rd:61: processing build-stage \Sexpr code failed:
Error in gzfile(file, "rb"): cannot open the connection
Glass.Rd:40: processing build-stage \Sexpr code failed:
Error in gzfile(file, "rb"): cannot open the connection
HouseVotes84.Rd:49: processing build-stage \Sexpr code failed:
Error in gzfile(file, "rb"): cannot open the connection
Ionosphere.Rd:32: processing build-stage \Sexpr code failed:
Error in gzfile(file, "rb"): cannot open the connection
LetterRecognition.Rd:49: processing build-stage \Sexpr code failed:
Error in gzfile(file, "rb"): cannot open the connection
Ozone.Rd:34: processing build-stage \Sexpr code failed:
Error in gzfile(file, "rb"): cannot open the connection
PimaIndiansDiabetes.Rd:32: processing build-stage \Sexpr code failed:
Error in gzfile(file, "rb"): cannot open the connection
Satellite.Rd:97: processing build-stage \Sexpr code failed:
Error in gzfile(file, "rb"): cannot open the connection
Servo.Rd:34: processing build-stage \Sexpr code failed:
Error in gzfile(file, "rb"): cannot open the connection
Shuttle.Rd:27: processing build-stage \Sexpr code failed:
Error in gzfile(file, "rb"): cannot open the connection
Sonar.Rd:10: processing build-stage \Sexpr code failed:
Error in gzfile(file, "rb"): cannot open the connection
Soybean.Rd:68: processing build-stage \Sexpr code failed:
Error in gzfile(file, "rb"): cannot open the connection
Vehicle.Rd:60: processing build-stage \Sexpr code failed:
Error in gzfile(file, "rb"): cannot open the connection
Vowel.Rd:26: processing build-stage \Sexpr code failed:
Error in gzfile(file, "rb"): cannot open the connection
Zoo.Rd:24: processing build-stage \Sexpr code failed:
Error in gzfile(file, "rb"): cannot open the connection
mlbench.friedman1.Rd:13: processing build-stage \Sexpr code failed:
Error in gzfile(file, "rb"): cannot open the connection
mlbench.friedman2.Rd:16: processing build-stage \Sexpr code failed:
Error in gzfile(file, "rb"): cannot open the connection
mlbench.friedman3.Rd:16: processing build-stage \Sexpr code failed:
Error in gzfile(file, "rb"): cannot open the connection
mlbench.threenorm.Rd:25: processing build-stage \Sexpr code failed:
Error in gzfile(file, "rb"): cannot open the connection
mlbench.waveform.Rd:26: processing build-stage \Sexpr code failed:
Error in gzfile(file, "rb"): cannot open the connection
problems found in ‘BostonHousing.Rd’, ‘BreastCancer.Rd’, ‘DNA.Rd’, ‘Glass.Rd’, ‘HouseVotes84.Rd’, ‘Ionosphere.Rd’, ‘LetterRecognition.Rd’, ‘Ozone.Rd’, ‘PimaIndiansDiabetes.Rd’, ‘Satellite.Rd’, ‘Servo.Rd’, ‘Shuttle.Rd’, ‘Sonar.Rd’, ‘Soybean.Rd’, ‘Vehicle.Rd’, ‘Vowel.Rd’, ‘Zoo.Rd’, ‘mlbench.friedman1.Rd’, ‘mlbench.friedman2.Rd’, ‘mlbench.friedman3.Rd’, ‘mlbench.threenorm.Rd’, ‘mlbench.waveform.Rd’
Flavor: r-devel-macos-arm64
Current CRAN status: OK: 14
Current CRAN status: OK: 14
Current CRAN status: OK: 14
Current CRAN status: OK: 14
Current CRAN status: NOTE: 3, OK: 11
Version: 0.2-7
Check: package dependencies
Result: NOTE
Package suggested but not available for checking: ‘openNLPmodels.en’
Flavors: r-oldrel-macos-arm64, r-oldrel-macos-x86_64, r-oldrel-windows-x86_64
Current CRAN status: NOTE: 3, OK: 11
Version: 1.5.3-5
Check: installed package size
Result: NOTE
installed size is 7.2Mb
sub-directories of 1Mb or more:
java 1.2Mb
models 6.0Mb
Flavors: r-oldrel-macos-arm64, r-oldrel-macos-x86_64, r-oldrel-windows-x86_64
Current CRAN status: OK: 14
Current CRAN status: WARN: 1, NOTE: 3, OK: 10
Version: 0.6-16
Check: Rd files
Result: WARN
Cetacea.Rd:17: processing build-stage \Sexpr code failed:
Error in gzfile(file, "rb"): cannot open the connection
Felines.Rd:36: processing build-stage \Sexpr code failed:
Error in gzfile(file, "rb"): cannot open the connection
SVMBench.Rd:38: processing build-stage \Sexpr code failed:
Error in gzfile(file, "rb"): cannot open the connection
algebra.Rd:54: processing build-stage \Sexpr code failed:
Error in gzfile(file, "rb"): cannot open the connection
closure.Rd:60: processing build-stage \Sexpr code failed:
Error in gzfile(file, "rb"): cannot open the connection
components.Rd:69: processing build-stage \Sexpr code failed:
Error in gzfile(file, "rb"): cannot open the connection
consensus.Rd:43: processing build-stage \Sexpr code failed:
Error in gzfile(file, "rb"): cannot open the connection
dissimilarity.Rd:53: processing build-stage \Sexpr code failed:
Error in gzfile(file, "rb"): cannot open the connection
pclust.Rd:43: processing build-stage \Sexpr code failed:
Error in gzfile(file, "rb"): cannot open the connection
predicates.Rd:242: processing build-stage \Sexpr code failed:
Error in gzfile(file, "rb"): cannot open the connection
reduction.Rd:84: processing build-stage \Sexpr code failed:
Error in gzfile(file, "rb"): cannot open the connection
relation.Rd:110: processing build-stage \Sexpr code failed:
Error in gzfile(file, "rb"): cannot open the connection
scores.Rd:39: processing build-stage \Sexpr code failed:
Error in gzfile(file, "rb"): cannot open the connection
trace.Rd:25: processing build-stage \Sexpr code failed:
Error in gzfile(file, "rb"): cannot open the connection
problems found in ‘Cetacea.Rd’, ‘Felines.Rd’, ‘SVMBench.Rd’, ‘algebra.Rd’, ‘closure.Rd’, ‘components.Rd’, ‘consensus.Rd’, ‘dissimilarity.Rd’, ‘pclust.Rd’, ‘predicates.Rd’, ‘reduction.Rd’, ‘relation.Rd’, ‘scores.Rd’, ‘trace.Rd’
Flavor: r-devel-macos-arm64
Version: 0.6-16
Check: package dependencies
Result: NOTE
Package which this enhances but not available for checking: ‘Rcplex’
Flavor: r-oldrel-macos-arm64
Version: 0.6-15
Check: package dependencies
Result: NOTE
Package which this enhances but not available for checking: ‘Rcplex’
Flavors: r-oldrel-macos-x86_64, r-oldrel-windows-x86_64
Current CRAN status: OK: 14
Current CRAN status: NOTE: 3, OK: 11
Version: 5.0-4
Check: installed package size
Result: NOTE
installed size is 10.8Mb
sub-directories of 1Mb or more:
java 10.8Mb
Flavors: r-oldrel-macos-arm64, r-oldrel-macos-x86_64, r-oldrel-windows-x86_64
Current CRAN status: NOTE: 3, OK: 11
Version: 0.1-3
Check: installed package size
Result: NOTE
installed size is 49.3Mb
sub-directories of 1Mb or more:
libs 49.2Mb
Flavors: r-oldrel-macos-arm64, r-oldrel-macos-x86_64, r-oldrel-windows-x86_64
Current CRAN status: NOTE: 2, OK: 12
Version: 0.1-33
Check: Rd cross-references
Result: NOTE
Package unavailable to check Rd xrefs: ‘Rglpk’
Flavor: r-oldrel-macos-arm64
Version: 0.1-33
Check: installed package size
Result: NOTE
installed size is 5.8Mb
sub-directories of 1Mb or more:
libs 5.8Mb
Flavor: r-oldrel-windows-x86_64
Current CRAN status: WARN: 1, NOTE: 4, OK: 9
Version: 0.4-47
Check: tests
Result: NOTE
Running ‘data_exchange.R’
Comparing ‘data_exchange.Rout’ to ‘data_exchange.Rout.save’ ...
136c136
< 1 2012-12-12 12:12:12 2012-12-12 12:12:12
---
> 1 2012-12-12 12:12:12 2012-12-12 13:12:12
159c159
< 1 2012-12-12 12:12:12 2012-12-12 12:12:12
---
> 1 2012-12-12 12:12:12 2012-12-12 13:12:12
Flavor: r-devel-linux-x86_64-fedora-clang
Version: 0.4-46
Check: tests
Result: NOTE
Running ‘data_exchange.R’
Comparing ‘data_exchange.Rout’ to ‘data_exchange.Rout.save’ ...
136c136
< 1 2012-12-12 12:12:12 2012-12-12 12:12:12
---
> 1 2012-12-12 12:12:12 2012-12-12 13:12:12
159c159
< 1 2012-12-12 12:12:12 2012-12-12 12:12:12
---
> 1 2012-12-12 12:12:12 2012-12-12 13:12:12
Flavor: r-devel-linux-x86_64-fedora-gcc
Version: 0.4-47
Check: Rd files
Result: WARN
WOW.Rd:23: processing build-stage \Sexpr code failed:
Error in gzfile(file, "rb"): cannot open the connection
Weka_associators.Rd:42: processing build-stage \Sexpr code failed:
Error in gzfile(file, "rb"): cannot open the connection
Weka_classifier_functions.Rd:60: processing build-stage \Sexpr code failed:
Error in gzfile(file, "rb"): cannot open the connection
Weka_classifier_lazy.Rd:50: processing build-stage \Sexpr code failed:
Error in gzfile(file, "rb"): cannot open the connection
Weka_classifier_meta.Rd:66: processing build-stage \Sexpr code failed:
Error in gzfile(file, "rb"): cannot open the connection
Weka_classifier_rules.Rd:57: processing build-stage \Sexpr code failed:
Error in gzfile(file, "rb"): cannot open the connection
Weka_classifier_trees.Rd:69: processing build-stage \Sexpr code failed:
Error in gzfile(file, "rb"): cannot open the connection
Weka_clusterers.Rd:41: processing build-stage \Sexpr code failed:
Error in gzfile(file, "rb"): cannot open the connection
Weka_filters.Rd:38: processing build-stage \Sexpr code failed:
Error in gzfile(file, "rb"): cannot open the connection
Weka_interfaces.Rd:107: processing build-stage \Sexpr code failed:
Error in gzfile(file, "rb"): cannot open the connection
Weka_stemmers.Rd:23: processing build-stage \Sexpr code failed:
Error in gzfile(file, "rb"): cannot open the connection
evaluate_Weka_classifier.Rd:53: processing build-stage \Sexpr code failed:
Error in gzfile(file, "rb"): cannot open the connection
problems found in ‘WOW.Rd’, ‘Weka_associators.Rd’, ‘Weka_classifier_functions.Rd’, ‘Weka_classifier_lazy.Rd’, ‘Weka_classifier_meta.Rd’, ‘Weka_classifier_rules.Rd’, ‘Weka_classifier_trees.Rd’, ‘Weka_clusterers.Rd’, ‘Weka_filters.Rd’, ‘Weka_interfaces.Rd’, ‘Weka_stemmers.Rd’, ‘evaluate_Weka_classifier.Rd’
Flavor: r-devel-macos-arm64
Version: 0.4-47
Check: tests
Result: NOTE
Running ‘data_exchange.R’ [1s/0s]
Comparing ‘data_exchange.Rout’ to ‘data_exchange.Rout.save’ ...
136c136
< 1 2012-12-12 12:12:12 2012-12-13 01:12:12
---
> 1 2012-12-12 12:12:12 2012-12-12 13:12:12
159c159
< 1 2012-12-12 12:12:12 2012-12-13 01:12:12
---
> 1 2012-12-12 12:12:12 2012-12-12 13:12:12
Flavor: r-devel-macos-arm64
Version: 0.4-47
Check: tests
Result: NOTE
Running ‘data_exchange.R’ [1s/0s]
Comparing ‘data_exchange.Rout’ to ‘data_exchange.Rout.save’ ...136c136
< 1 2012-12-12 12:12:12 2012-12-13 01:12:12
---
> 1 2012-12-12 12:12:12 2012-12-12 13:12:12
159c159
< 1 2012-12-12 12:12:12 2012-12-13 01:12:12
---
> 1 2012-12-12 12:12:12 2012-12-12 13:12:12
Flavor: r-release-macos-arm64
Version: 0.4-46
Check: tests
Result: NOTE
Running ‘data_exchange.R’ [2s/1s]
Comparing ‘data_exchange.Rout’ to ‘data_exchange.Rout.save’ ...136c136
< 1 2012-12-12 12:12:12 2012-12-12 07:12:12
---
> 1 2012-12-12 12:12:12 2012-12-12 13:12:12
159c159
< 1 2012-12-12 12:12:12 2012-12-12 07:12:12
---
> 1 2012-12-12 12:12:12 2012-12-12 13:12:12
Flavor: r-release-macos-x86_64
Current CRAN status: NOTE: 3, OK: 11
Version: 3.9.3-2
Check: installed package size
Result: NOTE
installed size is 10.8Mb
sub-directories of 1Mb or more:
java 10.7Mb
Flavors: r-oldrel-macos-arm64, r-oldrel-macos-x86_64, r-oldrel-windows-x86_64
Current CRAN status: WARN: 7, OK: 7
Version: 0.2-19
Check: Rd files
Result: WARN
prepare_Rd: ./man/skmeans.Rd:53: unknown macro '\bibcitet'
prepare_Rd: ./man/skmeans.Rd:65: unknown macro '\bibcitet'
prepare_Rd: ./man/skmeans.Rd:72: unknown macro '\bibcitet'
prepare_Rd: ./man/skmeans.Rd:75: unknown macro '\bibcitep'
prepare_Rd: ./man/skmeans.Rd:194: unknown macro '\bibcitet'
checkRd: (-1) skmeans.Rd:53: Lost braces
53 | This is the formulation used in \bibcitet{skmeans::Dhillon+Modha:2001}
| ^
checkRd: (-1) skmeans.Rd:65: Lost braces
65 | \bibcitet{skmeans::Krishna+Narasimha_Murty:1999}.}
| ^
checkRd: (-1) skmeans.Rd:72: Lost braces
72 | \bibcitet{skmeans::Dhillon+Guan+Kogan:2002}.}
| ^
checkRd: (-1) skmeans.Rd:75: Lost braces
75 | Karypis \bibcitep{skmeans::Karypis:2002}.}
| ^
checkRd: (-1) skmeans.Rd:194: Lost braces
194 | material to \bibcitet{skmeans::Maitra+Ramler:2010} at
| ^
Flavors: r-release-linux-x86_64, r-release-macos-arm64, r-release-macos-x86_64, r-release-windows-x86_64, r-oldrel-macos-arm64, r-oldrel-macos-x86_64, r-oldrel-windows-x86_64
Version: 0.2-19
Check: package dependencies
Result: NOTE
Package which this enhances but not available for checking: ‘kmndirs’
Flavors: r-oldrel-macos-arm64, r-oldrel-macos-x86_64, r-oldrel-windows-x86_64
Current CRAN status: NOTE: 5, OK: 9
Version: 0.1-55
Check: tests
Result: NOTE
Running ‘abind.R’ [0s/0s]
Comparing ‘abind.Rout’ to ‘abind.Rout.save’ ... OK
Running ‘apply.R’ [0s/0s]
Comparing ‘apply.Rout’ to ‘apply.Rout.save’ ... OK
Running ‘crossprod.R’ [0s/0s]
Comparing ‘crossprod.Rout’ to ‘crossprod.Rout.save’ ... OK
Running ‘dimgets.R’ [0s/0s]
Running ‘extract.R’ [0s/0s]
Comparing ‘extract.Rout’ to ‘extract.Rout.save’ ... OK
Running ‘matrix.R’ [0s/0s]
Comparing ‘matrix.Rout’ to ‘matrix.Rout.save’ ... OK
Running ‘matrix_dimnames.R’ [0s/0s]
Comparing ‘matrix_dimnames.Rout’ to ‘matrix_dimnames.Rout.save’ ... OK
Running ‘rollup.R’ [0s/0s]
Comparing ‘rollup.Rout’ to ‘rollup.Rout.save’ ... OK
Running ‘split.R’ [0s/0s]
Comparing ‘split.Rout’ to ‘split.Rout.save’ ... OK
Running ‘ssa_valid.R’ [0s/0s]
Comparing ‘ssa_valid.Rout’ to ‘ssa_valid.Rout.save’ ... OK
Running ‘stm.R’ [0s/0s]
Comparing ‘stm.Rout’ to ‘stm.Rout.save’ ... OK
Running ‘stm_apply.R’ [0s/0s]
Comparing ‘stm_apply.Rout’ to ‘stm_apply.Rout.save’ ... OK
Running ‘stm_rollup.R’ [0s/0s]
Comparing ‘stm_rollup.Rout’ to ‘stm_rollup.Rout.save’ ...
113a114,115
> _row_tsums: reduced 1 (3) zeros
> _row_tsums: 0.000s [0.000s/0.000s]
Running ‘stm_subassign.R’ [0s/0s]
Comparing ‘stm_subassign.Rout’ to ‘stm_subassign.Rout.save’ ... OK
Running ‘stm_ttcrossprod.R’ [0s/0s]
Comparing ‘stm_ttcrossprod.Rout’ to ‘stm_ttcrossprod.Rout.save’ ... OK
Running ‘stm_valid.R’ [0s/0s]
Comparing ‘stm_valid.Rout’ to ‘stm_valid.Rout.save’ ... OK
Running ‘stm_zeros.R’ [0s/0s]
Comparing ‘stm_zeros.Rout’ to ‘stm_zeros.Rout.save’ ... OK
Running ‘subassign.R’ [0s/0s]
Comparing ‘subassign.Rout’ to ‘subassign.Rout.save’ ... OK
Running ‘util.R’ [0s/0s]
Comparing ‘util.Rout’ to ‘util.Rout.save’ ... OK
Flavor: r-devel-macos-arm64
Version: 0.1-55
Check: tests
Result: NOTE
Running 'abind.R' [0s]
Comparing 'abind.Rout' to 'abind.Rout.save' ... OK
Running 'apply.R' [0s]
Comparing 'apply.Rout' to 'apply.Rout.save' ... OK
Running 'crossprod.R' [0s]
Comparing 'crossprod.Rout' to 'crossprod.Rout.save' ... OK
Running 'dimgets.R' [0s]
Running 'extract.R' [0s]
Comparing 'extract.Rout' to 'extract.Rout.save' ... OK
Running 'matrix.R' [0s]
Comparing 'matrix.Rout' to 'matrix.Rout.save' ... OK
Running 'matrix_dimnames.R' [0s]
Comparing 'matrix_dimnames.Rout' to 'matrix_dimnames.Rout.save' ... OK
Running 'rollup.R' [0s]
Comparing 'rollup.Rout' to 'rollup.Rout.save' ... OK
Running 'split.R' [0s]
Comparing 'split.Rout' to 'split.Rout.save' ... OK
Running 'ssa_valid.R' [0s]
Comparing 'ssa_valid.Rout' to 'ssa_valid.Rout.save' ... OK
Running 'stm.R' [0s]
Comparing 'stm.Rout' to 'stm.Rout.save' ... OK
Running 'stm_apply.R' [0s]
Comparing 'stm_apply.Rout' to 'stm_apply.Rout.save' ... OK
Running 'stm_rollup.R' [0s]
Comparing 'stm_rollup.Rout' to 'stm_rollup.Rout.save' ...
113a114,115
> _row_tsums: reduced 1 (3) zeros
> _row_tsums: 0.000s [0.000s/0.000s]
Running 'stm_subassign.R' [0s]
Comparing 'stm_subassign.Rout' to 'stm_subassign.Rout.save' ... OK
Running 'stm_ttcrossprod.R' [0s]
Comparing 'stm_ttcrossprod.Rout' to 'stm_ttcrossprod.Rout.save' ... OK
Running 'stm_valid.R' [0s]
Comparing 'stm_valid.Rout' to 'stm_valid.Rout.save' ... OK
Running 'stm_zeros.R' [0s]
Comparing 'stm_zeros.Rout' to 'stm_zeros.Rout.save' ... OK
Running 'subassign.R' [0s]
Comparing 'subassign.Rout' to 'subassign.Rout.save' ... OK
Running 'util.R' [0s]
Comparing 'util.Rout' to 'util.Rout.save' ... OK
Flavor: r-devel-windows-x86_64
Version: 0.1-55
Check: tests
Result: NOTE
Running ‘abind.R’ [0s/0s]
Comparing ‘abind.Rout’ to ‘abind.Rout.save’ ... OK
Running ‘apply.R’ [0s/0s]
Comparing ‘apply.Rout’ to ‘apply.Rout.save’ ... OK
Running ‘crossprod.R’ [0s/0s]
Comparing ‘crossprod.Rout’ to ‘crossprod.Rout.save’ ... OK
Running ‘dimgets.R’ [0s/0s]
Running ‘extract.R’ [0s/0s]
Comparing ‘extract.Rout’ to ‘extract.Rout.save’ ... OK
Running ‘matrix.R’ [0s/0s]
Comparing ‘matrix.Rout’ to ‘matrix.Rout.save’ ... OK
Running ‘matrix_dimnames.R’ [0s/0s]
Comparing ‘matrix_dimnames.Rout’ to ‘matrix_dimnames.Rout.save’ ... OK
Running ‘rollup.R’ [0s/0s]
Comparing ‘rollup.Rout’ to ‘rollup.Rout.save’ ... OK
Running ‘split.R’ [0s/0s]
Comparing ‘split.Rout’ to ‘split.Rout.save’ ... OK
Running ‘ssa_valid.R’ [0s/0s]
Comparing ‘ssa_valid.Rout’ to ‘ssa_valid.Rout.save’ ... OK
Running ‘stm.R’ [0s/0s]
Comparing ‘stm.Rout’ to ‘stm.Rout.save’ ... OK
Running ‘stm_apply.R’ [0s/0s]
Comparing ‘stm_apply.Rout’ to ‘stm_apply.Rout.save’ ... OK
Running ‘stm_rollup.R’ [0s/0s]
Comparing ‘stm_rollup.Rout’ to ‘stm_rollup.Rout.save’ ...113a114,115
> _row_tsums: reduced 1 (3) zeros
> _row_tsums: 0.000s [0.000s/0.000s]
Running ‘stm_subassign.R’ [0s/0s]
Comparing ‘stm_subassign.Rout’ to ‘stm_subassign.Rout.save’ ... OK
Running ‘stm_ttcrossprod.R’ [0s/0s]
Comparing ‘stm_ttcrossprod.Rout’ to ‘stm_ttcrossprod.Rout.save’ ... OK
Running ‘stm_valid.R’ [0s/0s]
Comparing ‘stm_valid.Rout’ to ‘stm_valid.Rout.save’ ... OK
Running ‘stm_zeros.R’ [0s/0s]
Comparing ‘stm_zeros.Rout’ to ‘stm_zeros.Rout.save’ ... OK
Running ‘subassign.R’ [0s/0s]
Comparing ‘subassign.Rout’ to ‘subassign.Rout.save’ ... OK
Running ‘util.R’ [0s/0s]
Comparing ‘util.Rout’ to ‘util.Rout.save’ ... OK
Flavor: r-release-macos-arm64
Version: 0.1-55
Check: tests
Result: NOTE
Running ‘abind.R’ [0s/0s]
Comparing ‘abind.Rout’ to ‘abind.Rout.save’ ... OK
Running ‘apply.R’ [0s/0s]
Comparing ‘apply.Rout’ to ‘apply.Rout.save’ ... OK
Running ‘crossprod.R’ [0s/0s]
Comparing ‘crossprod.Rout’ to ‘crossprod.Rout.save’ ... OK
Running ‘dimgets.R’ [0s/0s]
Running ‘extract.R’ [0s/0s]
Comparing ‘extract.Rout’ to ‘extract.Rout.save’ ... OK
Running ‘matrix.R’ [0s/1s]
Comparing ‘matrix.Rout’ to ‘matrix.Rout.save’ ... OK
Running ‘matrix_dimnames.R’ [0s/0s]
Comparing ‘matrix_dimnames.Rout’ to ‘matrix_dimnames.Rout.save’ ... OK
Running ‘rollup.R’ [0s/0s]
Comparing ‘rollup.Rout’ to ‘rollup.Rout.save’ ... OK
Running ‘split.R’ [0s/0s]
Comparing ‘split.Rout’ to ‘split.Rout.save’ ... OK
Running ‘ssa_valid.R’ [0s/0s]
Comparing ‘ssa_valid.Rout’ to ‘ssa_valid.Rout.save’ ... OK
Running ‘stm.R’ [0s/0s]
Comparing ‘stm.Rout’ to ‘stm.Rout.save’ ... OK
Running ‘stm_apply.R’ [0s/0s]
Comparing ‘stm_apply.Rout’ to ‘stm_apply.Rout.save’ ... OK
Running ‘stm_rollup.R’ [0s/0s]
Comparing ‘stm_rollup.Rout’ to ‘stm_rollup.Rout.save’ ...113a114,115
> _row_tsums: reduced 1 (3) zeros
> _row_tsums: 0.000s [0.000s/0.000s]
Running ‘stm_subassign.R’ [0s/1s]
Comparing ‘stm_subassign.Rout’ to ‘stm_subassign.Rout.save’ ... OK
Running ‘stm_ttcrossprod.R’ [0s/1s]
Comparing ‘stm_ttcrossprod.Rout’ to ‘stm_ttcrossprod.Rout.save’ ... OK
Running ‘stm_valid.R’ [0s/1s]
Comparing ‘stm_valid.Rout’ to ‘stm_valid.Rout.save’ ... OK
Running ‘stm_zeros.R’ [0s/1s]
Comparing ‘stm_zeros.Rout’ to ‘stm_zeros.Rout.save’ ... OK
Running ‘subassign.R’ [0s/1s]
Comparing ‘subassign.Rout’ to ‘subassign.Rout.save’ ... OK
Running ‘util.R’ [0s/1s]
Comparing ‘util.Rout’ to ‘util.Rout.save’ ... OK
Flavor: r-release-macos-x86_64
Version: 0.1-55
Check: tests
Result: NOTE
Running 'abind.R' [0s]
Comparing 'abind.Rout' to 'abind.Rout.save' ... OK
Running 'apply.R' [0s]
Comparing 'apply.Rout' to 'apply.Rout.save' ... OK
Running 'crossprod.R' [0s]
Comparing 'crossprod.Rout' to 'crossprod.Rout.save' ... OK
Running 'dimgets.R' [0s]
Running 'extract.R' [0s]
Comparing 'extract.Rout' to 'extract.Rout.save' ... OK
Running 'matrix.R' [0s]
Comparing 'matrix.Rout' to 'matrix.Rout.save' ... OK
Running 'matrix_dimnames.R' [0s]
Comparing 'matrix_dimnames.Rout' to 'matrix_dimnames.Rout.save' ... OK
Running 'rollup.R' [0s]
Comparing 'rollup.Rout' to 'rollup.Rout.save' ... OK
Running 'split.R' [0s]
Comparing 'split.Rout' to 'split.Rout.save' ... OK
Running 'ssa_valid.R' [0s]
Comparing 'ssa_valid.Rout' to 'ssa_valid.Rout.save' ... OK
Running 'stm.R' [0s]
Comparing 'stm.Rout' to 'stm.Rout.save' ... OK
Running 'stm_apply.R' [0s]
Comparing 'stm_apply.Rout' to 'stm_apply.Rout.save' ... OK
Running 'stm_rollup.R' [0s]
Comparing 'stm_rollup.Rout' to 'stm_rollup.Rout.save' ...113a114,115
> _row_tsums: reduced 1 (3) zeros
> _row_tsums: 0.000s [0.000s/0.000s]
Running 'stm_subassign.R' [0s]
Comparing 'stm_subassign.Rout' to 'stm_subassign.Rout.save' ... OK
Running 'stm_ttcrossprod.R' [0s]
Comparing 'stm_ttcrossprod.Rout' to 'stm_ttcrossprod.Rout.save' ... OK
Running 'stm_valid.R' [0s]
Comparing 'stm_valid.Rout' to 'stm_valid.Rout.save' ... OK
Running 'stm_zeros.R' [0s]
Comparing 'stm_zeros.Rout' to 'stm_zeros.Rout.save' ... OK
Running 'subassign.R' [0s]
Comparing 'subassign.Rout' to 'subassign.Rout.save' ... OK
Running 'util.R' [0s]
Comparing 'util.Rout' to 'util.Rout.save' ... OK
Flavor: r-release-windows-x86_64
Current CRAN status: NOTE: 5, OK: 9
Version: 0.0-27
Check: tests
Result: NOTE
Running ‘counting.R’ [0s/0s]
Comparing ‘counting.Rout’ to ‘counting.Rout.save’ ...
26a27,28
> counting ... 9 string(s) using 19 nodes [0.00s]
> writing ... 16 strings [0.00s]
47a50
> counting ... 9 string(s) using 19 nodes [0.00s]
49a53,54
> counting ... 9 string(s) using 19 nodes [0.00s]
> writing ... 16 strings [0.00s]
70a76,77
> counting ... 2 string(s) using 5 nodes [0.00s]
> writing ... 2 strings [0.00s]
77a85,86
> counting ... 2 string(s) using 5 nodes [0.00s]
> writing ... 4 strings [0.00s]
86a96,97
> counting ... 2 string(s) using 6 nodes [0.00s]
> writing ... 5 strings [0.00s]
96a108,109
> counting ... 2 string(s) using 6 nodes [0.00s]
> writing ... 5 strings [0.00s]
106a120
> counting ... 2 string(s) using 5 nodes [0.00s]
108a123,124
> counting ... 2 string(s) using 5 nodes [0.00s]
> writing ... 2 strings [0.00s]
Running ‘counting_useBytes.R’ [0s/0s]
Comparing ‘counting_useBytes.Rout’ to ‘counting_useBytes.Rout.save’ ...
32a33,34
> counting ... 10 string(s) using 19 nodes [0.00s]
> writing ... 19 strings [0.00s]
56a59
> counting ... 10 string(s) using 19 nodes [0.00s]
58a62,63
> counting ... 10 string(s) using 19 nodes [0.00s]
> writing ... 19 strings [0.00s]
82a88,89
> counting ... 2 string(s) using 5 nodes [0.00s]
> writing ... 2 strings [0.00s]
89a97,98
> counting ... 2 string(s) using 5 nodes [0.00s]
> writing ... 5 strings [0.00s]
99a109,110
> counting ... 2 string(s) using 6 nodes [0.00s]
> writing ... 6 strings [0.00s]
110a122,123
> counting ... 2 string(s) using 6 nodes [0.00s]
> writing ... 6 strings [0.00s]
121a135
> counting ... 2 string(s) using 5 nodes [0.00s]
123a138,139
> counting ... 2 string(s) using 5 nodes [0.00s]
> writing ... 2 strings [0.00s]
Flavor: r-devel-macos-arm64
Version: 0.0-26
Check: tests
Result: NOTE
Running 'counting.R' [0s]
Comparing 'counting.Rout' to 'counting.Rout.save' ...
26a27,28
> counting ... 9 string(s) using 19 nodes [0.00s]
> writing ... 16 strings [0.00s]
47a50
> counting ... 9 string(s) using 19 nodes [0.00s]
49a53,54
> counting ... 9 string(s) using 19 nodes [0.00s]
> writing ... 16 strings [0.00s]
70a76,77
> counting ... 2 string(s) using 5 nodes [0.00s]
> writing ... 2 strings [0.00s]
77a85,86
> counting ... 2 string(s) using 5 nodes [0.00s]
> writing ... 4 strings [0.00s]
86a96,97
> counting ... 2 string(s) using 6 nodes [0.00s]
> writing ... 5 strings [0.00s]
96a108,109
> counting ... 2 string(s) using 6 nodes [0.00s]
> writing ... 5 strings [0.00s]
106a120
> counting ... 2 string(s) using 5 nodes [0.00s]
108a123,124
> counting ... 2 string(s) using 5 nodes [0.00s]
> writing ... 2 strings [0.00s]
Running 'counting_useBytes.R' [0s]
Comparing 'counting_useBytes.Rout' to 'counting_useBytes.Rout.save' ...
32a33,34
> counting ... 10 string(s) using 19 nodes [0.00s]
> writing ... 19 strings [0.00s]
56a59
> counting ... 10 string(s) using 19 nodes [0.00s]
58a62,63
> counting ... 10 string(s) using 19 nodes [0.00s]
> writing ... 19 strings [0.00s]
82a88,89
> counting ... 2 string(s) using 5 nodes [0.00s]
> writing ... 2 strings [0.00s]
89a97,98
> counting ... 2 string(s) using 5 nodes [0.00s]
> writing ... 5 strings [0.00s]
99a109,110
> counting ... 2 string(s) using 6 nodes [0.00s]
> writing ... 6 strings [0.00s]
110a122,123
> counting ... 2 string(s) using 6 nodes [0.00s]
> writing ... 6 strings [0.00s]
121a135
> counting ... 2 string(s) using 5 nodes [0.00s]
123a138,139
> counting ... 2 string(s) using 5 nodes [0.00s]
> writing ... 2 strings [0.00s]
Flavor: r-devel-windows-x86_64
Version: 0.0-27
Check: tests
Result: NOTE
Running ‘counting.R’ [0s/0s]
Comparing ‘counting.Rout’ to ‘counting.Rout.save’ ...26a27,28
> counting ... 9 string(s) using 19 nodes [0.00s]
> writing ... 16 strings [0.00s]
47a50
> counting ... 9 string(s) using 19 nodes [0.00s]
49a53,54
> counting ... 9 string(s) using 19 nodes [0.00s]
> writing ... 16 strings [0.00s]
70a76,77
> counting ... 2 string(s) using 5 nodes [0.00s]
> writing ... 2 strings [0.00s]
77a85,86
> counting ... 2 string(s) using 5 nodes [0.00s]
> writing ... 4 strings [0.00s]
86a96,97
> counting ... 2 string(s) using 6 nodes [0.00s]
> writing ... 5 strings [0.00s]
96a108,109
> counting ... 2 string(s) using 6 nodes [0.00s]
> writing ... 5 strings [0.00s]
106a120
> counting ... 2 string(s) using 5 nodes [0.00s]
108a123,124
> counting ... 2 string(s) using 5 nodes [0.00s]
> writing ... 2 strings [0.00s]
Running ‘counting_useBytes.R’ [0s/0s]
Comparing ‘counting_useBytes.Rout’ to ‘counting_useBytes.Rout.save’ ...32a33,34
> counting ... 10 string(s) using 19 nodes [0.00s]
> writing ... 19 strings [0.00s]
56a59
> counting ... 10 string(s) using 19 nodes [0.00s]
58a62,63
> counting ... 10 string(s) using 19 nodes [0.00s]
> writing ... 19 strings [0.00s]
82a88,89
> counting ... 2 string(s) using 5 nodes [0.00s]
> writing ... 2 strings [0.00s]
89a97,98
> counting ... 2 string(s) using 5 nodes [0.00s]
> writing ... 5 strings [0.00s]
99a109,110
> counting ... 2 string(s) using 6 nodes [0.00s]
> writing ... 6 strings [0.00s]
110a122,123
> counting ... 2 string(s) using 6 nodes [0.00s]
> writing ... 6 strings [0.00s]
121a135
> counting ... 2 string(s) using 5 nodes [0.00s]
123a138,139
> counting ... 2 string(s) using 5 nodes [0.00s]
> writing ... 2 strings [0.00s]
Flavor: r-release-macos-arm64
Version: 0.0-26
Check: tests
Result: NOTE
Running ‘counting.R’ [0s/0s]
Comparing ‘counting.Rout’ to ‘counting.Rout.save’ ...26a27,28
> counting ... 9 string(s) using 19 nodes [0.00s]
> writing ... 16 strings [0.00s]
47a50
> counting ... 9 string(s) using 19 nodes [0.00s]
49a53,54
> counting ... 9 string(s) using 19 nodes [0.00s]
> writing ... 16 strings [0.00s]
70a76,77
> counting ... 2 string(s) using 5 nodes [0.00s]
> writing ... 2 strings [0.00s]
77a85,86
> counting ... 2 string(s) using 5 nodes [0.00s]
> writing ... 4 strings [0.00s]
86a96,97
> counting ... 2 string(s) using 6 nodes [0.00s]
> writing ... 5 strings [0.00s]
96a108,109
> counting ... 2 string(s) using 6 nodes [0.00s]
> writing ... 5 strings [0.00s]
106a120
> counting ... 2 string(s) using 5 nodes [0.00s]
108a123,124
> counting ... 2 string(s) using 5 nodes [0.00s]
> writing ... 2 strings [0.00s]
Running ‘counting_useBytes.R’ [0s/0s]
Comparing ‘counting_useBytes.Rout’ to ‘counting_useBytes.Rout.save’ ...32a33,34
> counting ... 10 string(s) using 19 nodes [0.00s]
> writing ... 19 strings [0.00s]
56a59
> counting ... 10 string(s) using 19 nodes [0.00s]
58a62,63
> counting ... 10 string(s) using 19 nodes [0.00s]
> writing ... 19 strings [0.00s]
82a88,89
> counting ... 2 string(s) using 5 nodes [0.00s]
> writing ... 2 strings [0.00s]
89a97,98
> counting ... 2 string(s) using 5 nodes [0.00s]
> writing ... 5 strings [0.00s]
99a109,110
> counting ... 2 string(s) using 6 nodes [0.00s]
> writing ... 6 strings [0.00s]
110a122,123
> counting ... 2 string(s) using 6 nodes [0.00s]
> writing ... 6 strings [0.00s]
121a135
> counting ... 2 string(s) using 5 nodes [0.00s]
123a138,139
> counting ... 2 string(s) using 5 nodes [0.00s]
> writing ... 2 strings [0.00s]
Flavor: r-release-macos-x86_64
Version: 0.0-26
Check: tests
Result: NOTE
Running 'counting.R' [0s]
Comparing 'counting.Rout' to 'counting.Rout.save' ...26a27,28
> counting ... 9 string(s) using 19 nodes [0.00s]
> writing ... 16 strings [0.00s]
47a50
> counting ... 9 string(s) using 19 nodes [0.00s]
49a53,54
> counting ... 9 string(s) using 19 nodes [0.00s]
> writing ... 16 strings [0.00s]
70a76,77
> counting ... 2 string(s) using 5 nodes [0.00s]
> writing ... 2 strings [0.00s]
77a85,86
> counting ... 2 string(s) using 5 nodes [0.00s]
> writing ... 4 strings [0.00s]
86a96,97
> counting ... 2 string(s) using 6 nodes [0.00s]
> writing ... 5 strings [0.00s]
96a108,109
> counting ... 2 string(s) using 6 nodes [0.00s]
> writing ... 5 strings [0.00s]
106a120
> counting ... 2 string(s) using 5 nodes [0.00s]
108a123,124
> counting ... 2 string(s) using 5 nodes [0.00s]
> writing ... 2 strings [0.00s]
Running 'counting_useBytes.R' [0s]
Comparing 'counting_useBytes.Rout' to 'counting_useBytes.Rout.save' ...32a33,34
> counting ... 10 string(s) using 19 nodes [0.00s]
> writing ... 19 strings [0.00s]
56a59
> counting ... 10 string(s) using 19 nodes [0.00s]
58a62,63
> counting ... 10 string(s) using 19 nodes [0.00s]
> writing ... 19 strings [0.00s]
82a88,89
> counting ... 2 string(s) using 5 nodes [0.00s]
> writing ... 2 strings [0.00s]
89a97,98
> counting ... 2 string(s) using 5 nodes [0.00s]
> writing ... 5 strings [0.00s]
99a109,110
> counting ... 2 string(s) using 6 nodes [0.00s]
> writing ... 6 strings [0.00s]
110a122,123
> counting ... 2 string(s) using 6 nodes [0.00s]
> writing ... 6 strings [0.00s]
121a135
> counting ... 2 string(s) using 5 nodes [0.00s]
123a138,139
> counting ... 2 string(s) using 5 nodes [0.00s]
> writing ... 2 strings [0.00s]
Flavor: r-release-windows-x86_64
Current CRAN status: OK: 14
Current CRAN status: ERROR: 3, NOTE: 2, OK: 9
Version: 0.7-18
Check: Rd files
Result: WARN
acq.Rd:22: processing build-stage \Sexpr code failed:
Error in gzfile(file, "rb"): cannot open the connection
crude.Rd:22: processing build-stage \Sexpr code failed:
Error in gzfile(file, "rb"): cannot open the connection
readRCV1.Rd:28: processing build-stage \Sexpr code failed:
Error in gzfile(file, "rb"): cannot open the connection
readReut21578XML.Rd:31: processing build-stage \Sexpr code failed:
Error in gzfile(file, "rb"): cannot open the connection
stemCompletion.Rd:39: processing build-stage \Sexpr code failed:
Error in gzfile(file, "rb"): cannot open the connection
weightSMART.Rd:75: processing build-stage \Sexpr code failed:
Error in gzfile(file, "rb"): cannot open the connection
weightTfIdf.Rd:38: processing build-stage \Sexpr code failed:
Error in gzfile(file, "rb"): cannot open the connection
problems found in ‘acq.Rd’, ‘crude.Rd’, ‘readRCV1.Rd’, ‘readReut21578XML.Rd’, ‘stemCompletion.Rd’, ‘weightSMART.Rd’, ‘weightTfIdf.Rd’
Flavor: r-devel-macos-arm64
Version: 0.7-18
Check: examples
Result: ERROR
Running examples in ‘tm-Ex.R’ failed
The error most likely occurred in:
> ### Name: readPDF
> ### Title: Read In a PDF Document
> ### Aliases: readPDF
> ### Keywords: file
>
> ### ** Examples
>
> uri <- paste0("file://",
+ system.file(file.path("doc", "tm.pdf"), package = "tm"))
> engine <- if(nzchar(system.file(package = "pdftools"))) {
+ "pdftools"
+ } else {
+ "ghostscript"
+ }
> reader <- readPDF(engine)
> pdf <- reader(elem = list(uri = uri), language = "en", id = "id1")
> cat(content(pdf)[1])
Introduction to the tm Package
Text Mining in R
Ingo Feinerer
February 18, 2026
Introduction
This vignette gives a short introduction to text mining in R utilizing the text mining framework provided by
the tm package. We present methods for data import, corpus handling, preprocessing, metadata management,
and creation of term-document matrices. Our focus is on the main aspects of getting started with text mining
in R—an in-depth description of the text mining infrastructure offered by tm was published in the Journal of
Statistical Software (Feinerer et al., 2008). An introductory article on text mining in R was published in R
News (Feinerer, 2008).
Data Import
The main structure for managing documents in tm is a so-called Corpus, representing a collection of text
documents. A corpus is an abstract concept, and there can exist several implementations in parallel. The
default implementation is the so-called VCorpus (short for Volatile Corpus) which realizes a semantics as known
from most R objects: corpora are R objects held fully in memory. We denote this as volatile since once the
R object is destroyed, the whole corpus is gone. Such a volatile corpus can be created via the constructor
VCorpus(x, readerControl). Another implementation is the PCorpus which implements a Permanent Corpus
semantics, i.e., the documents are physically stored outside of R (e.g., in a database), corresponding R objects
are basically only pointers to external structures, and changes to the underlying corpus are reflected to all R
objects associated with it. Compared to the volatile corpus the corpus encapsulated by a permanent corpus
object is not destroyed if the corresponding R object is released.
Within the corpus constructor, x must be a Source object which abstracts the input location. tm provides a
set of predefined sources, e.g., DirSource, VectorSource, or DataframeSource, which handle a directory, a vector
interpreting each component as document, or data frame like structures (like CSV files), respectively. Except
DirSource, which is designed solely for directories on a file system, and VectorSource, which only accepts (char-
acter) vectors, most other implemented sources can take connections as input (a character string is interpreted
as file path). getSources() lists available sources, and users can create their own sources.
The second argument readerControl of the corpus constructor has to be a list with the named components
reader and language. The first component reader constructs a text document from elements delivered by
a source. The tm package ships with several readers (e.g., readPlain(), readPDF(), readDOC(), . . . ). See
getReaders() for an up-to-date list of available readers. Each source has a default reader which can be
overridden. E.g., for DirSource the default just reads in the input files and interprets their content as text.
Finally, the second component language sets the texts’ language (preferably using ISO 639-2 codes).
In case of a permanent corpus, a third argument dbControl has to be a list with the named components
dbName giving the filename holding the sourced out objects (i.e., the database), and dbType holding a valid
database type as supported by package filehash. Activated database support reduces the memory demand,
however, access gets slower since each operation is limited by the hard disk’s read and write capabilities.
So e.g., plain text files in the directory txt containing Latin (lat) texts by the Roman poet Ovid can be
read in with following code:
> txt <- system.file("texts", "txt", package = "tm")
> (ovid <- VCorpus(DirSource(txt, encoding = "UTF-8"),
+ readerControl = list(language = "lat")))
<<VCorpus>>
Metadata: corpus specific: 0, document level (indexed): 0
Content: documents: 5
1
> VCorpus(URISource(uri, mode = ""),
+ readerControl = list(reader = readPDF(engine = "ghostscript")))
sh: : command not found
Error in `<current-expression>` : error in running command
Calls: VCorpus ... mapply -> <Anonymous> -> <Anonymous> -> pdf_info -> system2
Execution halted
Flavors: r-devel-macos-arm64, r-release-macos-arm64
Version: 0.7-18
Check: package dependencies
Result: NOTE
Packages suggested but not available for checking:
'Rcampdf', 'tm.lexicon.GeneralInquirer'
Flavor: r-oldrel-macos-arm64
Version: 0.7-18
Check: examples
Result: ERROR
Running examples in ‘tm-Ex.R’ failed
The error most likely occurred in:
> ### Name: readPDF
> ### Title: Read In a PDF Document
> ### Aliases: readPDF
> ### Keywords: file
>
> ### ** Examples
>
> uri <- paste0("file://",
+ system.file(file.path("doc", "tm.pdf"), package = "tm"))
> engine <- if(nzchar(system.file(package = "pdftools"))) {
+ "pdftools"
+ } else {
+ "ghostscript"
+ }
> reader <- readPDF(engine)
> pdf <- reader(elem = list(uri = uri), language = "en", id = "id1")
> cat(content(pdf)[1])
Introduction to the tm Package
Text Mining in R
Ingo Feinerer
February 18, 2026
Introduction
This vignette gives a short introduction to text mining in R utilizing the text mining framework provided by
the tm package. We present methods for data import, corpus handling, preprocessing, metadata management,
and creation of term-document matrices. Our focus is on the main aspects of getting started with text mining
in R—an in-depth description of the text mining infrastructure offered by tm was published in the Journal of
Statistical Software (Feinerer et al., 2008). An introductory article on text mining in R was published in R
News (Feinerer, 2008).
Data Import
The main structure for managing documents in tm is a so-called Corpus, representing a collection of text
documents. A corpus is an abstract concept, and there can exist several implementations in parallel. The
default implementation is the so-called VCorpus (short for Volatile Corpus) which realizes a semantics as known
from most R objects: corpora are R objects held fully in memory. We denote this as volatile since once the
R object is destroyed, the whole corpus is gone. Such a volatile corpus can be created via the constructor
VCorpus(x, readerControl). Another implementation is the PCorpus which implements a Permanent Corpus
semantics, i.e., the documents are physically stored outside of R (e.g., in a database), corresponding R objects
are basically only pointers to external structures, and changes to the underlying corpus are reflected to all R
objects associated with it. Compared to the volatile corpus the corpus encapsulated by a permanent corpus
object is not destroyed if the corresponding R object is released.
Within the corpus constructor, x must be a Source object which abstracts the input location. tm provides a
set of predefined sources, e.g., DirSource, VectorSource, or DataframeSource, which handle a directory, a vector
interpreting each component as document, or data frame like structures (like CSV files), respectively. Except
DirSource, which is designed solely for directories on a file system, and VectorSource, which only accepts (char-
acter) vectors, most other implemented sources can take connections as input (a character string is interpreted
as file path). getSources() lists available sources, and users can create their own sources.
The second argument readerControl of the corpus constructor has to be a list with the named components
reader and language. The first component reader constructs a text document from elements delivered by
a source. The tm package ships with several readers (e.g., readPlain(), readPDF(), readDOC(), . . . ). See
getReaders() for an up-to-date list of available readers. Each source has a default reader which can be
overridden. E.g., for DirSource the default just reads in the input files and interprets their content as text.
Finally, the second component language sets the texts’ language (preferably using ISO 639-2 codes).
In case of a permanent corpus, a third argument dbControl has to be a list with the named components
dbName giving the filename holding the sourced out objects (i.e., the database), and dbType holding a valid
database type as supported by package filehash. Activated database support reduces the memory demand,
however, access gets slower since each operation is limited by the hard disk’s read and write capabilities.
So e.g., plain text files in the directory txt containing Latin (lat) texts by the Roman poet Ovid can be
read in with following code:
> txt <- system.file("texts", "txt", package = "tm")
> (ovid <- VCorpus(DirSource(txt, encoding = "UTF-8"),
+ readerControl = list(language = "lat")))
<<VCorpus>>
Metadata: corpus specific: 0, document level (indexed): 0
Content: documents: 5
1
> VCorpus(URISource(uri, mode = ""),
+ readerControl = list(reader = readPDF(engine = "ghostscript")))
sh: : command not found
Error in system2(gs_cmd, c("-dNODISPLAY -q", sprintf("-sFile=%s", shQuote(file)), :
error in running command
Calls: VCorpus ... mapply -> <Anonymous> -> <Anonymous> -> pdf_info -> system2
Execution halted
Flavor: r-oldrel-macos-arm64
Version: 0.7-17
Check: package dependencies
Result: NOTE
Packages suggested but not available for checking:
'Rcampdf', 'tm.lexicon.GeneralInquirer'
Flavors: r-oldrel-macos-x86_64, r-oldrel-windows-x86_64
Current CRAN status: OK: 14
Current CRAN status: WARN: 1, NOTE: 2, OK: 11
Version: 0.10-60
Check: dependencies in R code
Result: NOTE
Namespace in Imports field not imported from: ‘jsonlite’
All declared Imports should be used.
Flavors: r-devel-linux-x86_64-fedora-clang, r-devel-linux-x86_64-fedora-gcc
Version: 0.10-60
Check: Rd files
Result: WARN
NelPlo.Rd:42: processing build-stage \Sexpr code failed:
Error in gzfile(file, "rb"): cannot open the connection
USeconomic.Rd:11: processing build-stage \Sexpr code failed:
Error in gzfile(file, "rb"): cannot open the connection
adf.test.Rd:29: processing build-stage \Sexpr code failed:
Error in gzfile(file, "rb"): cannot open the connection
arma.Rd:24: processing build-stage \Sexpr code failed:
Error in gzfile(file, "rb"): cannot open the connection
bds.test.Rd:48: processing build-stage \Sexpr code failed:
Error in gzfile(file, "rb"): cannot open the connection
bev.Rd:21: processing build-stage \Sexpr code failed:
Error in gzfile(file, "rb"): cannot open the connection
garch.Rd:51: processing build-stage \Sexpr code failed:
Error in gzfile(file, "rb"): cannot open the connection
ice.river.Rd:12: processing build-stage \Sexpr code failed:
Error in gzfile(file, "rb"): cannot open the connection
jarque.bera.test.Rd:31: processing build-stage \Sexpr code failed:
Error in gzfile(file, "rb"): cannot open the connection
kpss.test.Rd:24: processing build-stage \Sexpr code failed:
Error in gzfile(file, "rb"): cannot open the connection
po.test.Rd:28: processing build-stage \Sexpr code failed:
Error in gzfile(file, "rb"): cannot open the connection
portfolio.optim.Rd:55: processing build-stage \Sexpr code failed:
Error in gzfile(file, "rb"): cannot open the connection
pp.test.Rd:31: processing build-stage \Sexpr code failed:
Error in gzfile(file, "rb"): cannot open the connection
runs.test.Rd:41: processing build-stage \Sexpr code failed:
Error in gzfile(file, "rb"): cannot open the connection
summary.garch.Rd:24: processing build-stage \Sexpr code failed:
Error in gzfile(file, "rb"): cannot open the connection
surrogate.Rd:39: processing build-stage \Sexpr code failed:
Error in gzfile(file, "rb"): cannot open the connection
terasvirta.test.Rd:50: processing build-stage \Sexpr code failed:
Error in gzfile(file, "rb"): cannot open the connection
tsbootstrap.Rd:35: processing build-stage \Sexpr code failed:
Error in gzfile(file, "rb"): cannot open the connection
white.test.Rd:59: processing build-stage \Sexpr code failed:
Error in gzfile(file, "rb"): cannot open the connection
problems found in ‘NelPlo.Rd’, ‘USeconomic.Rd’, ‘adf.test.Rd’, ‘arma.Rd’, ‘bds.test.Rd’, ‘bev.Rd’, ‘garch.Rd’, ‘ice.river.Rd’, ‘jarque.bera.test.Rd’, ‘kpss.test.Rd’, ‘po.test.Rd’, ‘portfolio.optim.Rd’, ‘pp.test.Rd’, ‘runs.test.Rd’, ‘summary.garch.Rd’, ‘surrogate.Rd’, ‘terasvirta.test.Rd’, ‘tsbootstrap.Rd’, ‘white.test.Rd’
Flavor: r-devel-macos-arm64
Current CRAN status: OK: 14
Current CRAN status: OK: 14
Current CRAN status: NOTE: 3, OK: 11
Version: 0.1-18
Check: package dependencies
Result: NOTE
Package suggested but not available for checking: ‘wordnetDicts’
Flavor: r-oldrel-macos-arm64
Version: 0.1-17
Check: package dependencies
Result: NOTE
Package suggested but not available for checking: ‘wordnetDicts’
Flavors: r-oldrel-macos-x86_64, r-oldrel-windows-x86_64