flowml: A Backend for a 'nextflow' Pipeline that Performs Machine-Learning-Based Modeling of Biomedical Data

Provides functionality to perform machine-learning-based modeling in a computation pipeline. Its functions contain the basic steps of machine-learning-based knowledge discovery workflows, including model training and optimization, model evaluation, and model testing. To perform these tasks, the package builds heavily on existing machine-learning packages, such as 'caret' <https://github.com/topepo/caret/> and associated packages. The package can train multiple models, optimize model hyperparameters by performing a grid search or a random search, and evaluates model performance by different metrics. Models can be validated either on a test data set, or in case of a small sample size by k-fold cross validation or repeated bootstrapping. It also allows for 0-Hypotheses generation by performing permutation experiments. Additionally, it offers methods of model interpretation and item categorization to identify the most informative features from a high dimensional data space. The functions of this package can easily be integrated into computation pipelines (e.g. 'nextflow' <https://www.nextflow.io/>) and hereby improve scalability, standardization, and re-producibility in the context of machine-learning.

Version: 0.1.3
Depends: R (≥ 3.5.0)
Imports: ABCanalysis, caret, data.table, dplyr, fastshap, furrr, future, magrittr, optparse, parallel, purrr, R6, readr, rjson, rlang, rsample, stats, stringr, tibble, tidyr, utils, vip
Suggests: ada, adabag, arm, bartMachine, bst, C50, caTools, class, Cubist, e1071, earth, elasticnet, evtree, fastICA, foreach, frbs, gam, gbm, ggplot2, glmnet, h2o, hda, ipred, keras, kernlab, kknn, klaR, knitr, kohonen, lars, leaps, LiblineaR, LogicReg, MASS, Matrix, mboost, mda, mgcv, monomvn, neuralnet, nnet, nnls, pamr, partDSA, party, partykit, penalized, pls, plyr, proxy, quantregForest, randomForest, ranger, rFerns, rmarkdown, rpart, rrcov, rrcovHD, RSNNS, RWeka, sda, shapviz, spls, superpc, VGAM, xgboost
Published: 2024-02-16
Author: Sebastian Malkusch ORCID iD [aut, cre], Kolja Becker ORCID iD [aut], Alexander Peltzer ORCID iD [ctb], Neslihan Kaya ORCID iD [ctb], Boehringer Ingelheim Ltd. [cph, fnd]
Maintainer: Sebastian Malkusch <sebastian.malkusch at boehringer-ingelheim.com>
BugReports: https://github.com/Boehringer-Ingelheim/flowml/issues
License: GPL (≥ 3)
URL: https://github.com/Boehringer-Ingelheim/flowml
NeedsCompilation: no
CRAN checks: flowml results

Documentation:

Reference manual: flowml.pdf

Downloads:

Package source: flowml_0.1.3.tar.gz
Windows binaries: r-devel: flowml_0.1.3.zip, r-release: flowml_0.1.3.zip, r-oldrel: flowml_0.1.3.zip
macOS binaries: r-release (arm64): flowml_0.1.3.tgz, r-oldrel (arm64): flowml_0.1.3.tgz, r-release (x86_64): flowml_0.1.3.tgz, r-oldrel (x86_64): flowml_0.1.3.tgz
Old sources: flowml archive

Linking:

Please use the canonical form https://CRAN.R-project.org/package=flowml to link to this page.