Package: mikropml 1.6.1.9000

Kelly Sovacool

mikropml: User-Friendly R Package for Supervised Machine Learning Pipelines

An interface to build machine learning models for classification and regression problems. 'mikropml' implements the ML pipeline described by Topçuoğlu et al. (2020) <doi:10.1128/mBio.00434-20> with reasonable default options for data preprocessing, hyperparameter tuning, cross-validation, testing, model evaluation, and interpretation steps. See the website <https://www.schlosslab.org/mikropml/> for more information, documentation, and examples.

Authors:Begüm Topçuoğlu [aut], Zena Lapp [aut], Kelly Sovacool [aut, cre], Evan Snitkin [aut], Jenna Wiens [aut], Patrick Schloss [aut], Nick Lesniak [ctb], Courtney Armour [ctb], Sarah Lucas [ctb]

mikropml_1.6.1.9000.tar.gz
mikropml_1.6.1.9000.zip(r-4.5)mikropml_1.6.1.9000.zip(r-4.4)mikropml_1.6.1.9000.zip(r-4.3)
mikropml_1.6.1.9000.tgz(r-4.4-any)mikropml_1.6.1.9000.tgz(r-4.3-any)
mikropml_1.6.1.9000.tar.gz(r-4.5-noble)mikropml_1.6.1.9000.tar.gz(r-4.4-noble)
mikropml_1.6.1.9000.tgz(r-4.4-emscripten)mikropml_1.6.1.9000.tgz(r-4.3-emscripten)
mikropml.pdf |mikropml.html
mikropml/json (API)
NEWS

# Install 'mikropml' in R:
install.packages('mikropml', repos = c('https://schlosslab.r-universe.dev', 'https://cloud.r-project.org'))

Peer review:

Bug tracker:https://github.com/schlosslab/mikropml/issues

Datasets:

On CRAN:

machine-learning

39 exports 53 stars 3.55 score 87 dependencies 1 mentions 76 scripts 457 downloads

Last updated 1 years agofrom:77669ee3fb. Checks:OK: 3 NOTE: 4. Indexed: yes.

TargetResultDate
Doc / VignettesOKSep 16 2024
R-4.5-winNOTESep 16 2024
R-4.5-linuxNOTESep 16 2024
R-4.4-winNOTESep 16 2024
R-4.4-macNOTESep 16 2024
R-4.3-winOKSep 16 2024
R-4.3-macOKSep 16 2024

Exports::=!!.data%>%bootstrap_performancecalc_balanced_precisioncalc_baseline_precisioncalc_mean_perfcalc_mean_prccalc_mean_roccalc_model_sensspeccalc_perf_metricscombine_hp_performancecompare_modelscontr.ltfrdefine_cvget_caret_processed_dfget_feature_importanceget_hp_performanceget_hyperparams_listget_outcome_typeget_partition_indicesget_perf_metric_fnget_perf_metric_nameget_performance_tblget_tuning_gridgroup_correlated_featurespermute_p_valueplot_hp_performanceplot_mean_prcplot_mean_rocplot_model_performancepreprocess_datarandomize_feature_orderremove_singleton_columnsreplace_spacesrun_mltidy_perf_datatrain_model

Dependencies:bitopscaretcaToolsclasscliclockcodetoolscolorspacecpp11data.tablediagramdigestdplyre1071fansifarverforeachfuturefuture.applygenericsggplot2glmnetglobalsgluegowergplotsgtablegtoolshardhatipredisobanditeratorsjsonlitekernlabKernSmoothlabelinglatticelavalifecyclelistenvlubridatemagrittrMASSMatrixmgcvMLmetricsModelMetricsmunsellnlmennetnumDerivparallellypillarpkgconfigplyrpROCprodlimprogressrproxypurrrR6randomForestRColorBrewerRcppRcppEigenrecipesreshape2rlangROCRrpartscalesshapeSQUAREMstringistringrsurvivaltibbletidyrtidyselecttimechangetimeDatetzdbutf8vctrsviridisLitewithrxgboost

Introduction to mikropml

Rendered fromintroduction.Rmdusingknitr::rmarkdownon Sep 16 2024.

Last update: 2023-02-15
Started: 2020-07-01

mikropml: User-Friendly R Package for Supervised Machine Learning Pipelines

Rendered frompaper.Rmdusingknitr::rmarkdownon Sep 16 2024.

Last update: 2022-11-01
Started: 2020-10-15

Readme and manuals

Help Manual

Help pageTopics
Calculate a bootstrap confidence interval for the performance on a single train/test splitbootstrap_performance
Calculate balanced precision given actual and baseline precisioncalc_balanced_precision
Calculate the fraction of positives, i.e. baseline precision for a PRC curvecalc_baseline_precision
Generic function to calculate mean performance curves for multiple modelscalc_mean_perf
Calculate and summarize performance for ROC and PRC plotscalc_mean_prc calc_mean_roc calc_model_sensspec sensspec
Get performance metrics for test datacalc_perf_metrics
Combine hyperparameter performance metrics for multiple train/test splitscombine_hp_performance
Perform permutation tests to compare the performance metric across all pairs of a group variable.compare_models
Define cross-validation scheme and training parametersdefine_cv
Get preprocessed dataframe for continuous variablesget_caret_processed_df
Get feature importance using the permutation methodget_feature_importance
Get hyperparameter performance metricsget_hp_performance
Set hyperparameters based on ML method and dataset characteristicsget_hyperparams_list
Get outcome type.get_outcome_type
Select indices to partition the data into training & testing sets.get_partition_indices
Get default performance metric functionget_perf_metric_fn
Get default performance metric nameget_perf_metric_name
Get model performance metrics as a one-row tibbleget_performance_tbl
Generate the tuning grid for tuning hyperparametersget_tuning_grid
Group correlated featuresgroup_correlated_features
Mini OTU abundance dataset - preprocessedotu_data_preproc
Mini OTU abundance datasetotu_mini_bin
Results from running the pipeline with L2 logistic regression on 'otu_mini_bin' with feature importance and groupingotu_mini_bin_results_glmnet
Results from running the pipeline with random forest on 'otu_mini_bin'otu_mini_bin_results_rf
Results from running the pipeline with rpart2 on 'otu_mini_bin'otu_mini_bin_results_rpart2
Results from running the pipeline with svmRadial on 'otu_mini_bin'otu_mini_bin_results_svmRadial
Results from running the pipeline with xbgTree on 'otu_mini_bin'otu_mini_bin_results_xgbTree
Results from running the pipeline with glmnet on 'otu_mini_bin' with 'Otu00001' as the outcomeotu_mini_cont_results_glmnet
Results from running the pipeline with glmnet on 'otu_mini_bin' with 'Otu00001' as the outcome column, using a custom train control scheme that does not perform cross-validationotu_mini_cont_results_nocv
Cross validation on 'train_data_mini' with grouped features.otu_mini_cv
Mini OTU abundance dataset with 3 categorical variablesotu_mini_multi
Groups for otu_mini_multiotu_mini_multi_group
Results from running the pipeline with glmnet on 'otu_mini_multi' for multiclass outcomesotu_mini_multi_results_glmnet
Small OTU abundance datasetotu_small
Calculated a permuted p-value comparing two modelspermute_p_value
Plot hyperparameter performance metricsplot_hp_performance
Plot ROC and PRC curvesplot_curves plot_mean_prc plot_mean_roc
Plot performance metrics for multiple ML runs with different parametersplot_model_performance
Preprocess data prior to running machine learningpreprocess_data
Randomize feature order to eliminate any position-dependent effectsrandomize_feature_order
Remove columns appearing in only 'threshold' row(s) or fewer.remove_singleton_columns
Replace spaces in all elements of a character vector with underscoresreplace_spaces
Run the machine learning pipelinerun_ml
Tidy the performance dataframetidy_perf_data
Train model using 'caret::train()'.train_model