biomarkertmle {biotmle}R Documentation

Biomarker Evaluation with Targeted Minimum Loss-Based Estimation (TMLE)

Description

Computes the causal target parameter defined as the difference between the biomarker expression values under treatment and those same values under no treatment, using Targeted Minimum Loss-Based Estimation.

Usage

biomarkertmle(se, varInt, ngscounts = FALSE, parallel = TRUE,
  bppar_type = NULL, future_param = NULL, family = "gaussian",
  subj_ids = NULL, g_lib = c("SL.glm", "SL.randomForest", "SL.nnet",
  "SL.polymars", "SL.mean"), Q_lib = c("SL.glm", "SL.randomForest", "SL.nnet",
  "SL.mean"))

Arguments

se

(SummarizedExperiment) - containing expression or next-generation sequencing data in the "assays" slot and a matrix of phenotype-level data in the "colData" slot.

varInt

(numeric) - indicating the column of the design matrix corresponding to the treatment or outcome of interest (in the "colData" slot of the "se" argument above).

ngscounts

(logical) - whether the data are counts generated from a next-generation sequencing (NGS) experiment (e.g., RNA-seq). The default setting assumes continuous expression measures as generated by microarray-type platforms.

parallel

(logical) - whether or not to use parallelization in the estimation procedure. Invoking parallelization happens through a combination of calls to future and BiocParallel. If this argument is set to TRUE, future::multiprocess is used, and if FALSE, future::sequential is used, alongside BiocParallel::bplapply. Other options for evaluation through futures may be invoked by setting the argument future_param.

bppar_type

(character) - specifies the type of backend to be used with the parallelization invoked by BiocParallel. Consult the manual page for BiocParallel::BiocParallelParam for possible types and descriptions on their appropriate uses. The default for this argument is NULL, which silently uses BiocParallel::DoparParam.

future_param

(character) - specifies the type of parallelization to be invoked when using futures for evaluation. For a list of the available types, please consult the documentation for future::plan. The default setting (this argument set to NULL) silently invokes future::multiprocess. Be careful if changing this setting.

family

(character) - specification of error family: "binomial" or "gaussian".

subj_ids

(numeric vector) - subject IDs to be passed directly to the same subject should have the exact same numerical identifier; coerced to numeric if not provided in the appropriate form.

g_lib

(char vector) - library of learning algorithms to be used in fitting the "g" step of the standard TMLE procedure.

Q_lib

(char vector) - library of learning algorithms to be used in fitting the "Q" step of the standard TMLE procedure.

Value

S4 object of class biotmle, generated by sub-classing SummarizedExperiment, with additional slots containing tmleOut and call, among others, containing TMLE-based estimates of the relationship between a biomarker and exposure or outcome variable and the original call to this function (for user reference), respectively.

Examples

library(dplyr)
library(biotmleData)
data(illuminaData)
library(SummarizedExperiment)
"%ni%" = Negate("%in%")

colData(illuminaData) <- colData(illuminaData) %>%
     data.frame %>%
     dplyr::mutate(age = as.numeric(age > median(age))) %>%
     DataFrame

varInt_index <- which(names(colData(illuminaData)) %in% "benzene")

biomarkerTMLEout <- biomarkertmle(se = illuminaData[1:2, ],
                                  varInt = varInt_index,
                                  parallel = FALSE,
                                  family = "gaussian",
                                  g_lib = c("SL.mean", "SL.glm"),
                                  Q_lib = "SL.mean"
                                 )


[Package biotmle version 1.4.0 Index]