grooMethy {REMP}R Documentation

Groom methylation data to fix potential data issues

Description

grooMethy is used to automatically detect and fix data issues including zero beta value, missing value, and infinite value.

Usage

grooMethy(methyDat, impute = TRUE, imputebyrow = TRUE,
  mapGenome = FALSE, verbose = FALSE)

Arguments

methyDat

A RatioSet, GenomicRatioSet, DataFrame, data.table, data.frame, or matrix of Illumina BeadChip methylation data (450k or EPIC array).

impute

If TRUE, K-Nearest Neighbouring imputation will be applied to fill the missing values. Default = TRUE. See Details.

imputebyrow

If TRUE, missing values will be imputed using similar values in row (i.e., across samples); if FALSE, missing values will be imputed using similar values in column (i.e., across CpGs). Default is TRUE.

mapGenome

Logical parameter. If TRUE, function will return a GenomicRatioSet object instead of a RatioSet.

verbose

Logical parameter. Should the function be verbose?

Details

For methylation data in beta value, if zero value exists, the logit transformation from beta to M value will produce negative infinite value. Therefore, zero beta value will be replaced with the smallest non-zero beta value found in the dataset. grooMethy can also handle missing value (i.e. NA or NaN) using KNN imputation (see impute.knn). The infinite value will be also treated as missing value for imputation. If the original dataset is in beta value, grooMethy will first transform it to M value before imputation is carried out. If the imputed value is out of the original range (which is possible when imputebyrow = FALSE), mean value will be used instead. Warning: imputed values for multimodal distributed CpGs (across samples) may not be correct. Please check package ENmix to identify the CpGs with multimodal distribution. Please note that grooMethy is also embedded in remp so the user can run remp directly without explicitly running grooMethy.

Value

A RatioSet or GenomicRatioSet containing beta value and M value of the methylation data.

Examples

GM12878_450k <- getGM12878('450k') # Get GM12878 methylation data (450k array)
grooMethy(GM12878_450k, verbose = TRUE)
grooMethy(minfi::getBeta(GM12878_450k), verbose = TRUE)


[Package REMP version 1.4.1 Index]