reduceFunctions {clusterExperiment}R Documentation

Filtering statistics and Dimensionality Reduction Functions

Description

Functions for calculating and manipulating either filtering statistics, stored in rowData, or the dimensionality reduction results, stored in reducedDims.

Usage

## S4 method for signature 'SummarizedExperiment'
makeFilterStats(object,
  filterStats = listBuiltInFilterStats(), transFun = NULL,
  isCount = FALSE)

## S4 method for signature 'matrixOrHDF5'
makeFilterStats(object, ...)

## S4 method for signature 'ClusterExperiment'
makeFilterStats(object,
  whichClusterIgnoreUnassigned = NULL,
  filterStats = listBuiltInFilterStats(), ...)

listBuiltInFilterStats()

## S4 method for signature 'SummarizedExperiment'
filterData(object, filterStats, cutoff,
  percentile, absolute = FALSE, keepLarge = TRUE)

## S4 method for signature 'SingleCellExperiment'
defaultNDims(object, reduceMethod, typeToShow)

## S4 method for signature 'SummarizedExperiment'
filterNames(object)

## S4 method for signature 'SingleCellExperiment'
makeReducedDims(object, reducedDims = "PCA",
  maxDims = 500, transFun = NULL, isCount = FALSE)

## S4 method for signature 'matrixOrHDF5'
makeReducedDims(object, ...)

## S4 method for signature 'SummarizedExperiment'
makeReducedDims(object, ...)

## S4 method for signature 'ClusterExperiment'
makeReducedDims(object, ...)

listBuiltInReducedDims()

Arguments

object

object from which user wants to calculate per-row statistics

filterStats

character vector of statistics to calculate. Must be one of the character values given by listBuildInFilterStats().

transFun

a transformation function to be applied to the data. If the transformation applied to the data creates an error or NA values, then the function will throw an error. If object is of class ClusterExperiment, the stored transformation will be used and giving this parameter will result in an error.

isCount

if transFun=NULL, then isCount=TRUE will determine the transformation as defined by function(x){log2(x+1)}, and isCount=FALSE will give a transformation function function(x){x}. Ignored if transFun=NULL. If object is of class ClusterExperiment, the stored transformation will be used and giving this parameter will result in an error.

...

Values passed on the the 'SingleCellExperiment' method.

whichClusterIgnoreUnassigned

indicates clustering that should be used to filter out unassigned samples from the calculations. If NULL no filtering of samples will be done. See details for more information.

cutoff

numeric. A value at which to filter the rows (genes) for the test statistic

percentile

numeric. Either a number between 0,1 indicating what percentage of the rows (genes) to keep or an integer value indicated the number of rows (genes) to keep

absolute

whether to take the absolute value of the filter statistic

keepLarge

logical whether to keep rows (genes) with large values of the test statistic or small values of the test statistic.

reduceMethod

character. A method or methods for reducing the size of the data, either by filtering the rows (genes) or by a dimensionality reduction method. Must either be 1) must match the name of a built-in method, in which case if it is not already existing in the object will be passed to makeFilterStats or link{makeReducedDims}, or 2) must match a stored filtering statistic or dimensionality reduction in the object

typeToShow

character (optional). If given, should be one of "filterStats" or "reducedDims" to indicate of the values in the reduceMethod vector, only show those corresponding to "filterStats" or "reducedDims" options.

reducedDims

a vector of character values indicating the methods of dimensionality reduction to be performed. Currently only "PCA" is implemented.

maxDims

Numeric vector of integer giving the number of PC dimensions to calculate. maxDims can also take values between (0,1) to indicate keeping the number of dimensions necessary to account for that proportion of the variance. maxDims should be of same length as reducedDims, indicating the number of dimensions to keep for each method (if maxDims is of length 1, the same number of dimensions will be used for each).

Details

whichClusterIgnoreUnassigned is only an option when applied to a ClusterExperiment classs and indicates that the filtering statistics should be calculated based on samples that are unassigned by the designated clustering. The name given to the filter in this case is of the form <filterStats>_<clusterLabel>, i.e. the clustering label of the clustering is appended to the standard name for the filtering statistic.

Note that filterData returns a SingleCellExperiment object. To get the actual data out use either assay or transformData if transformed data is desired.

The PCA method uses either prcomp from the stats package or svds from the RSpectra package to perform PCA. Both are called on t(assay(x)) with center=TRUE and scale=TRUE (i.e. the feature are centered and scaled), so that it is performing PCA on the correlation matrix of the features.

Value

makeFilterStats returns a SummarizedExperiment object with the requested filtering statistics will be added to the DataFrame in the rowData slot and given names corresponding to the filterStats values. Warning: the function will overwrite existing columns in rowData with the same name. Columns in the rowData slot with different names should not be affected.

filterData returns a SingleCellExperiment object with the rows (genes) removed based on filters

defaultNDims returns a numeric vector giving the default dimensions the methods in clusterExperiment will use for reducing the size of the data. If typeToShow is missing, the resulting vector will be equal to the length of reduceMethod. Otherwise, it will be a vector with all the unique valid default values for the typeToShow (note that different dimensionality reduction methods can have different maximal dimensions, so the result may not be of length one in this case).

filterNames returns a vector of the columns of the rowData that are considered valid filtering statistics. Currently any numeric column in rowData is a valid filtering statistic.

makeReducedDims returns a SingleCellExperiment containing the calculated dimensionality reduction in the reduceDims with names corresponding to the name given in reducedDims.

Examples

data(simData)
listBuiltInFilterStats()
scf<-makeFilterStats(simData,filterStats=c("var","mad"))
scf
scfFiltered<-filterData(scf,filterStats="mad",percentile=10)
scfFiltered
assay(scfFiltered)[1:10,1:10]
data(simData)
listBuiltInReducedDims()
scf<-makeReducedDims(simData, reducedDims="PCA", maxDims=3)
scf

[Package clusterExperiment version 2.0.2 Index]