prepareCellData {cydar}R Documentation

Prepare mass cytometry data

Description

Convert single-cell marker intensities from a mass cytometry experiment into a format for efficient counting.

Usage

prepareCellData(x, naive=FALSE, markers=NULL, ...)

Arguments

x

A named list of numeric matrices, where each matrix corresponds to a sample and contains expression intensities for each cell (row) and each marker (column). Alternatively, a ncdfFlowSet object containing the same information.

naive

A logical scalar specifying whether k-means clustering should be performed.

markers

A vector specifying the markers to use in distance calculations.

...

Additional arguments to pass to kmeans.

Details

This function constructs a CyData object from the marker intensities of each cell in one or more samples.

If naive=FALSE, this function performs k-means clustering on all the cells based on their marker intensities. The number of clusters is set to the square-root of the total number of cells. The cluster centres and cell assignments are then stored for later use in speeding up high-dimensional searches. Intensity matrices from several samples are also merged into a single matrix for greater efficiency.

Note that naive does not change the results of downstream functions, only the computational algorithm with which they are obtained.

If markers is specified, only the specified markers will be used in the distance calculations. This also applies to calculations in downstream functions like countCells and neighborDistances. All other markers will be ignored unless their usage is explicitly requested. By default, markers=NULL which means that all supplied markers will be used in the calculations.

Value

A CyData object with marker intensities for each cell stored in the cellIntensities slot. In addition:

Each element of cluster.info is a list, containing the zero-indexed column index of the output matrix that specifies the first cell in the cluster; as well as a numeric vector of distances between each cell in the cluster and the cluster centre. Cells in cellIntensities are arranged in blocks corresponding to the clusters and ordered such that the distances are increasing.

Author(s)

Aaron Lun

See Also

countCells, neighborDistances

Examples

### Mocking up some data: ###
nmarkers <- 20
marker.names <- paste0("X", seq_len(nmarkers))
nsamples <- 8
sample.names <- paste0("Y", seq_len(nsamples))

x <- list()
for (i in sample.names) {
    ex <- matrix(rgamma(nmarkers*1000, 2, 2), ncol=nmarkers, nrow=1000)
    colnames(ex) <- marker.names
    x[[i]] <- ex
}

### Running the function: ###
cd <- prepareCellData(x)
cd

[Package cydar version 1.4.0 Index]