prepareCellData {cydar} | R Documentation |
Convert single-cell marker intensities from a mass cytometry experiment into a format for efficient counting.
prepareCellData(x, naive=FALSE, markers=NULL, ...)
x |
A named list of numeric matrices, where each matrix corresponds to a sample and contains expression intensities for each cell (row) and each marker (column). Alternatively, a ncdfFlowSet object containing the same information. |
naive |
A logical scalar specifying whether k-means clustering should be performed. |
markers |
A vector specifying the markers to use in distance calculations. |
... |
Additional arguments to pass to |
This function constructs a CyData object from the marker intensities of each cell in one or more samples.
If naive=FALSE
, this function performs k-means clustering on all the cells based on their marker intensities.
The number of clusters is set to the square-root of the total number of cells.
The cluster centres and cell assignments are then stored for later use in speeding up high-dimensional searches.
Intensity matrices from several samples are also merged into a single matrix for greater efficiency.
Note that naive
does not change the results of downstream functions, only the computational algorithm with which they are obtained.
If markers
is specified, only the specified markers will be used in the distance calculations.
This also applies to calculations in downstream functions like countCells
and neighborDistances
.
All other markers will be ignored unless their usage is explicitly requested.
By default, markers=NULL
which means that all supplied markers will be used in the calculations.
A CyData object with marker intensities for each cell stored in the cellIntensities
slot.
In addition:
Marker names are stored as the row names of the markerData
slot.
An additional used
field also specifies whether the marker was used or not.
Sample names are stored as the column names of the output object.
cellData
contains sample.id
, an integer vector specifying the element of x
that each cell was taken from;
and cell.id
, an integer vector specifying the row index of each cell in its original matrix.
If naive=FALSE
, the metadata contains cluster.centers
, a numeric matrix containing the centre coordinates of each cluster (column) for each marker (row);
and cluster.info
, a list containing information for each cluster.
Each element of cluster.info
is a list, containing the zero-indexed column index of the output matrix that specifies the first cell in the cluster;
as well as a numeric vector of distances between each cell in the cluster and the cluster centre.
Cells in cellIntensities
are arranged in blocks corresponding to the clusters and ordered such that the distances are increasing.
Aaron Lun
### Mocking up some data: ### nmarkers <- 20 marker.names <- paste0("X", seq_len(nmarkers)) nsamples <- 8 sample.names <- paste0("Y", seq_len(nsamples)) x <- list() for (i in sample.names) { ex <- matrix(rgamma(nmarkers*1000, 2, 2), ncol=nmarkers, nrow=1000) colnames(ex) <- marker.names x[[i]] <- ex } ### Running the function: ### cd <- prepareCellData(x) cd