withinLaneNormalization-methods {EDASeq} | R Documentation |
withinLaneNormalization
in Package EDASeq Within-lane normalization for GC-content (or other lane-specific) bias.
withinLaneNormalization(x, y, which=c("loess","median","upper","full"), offset=FALSE, num.bins=10, round=TRUE)
x |
A numeric matrix representing the counts or a |
y |
A numeric vector representing the covariate to normalize for (if |
which |
Method used to normalized. See the details section and the reference below for details. |
offset |
Should the normalized value be returned as an offset leaving the original counts unchanged? |
num.bins |
The number of bins used to stratify the covariate for |
round |
If TRUE the normalization returns rounded values (pseudo-counts). Ignored if offset=TRUE. |
This method implements four normalizations described in Risso et al. (2011).
The loess
normalization transforms the data by regressing the counts on y
and subtracting the loess fit from the counts to remove the dependence.
The median
, upper
and full
normalizations are based on the stratification of the genes based on y
. Once the genes are stratified in num.bins
strata, the methods work as follows.
median
:scales the data to have the same median in each bin.
upper
:the same but with the upper quartile.
full
:forces the distribution of each stratum to be the same using a non linear full quantile normalization, in the spirit of the one used in microarrays.
signature(x = "matrix", y = "numeric")
It returns a matrix with the normalized counts if offset=FALSE
or with the offset if offset=TRUE
.
signature(x = "SeqExpressionSet", y = "character")
It returns a SeqExpressionSet
with the normalized counts in the normalizedCounts
slot and with the offset in the offset
slot (if offset=TRUE
).
Davide Risso.
D. Risso, K. Schwartz, G. Sherlock and S. Dudoit (2011). GC-Content Normalization for RNA-Seq Data. Manuscript in Preparation.
library(yeastRNASeq) data(geneLevelData) data(yeastGC) sub <- intersect(rownames(geneLevelData), names(yeastGC)) mat <- as.matrix(geneLevelData[sub, ]) data <- newSeqExpressionSet(mat, phenoData=AnnotatedDataFrame( data.frame(conditions=factor(c("mut", "mut", "wt", "wt")), row.names=colnames(geneLevelData))), featureData=AnnotatedDataFrame(data.frame(gc=yeastGC[sub]))) norm <- withinLaneNormalization(data, "gc", which="full", offset=FALSE)