scoreMergedBins {RIPSeeker} | R Documentation |
Sum, normalize the read counts, and average the logOdd score over the bins being merged into a single enriced region.
scoreMergedBins(findOverlapsHits, unmergedGRAll, mergedGRAll)
findOverlapsHits |
Output from |
unmergedGRAll |
GRanges before merging. |
mergedGRAll |
GRanges after merging. |
The consecutive RIP-bins predicted by the Viterbi function (See nbh_vit
) are merged into a single RIP region. An aggregate RIPScore as the averaged RIPScores over the associated merged bins is assigned to each merged RIP region. In the RIPSeeker workflow, the averaged RIPScore then becomes the representative score for the region and subject to significance test carried out in seekRIP
.
A merged GRanges each with scores including summed read count, averaged log odd scores, and FPK (fragment per kilobase of region length). The latter score represent a normalized read count.
This function is expected to be called only from logScoreWithoutControl
and logScoreWithControl
.
Yue Li
seekRIP, computeLogOdd logScoreWithControl, logScoreWithoutControl
if(interactive()) { # see example in seekRIP # Retrieve system files extdata.dir <- system.file("extdata", package="RIPSeeker") bamFiles <- list.files(extdata.dir, ".bam$", recursive=TRUE, full.names=TRUE) bamFiles <- grep("PRC2", bamFiles, value=TRUE) # Parameters setting binSize <- 1e5 # use a large fixed bin size for demo only multicore <- FALSE # use multicore strandType <- "-" # set strand type to minus strand ################ run main function for HMM inference on all chromosomes ################ mainSeekOutputRIP <- mainSeek(bamFiles= grep(pattern="SRR039214", bamFiles, value=TRUE, invert=TRUE), binSize=binSize, strandType=strandType, reverseComplement=TRUE, genomeBuild="mm9", uniqueHit = TRUE, assignMultihits = TRUE, rerunWithDisambiguatedMultihits = TRUE, multicore=multicore, silentMain=FALSE, verbose=TRUE) nbhGRRIP <- mainSeekOutputRIP$nbhGRList$chrX logOddScore <- computeLogOdd(nbhGRRIP) values(nbhGRRIP) <- cbind(as.data.frame(values(nbhGRRIP)), logOddScore) enrichIdx <- which(values(nbhGRRIP)$viterbi_state == 2) unmergedRIP <- nbhGRRIP[enrichIdx] mergedRIP <- reduce(unmergedRIP, min.gapwidth = median(width(unmergedRIP) )) overlapIdx <- findOverlaps(mergedRIP, unmergedRIP) # a list with query hits as names and subject hits as items findOverlapsHits <- split(overlapIdx, queryHits(overlapIdx)) # get the score for the first merged region x <- scoreMergedBins(findOverlapsHits[[1]], unmergedRIP, mergedRIP) # get scores for all of the merged regions mergedRIPList <- lapply(split(overlapIdx, queryHits(overlapIdx)), scoreMergedBins, unmergedRIP, mergedRIP) names(mergedRIPList) <- NULL mergedRIP <- do.call(c, mergedRIPList) # logOddScore is the averaged logOddScore across merged bins mergedRIP }