mainSeekSingleChrom {RIPSeeker} | R Documentation |
This an internal function used by mainSeek
to accomplish three major tasks on a single chromosome: automatically select bin size, compute read counts within the bins, and obtain optimal HMM paramters.
mainSeekSingleChrom(alignGR, K = 2, binSize = NULL, minReadCount = 10, backupNumBins = 10, minBinSize = 200, maxBinSize = 1200, increment = 5, pathToSavePlotsOfBinSizesVersusCosts, verbose = TRUE, allowSecondAttempt = TRUE, ...)
alignGR |
GRanges containing the alignments on a single chromosome . |
K |
Number of hidden states (Default: 2). By default, state 1 specifies the background and state 2 the RIP regions. The two states are recognized by the means for the two distributions (See |
binSize |
Size to use for binning the read counts across each chromosome. If NULL, optimal bin size within a range (default: minBinSize=200, maxBinSize=1200) will be automatically selected (See |
minReadCount |
Minimum aligned read counts needed for HMM to converge (Default: 10). Note that HMM may not converge some times when majority of the read counts are zero even if some read count > 10. When that happens, a back-up function |
backupNumBins |
If read count is less than |
minBinSize |
Minimum bin size to start with the bin selection (See |
maxBinSize |
Maximum bin size to stop with the bin selection (See |
increment |
Step-wise increment in bin size selection (See |
pathToSavePlotsOfBinSizesVersusCosts |
Directory used to save the diagnostic plots for bin size selection. |
verbose |
Binary indicator for disable (FALSE) or enable (TRUE) HMM training message from function |
allowSecondAttempt |
In case HMM fails to converge due to malformed paramters in EM iteraction, re-iterating the HMM process each time with a different suboptimal bin size in attempt to succeed in some trial. If all yeild nothing, fall back up to |
... |
Argumnets passed to |
nbhGR |
GRanges object containing the optimized HMM parameters (and the Viterbi hidden state sequence) accompanied with the read count vector following the (automatic) binning scheme. |
Unless a highly customized workflow is needed, ripSeek
is the high-level front-end main function that should be used in most cases.
Yue Li
# Retrieve system files extdata.dir <- system.file("extdata", package="RIPSeeker") bamFiles <- list.files(extdata.dir, ".bam$", recursive=TRUE, full.names=TRUE) bamFiles <- grep("PRC2", bamFiles, value=TRUE) # Parameters setting binSize <- 1e5 # use a large fixed bin size for demo only minBinSize <- NULL # min bin size in automatic bin size selection maxBinSize <- NULL # max bin size in automatic bin size selection multicore <- FALSE # use multicore strandType <- "-" # set strand type to minus strand # Retrieve system files extdata.dir <- system.file("extdata", package="RIPSeeker") bamFiles <- list.files(extdata.dir, ".bam$", recursive=TRUE, full.names=TRUE) bamFiles <- grep("PRC2", bamFiles, value=TRUE) alignGal <- getAlignGal(bamFiles[1], reverseComplement=TRUE, genomeBuild="mm9") alignGR <- as(alignGal, "GRanges") alignGRList <- GRangesList(as.list(split(alignGR, seqnames(alignGR)))) ################ run main function for HMM inference on a single chromosome ################ nbhGR <- mainSeekSingleChrom(alignGR=alignGRList$chrX, K = 2, binSize=binSize, minBinSize = minBinSize, maxBinSize = maxBinSize) nbhGR