simulateClumpSizeDist {motifcounter} | R Documentation |
This function repeatedly simulates random DNA sequences according to the background model and subsequently counts the number of k-clump occurrences, where denotes the clump size. This function is only used for benchmarking analysis.
simulateClumpSizeDist(pfm, bg, seqlen, nsim = 10, singlestranded = FALSE)
pfm |
An R matrix that represents a position frequency matrix |
bg |
A Background object |
seqlen |
Integer-valued vector that defines the lengths of the
individual sequences. For a given DNAStringSet,
this information can be retrieved using |
nsim |
Integer number of random samples. |
singlestranded |
Boolean that indicates whether a single strand or both strands shall be scanned for motif hits. Default: singlestranded = FALSE. |
A List that contains
Empirical distribution of the clump sizes
compoundPoissonDist
,combinatorialDist
# Load sequences seqfile = system.file("extdata", "seq.fasta", package = "motifcounter") seqs = Biostrings::readDNAStringSet(seqfile) # Load background bg = readBackground(seqs, 1) # Load motif motiffile = system.file("extdata", "x31.tab", package = "motifcounter") motif = t(as.matrix(read.table(motiffile))) # Study the clump size frequencies in one sequence of length 1 Mb seqlen = 1000000 # scan both strands simc = motifcounter:::simulateClumpSizeDist(motif, bg, seqlen) # scan a single strand simc = motifcounter:::simulateClumpSizeDist(motif, bg, seqlen, singlestranded = TRUE)