computeOptimal {ChIPanalyser} | R Documentation |
ChIPanalyser
contains a set of functions some of which require two
parameters known as ScalingFactorPWM
and as
boundMolecules
. These two paramters are not always known.
computeOptimal
will compute these values by maximising the
correlation and minimising the Mean Squared Error between a predicted
ChIP-seq-like profile and a real ChIP-seq profile for a given loci.
computeOptimal(DNASequenceSet, genomicProfileParameters, LocusProfile, setSequence, DNAAccessibility = NULL, occupancyProfileParameters = NULL, parameter = "all", peakMethod="moving_kernel",cores=1)
DNASequenceSet |
|
genomicProfileParameters |
|
LocusProfile |
|
setSequence |
|
DNAAccessibility |
|
occupancyProfileParameters |
|
parameter |
|
peakMethod |
|
cores |
|
In order to backward infer the values of ScalingFactorPWM
and boundMolecules
, it is possible to use the
computeOptimal
to find these parameters.
It should be noted that this functions requires a ChIP-seq data input.
LocusProfile
(ChIP-seq data) should be a named list with normalised
ChIP-seq to a single base pair level. Naming should stay consitent with all
other names and should represent the names of the loci of interest.
The naming procedure should be similar in setSequence
.
Each range within the GRanges
should
be named (not to be confused with seqnames )
computeOptimal
returns a list respectivly described as the optimal
set of Parameters (lambda or ScalingFactorPWM
and
boundMolecules
), the optimal matrix (a matrix containing
accuracy estimates dependant on the parameter chosen), and finally the
chosen parameter. If the parameter that was chosen was "all",
then each element of this list will contain the optimal set of
parameters, optimal matricies for
"correlation", "Mean Squared Error" and "theta".
Patrick C. N. Martin <pm16057@essex.ac.uk>
Zabet NR, Adryan B (2015) Estimating binding properties of transcription factors from genome-wide binding profiles. Nucleic Acids Res., 43, 84–94.
#Data extraction data(ChIPanalyserData) # path to Position Frequency Matrix PFM <- file.path(system.file("extdata",package="ChIPanalyser"),"BCDSlx.pfm") #As an example of genome, this example will run on the Drosophila genome if(!require("BSgenome.Dmelanogaster.UCSC.dm3", character.only = TRUE)){ source("https://bioconductor.org/biocLite.R") biocLite("BSgenome.Dmelanogaster.UCSC.dm3") } library(BSgenome.Dmelanogaster.UCSC.dm3) DNASequenceSet <- getSeq(BSgenome.Dmelanogaster.UCSC.dm3) #Building data objects GPP <- genomicProfileParameters(PFM=PFM,BPFrequency=DNASequenceSet) OPP <- occupancyProfileParameters() #Computing Optimal set of Parameters optimalParam <- computeOptimal(DNASequenceSet = DNASequenceSet, genomicProfileParameters = GPP, LocusProfile = eveLocusChip, setSequence = eveLocus, DNAAccessibility = Access, occupancyProfileParameters = OPP, parameter = "all", peakMethod="moving_kernel")