pairwiseAlignment {Biostrings} | R Documentation |
Solves the (Needleman-Wunsch) global alignment and (Smith-Waterman) local alignment problems.
pairwiseAlignment(pattern, subject, patternQuality = 22L, subjectQuality = 22L, type = "global", substitutionMatrix = NULL, gapOpening = -10, gapExtension = -4, scoreOnly = FALSE)
pattern |
a character vector of length 1, an XString , or an
XStringSet object. |
subject |
a character vector of length 1 or an XString object. |
patternQuality, subjectQuality |
respective quality scores for pattern and
subject that are used in a quality-based method for generating a substitution
matrix. These scores must either be represented by [0 - 99] integer vectors, character
vectors, BString , or, in the case of patternQuality ,
BStringSet objects. Characters are interpreted as [0 - 99]
quality measures by subtracting 33 from their ASCII decimal representation
(e.g. ! = 0, " = 1, # = 2, ...). These two arguments are ignored if
!is.null(substitutionMatrix) . |
type |
type of alignment ("global" , "local" , "overlap" ). |
substitutionMatrix |
constant substitution matrix for the alignment. Do not
use substitutionMatrix in conjunction with patternQuality and
subjectQuality arguments. |
gapOpening |
penalty for opening a gap in the alignment. |
gapExtension |
penalty for extending a gap in the alignment. |
scoreOnly |
logical to denote whether or not to only return the scores of the optimal pairwise alignment. (See Value section below.) |
General implementation based on Chapter 2 of Haubold and Wiehe (2006). Quality-based method for generating a substitution matrix based on the Bioinformatics article by Ketil Malde given below.
If scoreOnly == FALSE
, an instance of class XStringAlign
is returned.
If scoreOnly == TRUE
, a numeric vector containing the scores for the optimal
pairwise alignments is returned.
Patrick Aboyoun and Herve Pages.
B. Haubold, T. Wiehe, Introduction to Computational Biology, Birkhauser Verlag 2006, Chapter 2. K. Malde, The effect of sequence quality on sequence alignment, Bioinformatics, Feb 23, 2008.
XStringAlign-class, substitution.matrices
## Nucleotide global, local, and overlap alignments s1 <- DNAString("ACTTCACCAGCTCCCTGGCGGTAAGTTGATCAAAGGAAACGCAAAGTTTTCAAG") s2 <- DNAString("GTTTCACTACTTCCTTTCGGGTAAGTAAATATATAAATATATAAAAATATAATTTTCATC") # First use a constant substitution matrix mat <- matrix(-3, nrow = 4, ncol = 4) diag(mat) <- 1 rownames(mat) <- colnames(mat) <- DNA_ALPHABET[1:4] globalAlign <- pairwiseAlignment(s1, s2, substitutionMatrix = mat, gapOpening = -5, gapExtension = -2) localAlign <- pairwiseAlignment(s1, s2, type = "local", substitutionMatrix = mat, gapOpening = -5, gapExtension = -2) overlapAlign <- pairwiseAlignment(s1, s2, type = "overlap", substitutionMatrix = mat, gapOpening = -5, gapExtension = -2) # Then use quality-based method for generating a substitution matrix pairwiseAlignment(s1, s2, patternQuality = rep(c(22L, 12L), times = c(36, 18)), subjectQuality = rep(c(22L, 12L), times = c(40, 20)), scoreOnly = TRUE) ## Amino acid global alignment pairwiseAlignment(AAString("PAWHEAE"), AAString("HEAGAWGHEE"), substitutionMatrix = "BLOSUM50", gapOpening = 0, gapExtension = -8)