pairwiseAlignment {Biostrings}R Documentation

Optimal Pairwise Alignment

Description

Solves the (Needleman-Wunsch) global alignment and (Smith-Waterman) local alignment problems.

Usage

pairwiseAlignment(pattern, subject, patternQuality = 22L, subjectQuality = 22L, type = "global",
                  substitutionMatrix = NULL, gapOpening = -10, gapExtension = -4,
                  scoreOnly = FALSE)

Arguments

pattern a character vector of length 1, an XString, or an XStringSet object.
subject a character vector of length 1 or an XString object.
patternQuality, subjectQuality respective quality scores for pattern and subject that are used in a quality-based method for generating a substitution matrix. These scores must either be represented by [0 - 99] integer vectors, character vectors, BString, or, in the case of patternQuality, BStringSet objects. Characters are interpreted as [0 - 99] quality measures by subtracting 33 from their ASCII decimal representation (e.g. ! = 0, " = 1, # = 2, ...). These two arguments are ignored if !is.null(substitutionMatrix).
type type of alignment ("global", "local", "overlap").
substitutionMatrix constant substitution matrix for the alignment. Do not use substitutionMatrix in conjunction with patternQuality and subjectQuality arguments.
gapOpening penalty for opening a gap in the alignment.
gapExtension penalty for extending a gap in the alignment.
scoreOnly logical to denote whether or not to only return the scores of the optimal pairwise alignment. (See Value section below.)

Details

General implementation based on Chapter 2 of Haubold and Wiehe (2006). Quality-based method for generating a substitution matrix based on the Bioinformatics article by Ketil Malde given below.

Value

If scoreOnly == FALSE, an instance of class XStringAlign is returned. If scoreOnly == TRUE, a numeric vector containing the scores for the optimal pairwise alignments is returned.

Author(s)

Patrick Aboyoun and Herve Pages.

References

B. Haubold, T. Wiehe, Introduction to Computational Biology, Birkhauser Verlag 2006, Chapter 2. K. Malde, The effect of sequence quality on sequence alignment, Bioinformatics, Feb 23, 2008.

See Also

XStringAlign-class, substitution.matrices

Examples

  ## Nucleotide global, local, and overlap alignments
  s1 <- 
    DNAString("ACTTCACCAGCTCCCTGGCGGTAAGTTGATCAAAGGAAACGCAAAGTTTTCAAG")
  s2 <-
    DNAString("GTTTCACTACTTCCTTTCGGGTAAGTAAATATATAAATATATAAAAATATAATTTTCATC")

  # First use a constant substitution matrix
  mat <- matrix(-3, nrow = 4, ncol = 4)
  diag(mat) <- 1
  rownames(mat) <- colnames(mat) <- DNA_ALPHABET[1:4]
  globalAlign <-
    pairwiseAlignment(s1, s2, substitutionMatrix = mat, gapOpening = -5, gapExtension = -2)
  localAlign <-
    pairwiseAlignment(s1, s2, type = "local", substitutionMatrix = mat, gapOpening = -5, gapExtension = -2)
  overlapAlign <-
    pairwiseAlignment(s1, s2, type = "overlap", substitutionMatrix = mat, gapOpening = -5, gapExtension = -2)

  # Then use quality-based method for generating a substitution matrix
  pairwiseAlignment(s1, s2,
                    patternQuality = rep(c(22L, 12L), times = c(36, 18)),
                    subjectQuality = rep(c(22L, 12L), times = c(40, 20)),
                    scoreOnly = TRUE)

  ## Amino acid global alignment
  pairwiseAlignment(AAString("PAWHEAE"), AAString("HEAGAWGHEE"), substitutionMatrix = "BLOSUM50",
                    gapOpening = 0, gapExtension = -8)

[Package Biostrings version 2.8.18 Index]