XStringAlign-class {Biostrings} | R Documentation |
The XStringAlign
class is a container for storing
an alignment between 2 XString objects of the same subtype.
Before we define the notion of alignment, we introduce the notion of "filled-with-gaps supersequence". A "filled-with-gaps supersequence" of a string s1 is a string S1 that is obtained by inserting 0 or any number of gaps in s1. For example L-A–ND is a "filled-with-gaps supersequence" of LAND. An alignment between 2 strings s1 and s2 is made of 2 strings (align1 and align2) that are "filled-with-gaps supersequences" of s1 and s2, and that have the same length. Note that this common length must be greater or equal to the lengths of s1 and s2: nchar(align1) = nchar(align2) >= max(nchar(s1), nchar(s2))
For example, this is an alignment between LAND and LEAVES:
L-A--ND LEAVES-
An alignment can be seen as a compact representation of one set of basic operations that transforms s1 into s2. There are 3 different kinds of basic operations: "insertions" (gaps in align1), "deletions" (gaps in align2), "replacements". The above alignment represents the following basic operations:
insert E at pos 2 insert V at pos 4 insert E at pos 5 replace by S at pos 6 (N is replaced by S) delete at pos 7 (D is deleted)Note that "insert X at pos i" means that all letters at a position >= i are moved 1 place to the right before X is actually inserted.
There are many possible alignments between 2 given strings s1 and s2 and a common problem is to find the one (or those ones) with the highest score i.e. with the lower total cost in terms of basic operations.
In the code snippets below,
x
is a XStringAlign
object.
align1(x)
and align2(x)
:
The "filled-with-gaps supersequences" of the original strings to align.
Note that align1(x)
and align2(x)
are XString
objects of the same subtype and length.
type(x)
:
The type of the alignment ("global"
, "local"
, or "overlap"
).
score(x)
:
The score of the alignment (integer).
length(x)
or nchar(x)
:
The length of the alignment i.e. the common length of align1(x)
and align2(x)
.
alphabet(x)
:
Equivalent to alphabet(align1(x))
(or alphabet(align2(x))
).
as.character(x)
:
Converts x
to a named character vector of length 2.
H. Pages
pairwiseAlignment
,
XString-class
s1 <- AAString("LAND") s2 <- AAString("LEAVES") nw1 <- pairwiseAlignment(s1, s2, substitutionMatrix = "BLOSUM50", gapOpening = -3, gapExtension = -1) nw1 length(nw1) nw0 <- pairwiseAlignment(s1, s2, substitutionMatrix = "BLOSUM50", gapOpening = 0, gapExtension = 0) nw0 length(nw0) ## Low gap penalties tend to produce longer alignments! as.character(nw0)