![]() |
Section 8.16.1.3: |
This page defines two generic abstract classes named $TEXT_STRING which have different numbers of class arguments
This abstract class defines a state component which is a set of all instantiations of objects of any class sub-typing from this class in addition to the vdm model types used wherever this class name is used. Note that SAME has to be an instantiated class, not an abstract one.
NOTE | See the important note about vdm state in the notes on vdm-sl usage in this specification. |
This abstract class characterises the concept of all forms of simple string whether binary, text or other as sequences of the argument class (elements) which must sub-type from $IS_EQ. Classes which sub-type from this shall have immutable semantics!
This feature is the cultural and coding which is associated with the string. It need not be the default culture and coding for the environment in which the program is executing, since a program may manipulate cultur objects independently of local textual representations.
Since the string has to exist then so does this component. The pre-condition, therefore, is vacuously true.
Thie is also vacuously true, since it is a component of every string of text.
This feature provides access to all of the cultural and environment dependencies relating to this character string.
This abstract class defines a state component which is a set of all instantiations of objects of any class sub-typing from this class in addition to the vdm model types used wherever this class name is used. Note that SAME has to be an instantiated class, not an abstract one.
NOTE | See the important note about vdm state in the notes on vdm-sl usage in this specification. |
This abstract class characterises the concept of a text string as a sequence of the argument class (elements) which must sub-type from $IS_EQ. The second and third class arguments are the 'corresponding' mutable ($FTEXT_STRINGS{ELT}) and immutable (sub-typing from $TEXT_STRINGS{ELT}) string classes. Classes which sub-type from this class shall have immutable semantics!
The provision of this feature is required to permit sub-typing classes to convert binary data into a string of text. It is complemented by the binstr feature defined below.
build | ( |
cursor : BIN_CURSOR, | |
lib : LIBCHARS | |
) : SAME |
This routine builds the result string from the binary string indicated using the encoding and repertoire defined by lib. If the string indicated by cursor does not contain an integral number of character codes in the given repertoire and encoding then void is returned and the cursor has not been moved.
This predicate tests to determine if the string contains all upper-case letters (being defined by the current execution environment cultural specification as being in the class 'upper'). Note that where a script does not define any upper case letters - or has no case distinction at all then the result will be identically false - even though the characters are letters.
This predicate returns true if and only if every element of self is upper-case, otherwise false. Where there is no case distinction in the script concerned then this returns identically false.
This predicate tests to determine if the string contains all lower-case letters (being defined by the current execution environment cultural specification as being in the class 'lower'). Note that where a script does not define any lower case letters - or has no case distinction at all then the result will be identically false - even though the characters are letters.
This predicate returns true if and only if every element of self is lower-case, otherwise false. Where there is no case distinction in the script concerned then this returns identically false.
Note that the index in this post-condition is incremented by one to take account of the indexing difference between Sather and vdm.
This routine returns the element to be found at the indicated position in self.
This routine creates a copy of self in which all lower case letters are replaced by an upper case equivalent if one exists. Note that there are scripts (eg Armenian) which have lower case letters to which there is no corresponding upper case letter. If no upper case equivalent exists then no change is made to a letter code. Non-letter codes are not changed.
upper | : SAME |
This routine returns a copy of self in which every lower case character has been converted to its upper case equivalent provided one exists.
This routine creates a copy of self in which all upper case letters are replaced by a lower case equivalent. Non-letter codes are not changed.
lower | : SAME |
This routine returns a copy of self in which every upper case character has been converted to its lower case equivalent.
This routine creates a copy of self in which the first character of each word is converted to its upper case equivalent (if one exists). The start of a word is defined as either the first character in the string unless that is white space or punctuation, otherwise the first character following a whitespace or punctuation character unless that is itself white space or punctuation.
capitalize | : SAME |
This routine returns a copy of self in which the first character of every word (from the beginning of the string or after punctuation or a whitespace) is converted to its upper case equivalent if one exists.
This feature returns a text string which is the concatenation of self the given number of times.
This routine returns a new string which contains the contents of self concatenated cnt times.
This feature enables arbitrary element substitution to be made over the entire text string.
replace | ( |
old_elt : ELT, | |
new_elt : ELT | |
) : SAME |
This routine returns a new string which is a copy of self apart from which each occurrence of old_elt has been replaced by new_elt.
This second variant of this feature enables simple set substitution to be made, any element in the string which is treated as if it were a set of elements being replaced by the given replacement element.
replace | ( |
test_set : STP, | |
new_elt : ELT | |
) : SAME |
This routine returns a copy of self in which all occurrences of any element in set are replaced by new_elt.
This feature returns a copy of self in which every occurrence of elt has been deleted.
remove | ( |
elt : ELT | |
) : SAME |
This routine returns a copy of self from which all occurrences of elt have been removed.
This feature returns a copy of self in which every occurrence of an element which is in the str argument has been deleted. The string argument is treated as if it were a set of elements.
remove | ( |
test_set : STP | |
) : SAME |
This routine returns a copy of self from which all elements contained in test_set have been removed.
This routine provides a facility to convert a text string into one with escape elements inserted. This is frequently useful when it is necessary to process the string by some external service which may treat the elements in elist specially unless preceded by an escape element. The list argument is treated as if it were a set of elements. Note that the list argument may be empty, in which case the only changes which occur is the duplication of every escape element.
escape | ( |
escape : ELT, | |
elist : STP | |
) : SAME |
This routine returns a text string which is a copy of self in which all elements occurring in elist - and the escape element itself - are preceded by the escape element.
The structure of text consists of a sequence of pages within each of which there are one or more lines of text, any number of which may be empty. This routine strips from the end of the string any number of contiguous line marks (more than one, it will be remembered, denoting blank lines).
strip | : SAME |
For the purposes of specification a line mark is considered to be a single element in the text string. Where an implementation uses two or more elements then they shall appear as being one for the purposes of addition/removal from a string.
This routine returns a copy of self from the end of which has been removed all contiguous line_marks.
This feature returns a copy of self from which the first occurrence (if any) of str has been removed.
minus | ( |
str : STP | |
) : SAME |
This routine returns a copy of self from which the first (if any) occurrence of str has been deleted.
This variant of the minus feature returns a copy of self from which the first occurrence after the given index position (if any) of str has been removed.
This routine returns a copy of self from which the first (if any) occurrence of str after the starting index has been deleted.
This feature corresponds to the elt! feature. This one yields the values of the individual elements of self starting with the one with the highest index and thereafter successively lower indices.
rev! | : ELT |
Note that the formal name of the iter has been changed to replace the exclamation mark iter symbol to a name acceptable to vdm tools.
This post-condition makes use of the history concept from vdm++ (see the vdm dialect notes).
For quit actions see the specificatiion of the quit statement.
This iter yields the elements of self in reverse order of the indices.
A text string consists of text elements which may have one or more codes per element (in Telugu or Vietnamese, for example). One of the necessary features of internationalising the required library, therefore, has resulted in the concept of a character code - the class CHAR_CODE. The routines in this section are provided to manipulate these when doing such things as code/character conversion/substitution operations.
All forms of text string require this form of creation operation in order that composition of characters may be effected. This merely returns a text string containing the element denoted by the single code. Note that this code may not be a combining code(see the class UNICODE for further information on this).
Since the code can have any value and the string takes its encoding from that, the pre-condition is vacuously true.
This creation routine returns a single element string formed from the encoding given.
This is the first of a pair of code yielding iters. Do not assume that the number of codes yielded will correspond to the number of elements in the text string. That is only true for text strings in which all elements happen to have a single code!
Note that the formal name of the iter has been changed to replace the exclamation mark iter symbol to a name acceptable to vdm tools.
This post-condition makes use of the history concept from vdm++ (see the vdm dialect notes).
For quit actions see the specification of the quit statement.
This iter yields each individual character encoding in self in sequence using the repertoire and encoding of the text string.
Note that the formal name of the iter has been changed to replace the exclamation mark iter symbol to a name acceptable to vdm tools.
This post-condition makes use of the history concept from vdm++ (see the vdm dialect notes).
For quit actions see the specification of the quit statement.
This iter yields individual character encodings in self in sequence beginning with the first code in the element at the given index in the string.
![]() |
Language Index | ![]() |
Library Index | ![]() |
String Index |
Comments or enquiries should be made to
Keith
Hopper. Page last modified: Thursday, 25 May 2000. |
![]() |