Internationalisation

Introduction

The formal specification of the Sather language includes a reference to the need for an implementation to provide facilities for internationalising programs and providing localisation facilities. The model specified includes the implementation of the two standards -

ISO/IEC 14651 - International String Ordering - Method for comparing Character Strings and Description of the Common Template Tailorable Ordering

ISO/IEC 14652 - Information Technology - Specifications for Cultural Conventions

These standards are available from national standards bodies (but see below), a list of whose members may be obtained from ISO.

The need to be able to run a program without alteration in whatever cultural environment it may be has meant that the Required Library design includes a number of classes specifically for use where a variety of cultures may be in use. Any full implementation of the internationalised Sather language must therefore include facilities for converting a cultural specification (as given in the appropriate standards and as may be registered with national standards bodies) into a form suitable for use when running Sather programs.

The notes given below are guide lines for programmers wishing to produce fully internationalised programs - for properly localised execution environments.

Culture

Throughout the world there are many hundreds of different cultures, the diversity of which is essential to the continuing existence of mankind. In order to ensure that all cultures can make use of computing technology according to their needs, the ability to execute programs in environments and cultures different from the original programmers becomes a major direction for standardisation.

The International Standards Organisation (ISO), recognising this challenge, has a number of working groups trying to develop acceptable standards for this purpose. The specific working group developing internationalisation standards for programming is ISO/IEC JTC1/SC22/WG20. This group has a number of projects among which are -

The last of these is designed to provide a method for enabling national (and other) standards bodies to specify their own cultural conventions in accordance with the 14652 standard - and then register that specification for use internationally. It is this which provides a mechanism for cultural standards for which the API (15435 standard) specifies an implementation interface.

The first standard in this list is a very comprehensive mechanism for ordering strings of text independently of the coding which may be used for individual text components (characters, ideographs, etc). It is a very important part of the whole groups of standards and needs some care in reading!

From the point of view of a Sather programmer, all of these put together have the following consequences for a program which is designed to be portable across cultures :-

Resources

The resources of a class consist of the list of messages which it needs in interacting with the program user. For the purposes of the Sather implementation features specified in the standards are built-in to the Required Library. The resource features is designed around the concept that a resource file will contain an entry giving a class name followed by the name of the file containing these messages. The programmer must, of course, produce the 'message' file (the technical details are given in the specifications Resource section).

At some point during execution when it is first necessary to emit such a message (say using the class REPORTER) the file needs to be read - using the RESOURCES::read feature which takes a class name in the form of RUNES (see that class specification) and a count of the number of messages expected as a simple cross-check - returning an array of format strings.

Since the syntactic order of expression in natural languages may differ markedly for the same meaning, it is essential to use the 'index' mechanism so that translations may re-order the values to be plugged in while the program remains unaltered.

Coding

Since these message files may be needed not only in a variety of languages, but also in a range of character encodings for different machines and/or operating systems, say, it is necessary to have the ability to convert from one encoding to another. This requires an implementation to provide such a re-encoding utility (see the auxiliary program section of these notes for discussion.


Comments or enquiries should be made to Keith Hopper .
Page last modified: Tuesday, 17 October 2000.
Produced with Amaya