From: Keith Hopper <asgard@wave.co.nz>
Newsgroups: comp.lang.sather
Subject: Progress down under
Date: Thu, 27 May 1999 19:46:34 +1200
Message-ID: <4908aa6d48asgard@waikato.ac.nz>
I must apologise to those to whom I indicated that we would be putting 'pen
to paper' to let you into what will be/is/has been going on down here -
other teaching commitments have made life a bit tricky. However I have
nowproduced three articles (and a couple of spreadsheets) which will appear
separately.
To those who are unaware of what we have been doing I have written a 'New
Image' article to give you background to the scheme of work here, why and
for what reason, etc.
Our work has almost got past the critical stage where we can release some
beta-test code - we intend to be very thorough, however, and test
everything until we all have the proverbil blue faces! Don't therefore,
please ask for code until we can give a better forecast. The new Required
Library, Required Resources and boot compiler is unlikely to be available
before August - given the size of the alpha testing task.
Irrespective of what we have done here there has been absolutely no change
to the Sather language as specified in the existing documentation - except
for making visible the 'bit' class which was discussed in the original
documentation - and which we have found to be essential to meet our
objectives.
Have fun reading!
kh
-----------------------------------------------------------------------------
From: Keith Hopper <asgard@wave.co.nz>
Subject: Sather - a New Image?
Date: Thu, 27 May 1999 19:50:56 +1200
Original Ideas
--------------
In order to carry out some research into the problems of
internationalisation and localisation of software we required a high
quality clean, safe O-O language. We found Sather which, at first sight,
seemed adequate to our needs.
When we began serious evaluation we discovered a number of significant
short-comings in the confusion between language specification and
implementation. Study of some of the early notes revealed that the
original concept was to define immutable objects using AVAL - specifically
AVAL{BIT}.
Since we required a program which could handle multiple character encodings
in the same program - encodings which could vary from 8 to 32 bits for one
encoding and from one to six encodings per character, we obviously had to
carry out some re-design work on both compiler and library. We therefore
needed to look clearly at every form of conversion between internal values
and external representations (whether text or binary).
The Plan
--------
In the course of doing this we discovered that there were no facilities for
input/output of binary data except as pseudo-characters which, because of
our requirements were no longer 'simple' 8-bit 'codes there are even
bit-patterns which are not valid character encodings!
We further realised that the existing class INT actually incorporated four
incompatible groups of semantics - those for integers (the abstract number
set Z), those for cardinal (the abstract number set N), those for closed
field arithmetic (a Galois field as used in the C programming language) and
- believe it or not - bit_pattern manipulation!
After assessing all of these (and a few other) factors we realised that in
order to use Sather in the way we desired that a considerable part og the
distribution library would need revision and that it would need
considerable extension in relation to strings, binary (bit-pattern)
handling and those features already in the course of being standardised
(such things as money, time and dates). Naturally some modification to the
compiler would be necessary to modify/add built-in features in accordance
with the altered requirements - in practice of course the compiler will
eventually need to be written in order to take account of the new
internationalisation features and, itself, make use of the new library.
A New Compiler
--------------
It was realised that the compiler back end, generating C source text was
not the most effective way to generate an executable image, particularly
when many of the C (contorted) expressions themselves result in bloated
code. It was therefore decided - as a completely separate exercise
initially - that an attempt would be made to generate assembly language
source text in a parameterised compiler back end. This work, although not
yet incorporating every code generation (all but two apart from the
parallel code features).
A New run-time engine
---------------------
The enormous weight of overhead involved in the Sather run-time engine
support, particularly for concurrent programs, has also led to an off-shoot
project - a complete re-design for efficiency - and eventually as a very
small engine indeed which could be run on bare hardware - thus enabling an
operating system to eventually be built in Sather.
Current Status
--------------
The current state of work here at Waikato is as follows :-
a. The new compiler back end is running and only a few things need
to be tidied up before booting the compiler. Entry to Alpha testing
after this is expected to be some time in August.
b. All 3Mbytes of the Required Library source (501 classes
altogether!) and about 1Mbyte of Required Resource files (another 313
files) have at last started Alpha testing (there have been eight
re-writes of all of the string and character code to improve
efficiency!). The move to Beta testing is at the moment uncertain,
but some time in August looks good for this too. This testing
includes of course the new 'Required Compiler boot version.
c. A new run-time engine (really a very small OS-type kernel code -
expected size much less than 20kbytes for stand-alone and about
40kbytes when interfacing to an OS) protocode should be complete by
the end of June. Work on a full implementation will depend on
availability of student-power!
To Do
-----
Our to-do list includes not only those components of the work which will
probably need doing here (until we have a new compiler completely booted!),
but also a number of sub-projects we would welcome from other contributors.
[If anyone would like a winter in NZ we are quite happy to host academic
staff who may be eligible for sabbatical leave]
Things which we would like to do here (effort permitting) in addition to
fully distributable versions of the above are :-
a. Port the parameterised back-end to another CPU - probably a
(Strong)ARM to assess the effort involved.
b. Design and implement a tree-walking data flow analyser and code
evaluator for the Sather compiler for inclusion in a new production
compiler as a combination of the new back-end and new Required Library
enhancements, using the new library.
c. Re-implement the boot version of the cultural compiler using the
new Required Library. The program takes text form descriptions of
cultural features in the form specified in ISO/IEC 14651 & 14642 and
produces culture independent binary configuration files for use by
Sather (and other?) installations.
d. Produce a free standing code converter utility for re-coding
resource files where natural language translation is not involved.
e. Produce a resource editor for use by a translator when preparing
Sather resources for use in a culture using a different natural
language (and optionally a different character code standard).
Thinking about other tools/enhancements which also need work has led us to
produce skeleton designs for a number of separate libraries - and ideas for
others which other contributors may care to take up (using the boot version
of the Required Library compiler and the beta test version of the Required
Library itself!) :-
a. Windows - not built on already overweight sub-strata! This has
to be an absolute priority design and implementation issue - anyone
wishing to do this work should ask for our skeleton design (which
integrates neatly into the Required Library architecture. We see this
as needing to sit directly on Win32 primitive DLLs, X server
directly(?), RISCOS/NCOS WIMP, etc.
b. Comms/Network library - fully portable of course but including
facilities for serial-port/modem, IrDA, Ethernet, ASTM, etc.
c. CORBA - some people have already indicated possible interest in
doing this.
d. Add full 3-D geometry services to the Required Library Geometric
sub-section.
e. "Compiler" - most programs are involved in extracting meaning
from keyed input or text stream - and providing precisely defined
output. Although these facilities are, indeed , perfectly general,
they are most often associated only with a compiler - hence the
place-holder name!
f. Devices - of any kind!
g. Vector Drawing
h. Painting (pixel map manipulation)
i. Maths - building on top of the few classes in the original Sather
distribution which found themselves in this library!
j. Accounting.
k. Simulation.
Where Next!
-----------
A number of suggestions have been made about modifying the Sather language.
Since the language is one of the most type-safe languages yet devised then
we at Waikato are opposed to any development which would endanger/weaken
this. We foresee that it should be ideally suited to operate in
safety-critical environments and applications; to this end we are looking
carefully at existing loopholes in both language and implementation.
A further area which we believe is much more pressing is that of Library
name spaces and the ability to provide a 'semi-binary' filed form of
generic classes so that new libraries may be produced for use with Sather
by commercial enterprises wishing to retain the rights to not distribute
their source text in order to protect their market!
kh
-----------------------------------------------------------------------------
From: Keith Hopper <asgard@wave.co.nz>
Newsgroups: comp.lang.sather
Subject: Sather Required Resources
Date: Thu, 27 May 1999 20:19:01 +1200
The task of producing fully internationalised software requires the
elimination of all text literals (strings and individual characters) from
program source. All text needs must be defined in that environment in
which the program is executing.
In addition to this, there is an increasing number of intellectual concepts
(eg date, number, money, address) the textual representation of which is
being parameterised as part of international standardisation efforts.
In addition to these factors, there will inevitably be domain specific
conversions relevant to an application domain, rather than just a specific
program package.
All three of these features have had to be implemented in culturally
independent classes, with or without parameters.
As standards are amended from time to time and new domain specific
libraries are created for Sather so it must be relatively simple to amend
the basic Required Library cultural classes and the concomitant Resources.
Three kinds of resource are provided :-
a. The source and binary forms of culture-dependent specifications
as specified in the particular national cultures in accordance with
ISO/IEC 14652 & 14651.
b. A group of Required Library formatting classes and the
corresponding Required Resources. Note that domain libraries will need
to add their special requirements to this mechanism following the
structure of the Required Library and Required Resources.
c. Lexical token and message formatting data produce in the
language, culture and form of representation (including character
encodings which may even be machine specific!!).
The culture source specification needs compilation into binary form for
internationalised use of the localised culture specification. For this
purpose an auxiliary Cultural compiler called 'build' is provided as an
adjunct to the Required Resources.
The compilation only involves a handful of files. The major cultural
resource requirements are in terms of two kinds of resource :-
a. Value domains which have a discrete domain of values which are
representable by a simple lexical token (ie are not constructed
dynamically as number representations are) - such concepts as 'days of
week', 'standardised character encoding names (and aliases too!).
b. Messages which contain value formatting expressions for use when
composing responses/requests to a human user.
The data of both of these groups needs to be represented using a local
character set, encoding and appropriate natural language lexical tokens.
Files containing such text need to be transformed for each culture and
coding.
The Required Library contains some seventy classes with corresponding
resource files for each culture.
When Required Library testing has been completed, files will be provided
for the international standard (see ISO/IEC 14652) default culture and the
two cultures en_NZ and mi_NZ used in New Zealand. For the US, of course
there will generally only need to be minor spelling changes and,
potentially a translation of encodings.
A code conversion facility is part of the Required Library and a simple
code conversion utility will also be distributed.
kh
-----------------------------------------------------------------------------
From: Keith Hopper <asgard@wave.co.nz>
Newsgroups: comp.lang.sather
Subject: Language vis a vis Required Library
Date: Thu, 27 May 1999 20:17:05 +1200
The nomenclature in the title refers to the revised implementation of the
compiler and libraries which are being worked on here at Waikato university
- not the 23rd variant of Sather-1.2.
Our work on the compiler has, in fact produced four interim variants :-
a. A version which runs under Win32 and linux. We have not been
able yet to run distributed programs on these hosts. At the time of
writing it is not intended to pursue this any further.
b. A version of a above which runs under RISCOS on a StrongARM
processor, modified for OS file path naming and to circumvent problems
with the heap-based execution stacks!
c. A version of a which is having its back end replaced (and a few
necessary fixes to its middle!) by a version producing x86 assembler
source text for the gasm assembler.
d. A version of a which has been modified to include the semantics
of AVAL{BIT} and other changes needed to handle multi-octet codes.
This has involved the revision of CHAR to be a 32-bit entity and the
revision of STR to include width and culture references.
Once all of these versions are fully-tested then they will be merged into a
more portable version using the new Required Library, a compiler which it
is hoped will be of production quality. It is intended that this will be
done by specifying each feature in vdm-sl then using that as the basis for
pre/post conditions and implementation and then rigorously testing
according to a formal specification of the language (again using vdm-sl
with the IFAD VDM toolkit to help!).
[Incidentally, that Toolkit will generate C++ - but why couldn't it
generate Sather instead?]
The library design, together with the design of the cultural Resources
mechanism has led to a need to separate the implementation from the
language as far as possible. We have therefor defined those classes
necessary to the language specification from those in the Required Library.
In so doing we have identifies a lot of Required Library classes. The
classes in this library are required to implement the basic O-O language
functionality in the presence of culturally dependent specifications of
features. This has been done in accordance with the developing
international standards in that field.
Language Classes
----------------
The Sather language is therefore defined as having the following classes :-
a. Abstractions
(1) $OB - the parent of all classes.
(2) $LOCK - the parent of concurrent synchronisation classes.
(3) $PORT - the parent of all device/channel mechanism classes.
(4) $EXT_IDENT - the parent of all references/handles used in
the execution environment name space.
b. Implementations
(1) BIT - the component out of a sequence of one or more of
which a value may be created. It has two conversion routines
from and to a numeric value in which 'set' is equated to the
numeric value 1, 'clear' to the numeric value 0 - irrespective of
their actual representation!
(2) BOOL - this class is required to implement a stored program
of any kind involving choices. It has language defined values
'true', 'false'. The string representations of these values yield
corresponding names in the culture in which a program is
executing. An object or returned value of this class may be
allocated arbitrary bit-patterns for storage purposes. These are
implementation dependent and need not be the same storage
patterns as those used in other places or on other occasions
which might arise.
(3) REFERENCE - this class derives from $EXT_IDENT and is an
implementation of handle provided by the OS environment in which
a program runs. Apart from this semantic association a value of
type REFERENCE is otherwise an uninterpretable bit-pattern.
c. Constructors
(1) AVAL - constructors a value as a contiguous bit-pattern from
the indicated number of immutable objects - eg BIT (but also,
say, INT - but NOT BOOL!!).
(2) AREF - construct an array of objects of the kind indicated
by the element class. The array so created is stored in an
implementation-defined manner.
(3) "TUP" - we propose to retain the existing class for now, but
are not very happy with it. We would rather envisage a language
specification which stated that in any tuple class the attributes
will be in adjacent locations in the order of occurrence in the
source text, separated only by any padding needed to satisfy
implementation-defined storage alignment requirements, This
means in practice that AVAL{BIT} with 8, 16 and 32 - even 64? -
bits could be placed contiguously. This is very important when
handling external device chip interfaces.
Library Classes
---------------
The Required Library is divided into 18 sections as follows :-
Basic Binary Codes Concurrent
Containers Cultural Date-Time FileSys
Geometric IO LowLevel NonNumeric
Numeric Opsys Represent SatherRT
Strings Text
- a grand total of 501 classes in 243 files!
An Excel spreadsheet of inheritance and another for inclusions are
available on request - but, be warned - they each occupy over six square
metres of wall space at half scale!
The following are a few brief explanatory notes about the library
sections :-
a. Basic contains the definition of the language classes for
interfacing purposes plus the abstractions which were in the original
distribution library Base section.
b. Binary - probably self-explanatory - BITs, OCTETs, HEXTETs along
with BINSTR, FBINSTR, etc.
c. Codes - all to do with character codes and code strings and code
mapping.
d. Concurrent - all of the Sather 1.2 concurrent library updated for
cultural and other changes.
e. Containers - divided into sub-sections!
f. Cultural - value formatting descriptors, ordering dependencies,
repertoires and other cultural resources.
g. Date-Time - dates, times, elapsed, time-stamps, etc.
h. FileSys - directories, labels, file and search paths.
i. Geometric - included to permit page sizes to be specified as a
SIZE value for internationalisation purposes.
j. LowLevel - for trap/svc/swi, etc interfacing to an OS, etc.
k. NonNumeric - an enumeration facility, bit-sets, bool address,
etc.
l. Numeric - including ranges, rat and money!
m. Opsys - A portable interface abstraction.
n. Represent - all value conversions to and from text strings -
principally as partial classes.
o. SatherRT - SYS only at the moment.
p. Strings - Abstract and partial classes.
q. Text - what it says - but also RUNE,RUNES,FRUNES!!
kh
-----------------------------------------------------------------------------
From: nobbi@cheerful.com
Newsgroups: comp.lang.sather
Subject: Re: Learning Sather
Date: Mon, 31 May 1999 06:56:21 +0200
Michael Dingler <mdingler@mindless.com> wrote:
> Hi folks,
>
> I just wanted to ask -- with the ongoing transition to
> a 'GNU Sather' -- whether learning Sather now seems
> sensible.
> Having to learn yet another OO language when the next
> compiler version considerably breaks even the syntax
> isn't what I was thinking of (C++ scarred me enough ;)
The syntax will definitely not be broken. Maybe there'll be changes to in
somewhere in the future (Not in the next view versions!) But they'll only
affect details. You'll not have to unlearn anything you know about the
language.
Only drawback in learning Sather: You may have a hard time going back to C++
on any other "OO"-language because you'll miss much of the features Sather
offers. :-)
On the other hand - even your C++-Programming may improve because
you learn to *think* real OOP.
> So how much will be changed when going from Sather 1.2b
> to GNU Sather? Just the implementation and libraries or
> the whole language? How long do I have to wait for a
> stable version? (of the language, not the compiler)
ICSI 1.2b will become GNU 1.2.0 soon and the 1.2.x-line will be used only for
stabilization (Even though 1.2b is very usable already). Parallel to that,
The next release is prepared down under. It is hard to say, how long we will
stay in beta with that. Maybe sometime in fall this year? But as I said
before - this should not affect anybody learning the language.
Ciao,
Nobbi |