![]() |
![]() ![]() ![]() ![]() |
From: Keith Hopper <asgard@wave.co.nz> Newsgroups: comp.lang.sather Subject: Progress down under Date: Thu, 27 May 1999 19:46:34 +1200 Message-ID: <4908aa6d48asgard@waikato.ac.nz> I must apologise to those to whom I indicated that we would be putting 'pen to paper' to let you into what will be/is/has been going on down here - other teaching commitments have made life a bit tricky. However I have nowproduced three articles (and a couple of spreadsheets) which will appear separately. To those who are unaware of what we have been doing I have written a 'New Image' article to give you background to the scheme of work here, why and for what reason, etc. Our work has almost got past the critical stage where we can release some beta-test code - we intend to be very thorough, however, and test everything until we all have the proverbil blue faces! Don't therefore, please ask for code until we can give a better forecast. The new Required Library, Required Resources and boot compiler is unlikely to be available before August - given the size of the alpha testing task. Irrespective of what we have done here there has been absolutely no change to the Sather language as specified in the existing documentation - except for making visible the 'bit' class which was discussed in the original documentation - and which we have found to be essential to meet our objectives. Have fun reading! kh ----------------------------------------------------------------------------- From: Keith Hopper <asgard@wave.co.nz> Subject: Sather - a New Image? Date: Thu, 27 May 1999 19:50:56 +1200 Original Ideas -------------- In order to carry out some research into the problems of internationalisation and localisation of software we required a high quality clean, safe O-O language. We found Sather which, at first sight, seemed adequate to our needs. When we began serious evaluation we discovered a number of significant short-comings in the confusion between language specification and implementation. Study of some of the early notes revealed that the original concept was to define immutable objects using AVAL - specifically AVAL{BIT}. Since we required a program which could handle multiple character encodings in the same program - encodings which could vary from 8 to 32 bits for one encoding and from one to six encodings per character, we obviously had to carry out some re-design work on both compiler and library. We therefore needed to look clearly at every form of conversion between internal values and external representations (whether text or binary). The Plan -------- In the course of doing this we discovered that there were no facilities for input/output of binary data except as pseudo-characters which, because of our requirements were no longer 'simple' 8-bit 'codes there are even bit-patterns which are not valid character encodings! We further realised that the existing class INT actually incorporated four incompatible groups of semantics - those for integers (the abstract number set Z), those for cardinal (the abstract number set N), those for closed field arithmetic (a Galois field as used in the C programming language) and - believe it or not - bit_pattern manipulation! After assessing all of these (and a few other) factors we realised that in order to use Sather in the way we desired that a considerable part og the distribution library would need revision and that it would need considerable extension in relation to strings, binary (bit-pattern) handling and those features already in the course of being standardised (such things as money, time and dates). Naturally some modification to the compiler would be necessary to modify/add built-in features in accordance with the altered requirements - in practice of course the compiler will eventually need to be written in order to take account of the new internationalisation features and, itself, make use of the new library. A New Compiler -------------- It was realised that the compiler back end, generating C source text was not the most effective way to generate an executable image, particularly when many of the C (contorted) expressions themselves result in bloated code. It was therefore decided - as a completely separate exercise initially - that an attempt would be made to generate assembly language source text in a parameterised compiler back end. This work, although not yet incorporating every code generation (all but two apart from the parallel code features). A New run-time engine --------------------- The enormous weight of overhead involved in the Sather run-time engine support, particularly for concurrent programs, has also led to an off-shoot project - a complete re-design for efficiency - and eventually as a very small engine indeed which could be run on bare hardware - thus enabling an operating system to eventually be built in Sather. Current Status -------------- The current state of work here at Waikato is as follows :- a. The new compiler back end is running and only a few things need to be tidied up before booting the compiler. Entry to Alpha testing after this is expected to be some time in August. b. All 3Mbytes of the Required Library source (501 classes altogether!) and about 1Mbyte of Required Resource files (another 313 files) have at last started Alpha testing (there have been eight re-writes of all of the string and character code to improve efficiency!). The move to Beta testing is at the moment uncertain, but some time in August looks good for this too. This testing includes of course the new 'Required Compiler boot version. c. A new run-time engine (really a very small OS-type kernel code - expected size much less than 20kbytes for stand-alone and about 40kbytes when interfacing to an OS) protocode should be complete by the end of June. Work on a full implementation will depend on availability of student-power! To Do ----- Our to-do list includes not only those components of the work which will probably need doing here (until we have a new compiler completely booted!), but also a number of sub-projects we would welcome from other contributors. [If anyone would like a winter in NZ we are quite happy to host academic staff who may be eligible for sabbatical leave] Things which we would like to do here (effort permitting) in addition to fully distributable versions of the above are :- a. Port the parameterised back-end to another CPU - probably a (Strong)ARM to assess the effort involved. b. Design and implement a tree-walking data flow analyser and code evaluator for the Sather compiler for inclusion in a new production compiler as a combination of the new back-end and new Required Library enhancements, using the new library. c. Re-implement the boot version of the cultural compiler using the new Required Library. The program takes text form descriptions of cultural features in the form specified in ISO/IEC 14651 & 14642 and produces culture independent binary configuration files for use by Sather (and other?) installations. d. Produce a free standing code converter utility for re-coding resource files where natural language translation is not involved. e. Produce a resource editor for use by a translator when preparing Sather resources for use in a culture using a different natural language (and optionally a different character code standard). Thinking about other tools/enhancements which also need work has led us to produce skeleton designs for a number of separate libraries - and ideas for others which other contributors may care to take up (using the boot version of the Required Library compiler and the beta test version of the Required Library itself!) :- a. Windows - not built on already overweight sub-strata! This has to be an absolute priority design and implementation issue - anyone wishing to do this work should ask for our skeleton design (which integrates neatly into the Required Library architecture. We see this as needing to sit directly on Win32 primitive DLLs, X server directly(?), RISCOS/NCOS WIMP, etc. b. Comms/Network library - fully portable of course but including facilities for serial-port/modem, IrDA, Ethernet, ASTM, etc. c. CORBA - some people have already indicated possible interest in doing this. d. Add full 3-D geometry services to the Required Library Geometric sub-section. e. "Compiler" - most programs are involved in extracting meaning from keyed input or text stream - and providing precisely defined output. Although these facilities are, indeed , perfectly general, they are most often associated only with a compiler - hence the place-holder name! f. Devices - of any kind! g. Vector Drawing h. Painting (pixel map manipulation) i. Maths - building on top of the few classes in the original Sather distribution which found themselves in this library! j. Accounting. k. Simulation. Where Next! ----------- A number of suggestions have been made about modifying the Sather language. Since the language is one of the most type-safe languages yet devised then we at Waikato are opposed to any development which would endanger/weaken this. We foresee that it should be ideally suited to operate in safety-critical environments and applications; to this end we are looking carefully at existing loopholes in both language and implementation. A further area which we believe is much more pressing is that of Library name spaces and the ability to provide a 'semi-binary' filed form of generic classes so that new libraries may be produced for use with Sather by commercial enterprises wishing to retain the rights to not distribute their source text in order to protect their market! kh ----------------------------------------------------------------------------- From: Keith Hopper <asgard@wave.co.nz> Newsgroups: comp.lang.sather Subject: Sather Required Resources Date: Thu, 27 May 1999 20:19:01 +1200 The task of producing fully internationalised software requires the elimination of all text literals (strings and individual characters) from program source. All text needs must be defined in that environment in which the program is executing. In addition to this, there is an increasing number of intellectual concepts (eg date, number, money, address) the textual representation of which is being parameterised as part of international standardisation efforts. In addition to these factors, there will inevitably be domain specific conversions relevant to an application domain, rather than just a specific program package. All three of these features have had to be implemented in culturally independent classes, with or without parameters. As standards are amended from time to time and new domain specific libraries are created for Sather so it must be relatively simple to amend the basic Required Library cultural classes and the concomitant Resources. Three kinds of resource are provided :- a. The source and binary forms of culture-dependent specifications as specified in the particular national cultures in accordance with ISO/IEC 14652 & 14651. b. A group of Required Library formatting classes and the corresponding Required Resources. Note that domain libraries will need to add their special requirements to this mechanism following the structure of the Required Library and Required Resources. c. Lexical token and message formatting data produce in the language, culture and form of representation (including character encodings which may even be machine specific!!). The culture source specification needs compilation into binary form for internationalised use of the localised culture specification. For this purpose an auxiliary Cultural compiler called 'build' is provided as an adjunct to the Required Resources. The compilation only involves a handful of files. The major cultural resource requirements are in terms of two kinds of resource :- a. Value domains which have a discrete domain of values which are representable by a simple lexical token (ie are not constructed dynamically as number representations are) - such concepts as 'days of week', 'standardised character encoding names (and aliases too!). b. Messages which contain value formatting expressions for use when composing responses/requests to a human user. The data of both of these groups needs to be represented using a local character set, encoding and appropriate natural language lexical tokens. Files containing such text need to be transformed for each culture and coding. The Required Library contains some seventy classes with corresponding resource files for each culture. When Required Library testing has been completed, files will be provided for the international standard (see ISO/IEC 14652) default culture and the two cultures en_NZ and mi_NZ used in New Zealand. For the US, of course there will generally only need to be minor spelling changes and, potentially a translation of encodings. A code conversion facility is part of the Required Library and a simple code conversion utility will also be distributed. kh ----------------------------------------------------------------------------- From: Keith Hopper <asgard@wave.co.nz> Newsgroups: comp.lang.sather Subject: Language vis a vis Required Library Date: Thu, 27 May 1999 20:17:05 +1200 The nomenclature in the title refers to the revised implementation of the compiler and libraries which are being worked on here at Waikato university - not the 23rd variant of Sather-1.2. Our work on the compiler has, in fact produced four interim variants :- a. A version which runs under Win32 and linux. We have not been able yet to run distributed programs on these hosts. At the time of writing it is not intended to pursue this any further. b. A version of a above which runs under RISCOS on a StrongARM processor, modified for OS file path naming and to circumvent problems with the heap-based execution stacks! c. A version of a which is having its back end replaced (and a few necessary fixes to its middle!) by a version producing x86 assembler source text for the gasm assembler. d. A version of a which has been modified to include the semantics of AVAL{BIT} and other changes needed to handle multi-octet codes. This has involved the revision of CHAR to be a 32-bit entity and the revision of STR to include width and culture references. Once all of these versions are fully-tested then they will be merged into a more portable version using the new Required Library, a compiler which it is hoped will be of production quality. It is intended that this will be done by specifying each feature in vdm-sl then using that as the basis for pre/post conditions and implementation and then rigorously testing according to a formal specification of the language (again using vdm-sl with the IFAD VDM toolkit to help!). [Incidentally, that Toolkit will generate C++ - but why couldn't it generate Sather instead?] The library design, together with the design of the cultural Resources mechanism has led to a need to separate the implementation from the language as far as possible. We have therefor defined those classes necessary to the language specification from those in the Required Library. In so doing we have identifies a lot of Required Library classes. The classes in this library are required to implement the basic O-O language functionality in the presence of culturally dependent specifications of features. This has been done in accordance with the developing international standards in that field. Language Classes ---------------- The Sather language is therefore defined as having the following classes :- a. Abstractions (1) $OB - the parent of all classes. (2) $LOCK - the parent of concurrent synchronisation classes. (3) $PORT - the parent of all device/channel mechanism classes. (4) $EXT_IDENT - the parent of all references/handles used in the execution environment name space. b. Implementations (1) BIT - the component out of a sequence of one or more of which a value may be created. It has two conversion routines from and to a numeric value in which 'set' is equated to the numeric value 1, 'clear' to the numeric value 0 - irrespective of their actual representation! (2) BOOL - this class is required to implement a stored program of any kind involving choices. It has language defined values 'true', 'false'. The string representations of these values yield corresponding names in the culture in which a program is executing. An object or returned value of this class may be allocated arbitrary bit-patterns for storage purposes. These are implementation dependent and need not be the same storage patterns as those used in other places or on other occasions which might arise. (3) REFERENCE - this class derives from $EXT_IDENT and is an implementation of handle provided by the OS environment in which a program runs. Apart from this semantic association a value of type REFERENCE is otherwise an uninterpretable bit-pattern. c. Constructors (1) AVAL - constructors a value as a contiguous bit-pattern from the indicated number of immutable objects - eg BIT (but also, say, INT - but NOT BOOL!!). (2) AREF - construct an array of objects of the kind indicated by the element class. The array so created is stored in an implementation-defined manner. (3) "TUP" - we propose to retain the existing class for now, but are not very happy with it. We would rather envisage a language specification which stated that in any tuple class the attributes will be in adjacent locations in the order of occurrence in the source text, separated only by any padding needed to satisfy implementation-defined storage alignment requirements, This means in practice that AVAL{BIT} with 8, 16 and 32 - even 64? - bits could be placed contiguously. This is very important when handling external device chip interfaces. Library Classes --------------- The Required Library is divided into 18 sections as follows :- Basic Binary Codes Concurrent Containers Cultural Date-Time FileSys Geometric IO LowLevel NonNumeric Numeric Opsys Represent SatherRT Strings Text - a grand total of 501 classes in 243 files! An Excel spreadsheet of inheritance and another for inclusions are available on request - but, be warned - they each occupy over six square metres of wall space at half scale! The following are a few brief explanatory notes about the library sections :- a. Basic contains the definition of the language classes for interfacing purposes plus the abstractions which were in the original distribution library Base section. b. Binary - probably self-explanatory - BITs, OCTETs, HEXTETs along with BINSTR, FBINSTR, etc. c. Codes - all to do with character codes and code strings and code mapping. d. Concurrent - all of the Sather 1.2 concurrent library updated for cultural and other changes. e. Containers - divided into sub-sections! f. Cultural - value formatting descriptors, ordering dependencies, repertoires and other cultural resources. g. Date-Time - dates, times, elapsed, time-stamps, etc. h. FileSys - directories, labels, file and search paths. i. Geometric - included to permit page sizes to be specified as a SIZE value for internationalisation purposes. j. LowLevel - for trap/svc/swi, etc interfacing to an OS, etc. k. NonNumeric - an enumeration facility, bit-sets, bool address, etc. l. Numeric - including ranges, rat and money! m. Opsys - A portable interface abstraction. n. Represent - all value conversions to and from text strings - principally as partial classes. o. SatherRT - SYS only at the moment. p. Strings - Abstract and partial classes. q. Text - what it says - but also RUNE,RUNES,FRUNES!! kh ----------------------------------------------------------------------------- From: nobbi@cheerful.com Newsgroups: comp.lang.sather Subject: Re: Learning Sather Date: Mon, 31 May 1999 06:56:21 +0200 Michael Dingler <mdingler@mindless.com> wrote: > Hi folks, > > I just wanted to ask -- with the ongoing transition to > a 'GNU Sather' -- whether learning Sather now seems > sensible. > Having to learn yet another OO language when the next > compiler version considerably breaks even the syntax > isn't what I was thinking of (C++ scarred me enough ;) The syntax will definitely not be broken. Maybe there'll be changes to in somewhere in the future (Not in the next view versions!) But they'll only affect details. You'll not have to unlearn anything you know about the language. Only drawback in learning Sather: You may have a hard time going back to C++ on any other "OO"-language because you'll miss much of the features Sather offers. :-) On the other hand - even your C++-Programming may improve because you learn to *think* real OOP. > So how much will be changed when going from Sather 1.2b > to GNU Sather? Just the implementation and libraries or > the whole language? How long do I have to wait for a > stable version? (of the language, not the compiler) ICSI 1.2b will become GNU 1.2.0 soon and the 1.2.x-line will be used only for stabilization (Even though 1.2b is very usable already). Parallel to that, The next release is prepared down under. It is hard to say, how long we will stay in beta with that. Maybe sometime in fall this year? But as I said before - this should not affect anybody learning the language. Ciao, Nobbi |