Smalltalk/X does not support these; instead primitive code is entered as C-code into the methods. This allows the programmer to code everything he/she likes as C-code (beside the traditional primitives).
C-Primitives is included right into the method code and
surrounded by the special character sequences
'%{'
and '%}'
(the quotes are not part of it).
This character sequence has been choosen since it does not conflict with
existing Smalltalk programs.
Inline code can be placed wherever a smalltalk statement may be placed.
Notice, that not all systems support entering primitive code
in the SystemBrowser:
examples:
myPrimitive
%{
printf("hello world\m");
%}
!
anotherPrimitive:aFlag
aFlag ifTrue:[
%{
printf("it is true\n");
%}
] ifFalse:[
%{
printf("no, it is not\n");
%}
]
!
yetAnotherPrimitive:something
|aVariable|
aVariable := something.
%{
if (aVariable == true)
aVariable = nil;
%}
. "<- notice the period - %{ ... %} is syntactically a statement
which must be separated by '.'-characters "
aVariable print
!
accessing Smalltalk objects from within primitives
object representation
Smalltalk objects consist of a header, followed by the
named instance variables (if any), followed by the indexed instance variables (if any).
The header includes the size, the class and some additional information
required for memory management.
The instance variables are actually pointers to the objects.
Since the header is always required, the minimum size
is
3 * sizeof(char *)bytes (call it overhead). On 32bit machines, this is 12 bytes.
"true"
object looks like:
+-------------------------+
| class-pointer --------------> True-class
+-------------------------+
| size (12) |
+-------------------------+
| additional info +
+-------------------------|
The size field is actually only required for objects with indexed instance
variables; for other objects, the size could be fetched from its class
object (which also knows how big its instances are).
Right after the header, any named instance variables are stored.
For example, the object "(100.0 @ 200.0)"
is in memory:
Since most numbers in the system are SmallIntegers,
representing those as above would lead to many small objects.
To avoid this (and also to avoid indirect memory references),
a special encoding is used to represent these:
a bit (the so called TAG bit)
marks object pointers from smallInteger values.
Since pointers are aligned on word boundaries anyway (they are always even),
putting the TAG bit into the low bit
does not limit the address space for objects.
+-------------------------+
| class-pointer --------------> Point-class
+-------------------------+
| size (20) |
+-------------------------+
| additional info |
+-------------------------|
| x instvar --------------> 100.0 Float object
+-------------------------+
| y instvar --------------> 200.0 Float object
+-------------------------+
However, only 31 bits are available to encode a smallIntegers value.
Do not depend on the TAG bit being in a particular bit position;
on some machines, this bit is stored in the sign bit, to make the
smallInteger-check faster.
Use the check/conversion macros described below.
Although, on 32 bit machines there are actually 2 bits for use as tag bits (alignment is really at least a 4byte one), currently only one is used to tag small integers. Future versions may encode other types (for example, short floats) as tagged type to reduce memory requireements. These changes will be transparent to your primitives source code, if access macros are used everywhere.
To summarize, the object "(100 @ 200)"
is in memory:
BTW: this TAG representation is the reason
why
+-------------------------+
| class-pointer --------------> Point-class
+-------------------------+
| size (20) |
+-------------------------+
| additional info |
+-------------------------|
|TAG| x instvar 100 |
+-------------------------+
|TAG| y instvar 200 |
+-------------------------+
"SmallInteger allInstances"
does not
return any useful result and the SmallInteger
class cannot be subclassed since
instances do not have a class field.
Instead, the _Class()
macro described below
(and therefore the class
message sent to any object
'mimics' a class field pointing to
SmallInteger
.
The one other special object is nil
which is represented by a NULL pointer.
The UndefinedObject class is also not subclassable for that reason.
Indexed instance variables are stored after the named instance variables
(or after the header if there are none).
For example, the array "#('foo' true 30 nil)"
looks in memory:
and an instance of a subclass of Array with one named instvar would look
like:
+-------------------------+
| class-pointer --------------> Array-class
+-------------------------+
| size (24) |
+-------------------------+
| additional info |
+-------------------------|
| instvar (1) --------------> 'foo' object
+-------------------------+
| instvar (2) --------------> true object
+-------------------------+
|TAG| instvar (3) 30 |
+-------------------------+
| instvar (4) NULL |
+-------------------------+
ByteArrays, Strings, FloatArrays and DoubleArrays are variable sized too,
but do not store pointers to the elements. Instead, the elements are stored
as non objects.
+-------------------------+
| class-pointer |
+-------------------------+
| size |
+-------------------------+
| additional info |
+-------------------------|
| named instvar |
+-------------------------+
| instvar (1) |
+-------------------------+
| instvar (2) |
+-------------------------+
...
+-------------------------+
| instvar (N) |
+-------------------------+
Strings and ByteArrays store bytes in the variable part.
For example, the string "'hello'"
looks in memory:
Strings are delimited by a zero-byte ('\0') to make interfacing C string
functions (which expect this 0-byte) easier.
+-------------------------+
| class-pointer --------------> String-class
+-------------------------+
| size (18) |
+-------------------------+
| additional info |
+-------------------------|
| 'h' | 'e' | 'l' | 'l' |
+-------------------------+
| 'o' | '\0'| '\0' | '\0' |
+-------------------------+
The size field reflects the actual number of bytes
- however, the memory system
will always align memory in full word increments and
on a full-word boundary when allocating objects.
As a final example, the doubleArray "#(1.0 2.0 3.0) asDoubleArray"
looks in memory:
now, it should be clear why FloatArray and DoubleArray are more efficient
when storing large amounts of these: there is no overhead in object headers
for all the elements.
+-------------------------+
| class-pointer --------------> FloatArray-class
+-------------------------+
| size (36) |
+-------------------------+
| additional info |
+-------------------------|
| |
| 1.0 as a C double |
| |
+-------------------------+
| |
| 2.0 as a C double |
| |
+-------------------------+
| |
| 3.0 as a C double |
| |
+-------------------------+
variables
Within a primitive, a Smalltalk variable "xyz"
is accessable in C as:
true
, false
or nil
nil
' is actually a macro for a constant (usually 0).
@global(xyz)
xyz
is a global variable (other than above) or class variable.
(this is a bad example, globals should start with an
upper case letter, as in 'Xyz'
)
_INST(xyz)
'xyz'
is an instance variable.
_CINST(xyz)
'xyz'
is a class instance variable.
(do not confuse class variables with class instance variables.)
xyz
'xyz'
is a method variable, method argument,
block variable or block argument.
'_'
is prepended
':'
are replaced by underscores;
'+' -> 'pl' '-' -> 'mi'
'*' -> 'mu' '/' -> 'di' '\' -> 'mo'
',' -> 'co' '@' -> 'at' '&' -> 'am'
'<' -> 'le' '>' -> 'gr'
'=' -> 'eq' '~' -> 'ne'
'|' -> 'pi'
and a double underscore '__' prepended
(instead of the single underscore; to distinguish these from symbols named
#pl
etc.).
#'hello world'
becomes '_hello_40world'
)
'#one:two:three:'
is called
'_one_two_three_'
in C,
'#+'
is referred to as '__pl'
,
'#,,'
as '__coco'
,
'#<='
as '__leeq'
"
and finally '#=='
as '_eqeq'
.
Avoid the (mental-) association of 'le'
with 'less-equal'.
The above names are simply the first two alphabetic letter of their
spelled out (english) name.
Since this translation is somewhat inconvenient for the primitive writer
(and also, to make future changes possible), stc supports selector
translation. A construct of the form:
where
@symbol(...)
'...'
is a selector in smalltalks form, will be replaced by the
corresponding c name. (the '@'-syntax has been chosen, since it does not
conflict with existing C lexical elements (also, a similar mechanism is
found in existing objective-c comilers).
PLEASE: do use this feature, since it allows future changes made to the translation mechanism without making existing primitive codes source incompatible.
macros
the following macros are defined for C-primitives:
_Class(x)
'x'
_qClass(x)
'x'
must be non-Nil, non-SmallInteger.
Use only if you are certain about this (slightly faster than _Class
).
_Size(x)
_qSize(x)
'x'
may not be nil or a SmallInteger
Use only if you are certain. (slighly faster than the _Size
macro)
(x == nil)
__isObject(x)
'x'
is not a SmallInteger
__isNonNilObject(x)
(macro)
'x'
is neither a SmallInteger nor nil
(i.e. a 'real' object for which the _q-macros are allowed).
__isSmallInteger(x)
'x'
is a SmallInteger
__bothSmallInteger(x, y)
'x'
and 'y'
are SmallIntegers
__isArray(x)
'x'
is a Array
__isByteArray(x)
'x'
is a ByteArray
__isFloat(x
)
'x'
is a Float
__isFraction(x)
'x'
is a Float
__isString(x)
'x'
is a String
__isSymbol(x)
'x'
is a Symbol
__isCharacter(x)
'x'
is a Character
__isPoint(x)
'x'
is a Point
__isBlock(x)
'x'
is a Block
__isBytes(x)
'x'
is ByteArray-like (i.e. ByteArray or subclass)
__isSymbolLike(x)
'x'
is an instance of Symbol or of a subclass
__isMethodLike(x)
'x'
is an instance of Method or of a subclass
__isFloatLike(x)
'x'
is an instance of Float or of a subclass
__isClassLike(x)
'x'
is an instance of Behavior or of a subclass
__intVal(x)
'x'
must be a SmallInteger
__floatVal(x)
'x'
must be a Float
__stringVal(x)
'x'
must be a String
__symbolVal(x)
'x'
must be a Symbol
__characterVal(x)
'x'
must be a Character
__point_X(p)
'x'
must be a Point
__point_Y(p)
'x'
must be a Point
__stringSize(s)
's'
must be a String
__arraySize(a)
'a'
must be an Array
__byteArraySize(a)
'a'
must be a ByteArray
_MKSMALLINT(i)
'i'
(int)
_MKFLOAT(d)
(function)
'd'
(double) [1]
_qMKFLOAT(d)
_MKSTRING(s)
(function)
's'
(char *) [1]
_MKSYMBOL(s)
(function)
's'
(char *) [1]
_MKCHARACTER(c)
(macro)
'c'
(unsigned char)
'c'
must be in the range [0 .. 255].
_MKLARGEINT(v)
(function)
'v'
(int) [1]
_MKULARGEINT(u)
(function)
'u'
(unsigned int) [1]
_STORE(dst, val)
STORE
macro is not needed
for local variables (method locals, block locals).
Currently (and maybe forever), pointers in ST/X are DIRECT object pointers, meaning that the values of these variables point directly to the underlying objects structure (some of which can be found in the stc.h include file).
Since I cannot guarantee that this statement remains true in the future
(indirection makes things like the garbage collector or the become:-operation
much much simpler), you should always use access macros such as
"_InstPtr(o)->field"
instead of "o->field"
.
These access macros are also defined in stc.h:
_objPtr(o)
"_objPtr(o)->field"
instead of
"o->field"
!
_InstPtr(o)
_PointInstPtr(o)
and so on ...
If there will ever be a switch to indirect pointers, only those macros have to be changed instead of all primitive code.
late news:
some compilers seemed to get confused by a definition of both a macro named"_XXX"
and a global variable with the same name.In the current ST/X release, this was true for the
_isBlock
macro, where a corresponding'_isBlock'
c-variable exists (for theisBlock
symbol).The macro has been renamed to
__isBlock
, which will be done for all and every other macro. It has already been done for some of the above macros. Some of the macros have already been renamed - more will follow. Be prepared for more changes in that area.
"RETURN(value)"
macro is provided by the runtime system,
which does all the housekeeping. For primitives without a context,
the macro will expand to a simple return.
example:
myMethodReturningOne
%{
/* return a SmallInteger */
RETURN ( _MKSMALLINT(1) );
%}
!
mySpecialTrigMethod:arg
%{
if (__isFloat(arg)) {
/* return a Float */
RETURN ( _MKFLOAT(sin(exp(arg) * 1.2345) );
}
%}
.
self primitiveFailed
!
myMethodReturningSymbol
%{
/* return a Symbol */
RETURN ( @symbol(fooBar) );
}
%}
!
myMethodReturningString
%{
/* return a Symbol */
RETURN ( _MKSTRING("hello world") );
}
%}
!
myDestructiveUpperCaseToLowerCase:aString
%{
char *cp;
char c;
/* check if argument is a string */
if (__isString(aString)) {
/* get the C-character pointer to the characters */
cp = _stringVal(aString);
/* walk over string till end (0-byte) is reached */
while ((c = *cp)) {
if (isUpper(c)) {
*cp = toLower(c);
}
cp++;
}
RETURN ( aString );
}
%}
.
self primitiveFailed
!
local storage in primitives
primitives involving local storage which hold smalltalk objects AND
call other methods and/or allocate new objects, MUST be written with
great care, since the garbage collector may run at any time. The garbage
collector will move objects around so that your pointers become invalid.
The garbage collector will of course update all reachable pointers,
to be able to update your pointers, it must know them.
The easiest way of handling this situation is by declaring these locals as method locals (in contrast to c-variables) to these be placed into the context which is updated by the garbage collector:
instead of:
it is better to write:
aMethodName
%{
OBJ aLocal;
...
do something with aLocal
...
%}
In the later case, the smalltalk compiler produces code which protects
the local variable from beeing garbage collected (by creating a context,
which will be fixed by the garbage collector).
aMethodName
|aLocal|
%{
...
do something with aLocal
...
%}
Of course, this protection is only needed if your primitive code calls other methods and/or allocates storage - otherwise there is no danger since the garbage collector will only run when new objects are allocated (which is always possible when calling other methods).
For coding examples, see the primitives in "libbasic/SmallInt.st"
or
"libbasic/Float.st"
. Also, even though a bit more complex, looking into
"libview/XWorkstation.st"
also gives a lot of insight.
stack requirements
With the addition of multiple processes to ST/X, care must be taken when
calling c library functions which use big stack frames (i.e. which use
alloca, or declare big automatic arrays).
ST/X usually prepares a stack big enough for most functions (usually 4k), but
some functions need more
(to name some: printf
, scanf
, popen
and some Xlib
functions).
Since no documentation exists on C-library stack requirements,
you have to guess, try or otherwise find out what requirements exist.
If your system crashes after execution of a primitive, stack violations
are first class candidates for being responsible.
To tell the stc compiler, that a primitive needs more stack, a stack
declaration may be added - this is done by inserting a comment of the form:
or
/* STACK:n */
into the first line of the primitive (i.e. right after the opening
/* UNLIMITEDSTACK */
"{"
).
Detecting the first declaration, stc will make certain that n bytes of
stack are available for the method containing that primitive.
You may have to guess on what a good value for n is. Taking a bigger
value may be less performant, but give you more security.
The second declaration will produce code to switch to the unlimited c-stack
for the execution of the method containing that primitive. This stack
is grown by the operating system and unlimited (not really, but the limit is
typically some 8 or 16 megabytes).
There is one catch in using UNLIMITEDSTACK:
these methods may NOT send other
smalltalk messages. The reason is that all messages might eventually lead to
a process switch into another thread and the c-stack cannot hold frames for
multiple processes in a non first-in/last-out order.
To summarize things, your primitive might now look like:
Do not fear estimating the stack need -
if your estimate is low, there is still a 4k save area;
while no memory is lost or runtime penalty to be payed if you estimate too
high: after all its just stack memory, which is released with the return of
the method.
myMethod
%{ /* STACK:3100 */
char aBigBuffer[3000];
...
...
%}
A rule of thumb is the size of local data arrays (i.e. 3000 bytes of
aBigBuffer) plus some 100 bytes for the context and other housekeeping locals.
Better estimate too high than too low.
ommiting the context setup
For primitive code which does not send smalltalk messages, it is possible to
save the context setup altogether, effectively producing a very simple
(and speedy) c function.
This is done by adding a comment of the form "/* NOCONTEXT */"
to the first
line of the primitive code.
Since those methods do not have any provision for updating either arguments,
self or any locals, much care must be taken to not loose any pointers in case
of a garbage collect.
NOCONTEXT primitives should only be written by experienced users or for methods not allocating memory and not sending other smalltalk messages (i.e. save from entering the garbage collector).
For the curious:
the macrosPROTECT(ptr)
andUNPROTECT(ptr)
can be used to tell the garbage collector about variable values to be updated - see some primitives in libbasic for examples. Thus, with careful use of thesePROTECT
macros, it is possible to define NOCONTEXT methods even if there is a possibility of garbage collection.
These macros are used as:
{ OBJ myRef; ... PROTECT(myRef); ... do something which may lead to a GC this invalidates myRef ... UNPROTECT(myRef); ... myRef valid again }
/* NOREGISTER */
comment - which forces all
method/block locals to be allocated as auto-variables, so that an
address can be taken.
Not all C-compilers complain when the address is taken of a register variable;
some silently make the variable a non register one.
For portability of your code, please use the NOREGISTER
pragma
even if your compiler is a tolerant one iff your primitive takes the address of the variable.
Multiple of these kludge comment pragmas may be in one comment as in:
...
however, stc does not check if your combination makes sense
- using the last one if any conflicts arise.
%{ /* STACK:4096 NOREGISTER */
sending messages from within primitive code
To send a message to an object from within primitive code, use the message
send-function:
where i is is the number of arguments to the method,
selector is the selector which is a symbol (see above and below),
and arg<i> are the arguments. See below for 'XXX'
static struct inlineCache dummy = _DUMMYILC<i>;
val = _SEND<i>(receiver, sel, nil, XXX &dummy, arg1, arg2, ... argi)
For the garbage collector to be able to track all references and for the
debugger to be able to follow the calling chain,
there must
be a chain of contexts where the locals and arguments of methods and blocks
can be found (BTW: this is also what you get with 'thisContext'
).
The current context is usually kept in a global variable (or fixed
register if gnu-cc is used). Thus, on entry into the methods corresponding
c function, the value of this global has to be saved in the sender field
of the new context, and the global must be set to point to the new context.
The compiler can optionally generate code,
where the sending context is passed as an argument.
(this may be slightly faster on some machines)
To faciliate this situation of having different calling conventions,
a bunch of macros has been defined in
"stc.h"
,
which define things like SENDER
, SENDER_COMMA
etc.
depending on how passing of the sender is to be done.
Use of these macros finally makes the above message send look like:
If your message send is often performed, you may like to use the inline
caching facility, which keeps the target of the last send and speeds up
future sends - effectively making the next send an indirect
function call.
For hispeed sends, use:
val = _SEND<i>(receiver, sel COMMA_SENDER, nil &dummy, arg...);
here the target will be cached for a quick indirect call.
Examples:
static struct inlineCache ilc = _ILC<i>;
val = (*ilc.ilc_func)(receiver, sel COMMA_SENDER, nil, &ilc, args);
is equivalent to:
myMethod:argument
|local|
%{
static struct inlineCache dummy = _DUMMYILC0;
local =_SEND0(argument, @symbol(redraw) COMMA_SENDER, nil, &dummy);
%}
!
If you dont use the
myMethod:argument
|local|
local := argument redraw
!
@(symname)
construct, you have to make certain,
that the selector is known and defined somewhere. In this case, you should
include a "{ Symbol: redraw }"
compiler directive; see below for details.
The @(..)
construct will do this automatically for you.
Notice:
Since the SEND-code produced by the compiler is as good as handwritten code you should avoid writing primitives which send messages, if possible.
(actually the compiler produces the indirect call version, which is MUCH faster than the SEND code and also adds some hints to the cache management which further speeds up sends to self, constants and classes).
Primitives should be written for things which cannot be written in Smalltalk (for example: interface to databases, interface to graphics etc) or which are very time-consuming and can be considerable tuned in c (for example: copying arrays, String search, Image rotation etc.)
A good strategy is to do all nescessary smalltalk stuff before entering
the primitive code, and not sending anything from inside.
(See examples in XWorkstation.st
, Array.st
and especially the image manipulation methods in DepthXImage.st
etc.)
Also, keep in mind, that the interface will be changed if I get an idea
of a faster send, and YOU have to update the code in this case; whereas all
high level smalltalk code will not be affected by these changes.
allocating object memory in primitives
In general, direct object allocation (i.e. allocating some bytes of storage
and setting the class & instance fields 'manually') should be avoided and
banned from all primitives,
since direct allocation makes redefinition of the 'new'
method invisible to
your primitive and also opens the door for many possible errors (setting
fields/instvars wrong or forgetting to define or nil-out any field/instvar).
In most cases, it is possible to do the allocation outside of the primitive code
as in:
someMethod
|localBytes|
localBytes := ByteArray new:10000.
%{
...
do something with localBytes in the primitive
...
%}
If you really have to allocate in a primitive, here is how its done using a
call to the "new" function as:
(size is the number of bytes,
newObject = new(size, SENDER);
SENDER
is a macro from stc.h
- it MUST be present here)
This returns space for an object with (size - OHDR_SIZE) bytes, where OHDR_SIZE is the space (overhead) required for an object header. The object header includes the size field, the class, and some flags needed by the garbage collector.
Notice, that this function may return nil in case the memory manager has problems allocating the memory. This may happen only in one situation: if the object memory is full, and the operating system is not willing to satisfy a request for more memory. I.e. if the memory requirements hit any virtual memory size limits of the operating system).
Since this does not happen in normal situations forgetting the non-nil test is a common mistake, which gets unnoticed for quite some time.
Since the header is always required, the call is better written as:
newObject = new(bytesWanted + OHDR_SIZE, SENDER);
Passing the current context as SENDER
is nescessary since new()
might need
to perform a garbage collect and needs a handle to the context chain to
find references and update pointers.
Notice that both sending new
to a class and
the new()
function may fail and return nil if the system
is running out of memory. You MUST check the value returned
and handle the nil case.
The returned space is not initialized - not even cleared. To avoid a later
crash of your code or in the
garbage collector, you MUST set the class-field and nil-out/or correctly set
the instance fields of the new object (for ByteArrays, Strings, Float- and
DoubleArray the nilling can be omitted as done in the
ByteArrays #uninitializedNew:
method).
example: allocate a ByteArray.
example: allocate a Point.
{
...
/*
* notice, there may be a garbage collect here ...
* thus invalidating local pointers, which are neither
* in the context, nor have been PROTECTED
*/
newObject = new(100 + OHDR_SIZE, SENDER);
if (newObject != nil) {
/*
* MUST set the class
*/
_objPtr(newObject)->o_class = ByteArray;
/*
* except for ByteArrays and Strings, MUST nil-out instvars
* but care for the first OHDR_SIZE header bytes - dont clear
* those.
*/
bzero((char *)_objPtr(newObject) + OHDR_SIZE, 100);
}
RETURN ( newObject );
}
For the experts: there are also macros for very hi-speed allocation (without
a function call). These macros will directly manipulate the storage managers
free-pointers, thus allowing an object to be allocated with just a few
machine instructions.
The macros are:
{
...
newPoint = new(sizeof(OBJ)*2 + OHDR_SIZE, SENDER);
if (newPoint != nil) {
/*
* MUST set the class
*/
_objPtr(newPoint)->o_class = Point;
/*
* must set the fields
*/
_InstPtr(newPoint)->i_instvars[0] = _MKSMALLINT(1); /* p x:1 */
_InstPtr(newPoint)->i_instvars[1] = _MKSMALLINT(0); /* p y:0 */
}
RETURN ( newPoint );
}
quickNew(newObject, size, SENDER)
new()
for a garbage collect.
canDoQuickNew(size)
quickCheckedNew(newObject, size)
canDoQuickNew()
which returned true.
You should start to write your primitive using the normal new()
and later
decide if its worth to tune the allocation.
Notice, that stc does generate high performance code for some
Classes new operations. For example, it may decide to generate inline
allocation code for a point or array creation.
allocating C memory in primitives
If C memory is allocated (i.e. malloc
or
calloc
are called either directly or indirectly)
care must be taken to cleanup this memory
if the primitive performs other sends or
if the there is a possibility that the method gets interrupted.
Otherwise, there is a chance for a memory leak, by never freeing this
C memory.
The method:
possibly creates this memory leak (and may crash, since it does not
check the return value of malloc for being nil).
dangerousMethod
%{
char *cp;
cp = malloc(100);
SEND(...);
...
free(cp);
%}
Due to the message send in this method (the SEND
call),
it is possible, that the free()
call is never executed.
This happens if either a signal raise (with a handler and unwind)
or a block return
to some upper method, or a process termination
occurs in or below the method reached via the send.
Of course, this will not lead to a crash, but instead to more and more memory being allocated over time. This memory will never be freed. These errors are especially hard to find, since smalltalk does not know about it - it will not be shown in the MemoryMonitor or other tool.
There are two possible solutions to fix the above:
fixedMethod
%{
static char *saveHandle;
char *cp;
if (saveHandle) {
free(saveHandle);
}
cp = saveHandle = malloc(100);
if (cp) {
SEND(...);
...
free(cp); saveHandle = (char *)0;
}
%}
this will free the memory when the method is called the next time
and the previous free was not executed for some reason.
!MyClass class primitiveVariables!
%{
static char *saveHandle;
%}
! !
...
cleanupMethod
%{
if (saveHandle) {
free(saveHandle);
saveHandle = (char *)0;
}
%}
!
fixedMethod
%{
char *cp;
if (saveHandle) {
free(saveHandle);
}
cp = saveHandle = malloc(100);
if (cp) {
SEND(...);
...
free(cp); saveHandle = (char *)0;
}
%}
Therefore, every store of a non-SmallInteger, non-Nil object into another
object must be checked for these special situations. To do so, after every
store a macro of the form:
must be placed (this macro checks for those situations and calls a GC
function in these cases.)
dest->i_inst[3] = value; /* the store */
_STORE(dest, value)
Omitting the STORE
macro may leave you with a perfectly running program for
a while or longer and lead to a crash at a later time. (sometimes it may
even go totally unnoticed up to the time when storage allocation patterns
change)
Never forget this macro for pointer stores. Of course, it can be omitted if you can prove that the stored value is either nil or a SmallInteger; also for all bit-arrays (i.e. ByteArray. FloatArray, DoubleArray and String) this store-check is not needed.
Notice: the above example is a very bad one if 'dest ~~ self'
- see note below
storing into the receiver
The only store that is actually legal in Smalltalk is a store into instance
variables of the current receiver (of course, the bad guy can store into any
object from primitive code).
To update an instance variable named 'xyz'
in the current receiver, the
following code is needed:
As a rule: this should normally NOT be done from primitive code; better
separate the code into a primitive part and a smalltalk part - let stc handle
all these internals.
_INST(xyz) = value;
_STORE(self, value);
example:
Final note: a store check is also needed when storing into a method variable
of the home context from within a block context. But this is too tricky to
be done by a primitive-writer (I would not do that !) so you better write
your blocks in smalltalk.
myMethod
myInstave := someThing.
%{
/* other stuff */
%}
.
myInstance := somethingElse
storing into globals and class variables
A store check is also needed when updating a value into a global variable,
or classVariable.
After modifying global XYZ, you should add:
to your code.
_STORE(Smalltalk, XYZ);
For the curious: these macros are found in "stc.h"
; all of them can be used
in expressions and evaluate to the stored value.
interrupting primitive code
Interrupts (i.e. handling operating system signals like timers or CTRL-C)
are not handled immediately in a primitive, but instead simply set a flag
and are processed later (with the next send or when the method returns).
For most primitives, this behavior is correct and simplifies the writing of primitive code.
However, primitive code in which a blocking I/O or wait operation occurs, should tell the interrupt system to do an immediate interrupt. Otherwise there was no chance to get out of a blocking read (for example, reading from a pipe to which noone is writing) - not even by pressing CTRL-C or a timer interrupt.
To tell ST/X that interrupts should be handled immediately,
a global C-variable can be set before doing the blocking C-call.
As in:
Notice, that without this flag being set, timer interrupts will not
be handled while waiting for input to arrive. Therefore no switching
to other (smalltalk-) processes is done.
Beside the obvious
....
extern int __immediateInterrupt__;
...
__immediateInterrupt__ = 1;
n = read(fd, buffer, count);
__immediateInterrupt__ = 0;
...
read()
and write()
many
other C library functions do possibly block.
Of course, also CTRL-C processing is done using signals - therefore even
pressing CTRL-C has no effect and there is no chance to get into the debugger.
On the other hand, if an interrupt occurs and is handled immediately,
you have to be careful in coding your primitive. Anything can happen there,
especially recursive entry into this method,
a garbage collect, context unwinds, long returns or even process
termination have to be considered.
triggering interrups from C code
As mentioned above, interrupts (i.e. signals) are handled by setting
a flag and checking this flag at regular times (with the next message send,
when the current context returns or at a loops head).
These interrupt flags can be accessed from C code.
Therefore, interrupts can be triggered from C code by setting the corresponding
flag to nonNil (typically, they are set to true).
Although possible, you shuld not play around with other interrupt flags
than the customInterrupt
, which was specially designed for
this purpose.
For example, a C signal handler function may trigger this interrupt
by calling:
to set the flag.
and
stxRegisterCustomInterrupt();
to have the runtime system check for immediate interrupts being
enabled and perform the interrupt processing sequence as required.
stxHandleInterrupt();
In the smalltalk world, this will send customInterrupt
to the CustomInterruptHandler
.
Using custom interrupts is especially useful for callBacks and other
c functions which want to interrupt smalltalk processing and get
immediate response.
problems
%{
struct abc {
int field1;
char field2;
};
%}
method
|local1 field2|
...
will lead to an error, since the name "field2" is used both in a c-structure
and as a method local. This may also happen with other C-names (i.e. typedefs,
structure names, enum values etc.)
Workaround: rename the local variables.
"stc.h"
. (as in version 2.10.3). So, you have to avoid names such as
'byteArray', 'array' etc.
This will be changed soon - at least for names in "stc.h"
.
Compiling code with such conflicts will usualy lead to errors in the
C-compilation phase. Since stc does not parse (and understand) the structure
of primitive code, it will not notice this conflict.
hints & tips for writing primitives
The following is a list of common bugs, and can be used as a check
list in case of 'mysterous' behavior.
action:
action:
action:
_isString
- which is the C variable of the isString symbol)
action:
the memory layout of subclass instances is not the same as for instances of the class where your primitive is defined. Therefore access to the variable part (the indexed instvars) is wrong (the indexed instvars are stored right behind the named instvars).
action:
Array
and String
,
of which some are prepared to handle subclass instances, others simply
fall back into a super send.
you probably passed the value of the string using the _stringVal()
macro.
This macro returns a (char *)
to the characters of the string object.
At the time you pass this to the c function, everything runs fine and looks ok.
However, if the C function remembers that pointer, the next GC may move the string around,
leaving the (remembered) c pointer pointing to nowhere (actually, it will point to some
random object).
The next access from c will access a totally random object and may lead to a bus error
or segmentation violation.
action:
Copyright © Claus Gittinger Development & Consulting, all rights reserved
(cg@ssw.de)