prev back next

How to write inline C code

Contents

Introduction

Standard smalltalk implementations define a set of primitive operations which are built into a virtual (Smalltalk-) machine; these primitives do the kind of work which cannot be expressed as smalltalk code (for example integer addition) or which is reimplemented for performance reasons (for example copying Arrays).

Smalltalk/X does not support these; instead primitive code is entered as C-code into the methods. This allows the programmer to code everything he/she likes as C-code (beside the traditional primitives).

C-Primitives is included right into the method code and surrounded by the special character sequences '%{' and '%}' (the quotes are not part of it). This character sequence has been choosen since it does not conflict with existing Smalltalk programs.

Inline code can be placed wherever a smalltalk statement may be placed.

Notice, that not all systems support entering primitive code in the SystemBrowser:

examples:

       myPrimitive
       %{
	   printf("hello world\m");
       %}
       !

       anotherPrimitive:aFlag
	   aFlag ifTrue:[
       %{
	       printf("it is true\n");
       %}
	   ] ifFalse:[
       %{
	       printf("no, it is not\n");
       %}
	   ]
       !

       yetAnotherPrimitive:something
	   |aVariable|

	   aVariable := something.
       %{
	   if (aVariable == true)
	       aVariable = nil;
       %}
       . "<- notice the period - %{ ... %} is syntactically a statement
	     which must be separated by '.'-characters "

	   aVariable print
       !

accessing Smalltalk objects from within primitives

object representation

Smalltalk objects consist of a header, followed by the named instance variables (if any), followed by the indexed instance variables (if any). The header includes the size, the class and some additional information required for memory management. The instance variables are actually pointers to the objects.
Since the header is always required, the minimum size is
3 * sizeof(char *)
bytes (call it overhead). On 32bit machines, this is 12 bytes.
For example, the memory used by the "true" object looks like:
	+-------------------------+
	|       class-pointer   --------------> True-class
	+-------------------------+
	|       size  (12)        |
	+-------------------------+
	|       additional info   +
	+-------------------------|
The size field is actually only required for objects with indexed instance variables; for other objects, the size could be fetched from its class object (which also knows how big its instances are).
However, to tune the memory system, all objects include that size info. Later versions may optionally support object without a size field, which would save about 120k-150k bytes of object space in the initial image (there are about 35000-40000 objects initially) and result in a noticable reduction in the memory requirements if many small objects are handled in an application.
Therefore, never access the size field (or any other header field) directly. Instead, use the access macros which are described below. Upcoming systems may be delivered with changed access macros.

Right after the header, any named instance variables are stored.
For example, the object "(100.0 @ 200.0)" is in memory:

	+-------------------------+
	|       class-pointer   --------------> Point-class
	+-------------------------+
	|       size  (20)        |
	+-------------------------+
	|       additional info   |
	+-------------------------|
	|       x instvar       --------------> 100.0 Float object
	+-------------------------+
	|       y instvar       --------------> 200.0 Float object
	+-------------------------+
Since most numbers in the system are SmallIntegers, representing those as above would lead to many small objects. To avoid this (and also to avoid indirect memory references), a special encoding is used to represent these: a bit (the so called TAG bit) marks object pointers from smallInteger values. Since pointers are aligned on word boundaries anyway (they are always even), putting the TAG bit into the low bit does not limit the address space for objects.
However, only 31 bits are available to encode a smallIntegers value.
Do not depend on the TAG bit being in a particular bit position; on some machines, this bit is stored in the sign bit, to make the smallInteger-check faster. Use the check/conversion macros described below.

Although, on 32 bit machines there are actually 2 bits for use as tag bits (alignment is really at least a 4byte one), currently only one is used to tag small integers. Future versions may encode other types (for example, short floats) as tagged type to reduce memory requireements. These changes will be transparent to your primitives source code, if access macros are used everywhere.

To summarize, the object "(100 @ 200)" is in memory:

	+-------------------------+
	|       class-pointer   --------------> Point-class
	+-------------------------+
	|       size  (20)        |
	+-------------------------+
	|       additional info   |
	+-------------------------|
	|TAG|   x instvar 100     |
	+-------------------------+
	|TAG|   y instvar 200     |
	+-------------------------+
BTW: this TAG representation is the reason why "SmallInteger allInstances" does not return any useful result and the SmallInteger class cannot be subclassed since instances do not have a class field. Instead, the _Class() macro described below (and therefore the class message sent to any object 'mimics' a class field pointing to SmallInteger.

The one other special object is nil which is represented by a NULL pointer. The UndefinedObject class is also not subclassable for that reason.

Indexed instance variables are stored after the named instance variables (or after the header if there are none). For example, the array "#('foo' true 30 nil)" looks in memory:

	+-------------------------+
	|       class-pointer   --------------> Array-class
	+-------------------------+
	|       size  (24)        |
	+-------------------------+
	|       additional info   |
	+-------------------------|
	|       instvar (1)     --------------> 'foo' object
	+-------------------------+
	|       instvar (2)     --------------> true object
	+-------------------------+
	|TAG|   instvar (3)  30   |
	+-------------------------+
	|       instvar (4)  NULL |
	+-------------------------+
and an instance of a subclass of Array with one named instvar would look like:
	+-------------------------+
	|       class-pointer     |
	+-------------------------+
	|       size              |
	+-------------------------+
	|       additional info   |
	+-------------------------|
	|       named instvar     |
	+-------------------------+
	|       instvar (1)       |
	+-------------------------+
	|       instvar (2)       |
	+-------------------------+
		   ...
	+-------------------------+
	|       instvar (N)       |
	+-------------------------+
ByteArrays, Strings, FloatArrays and DoubleArrays are variable sized too, but do not store pointers to the elements. Instead, the elements are stored as non objects.
Strings and ByteArrays store bytes in the variable part. For example, the string "'hello'" looks in memory:
	+-------------------------+
	|       class-pointer   --------------> String-class
	+-------------------------+
	|       size  (18)        |
	+-------------------------+
	|       additional info   |
	+-------------------------|
	| 'h' | 'e' |  'l' |  'l' |    
	+-------------------------+
	| 'o' | '\0'| '\0' | '\0' | 
	+-------------------------+
Strings are delimited by a zero-byte ('\0') to make interfacing C string functions (which expect this 0-byte) easier.
The size field reflects the actual number of bytes - however, the memory system will always align memory in full word increments and on a full-word boundary when allocating objects.
As a final example, the doubleArray "#(1.0 2.0 3.0) asDoubleArray" looks in memory:
	+-------------------------+
	|       class-pointer   --------------> FloatArray-class
	+-------------------------+
	|       size  (36)        |
	+-------------------------+
	|       additional info   |
	+-------------------------|
	|                         |    
	|     1.0 as a C double   |
	|                         | 
	+-------------------------+
	|                         |    
	|     2.0 as a C double   |
	|                         | 
	+-------------------------+
	|                         |    
	|     3.0 as a C double   |
	|                         | 
	+-------------------------+
now, it should be clear why FloatArray and DoubleArray are more efficient when storing large amounts of these: there is no overhead in object headers for all the elements.

variables

Within a primitive, a Smalltalk variable "xyz" is accessable in C as:

true, false or nil
to access the corresponding smalltalk object.
'nil' is actually a macro for a constant (usually 0).

@global(xyz)
if xyz is a global variable (other than above) or class variable. (this is a bad example, globals should start with an upper case letter, as in 'Xyz')

_INST(xyz)
if 'xyz' is an instance variable.

_CINST(xyz)
if 'xyz' is a class instance variable. (do not confuse class variables with class instance variables.)

xyz
if 'xyz' is a method variable, method argument, block variable or block argument.

symbols

For selectors (i.e. any symbol), the following name conversion is done: Thus, the selector '#one:two:three:' is called '_one_two_three_' in C,
'#+' is referred to as '__pl', '#,,' as '__coco', '#<=' as '__leeq'" and finally '#==' as '_eqeq'.

Avoid the (mental-) association of 'le' with 'less-equal'. The above names are simply the first two alphabetic letter of their spelled out (english) name.

Since this translation is somewhat inconvenient for the primitive writer (and also, to make future changes possible), stc supports selector translation. A construct of the form:

	@symbol(...)
where '...' is a selector in smalltalks form, will be replaced by the corresponding c name. (the '@'-syntax has been chosen, since it does not conflict with existing C lexical elements (also, a similar mechanism is found in existing objective-c comilers).

PLEASE: do use this feature, since it allows future changes made to the translation mechanism without making existing primitive codes source incompatible.

macros

the following macros are defined for C-primitives:
_Class(x)
returns the class of 'x'

_qClass(x)
same, but 'x' must be non-Nil, non-SmallInteger. Use only if you are certain about this (slightly faster than _Class).

_Size(x)
the size of the object in bytes (0 for nil or integers)

_qSize(x)
same, but 'x' may not be nil or a SmallInteger Use only if you are certain. (slighly faster than the _Size macro)

(x == nil)
to check for nil

__isObject(x)
1 (in C) if 'x' is not a SmallInteger

__isNonNilObject(x) (macro)
1 if 'x' is neither a SmallInteger nor nil (i.e. a 'real' object for which the _q-macros are allowed).

__isSmallInteger(x)
1 if 'x' is a SmallInteger

__bothSmallInteger(x, y)
1 if both 'x' and 'y' are SmallIntegers

__isArray(x)
1 if 'x' is a Array

__isByteArray(x)
1 if 'x' is a ByteArray

__isFloat(x)
1 if 'x' is a Float

__isFraction(x)
1 if 'x' is a Float

__isString(x)
1 if 'x' is a String

__isSymbol(x)
1 if 'x' is a Symbol

__isCharacter(x)
1 if 'x' is a Character

__isPoint(x)
1 if 'x' is a Point

__isBlock(x)
1 if 'x' is a Block

__isBytes(x)
1 if 'x' is ByteArray-like (i.e. ByteArray or subclass)

__isSymbolLike(x)
1 if 'x' is an instance of Symbol or of a subclass

__isMethodLike(x)
1 if 'x' is an instance of Method or of a subclass

__isFloatLike(x)
1 if 'x' is an instance of Float or of a subclass

__isClassLike(x)
1 if 'x' is an instance of Behavior or of a subclass

__intVal(x)
the integer value; 'x' must be a SmallInteger

__floatVal(x)
the double value; 'x' must be a Float

__stringVal(x)
the character-pointer; 'x' must be a String

__symbolVal(x)
the character-pointer; 'x' must be a Symbol

__characterVal(x)
the ascii-value; 'x' must be a Character

__point_X(p)
the x-instance; 'x' must be a Point

__point_Y(p)
the y-instance; 'x' must be a Point

__stringSize(s)
the size of the string-object; 's' must be a String

__arraySize(a)
the size of the array-object; 'a' must be an Array

__byteArraySize(a)
the size of the byteArray-object; 'a' must be a ByteArray

_MKSMALLINT(i)
makes a SmallInteger object with value 'i' (int)

_MKFLOAT(d) (function)
makes a Float object with value 'd' (double) [1]

_qMKFLOAT(d)
same, but faster. However, increases code size somewhat.

_MKSTRING(s) (function)
makes a String object with value 's' (char *) [1]

_MKSYMBOL(s) (function)
makes a Symbol object with name 's' (char *) [1]

_MKCHARACTER(c) (macro)
makes a Character object with asciiValue 'c' (unsigned char)
'c' must be in the range [0 .. 255].

_MKLARGEINT(v) (function)
makes a LargeInteger object with value 'v' (int) [1]

_MKULARGEINT(u) (function)
makes a LargeInteger object with value 'u' (unsigned int) [1]

_STORE(dst, val)
to tell the garbage collector, that a store of val was made into dst; the STORE macro is not needed for local variables (method locals, block locals).
[1] may lead into garbage collector - see below.
After calling any of these [1]-functions, any pointers referencing smalltalk objects or pointing into smalltalk objects (_stringVal) are void. You have to take care of this, either by using the PROTECT macros, or by putting those references into the current context.

Currently (and maybe forever), pointers in ST/X are DIRECT object pointers, meaning that the values of these variables point directly to the underlying objects structure (some of which can be found in the stc.h include file).

Since I cannot guarantee that this statement remains true in the future (indirection makes things like the garbage collector or the become:-operation much much simpler), you should always use access macros such as "_InstPtr(o)->field" instead of "o->field".
These access macros are also defined in stc.h:

_objPtr(o)
same as o, to hide indirection if ever implemented, always use "_objPtr(o)->field" instead of "o->field" !

_InstPtr(o)
to access instvars by index [0 .. instSize-1]

_PointInstPtr(o)
to access x and y of a point

and so on ...

If there will ever be a switch to indirect pointers, only those macros have to be changed instead of all primitive code.

late news:

some compilers seemed to get confused by a definition of both a macro named "_XXX" and a global variable with the same name.

In the current ST/X release, this was true for the _isBlock macro, where a corresponding '_isBlock' c-variable exists (for the isBlock symbol).

The macro has been renamed to __isBlock, which will be done for all and every other macro. It has already been done for some of the above macros. Some of the macros have already been renamed - more will follow. Be prepared for more changes in that area.

returning a value from a primitive

Since some cleanup is required before a method is left, a simple C-return from a primitive will not be sufficient in most cases (consider for example a pending reference to the current context). To deal with these situations correctly, a "RETURN(value)" macro is provided by the runtime system, which does all the housekeeping. For primitives without a context, the macro will expand to a simple return.

example:

	myMethodReturningOne
	%{
	    /* return a SmallInteger */
	    RETURN ( _MKSMALLINT(1) );   
	%}
	!

	mySpecialTrigMethod:arg
	%{
	    if (__isFloat(arg)) {
		/* return a Float */
		RETURN ( _MKFLOAT(sin(exp(arg) * 1.2345) );
	    }
	%}
	.
	    self primitiveFailed
	!

	myMethodReturningSymbol
	%{
	    /* return a Symbol */
	    RETURN ( @symbol(fooBar) );
	    }
	%}
	!

	myMethodReturningString
	%{
	    /* return a Symbol */
	    RETURN ( _MKSTRING("hello world") );
	    }
	%}
	!

	myDestructiveUpperCaseToLowerCase:aString
	%{
	    char *cp;
	    char c;

	    /* check if argument is a string */
	    if (__isString(aString)) {
		/* get the C-character pointer to the characters */
		cp = _stringVal(aString);
		/* walk over string till end (0-byte) is reached */
		while ((c = *cp)) {
		    if (isUpper(c)) {
			*cp = toLower(c);
		    }
		    cp++;
		}
		RETURN ( aString );
	    }
	%}
	.
	    self primitiveFailed
	!

local storage in primitives

primitives involving local storage which hold smalltalk objects AND call other methods and/or allocate new objects, MUST be written with great care, since the garbage collector may run at any time. The garbage collector will move objects around so that your pointers become invalid. The garbage collector will of course update all reachable pointers, to be able to update your pointers, it must know them.

The easiest way of handling this situation is by declaring these locals as method locals (in contrast to c-variables) to these be placed into the context which is updated by the garbage collector:

instead of:

	  aMethodName
	  %{
	      OBJ aLocal;

	       ...
	      do something with aLocal
	       ...
	  %}
it is better to write:
	  aMethodName
	      |aLocal|
	  %{
		...
	      do something with aLocal
		...
	  %}
In the later case, the smalltalk compiler produces code which protects the local variable from beeing garbage collected (by creating a context, which will be fixed by the garbage collector).

Of course, this protection is only needed if your primitive code calls other methods and/or allocates storage - otherwise there is no danger since the garbage collector will only run when new objects are allocated (which is always possible when calling other methods).

For coding examples, see the primitives in "libbasic/SmallInt.st" or "libbasic/Float.st". Also, even though a bit more complex, looking into "libview/XWorkstation.st" also gives a lot of insight.

stack requirements

With the addition of multiple processes to ST/X, care must be taken when calling c library functions which use big stack frames (i.e. which use alloca, or declare big automatic arrays).

ST/X usually prepares a stack big enough for most functions (usually 4k), but some functions need more (to name some: printf, scanf, popen and some Xlib functions). Since no documentation exists on C-library stack requirements, you have to guess, try or otherwise find out what requirements exist. If your system crashes after execution of a primitive, stack violations are first class candidates for being responsible.

To tell the stc compiler, that a primitive needs more stack, a stack declaration may be added - this is done by inserting a comment of the form:

	/* STACK:n */
or
	/* UNLIMITEDSTACK */
into the first line of the primitive (i.e. right after the opening "{").

Detecting the first declaration, stc will make certain that n bytes of stack are available for the method containing that primitive. You may have to guess on what a good value for n is. Taking a bigger value may be less performant, but give you more security.
The second declaration will produce code to switch to the unlimited c-stack for the execution of the method containing that primitive. This stack is grown by the operating system and unlimited (not really, but the limit is typically some 8 or 16 megabytes).

There is one catch in using UNLIMITEDSTACK: these methods may NOT send other smalltalk messages. The reason is that all messages might eventually lead to a process switch into another thread and the c-stack cannot hold frames for multiple processes in a non first-in/last-out order.

To summarize things, your primitive might now look like:

	myMethod

	%{ /* STACK:3100 */

	    char aBigBuffer[3000];
	    ...
	    ...
	%}
Do not fear estimating the stack need - if your estimate is low, there is still a 4k save area; while no memory is lost or runtime penalty to be payed if you estimate too high: after all its just stack memory, which is released with the return of the method.

A rule of thumb is the size of local data arrays (i.e. 3000 bytes of aBigBuffer) plus some 100 bytes for the context and other housekeeping locals. Better estimate too high than too low.

ommiting the context setup

For primitive code which does not send smalltalk messages, it is possible to save the context setup altogether, effectively producing a very simple (and speedy) c function.
This is done by adding a comment of the form "/* NOCONTEXT */" to the first line of the primitive code. Since those methods do not have any provision for updating either arguments, self or any locals, much care must be taken to not loose any pointers in case of a garbage collect.

NOCONTEXT primitives should only be written by experienced users or for methods not allocating memory and not sending other smalltalk messages (i.e. save from entering the garbage collector).

For the curious:

the macros PROTECT(ptr) and UNPROTECT(ptr) can be used to tell the garbage collector about variable values to be updated - see some primitives in libbasic for examples. Thus, with careful use of these PROTECT macros, it is possible to define NOCONTEXT methods even if there is a possibility of garbage collection.
These macros are used as:
	{
	    OBJ myRef;
	    ...
	    PROTECT(myRef);
	    ...
	    do something which may lead to a GC
	    this invalidates myRef
	    ...
	    UNPROTECT(myRef);
	    ...
	    myRef valid again
	}

Register variables / Method locals

The newest sparc version of stc can now put locals into the registers (and other versions will also support this feature in the future). Since stc does not look-into or parse primitive code, you have to tell when register locals are not wanted (for example when the address of such a variable is taken in the primitive).
This is now done using a /* NOREGISTER */ comment - which forces all method/block locals to be allocated as auto-variables, so that an address can be taken.

Not all C-compilers complain when the address is taken of a register variable; some silently make the variable a non register one.
For portability of your code, please use the NOREGISTER pragma even if your compiler is a tolerant one iff your primitive takes the address of the variable.

Multiple of these kludge comment pragmas may be in one comment as in:

	  %{ /* STACK:4096  NOREGISTER */ 
... however, stc does not check if your combination makes sense - using the last one if any conflicts arise.

sending messages from within primitive code

To send a message to an object from within primitive code, use the message send-function:
	static struct inlineCache dummy = _DUMMYILC<i>;

	val = _SEND<i>(receiver, sel, nil, XXX &dummy, arg1, arg2, ... argi)
where i is is the number of arguments to the method, selector is the selector which is a symbol (see above and below), and arg<i> are the arguments. See below for 'XXX'

For the garbage collector to be able to track all references and for the debugger to be able to follow the calling chain, there must be a chain of contexts where the locals and arguments of methods and blocks can be found (BTW: this is also what you get with 'thisContext').

The current context is usually kept in a global variable (or fixed register if gnu-cc is used). Thus, on entry into the methods corresponding c function, the value of this global has to be saved in the sender field of the new context, and the global must be set to point to the new context.
The compiler can optionally generate code, where the sending context is passed as an argument. (this may be slightly faster on some machines)

To faciliate this situation of having different calling conventions, a bunch of macros has been defined in "stc.h", which define things like SENDER, SENDER_COMMA etc. depending on how passing of the sender is to be done. Use of these macros finally makes the above message send look like:

	val = _SEND<i>(receiver, sel COMMA_SENDER, nil &dummy, arg...);
If your message send is often performed, you may like to use the inline caching facility, which keeps the target of the last send and speeds up future sends - effectively making the next send an indirect function call. For hispeed sends, use:
	static struct inlineCache ilc = _ILC<i>;

	val = (*ilc.ilc_func)(receiver, sel COMMA_SENDER, nil, &ilc, args);
here the target will be cached for a quick indirect call. Examples:
      myMethod:argument
	  |local|
      %{
	  static struct inlineCache dummy = _DUMMYILC0;

	  local =_SEND0(argument, @symbol(redraw) COMMA_SENDER, nil, &dummy);
      %}
      !
is equivalent to:
      myMethod:argument
	  |local|

	  local := argument redraw
      !
If you dont use the @(symname) construct, you have to make certain, that the selector is known and defined somewhere. In this case, you should include a "{ Symbol: redraw }" compiler directive; see below for details. The @(..) construct will do this automatically for you.

Notice:

Since the SEND-code produced by the compiler is as good as handwritten code you should avoid writing primitives which send messages, if possible.

(actually the compiler produces the indirect call version, which is MUCH faster than the SEND code and also adds some hints to the cache management which further speeds up sends to self, constants and classes).

Primitives should be written for things which cannot be written in Smalltalk (for example: interface to databases, interface to graphics etc) or which are very time-consuming and can be considerable tuned in c (for example: copying arrays, String search, Image rotation etc.)

A good strategy is to do all nescessary smalltalk stuff before entering the primitive code, and not sending anything from inside. (See examples in XWorkstation.st, Array.st and especially the image manipulation methods in DepthXImage.st etc.)

Also, keep in mind, that the interface will be changed if I get an idea of a faster send, and YOU have to update the code in this case; whereas all high level smalltalk code will not be affected by these changes.

allocating object memory in primitives

In general, direct object allocation (i.e. allocating some bytes of storage and setting the class & instance fields 'manually') should be avoided and banned from all primitives, since direct allocation makes redefinition of the 'new' method invisible to your primitive and also opens the door for many possible errors (setting fields/instvars wrong or forgetting to define or nil-out any field/instvar).

In most cases, it is possible to do the allocation outside of the primitive code as in:

	someMethod
	    |localBytes|

	    localBytes := ByteArray new:10000.
	    %{
		...
		do something with localBytes in the primitive
		...
	    %}

If you really have to allocate in a primitive, here is how its done using a call to the "new" function as:

	newObject = new(size, SENDER);
(size is the number of bytes, SENDER is a macro from stc.h - it MUST be present here)

This returns space for an object with (size - OHDR_SIZE) bytes, where OHDR_SIZE is the space (overhead) required for an object header. The object header includes the size field, the class, and some flags needed by the garbage collector.

Notice, that this function may return nil in case the memory manager has problems allocating the memory. This may happen only in one situation: if the object memory is full, and the operating system is not willing to satisfy a request for more memory. I.e. if the memory requirements hit any virtual memory size limits of the operating system).

Since this does not happen in normal situations forgetting the non-nil test is a common mistake, which gets unnoticed for quite some time.

Since the header is always required, the call is better written as:

	newObject = new(bytesWanted + OHDR_SIZE, SENDER);

Passing the current context as SENDER is nescessary since new() might need to perform a garbage collect and needs a handle to the context chain to find references and update pointers.

Notice that both sending new to a class and the new() function may fail and return nil if the system is running out of memory. You MUST check the value returned and handle the nil case.

The returned space is not initialized - not even cleared. To avoid a later crash of your code or in the garbage collector, you MUST set the class-field and nil-out/or correctly set the instance fields of the new object (for ByteArrays, Strings, Float- and DoubleArray the nilling can be omitted as done in the ByteArrays #uninitializedNew: method).

example: allocate a ByteArray.

	{
	    ...
	    /*
	     * notice, there may be a garbage collect here ...
	     * thus invalidating local pointers, which are neither
	     * in the context, nor have been PROTECTED
	     */
	    newObject = new(100 + OHDR_SIZE, SENDER);
	    if (newObject != nil) {
		/* 
		 * MUST set the class
		 */
		_objPtr(newObject)->o_class = ByteArray;
		/*
		 * except for ByteArrays and Strings, MUST nil-out instvars
		 * but care for the first OHDR_SIZE header bytes - dont clear
		 * those.
		 */
		bzero((char *)_objPtr(newObject) + OHDR_SIZE, 100);
	    }
	    RETURN ( newObject );
	}
example: allocate a Point.
	{
	    ...
	    newPoint = new(sizeof(OBJ)*2 + OHDR_SIZE, SENDER);
	    if (newPoint != nil) {
		/* 
		 * MUST set the class
		 */
		_objPtr(newPoint)->o_class = Point;
		/*
		 * must set the fields
		 */
		_InstPtr(newPoint)->i_instvars[0] = _MKSMALLINT(1);  /* p x:1 */
		_InstPtr(newPoint)->i_instvars[1] = _MKSMALLINT(0);  /* p y:0 */
	    }
	    RETURN ( newPoint );
	}
For the experts: there are also macros for very hi-speed allocation (without a function call). These macros will directly manipulate the storage managers free-pointers, thus allowing an object to be allocated with just a few machine instructions. The macros are:
quickNew(newObject, size, SENDER)
tries to do a quick (inline) new; if no memory is available, calls the general new() for a garbage collect.

canDoQuickNew(size)
returns 1 if a quick new is possible WITHOUT a garbage collect. (i.e. returns zero if a quick new operation would do a GC)

quickCheckedNew(newObject, size)
does the quick new; is only allowed after a canDoQuickNew() which returned true.
Since use of these macros can also lead to mysterious errors, they should not be used in normal situations. (the saving is NOT spectacular under normal circumstances.)

You should start to write your primitive using the normal new() and later decide if its worth to tune the allocation.

Notice, that stc does generate high performance code for some Classes new operations. For example, it may decide to generate inline allocation code for a point or array creation.

allocating C memory in primitives

If C memory is allocated (i.e. malloc or calloc are called either directly or indirectly) care must be taken to cleanup this memory if the primitive performs other sends or if the there is a possibility that the method gets interrupted. Otherwise, there is a chance for a memory leak, by never freeing this C memory.
The method:
	dangerousMethod
	%{
	    char *cp;

	    cp = malloc(100);
	    SEND(...);
	    ...
	    free(cp);
	%}
possibly creates this memory leak (and may crash, since it does not check the return value of malloc for being nil).
Due to the message send in this method (the SEND call), it is possible, that the free() call is never executed. This happens if either a signal raise (with a handler and unwind) or a block return to some upper method, or a process termination occurs in or below the method reached via the send.

Of course, this will not lead to a crash, but instead to more and more memory being allocated over time. This memory will never be freed. These errors are especially hard to find, since smalltalk does not know about it - it will not be shown in the MemoryMonitor or other tool.

There are two possible solutions to fix the above:

modifying objects from within primitive code

Care must also be taken, when an object is stored somewhere within the primitive. To give you some background, the GC must know which objects have references to some special things like contexts. Also cross space references (i.e. old objects referencing new objects or any object referencing a stack-context) must be detected and GC given a chance to remember these. (this is called a write-barrier or store-check)

Therefore, every store of a non-SmallInteger, non-Nil object into another object must be checked for these special situations. To do so, after every store a macro of the form:

	dest->i_inst[3] = value;     /* the store */
	_STORE(dest, value)
must be placed (this macro checks for those situations and calls a GC function in these cases.)

Omitting the STORE macro may leave you with a perfectly running program for a while or longer and lead to a crash at a later time. (sometimes it may even go totally unnoticed up to the time when storage allocation patterns change)

Never forget this macro for pointer stores. Of course, it can be omitted if you can prove that the stored value is either nil or a SmallInteger; also for all bit-arrays (i.e. ByteArray. FloatArray, DoubleArray and String) this store-check is not needed.

Notice: the above example is a very bad one if 'dest ~~ self' - see note below

storing into the receiver

The only store that is actually legal in Smalltalk is a store into instance variables of the current receiver (of course, the bad guy can store into any object from primitive code). To update an instance variable named 'xyz' in the current receiver, the following code is needed:
	_INST(xyz) = value;
	_STORE(self, value);
As a rule: this should normally NOT be done from primitive code; better separate the code into a primitive part and a smalltalk part - let stc handle all these internals.
example:
	myMethod
	    myInstave := someThing.
	%{
	    /* other stuff */
	%}
	.
	    myInstance := somethingElse
Final note: a store check is also needed when storing into a method variable of the home context from within a block context. But this is too tricky to be done by a primitive-writer (I would not do that !) so you better write your blocks in smalltalk.

storing into globals and class variables

A store check is also needed when updating a value into a global variable, or classVariable. After modifying global XYZ, you should add:
	_STORE(Smalltalk, XYZ);
to your code.

For the curious: these macros are found in "stc.h"; all of them can be used in expressions and evaluate to the stored value.

interrupting primitive code

Interrupts (i.e. handling operating system signals like timers or CTRL-C) are not handled immediately in a primitive, but instead simply set a flag and are processed later (with the next send or when the method returns).

For most primitives, this behavior is correct and simplifies the writing of primitive code.

However, primitive code in which a blocking I/O or wait operation occurs, should tell the interrupt system to do an immediate interrupt. Otherwise there was no chance to get out of a blocking read (for example, reading from a pipe to which noone is writing) - not even by pressing CTRL-C or a timer interrupt.

To tell ST/X that interrupts should be handled immediately, a global C-variable can be set before doing the blocking C-call. As in:

	....
	extern int __immediateInterrupt__;
	...
	__immediateInterrupt__ = 1;
	n = read(fd, buffer, count);
	__immediateInterrupt__ = 0;
	...
Notice, that without this flag being set, timer interrupts will not be handled while waiting for input to arrive. Therefore no switching to other (smalltalk-) processes is done. Beside the obvious read() and write() many other C library functions do possibly block.
Of course, also CTRL-C processing is done using signals - therefore even pressing CTRL-C has no effect and there is no chance to get into the debugger.

On the other hand, if an interrupt occurs and is handled immediately, you have to be careful in coding your primitive. Anything can happen there, especially recursive entry into this method, a garbage collect, context unwinds, long returns or even process termination have to be considered.

triggering interrups from C code

As mentioned above, interrupts (i.e. signals) are handled by setting a flag and checking this flag at regular times (with the next message send, when the current context returns or at a loops head).
These interrupt flags can be accessed from C code. Therefore, interrupts can be triggered from C code by setting the corresponding flag to nonNil (typically, they are set to true).
Although possible, you shuld not play around with other interrupt flags than the customInterrupt, which was specially designed for this purpose.
For example, a C signal handler function may trigger this interrupt by calling:
	stxRegisterCustomInterrupt();
to set the flag. and
	stxHandleInterrupt();
to have the runtime system check for immediate interrupts being enabled and perform the interrupt processing sequence as required.

In the smalltalk world, this will send customInterrupt to the CustomInterruptHandler.

Using custom interrupts is especially useful for callBacks and other c functions which want to interrupt smalltalk processing and get immediate response.

problems

conflicting names of local variables and structures/typedefs

Names of C-Structures, structure fields and typedefs may not conflict with the names of method or block local variables. stc will produce wrong code, leading to a syntax error in the C-compilation phase.
Example:
	%{
	    struct abc {
		int field1;
		char field2;
	    };
	%}

	method
	    |local1 field2|
	    ...
will lead to an error, since the name "field2" is used both in a c-structure and as a method local. This may also happen with other C-names (i.e. typedefs, structure names, enum values etc.)

Workaround: rename the local variables.

other name conflicts with structures, macros and typedefs

STC does not always handle instance- and local variable names correctly, which have the same name as existing C-structures or C-typedefs. This also applies to structures defined in stc's standard header file "stc.h". (as in version 2.10.3). So, you have to avoid names such as 'byteArray', 'array' etc.

This will be changed soon - at least for names in "stc.h". Compiling code with such conflicts will usualy lead to errors in the C-compilation phase. Since stc does not parse (and understand) the structure of primitive code, it will not notice this conflict.

hints & tips for writing primitives

The following is a list of common bugs, and can be used as a check list in case of 'mysterous' behavior.