prev back next

Working with processes

Contents

Introduction

Smalltalk/X provides facilities for lightweight processes (also called threads). These are implemented within smalltalk itself i.e. the implementation does not depend on any particular operating system support.
Doing so has some advantages:

If the system is running on Unix you should keep in mind that:

Lightweight processes should not be confused with Unix's processes. Unix processes have separate address-spaces, meaning that once created, such a process can no longer access and/or modify objects from the parent process. Communication must be done explicit via interprocess mechanisms, such as shared memory, sockets, pipes or files.
Also, once forked, files or resources opened by a Unix subprocess are not automatically visible and/or accessable from the parent process. (This is also true for other child processes).

In contrast, smalltalk processes all run in one address space, therefore communication is possible via objects. They share open files, window system- and other operating system resources.

Each smalltalk process has its own automatically growing stack. There is no need for the programmer to preallocate or otherwise predefine the stack size. For security (and to make your life easier), the stack size is limited by a soft limit, which can be set at any time by aProcess setMaximumStackSize:limit.
When hit, this limit will raise a smalltalk signal, which can be cought and handled by the smalltalk exception mechanism. It is even possible to change the limit within the exception handler and continue execution of the process with more stack (see examples in "doc/coding").

Shared access to objects

As usual, some danger is involved in modifying shared data, this is especially true, since in general, no interlocks have been built into the normal classes. For example, concurrent add/removal to an orderedCollection can lead to invalid contents (since its internal information may become corrupted).

For the curious:

Instances of OrderedCollection hold some internal indices to keep track of first and last elements inside its container array. If a process switch occurs during update of these values, the stored value could be invalid.
However, to support multiple process synchronization, there are some classes available, which do handle concurrent access. Of special interrest are: To fix the above problem, the following code-fragment will help:

setup:

	|sema sharedCollection|

	...
	sema := Semaphore forMutualExclusion.
	sharedCollection := OrderedCollection new.
	...
writer:
	...
	sema critical:[
	    sharedCollection addLast:something
	].
	...
reader:
	...
	sema critical:[
	    something := sharedCollection removeFirst
	].
	...
Simple reader/writer applications are best implemented using SharedQueues, (which are basically implemented much like above code).
Thus, the above becomes:
	|queue|

	...
	queue := SharedQueue new.
	...
writer:
	...
	queue nextPut:something.
	...
reader:
	...
	something := queue next
	...
The smalltalk classes could be rewritten to add interlocks at every possible access in those containers (actually, many other classes such as the complete View hierarchy must be rewritten too).
This has not been done, mainly for performance reasons. The typical case is that most objects are not used concurrently - thus the overhead involved by locking would hurt the normal case and only simplify some special cases.

Process priorities & scheduling

Each process has associated with it a priority (usually some small number between 1 and 30). Whenever more than one process is runnable, the scheduler selects the highest priority process to execute. The currently running process is running until either: If two or more processes with the same priority are runnable, NO automatic rescheduling is done by the scheduler. To pass control to other processes within this group, the running process must give up control explicitly; either by performing a Processor>>yield or by suspending itself when waiting on some semaphore.

This is called "preemtive scheduling WITHOUT round robin".

The reason for not doing automatic round-robin lies in the above mentioned non-thread-safeness of the class library. (adding locks and semaphores at every possible spot in the class library would both slow down the system in the normal case, AND blow up the size of the smalltalk system considerably.)

adding round-robin

If you want to do round-robin, there is a PD implementation of a round-robin scheme. This implementation is very simple:
it creates a new high-priority process, whose execution is controlled by the timer, and which forces a yield of the running process with every clock-tick.
To try it, fileIn "goodies/Timeslice.st" using the fileBrowser or by evaluating:
	Smalltalk fileIn:'goodies/Timeslice.st'
then start round-robin with:
	ProcessorScheduler startTimeSlicing
(have a look at the files contents for more info)

Notice, that there is NO warranty concerning the actual usability of this goody; for above reasons, the system could behave strange when the timeslicer is running.
To see the effect of timeslicing, open a ProcessMonitor from the Launchers utility menu and start some processes which do some long time computation. For example, you can evaluate (in a workspace):

    [
      100 timesRepeat:[3000 factorial]
    ] forkAt:4
some 3 or 4 times and watch the processes in the monitor.

views and processes

In Smalltalk/X, typically one process is created for each topview. This topview and all of its subviews are grouped together in a so called WindowGroup. To get a picture of it, first start a ProcessMonitor then open a new workspace.

You will see some info about the workspaces process in the process monitors list.

Now, in the workspace evaluate:

	Processor activeProcess suspend
The workspace will no longer respond to keyboard or any other events. (select the corresponding entry in the processMonitor, and use the debug function from its popup-menu, to see what the workspace is currently doing)

You can continue execution of the workspaces process with the processMonitos resume-function (or in the debugger, by pressing the continue button).

All events (keyboard, mouse etc.) are read by a separate process (called the event dispatcher), which reads the event from the operating system, puts it into a per-windowgroup event-queue, and notifies the view process about the arrival of the event (which is sitting on a semaphore, waiting for this arrival).
Modal boxes create a new windowgroup, and enter a new dispatch loop on this. Thus, the original views eventqueue (although still being filled with arriving events) is not handled while the modalbox is active (*).

The following pictures should make this interaction clear:

event dispatcher:

	   +->-+
	   ^   |
	   |   V
	   |   waits for any input (from keyboard & mouse)
	   |   from device
	   |   |
	   |   V
	   |   put event into windowgroups queue        
	   |   |                                             
	   |   V                                             
	   |   wakeup windowgroups semaphore            >*****
	   |   |                                             *
	   +-<-+                                             *
							     * Wakeup !
							     *
    each window-group process:                               *
							     *
	   +->-+                                             *
	   ^   |                                             *
	   |   V                                             *
	   |   wait for event arrival (on my semaphore)  <****
	   |   |
	   |   V
	   |   send the event to the corrsponding view
	   |   |               |
	   +-<-+               |
			    View>>keyPress:...
			or: View>>expose...
			etc.
modal boxes (and popup-menus) start an extra event loop:

	   +->-+
	   ^   |
	   |   V
	   |   wait for event arrival (on my semaphore)
	   |   |
	   |   V
	   |   send the event to the corrsponding view
	   |   |   ^       |
	   +-<-+   |       |
		   |       V
		   |    ModalBox open
		   |    create a new windowgroup (for box)
		   |       |
		   |       V
		   |       +->-+
		   |       ^   |
		   |       |   V
		   |       |   wait for event arrival (boxes group)
		   |       |   |
		   |       |   V
		   |       |   send the event to the corrsponding (boxes) view
		   |       |   |               |
		   +--- done ? |               |
			   +-<-+               |
					    keyPress:...

views and priorities

Initially, all view-processes are created at the same priority (called UserSchedulingPriority, which is typically 8). This means, that a running user process will block all other view processes (except, if it does a yield from time to time).

Try evaluating (in a workspace) ...

	[true] whileTrue:[1000 factorial]
... the system seems dead (read the next paragraphs, before doing this).

Only processes with a higher priority will get control; since the event dispatcher is running at UserInterruptPriority (which is typically 24), it will still read events and put them into the views event queues. However, all view processes run at 8 which is why they never get a chance to actually process the event.

There is one type of event, which is handled by the event dispatcher itself: a keypress of Cntl-C in a view will be recognized by the dispatcher, and start a debugger on the corresponding view-process.
(actually, a signal is raised, which could in theory be cought by the view process).

Thus, to break out of the above execution, press Cntl-C in the workspace, and get a debugger for its process.
In the debugger, press either abort (to abort the doItevaluation), or terminate to kill the process and shut down the workspace completely.

If you have long computations to be done, AND you dont like the above behavior, you can of course perform this computation at a lower priority. Try evaluating (in the above workspace):

	Processor activeProcess priority:4.
	[true] whileTrue:[1000 factorial]
Now, the system is still responding to your input in other views, since those run at a higher priority (8), therefore suspending the workspace-process whenever they want to run. You can also think of the the low-prio processing as being performed in the background - only running when no higher prio process is runnable (which is the case whenever all other views are inactively waiting for some input).

Some views do exactly the same, when performing long operations. For example, the fileBrowser lowers its priority while reading directories (which can take a long time - especially when directories are NFS-mounted). Therefore, you can still work with other views (even other filebrowsers) while reading directories. Try it with a large directory (such as "/usr/bin").

It is a good idea, to do the same in your programs, if operations take longer than a few seconds - the user will be happy about it. Use the filebrowsers code as a guide.

For your convenience, there is a short-cut method provided by Process, which evaluates a block at a lower priority (and changes the priority back to the old value when done with the evaluation).
Thus, long evaluations should be done using a construct as:

	Processor activeProcess withPriority:4 do:[
	    10 timesRepeat:[2000 factorial]
	]
You should avoid hardcoding priority numbers into your code, since these may change (PPS users noticed that Parcplaces new release 2 uses priorities between 1 and 100),
To avoid breaking your code in case this is changed in ST/X, the above is better written as:
	Processor activeProcess 
	    withPriority:(Processor userBackgroundPriority)
	    do:[
		10 timesRepeat:[2000 factorial]
	    ]

background processes

The above example did its computation in the workspace process, thus the workspace did no longer respond to update- or any other events. To get around this behavior, you can also start a new process, to do this computation. Try to evaluate:
	[
	    10 timesRepeat:[
		2000 factorial
	    ]
	] forkAt:4
in a workspace, and watch the process monitor. You will notice, that the workspace is not blocked, but a separate process has been created. Since it runs at a lower priority, all other views continue to react as usual.

There is one possible problem with the above background process:

Cntl-C pressed in the workspace will no longer affect the computation
To stop/debug a runaway background process, you have to open a processMonitor and use its terminate or debug menu functions.

suggested priorities & hints

To keep the system responsive, use the following priorities in your programs:

blocking interrupts

If you ever want to change things in Delay, Semaphore or ProcessorScheduler, never forget the possibility of external-world interrupts (especially: timer interrupts). These can in occur at any time, bringing the system into the scheduler, which could switch to another process as a consequence of the interrupt.
Whenever you are modifying data which is related to process scheduling (i.e. counters in semaphores, process lists in the scheduler etc), you should therefore block these interrupts for a while.
This is done by:
	OperatingSystem blockInterrupts
	...
	   modify the critical data
	...
	OperatingSystem unblockInterrupts
Since this basic block/unblock does not handle nested calls, you should only unblock interrupts, if they have NOT been blocked in the first place. To do so, OperatingSystem>>blockInterrupts returns the previous blocking state - i.e. true, if they have been already blocked. Thus, to be certain, always use:
	|wasBlocked|

	...
	wasBlocked := OperatingSystem blockInterrupts
	...
	   modify the critical data
	...
	wasBlocked ifFalse:[OperatingSystem unblockInterrupts]
For your convenience, blocks offer an easier to use interface for this operation:
	[
	   ...
	   modify the critical data
	   ...
	] valueUninterruptably.
See the code in Semaphore, Delay and ProcessorScheduler for more examples.

Notice, that no event processing, timer handling or process switching is done when interrupts are blocked. Thus you should be very careful in coding these critical regions. For example, an endless loop in such a region will certainly lock up the smalltalk system. Also, do not spend too much time in such a region, any processing which takes longer than (say) 50 milliseconds will have a noticable impact on the user.
(Usually, it is almost always an indication of a bad design, if you have to block interrupts for such a long time).

While interrupts are blocked, incoming interrupts will be registered by the runtime system and processed (i.e. delivered) at unblock-time.
Be prepared to get the interrupt(s) right after (or even within) the unblock call.

Also, process switches will restore the blocking state back to how it was when the process was last suspended. Thus, a yield within a blocked interrupt section will usually reenable interrupts in the switched-to process.

It is also possible to enable/disable individual interrupts. See OperatingSystem's disableXXX and enableXXX methods.

interrupting a process

Beside the above external interrupts, you can also manually force a process to be interrupted and evaluate something. To do so, use:
	...
	anotherProcess interruptWith:[ some action to be evaluated ]
	...
This forces anotherProcess to evaluate the block passed to interruptWith:. If the process is suspended, it will be resumed for the evaluation. The evaluation will be performed by the interrupted process, on top of the running or suspended context (thus a signal-raise, long return, restart or context walkback is possible).

BTW: the event dispatchers Cntl-C processing is implemented using exactly this mechanism.

Try:

	|p|

	p :=[
		[true] whileTrue:[
		     1000 factorial
		]
	    ] forkAt:4.
	"
	 to find it easier in the process monitor
	"
	p name:'myProcess'.

	"
	 make it globally known
	"
	Smalltalk at:#myProcess put:p.
then:
	myProcess interruptWith:[Transcript showCr:'hello'].
or (see the output on the xterm-window, where ST/X has been started):
	myProcess interruptWith:[thisContext fullPrintAll].
or:
	myProcess interruptWith:[Object abortSignal raise]
	"this will bring the process into the debugger)
finally cleanup with:
	myProcess terminate.
	Smalltalk removeKey:#myProcess.
As another example, we can catch some signal in the process, as in:
	|p|

	p :=[
		Object abortSignal catch:[
		     [true] whileTrue:[
			 1000 factorial
		    ]
		 ].
		 Transcript showCr:'process finished'
	    ] forkAt:4.
	"
	 to find it easier in the process monitor
	"
	p name:'myProcess'.

	"
	 make it globally known
	"
	Smalltalk at:#myProcess put:p.
then send it the signal with:
	myProcess interruptWith:[Object abortSignal raise]

timeouts

Based on the above interrupt scheme, ProcessorScheduler offers methods to schedule timeout-actions. These will interrupt the execution of the process and force evaluation of a block after some time.

this kind of timed blocks are installed (for the current process) with:

	Processor addTimeBlock:aBlock afterSeconds:someTime
to interrupt other processes after some time, use:
	Processor addTimeBlock:aBlock for:aProcess afterSeconds:someTime
there are alternative methods which expect millisecond arguments for short time delays.

For example, the autorepeat feature of buttons is done using this mechanism. Here a timed block is installed with:

	Processor addTimeBlock:[self repeat] afterSeconds:0.1
Also, animations can be implemented with this feature (by scheduling a block to draw the next picture in the view after some time delay).

See working with timers & delays for more information.

terminating a process

Processes are terminated with the terminate message. Technically, this does not really terminate the process, but instead raise the ProcessTermination signal. Of course, this signal could be cought or otherwise handled by the process; especially to allow for the execution of cleanup actions (see Process>>startup).
This is called soft termination of a process.

A hard termination (i.e. immediate death of the process without any cleanup) can be done by sending it terminateNoSignal. Except for emergency situations, there should never be a need for this.

An interresting feature of soft termination is that all unwind blocks (see Block>>valueOnUnwindDo:) are executed - in contrast to a hard terminate, which will immediately kill the process.

Interrupting a runaway process

In case of emergency (for example, when a process with a priority higher than UserInterruptPriority loops endless), you can press Cntl-C in the xterm window where Smalltalk/X was started.
This will stop the system from whatever it is doing (even the event dispatcher) and enter a debugger.

If the scheduler was hit with this interrupt, all other process activities are stopped, which implies that other existing or new views will not be handled while in this debugger (i.e. the debuggers inspect functions will not work, since they open new inspector views).
If your runaway process was hit, the debugger behaves as if the Cntl-C was pressed in a view (however, it will run at the current priority, so you may want to lower it by evaluating:

	Processor activeProcess priority:8

In this debugger, either terminate the current process (if you where lucky, and the interrupt occured while running in the runaway process) or try to terminate the bad process by evaluating some expression like:

    Process allInstances do:[:p |
	p priority > 24 ifTrue:[
	    p id > 1 ifTrue:[      "/ do not kill the scheduler
		p terminate
	    ]
	]
    ]
Your runaway process is of course easier to locate, if you gave it a distinct name before; in this case, use:
    Process allInstances do:[:p |
	p name = 'nameOfBadProcess'ifTrue:[
	    p terminate
	]
    ]
A somwehat less drastic fix is to send it an abortSignal:
    Process allInstances do:[:p |
	p name = 'nameOfBadProcess'ifTrue:[
	    p interruptWith:[Object abortSignal raise]
	]
    ]
Most processes provide a handler for this signal at some save place, where they are prepared to continue execution. Those without a handler will terminate. Therefore, a workspace will return to its event loop, while other processes may terminate upon receipt of this signal.

In some situations, the system may bring you into a non graphical MiniDebugger (instead of the graphical DebugView). This happens, if the active process at interrupt time was a DebugView, or if any unexpected error occurs within the debuggers startup sequence.
The MiniDebugger too supports expression evaluation, abort and terminate functions, however, these have to be entered via the keyboard in the xterm window (where you pressed the Cntl-C before). Type ? (question mark) at the MiniDebuggers prompt to get a list of available commands.

On some keyboards, the interrupt key is labelled different from Cntl-C. Try DEL or INTR or have a look at the output of the stty unix command.

Notes:

(*) this is not fully correct: the modalbox peeks into the other windowgroups eventQueue periodically and handles redraw requests. Thus, the original groups views will still redraw themselfes when exposed. However, no input events (keyboard and/or mouse) are handled while a modalBox is active.


Copyright © Claus Gittinger Development & Consulting, all rights reserved

(cg@ssw.de)