mu TECH: an unhurried digression on C pointers vs. arrays

From: Alfie Costa (agcosta@gis.net)
Date: Fri Apr 28 2000 - 09:51:20 CEST


An old joke:

"How do you like this cake I baked?"
"Tastes awful."
"That's impossible, the cookbook says it's perfectly delicious!"

On 26 Apr 00, at 9:15, Michele Andreoli <mulinux@sunsite.auc.dk> wrote:

> but array are indentical to pointers, in C. Isn't true?

This turns out to be a very educational question, unfortunately more so than I
wished... after looking up stuff around the net, what I now believe to be the
correct answer is: Maybe?

To be precise, the mulinux answer to your question for 'gcc' on 386+ chips,
according to a few tests I've just run, is: Yes it's true. Or almost yes;
sadly, pointers even seem to be a little slower. Mah!

It depends on what chip is doing the computing, and what compiler is producing
the code. My error was thinking that the innards of C were like some sort of
'C virtual machine' that was the same, (or similar), for every C.

Attached to this e-mail is a little benchmark program in C which times in
seconds this lack of improvement or slowness. I've tested it with both Turbo
C, (compiled for 8086s), and mu's own 'gcc'. Both compilers showed no
advantages for pointers.

A Dejanews search on usenet reveals that most everybody already knows that C's
pointers aren't generally faster that arrays. Attached to this message are
some of the more dismaying messages on the topic. (Zipped)

In one of these usenet messages, from a 1997 comp.os.os2.programmer.misc post,
Rudi Ziegaus says:

<quote>

That's why I don't like to use pointer for these purposes. The "pointer access
is faster than array access" is a fairytale which is unfortunately ever told. A
good compiler can generate at least a equal fast code for arrays, maybe even
better.

<end-quote>

...And here's some text from the book where I read about pointers being faster
than arrays, perhaps the folktale's source, as it were. 'The C Companion' by
Allen I. Holub, 1987 Prentice Hall, (Prentice Hall Software Series, Brian W.
Kernighan(!), advisor), p. 146...

<snip>

If all these statements are equivalent, why use pointers instead of square
bracket options? the answer lies in efficiency considerations. Think about
what you are actually instructing the compiler to do when x=p[i++] is executed.

(1) Fetch the contests of i.

(2) Multiply it by the size of one object-pointed-to. Store the result in an
     anonymous scratch variable.

(3) Then, increment i and put it back.

(4) Fetch the contents of p.

(5) Then add the results of the multiply to it.

(6) Finally, fetch the contents of the cell whose address we just computed,
     and put in into x.

That's three fetches and one multiply. Now consider the expression x=*p++,
which performs the same operation. This second expression causes the compiler
to generate code to do the following:

(1) Fetch the contests of p.

(2) Fetch the object at that address, and put it into x.

(3) Add to p the size of an object-pointed-to.

(4) Store the result of the addition in p.

Here there are only 2 fetches and there's no multiply. That multiply is the
major inefficiency. As we saw in Chapter 2, a multiply is an expensive
operation and we should do anything we can to avoid it. This is especially
true for machines like the 8085 that don't support a hardware multiply; even
so, a hardware multiply in most machines is going to take more time than a
hardware addition. Another factor is the actual size of the object.
Multiplication by a power of 2 is just a shift. a multiply by 6 (which you'll
get with a pointer to a structure or an array) is not. On the other hand, many
compilers don't have the sense to shift rather than call a multiplication
subroutine, even for a power of 2.

You should note that, in terms of overhead, there's no difference between p[i]
and *(p+i). There is a run-time multiply in either case. However, if _i_ is a
constant -- p[1] or *(p+1) -- then the compiler can do the multiply at
_compile_ time rather than at run time. Array accesses that use constants are
much faster than those that use variables. Accessing an array using brackets
and constants is a reasonable action. If the array is declared static, the
actual address of the desired cell can be determined at compile time.

By the same token, if you need to access an array randomly, then use the
brackets; they're easier to read. An array hardly ever has to be accessed
randomly, however. Most often you're going through it in some orderly fashion:
row by row, column by column, diagonally, and so forth. In these cases,
pointers will be faster.

<snip>

Well, all this is perhaps more than most readers wish to know, but it's here in
the hope that it will help other C novices avoid this same peril...

The following section of this message contains a file attachment
prepared for transmission using the Internet MIME message format.
If you are using Pegasus Mail, or any another MIME-compliant system,
you should be able to save it or view it from within your mailer.
If you cannot, please ask your system administrator for assistance.

   ---- File information -----------
     File: Ptrtest.zip
     Date: 28 Apr 2000, 2:53
     Size: 5349 bytes.
     Type: ZIP-archive


---------------------------------------------------------------------
To unsubscribe, e-mail: mulinux-unsubscribe@sunsite.auc.dk
For additional commands, e-mail: mulinux-help@sunsite.auc.dk




This archive was generated by hypermail 2.1.6 : Sat Feb 08 2003 - 15:27:14 CET