[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [oc] Beyond Transmeta...



> I don't belive that you could reach more performance simplifying
> instructions... I suppose you have computer for desktop
> applications in mind (like Transmeta). Did you calculate how
> many cycles does your add take (most commont ins in GP
> aplications)? Clock ratio would be much lower.

Well, it depends on the kind of performance you are looking for. Serial 
performance is fine with CISC kind of processors, because internally they do 
a number of small things in parallel, and with new SIMD instructions they 
reach further into doing more parallel work. But with the a simpler processor 
like the one I describe you can achieve better parallel performance. I 
calculate with 4 1bit processors you can add 1 bit per clock, so for 32bit 
number takes 32 clocks, but when you take into acount there is only 4 1bit 
processors versus a 32bit processor, you can see that the 1bit processors 
should be more scaleable, with less heat and complications and more then 
likely higher stability at higher clock speeds you can have it running 
faster, or if you have 32 1bit processors, it can do 8 32bit additions 
simultaniously with 32clocks, that is 4 clocks an add when you average it 
out, then there is 128 1bit processors, which gets you 1 clock an add, going 
beyond 128 can get you less then 1 clock an add, that is probably the nature 
of a network, things get done slower but so many things get done 
simultaniously.

The other thing that can be done with this, is to only update what needs to 
be updated, only bits that change cause other bits to change. In a persistent 
network, when one bit changes, the connected bits send out instructions to be 
recalculated, if the change effects their values then their connected bits do 
the same, this all causes a chain reaction.

Future applications/systems are likely to be more complex, and include speech 
recognition, machine vision, timing sensitivity, and many other things to 
detect how the user feels about a particular operation and how to optimize 
for a particular user, and self maintanance and self organization so that 
having a computer is more automated.

> I suppose that kind of programs would be pretty large. And you haven't
> got any addressing here. How would you know where to get operands
> and where to put them?

There are diffrent ways of possibly handling connections. But one thing I 
have mentioned is the ability for self modification, which makes things even 
more difficult, the networks must have access to their own connections.

One way that could be used to handle connections could be an multidimensional 
space (mapped from flat memory space) for the bits to exist in, and to say 
that the bits can connect with bits a certain distance away from them selves. 
A 3d or 4d space may be the best solution, one of the dimensions could 
represent the instructions, so you have layers of networks that modify one 
another, these modifications could probably be used to dynamicly change the 
nature of a network as needed, so that one network can change another from an 
add operation to an subtraction operation. But then there is the issue of how 
to know which bit uses what instruction, its definetly very complicated. But 
I think if it can be figured out, and more then likely it will be figured out 
by looking at things diffrently, right now we are used to doing things in a 
serial kind of way.

Bandwidth is definetly an issue, but if the 1 bit processor is small enough 
it may be incorperated into the memory and bandwidth would be made up for by 
having it travel at the same clock rate.

Or on the other hand memory size maybe made up for by transfering higher 
level instructions... Like in our own mind we attach words names and labels 
to complex ideas to help us understand things better... This may very well be 
what the network would do, to do complex logic in certain parts of the 
network, like how a certain part of the human brain does math, while another 
does words. In this way it creates a processing network, like a CISC CPU 
turned into or emulated by a network, but with this method you can create 
multiple processing networks, that can be localized to be the equivalent of 
having multiple CISC processors. Like having one processor per pixel, to do 
real time ray tracing...

> Do you have some particular algorithm in mind?

Not one particular... but I have a few ideas of how it would work. The 
biggest factor is organization of networks. I've done experiments with 
treating mathematical equations as networks, and using certain properties to 
shift the network around so that you can get diffrent equations. If done 
right you can optimize a network to be more parallel, to take advantage of a 
certain number of 1bit processors. The other apsect is how things are mapped 
into the network, there are many diffrent ways for seperate networks to 
connect with each other, by using the same shifting, you can achieve various 
changes to the network, which allow for diffrent kinds of connections that 
where not possible before, these changes can either be helpful for a 
particular situation or bad, if it is good it is kept, if it is bad it is 
shifted back. So you shift around the network to achieve various effects. 
This is still partly in theory, mostly from my networked variable experments, 
I have yet to see how to shift around binary networks.

Leyland Needham