Re: multi/tasking/processing
On Wed, 16 Aug 1995, Penio Penev wrote:
> On Wed, 16 Aug 1995, Eugen Leitl wrote:
>
> > I was not proposing multiple register files or stacks, only two
> > sets, not even symmetrical (OS stack being much more shallow).
> > If these two banks are on chip, they need not to be swapped out
> > to main store.
>
> If there are two banks and three tasks, they need to.
But at least the OS, which gets called very often, does not need
to be swapped out. Instead of flushing out the entire stack,
each task itself should be responsible for transfering its
variables into store, leaving the stack clean...
And heck, while not using several (4-8) stack frames on chip,
thus eliminating store accesses altogether, as Novix did?
One should not elevate the transistor resource into something
sacred. We have to run our code with it after all, don't we?
(Where does all this wood and straw come from? Wait, don't
do that.. Ouch! Put that out, will you? Ouch!)
> [ ... ]
>
> > But still I am of the oppinion that having zero-latency context
> > switch between two tasks (possibly, protecting the address space
> > of one task) is valuable. A lot of program code is OS call code.
>
> How would you define an "OS"?
The interface between the hardware and the software. An insulating
layer, providing an consistant surface. Plus resource manager.
A special object with fixed ID, an instance in each node.
> > The OS is a very special task, requiring dedicated resources.
>
> It is not special at all in general. It is very special in particular in
> the U*X world.
Gods help me, I did not model upon Unix! I was thinking more
along the lines of Taos, a nanokernel (12 kByte) networked
hardware-independant etc. etc. OOP OS. The great chance
of threaded code is expandable microprogramming. This way, an
elaborate VM can squeezed into a tiny (on-chip) (SRAM) volume,
a highly important fact for small (<<1 MWord) nodes.
The VM opcodes might be interpreted as CALL address on some
machines and as OPCODE on other. A convenient means to provide
compatibilty with future machines while not losing efficiency.
E.g. imagine we are calling OS.SendMessage() all the time (in
an OO environment, probably the most often used OS call).
On some machines the opcode would mean CALL OS.SendMessage(),
on some a dedicated instruction OS.SendMessage(). No need to
recompile.
> > The OS can be implemented as a VM with zero context switch.
> > The OS supervisor task gets called very often particularly in
> > realtime machines.
>
> If the "OS supervisor task" works 100ns worth at 1ms intervals, would you
> dedicate half the chip resources to make it work 50ns at 1ms intervals?
Though 1 ms is probably too long, you are right, here. But how about
interrupts? OSCalls? Your code will be seething with OSCalls. Every 10
opcodes there will be one.
> > tens to hundreds of separate monotask nodes on single chip, a reentrant
> > multitasking OS with memory protection is a must.
>
> Reentrancy we need. But why do we need memory protection? A program, that
> needs memory protection is a buggy program. I'd rather have a simple
There is no such thing as a bug-free program. Particulary, a large
bug-free program. Look way into the future, where there will be
extensive grassroot computational infrastructure. Now you run your
distributed application and bring half of the western hemisphere's
economy to a crashing halt ;)
I am overdoing it a bit, but protection/security, is a must for
large networks. Software protection without hardware support is
pointless. I wouldn't want to run a 100 node box only to find out
that sometime along the way the whole things crashed, without
me being able to find out what exactly and where went wrong. No thanks.
> system, that I understand in full and am amble to debug, than a MMU ten
> times the size of the processor, that adds complexity without helping me
> kill bugs.
I can tell you quite a story here. Before the golden age of 68030/40/60,
the Commodore Amigas came equipped with a reentrant multitasking (microkernel
OO typed OS), yet without memory protection 68000/68020. Development language
was mostly C. Software was buggy, running a day without having to reboot
several times was a miracle. Instead of a harmless core dump, the whole
machine came to a grinding halt, the irate user staring at the
photogenely blinking "Guru Medidation xxxxx yyyyy" message.
After several debuggers with MMU support became available
(Mungwall/Enforcer), software quality suddenly balooned. Now
"Enforcer hit-free" software became a quality seal. Drastically
improved code quality due a MMU. Funny?
And no, I was not proposing a full-blown virtual memory
MMU. I thought about a binary mask protecting certain
memory areas, being settable only in OS (supervisor,
gotcha again, but still it ain't Unix) mode. A mask and
a comparator, generating an interrupt. Surely, this
won't take more than 200 transistors in toto?
(And one can map HD into address space at high memory
addresses, using MMU address faults to call a swap-in/out
routine really simple.. but skip that)
Simply knowing that this task just tried to r/w into OS
code or beyond its memory window at a certain address, helps
a lot in debugging. (And one doesn't have to reboot the box.)
>
> > What if the task is much too small, shall we waste the
> > rest of local node memory resources?
>
> The solution is to make the nodes respectively smaller (and cheaper, and
> more). But we are not there with the technology yet :-( We need better
> dynamical allocation schemes of on-chip resources. Not very likely with
> the silicon (semiconductor) technology.
Look, we have to have still a tiny OS in each grain. I simply can't
see how I can do anything sensible with fewer than 64 kWords/node,
minus OS demand.
We all know that the future is evolvable hardware (soft circuits)
on CAMs. We can build them even now with current semiconductor
technology. However, who is going to buy (and to program) them?
While MISC will have trouble to sell, as it is highly unusual already?
I might have funny notions, but not _that_ funny.
>
> But while we are at the silicon level now, we need to experiment with
> concepts and identify our needs better. As we write finer and
> finer-grained masspar apps, we will learn what (hardware/software/wetware)
> we need to solve bigger problems in adequate time (and budget).
These are my aims, exactly. I do not like featuritis and
rarely propose things just for the fun of it. 32/64 bits,
a second (shallow) stack/register bank for OS, primitive
memory protection, 2-3 index registers with according opcodes
(A+), (B+), (C+), a small on-chip SRAM appearing in address space and
multiple-link network processor is all I wish. (Now, that I think
of it... Did I just say I don't like featuritis?)
-- Eugene
> --
> Penio Penev <Penev@venezia.Rockefeller.edu> 1-212-327-7423
>
>