OK.. before i explain this, let's review the basic general idea behind RISC, Reduced Instruction Set Computing.
  • Microchips with smaller instruction sets can have simpler and thus generally more efficient chip designs.
  • Complex instructions are unnecessary, since such instructions can generally be broken down into combining a series of smaller, simpler instructions.
RISC chips attempt to reduce the number of processor instructions down to a mere handful of general instructions easily combinable to quickly serve more complex tasks. But think: because of turing equivilency, even after you've thrown out the obviously unnecessary instructions a few of the handful left over will always still be, in a technical sense, redundant. For practical reasons, RISC chips still do not have a truly minimal set; logically it would always be possible to reduce the set even further, finding an instruction somewhere in what few you still have that can be approximated and replaced by combining other instructions in certain ways.

So it came to pass that, in the tradition of such fine programming languages as INTERCAL and unlambda, a few computing visionaries were sitting around when the idea struck for the creation of an OISC:

One Instruction Set Computer.

Just think of the benefits! Clumsy lookup tables can be eliminated. Opcodes become completely unnecessary, meaning both that you conserve space (a pyrrhic victory, to be sure, since any task will take many, many more instructions under OISC) and allow the OISC to simply snarf through its registers without stopping to check what it's doing. Pipelining can be taken to a truly ludicrous extreme. (On the other hand, of course, practically all known forms of internal processor optimisation, for example VLIW, become roughly impossible. Meaning the creator of any OISC chip would have to completely reinvent the wheel as to how the chip operates. But who's to say the new wheel wouldn't be faster?)

The OISC's single instruction is "Subtract and branch if negative". It takes three registers at a go, and does this:

If the current register is "."
Write into the register specified by the contents of .
the current contents of the register specified by . minus the contents of the register specified by the contents of register .+1
and if the result is negative,
jump to the register specified by the contents of register .+2.

For obvious reasons, no OISC chips have ever actually been manufactured; however you can find an excellent interpreter/emulator of one (along with all known OISC software) at the retrocomputing museum's archive, downloadable at ftp://locke.ccil.org/retro/oisc.shar.gz

RISC designs tend to make things as simple and easy to optimise as possible for those creating compilers while making the life of assembly coders much more difficult. OISC, as the logical extreme of RISC, takes this concept to an equally far logical extreme. Writing OISC assembly can be hellish, and is pretty much impossible without the use of self-modifying code. This would not be a problem-- since the entire point of OISC is to put all burden on the compiler-- except that unfortunately, perhaps as a result of the lack of any OISC computers in existence, there are currently no compilers capable of writing to OISC code. The creators of the OISC interpreter have gotten around this in the few OISC programs they have written by use of assembler macros, which allow you to predefine inline function-like blocks for commonly done operations such as addition and subtraction, as well as use the . method of specifying register addresses in relation to the current address instead of absolutely.

(Notice one odd benefit of OISC is that, since it's basically impossible to figure out before runtime which instructions are going to be Jumps, writing an OISC virus would be almost impossible by current methods!)

Here is a nicely commented example of a simple (11-register) OISC program, taken from the Read Me of the shar linked above. It reads in a number and then prints it back out. (Note the use of the emulator's three "special" registers at 32765-32767, respectively named WRITE, READ and STOP; subtract from 32765 (which is always zero) and the result is printed to screen, read from 32766 and the result is taken from keyboard input, branch to 32767 and the program terminates.)

9 32766 3	  ; Address 0: Read a number, subtract it from
            ; 	address 9.  Go to 3 if negative.
32765 9 6	  ; Address 3: Print address 9 out.  Go to
            ;	6 if negative
10 11 32767	; Address 6: Subtract conts of address 11 from
            ;	address 10 Go to STOP if negative (which it
            ;	will, since address 11 contains 0, and
            ;	address 10 contains -1).
0           ; Address 9 - storage for the number
-1          ; Address 10 - negative one for unconditional branch
0           ; Address 11 - zero for unconditional branch

Keep in mind that while as of now all this is good for is obfuscated party tricks, in pure logic terms this really is by no means a turing tarpit. Were someone to write a real compiler for the OISC and a slightly more full-featured emulator (say, one capable of inputting or outputting characters, or maybe even with some form of video memory), there is no reason an OISC could not run, say, linux, just as easily as an x86. The only difficulty would be in getting the compiler to optimise well. Turing completeness is neat!

Given that IBM, Motorola and Intel have recently announced laboratory tests of extremely simple niobium-based superconducting chips that can run at 750 Gigahertz (!) this starts to look less ridiculous. Although many, many more instructions would be needed to run an OISC, it's much more likely that you'd be able to get a chip that contained only one instruction to be able to run at these speeds due to its simplicity. Since that's around 500 times as fast as the fastest Pentium chip on the market right now, as long as it's less than 500 times as inefficient it's still viable. Also, if you could parallelize it at all, it would probably be much much easier to multiprocess using this simple logic than using CISC or even RISC (although I could be talking out of my a-- on that one).

Hmm. An interesting idea, no matter what.

I've been trying to think about how we could extend the OISC design (maybe better described as OIC for one instruction computer -- it's not just that the "instruction set" is something that there's only one of, it's that the instruction set contains only one instruction). I experimented with character I/O in addition to just numeric I/O and made a little memory debugger shell. But what about segmentation and/or paging with an MMU? We could even use a OIC as our MMU running special code for it. We can extend it to use wider addressing, but what would a sensible memory map look like?

If we had the ability to give an offset to memory accesses with an MMU, we could create quasi-PIC with a memory map. EG:
segment 0-999 "local" code segment, always mapped to the physical address of our program (with a base-address register on the MMU), and this is where PIC would always assume to be loaded
segment 1000 special (I/O, MMU access)
segment 1000-999999 physical segments in RAM

The situation improves with a layer of paging, if we create a similar system where there are special "local" pages (set by a flag in the page table?) which can treat themselves as always located at the same address. If we have more fine-grained MMU control via paging with special pages, our quasi-PIC actually becomes somewhat reusable, and then we can have things like loadable libraries for our code (as long as you load at a page start, and set up your page table properly, with the special pages flagged). You can even remove some of the need for the special "local" pages by using a relocation table for library code to rewrite addresses when the library is loaded (although then shared libraries become slightly less useful, because every program using a shared library would have its own separate copy loaded).

Add paging and you have a full virtual memory system, but then you'd need a kernel which could be invoked by the MMU in order to handle a page fault... but more on adding interrupts in a minute.

We can also implement a permissions system on segments, but that gets way more complicated, because a lot of code has to be self-modifying (but, there might be workarounds for that with an MMU and memory-accessible program counter?) so we have to allow code to overwrite itself. And, in order to have something like interrupt-driven I/O or entrypoints/gates to kernel mode, we'd need to be really careful about how we did that. We might need an I/O controller which is able to read and write the processor state in order to make that happen (and the I/O controller could just be another OIC running special code...)

Multiprocessing on a single CPU is feasible if the kernel can use the MMU or I/O controller to read/set the processor state, but if we have multiple CPUs, the self-modifying code becomes a problem, because then we are filled with non-reentrant stuff. If we don't mind using more RAM (it's cheap nowadays), we could use Copy-on-Write for multi-CPU, at the expense of a more complicated multi-CPU MMU. But there might be another way to create true reentrant PIC.

I am wondering if all self-modifying code could be replaced with more complex code which references the program counter in very direct ways, based on the ".+N" system used in the OISC emulator's built-in compiler. The downside is that this either needs added to the CPU (which effectively makes it have more than one instruction, because you'd have a SUBTRACT-absolute-addressing and one or more SUBTRACT-relative-addressing instructions, although you could just call them "modes" of a single instruction and maybe get away with still calling it a single instruction), or the MMU (which creates a lot of complex interplay between the MMU and the exact state of the CPU, but we'd already need that anyway for the ability to do multiprocessing). This would work to create fully PIC, but I'm not entirely sure about being able to replace all self-modifying code this way. There may be special functions which could be implemented (by yet another coprocessor, as a OIC running its own special code perhaps) which would otherwise need to be done with self-modifying code... but having an extra coprocessor isn't such a bad thing, because we could get a memory-mapped FPU out of it, too, and speed up math operations greatly.

Writing an OS to handle all of that sounds rather complicated, and I'm not sure how security would work; but if you can make code which is fully PIC and fully reentrant, plus eliminate the need for self-modifying code, that's already half the battle.

Log in or register to write something here or to contact authors.