Note-form on ISAMUX (aka "ISANS")

A fixed number of additional (hidden) bits, conceptually a "namespace", that go directly and non-optionally into the instruction decode phase, extending (in each implementation) the opcode length to 16+N, 32+N, 48+N, where N is a hard fixed quantity on a per-implementor basis.

Where the opcode is normally loaded from the location at the PC, the extra bits, set via a CSR, are mandatorially appended to every instruction: hence why they are described as "hidden" opcode bits, and as a "namespace".

The parallels with c++ "using namespace" are direct and clear.

Hypothetical Format

Note that this is a hypothetical format, yet TBD, where particular attention needs to be paid to the fact that there is an "immediate" version of CSRRW (with 5 bits of immediate) that could save a lot of space in binaries.

   3                   2                   1
|1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0|
|------------------------------ |-------|---------------------|-|
|reserved reserved reserved reserved reserved   | foreignarch |1|
|custom         | reserved    |           official|B| rvcpage |0|

RV Mode

  • when bit 0 is 0, "RV" mode is selected.
  • in RV mode, bits 1 thru 5 provide up to 16 possible alternative meanings (namespaces) for 16 Bit opcodes. "pages" if you will. The top bit indicates custom meanings. When set to 0, the top bit is for official usage.
  • Bits 15 thru 23 are reserved.
  • Bits 24 thru 31 are for custom usage.
  • bit 6 ("B") is LE/BE

16 bit page examples:

  • 0b0000 STANDARD (2019) RVC
  • 0b0001 RVCv2
  • 0b0010 RV16
  • 0b0011 RVCv3
  • ...
  • 0b1000 custom 16 bit opcode meanings 1
  • 0b1001 custom 16 bit opcode meanings 2
  • .....

Foreign Arch Mode

  • when bit 0 is 1, "Foreign arch" mode is selected.
  • Bits 1 thru 7 are a table of foreign arches.
  • when the MSB is 1, this is for custom use.
  • when the MSB is 0, bits 1 thru 6 are reserved for 64 possible official foreign archs.

Foreign archs could be (examples):

  • 0b0000000 x86_32
  • 0b0000001 x86_64
  • 0b0000010 MIPS32
  • 0b0000011 MIPS64
  • ....
  • 0b0010000 Java Bytecode
  • 0b0010001 N.E.Other Bytecode
  • ....
  • 0b1000000 custom foreign arch 1
  • 0b1000001 custom foreign arch 2
  • ....

Note that "official" foreign archs have a binary value where the MSB is zero, and custom foreign archs have a binary value where the MSB is 1.

Namespaces are permitted to swap to new state

In each privilege level, on a change of ISANS (whether through manual setting of ISANS or through trap entry or exit changing the ISANS CSR), an implementation is permitted to completely and arbitrarily switch not only the instruction set, it is permitted to switch to a new bank of CSRs (or a subset of the same), and even to switch to a new PC.

This to occur immediately and atomically at the point at which the change in ISANS occurs.

The most obvious application of this is for Foreign Archs, which may have their own completely separate PC. Thus, foreign assembly code and RISCV assembly code need not be mixed in the same binary.

Further use-cases may be envisaged however great care needs to be taken to not cause massive complications for JIT emulation, as the RV ISANS is unary encoded (231 permutations).

In addition, the state information of all namespaces has to be saved and restored on a context-switch (unless the SP is also switched as part of the state!) which is quite severely burdensome and getting exceptionally complex.

Switching CSR, PC (and potentially SP) and other state on a NS change in the RISCV unary NS therefore needs to be done wisely and responsibly, i.e. minimised!

To be discussed. Context https://groups.google.com/a/groups.riscv.org/d/msg/isa-dev/x-uFZDXiOxY/27QDW5KvBQAJ

Privileged Modes / Traps

An additional WLRL CSR per priv-level named "LAST-ISANS" is required, and another called "TRAP-ISANS" These mirrors the ISANS CSR, and, on a trap, the current ISANS in that privilege level is atomically transferred into LAST-ISANS by the hardware, and ISANS in that trap is set to TRAP-ISANS. Hardware is only then permitted to modify the PC to begin execution of the trap.

On exit from the trap, LAST-ISANS is copied into the ISANS CSR, and LAST-ISANS is set to TRAP-ISANS. Only then is the hardware permitted to modify the PC to begin execution where the trap left off.

This is identical to how xepc is handled.

Note 1: in the case of Supervisor Mode (context switches in particular), saving and changing of LAST-ISANS (to and from the stack) must be done atomically and under the protection of the SIE bit. Failure to do so could result in corruption of LAST-ISANS when multiple traps occur in the same privilege level.

Note 2: question - should the trap due to illegal (unsupported) values written into LAST-ISANS occur when the software writes to LAST-ISANS, or when the trap (on exit) writes into LAST-ISANS? this latter seems fraught: a trap, on exit, causing another trap??

Per-privilege-level pseudocode (there exists UISANS, UTRAPISANS, ULASTISANS, MISANS, MTRAPISANS, MLASTISANS and so on):

trap_entry()
{
    LAST-ISANS = ISANS // record the old NS
    ISANS = TRAP_ISANS // traps are executed in "trap" NS
}

and trap_exit:

trap_exit():
{
    ISANS = LAST-ISANS
    LAST-ISANS = TRAP_ISANS
}

Why not have TRAP-ISANS as a vector table, matching mtvec?

Use case to be determined. Rather than be a global per-priv-level value, TRAP-ISANS is a table of length exactly equal to the mtvec/utvec/stvec table, with corresponding entries that specify the assembly-code namespace in which the trap handler routine is written.

Open question: see https://groups.google.com/a/groups.riscv.org/d/msg/isa-dev/IAhyOqEZoWA/BM0G3J2zBgAJ

trap_entry(x_cause)
{
    LAST-ISANS = ISANS // record the old NS
    ISANS = TRAP_ISANS_VEC[xcause] // traps are executed in "trap" NS
}

and trap_exit:

trap_exit(x_cause):
{
    ISANS = LAST-ISANS
    LAST-ISANS = TRAP_ISANS_VEC[x_cause]
}

Is this like MISA?

No.

  • MISA's space is entirely taken up (and running out).
  • There is no allocation (provision) for custom extensions.
  • MISA switches on and off entire extensions: ISAMUX/NS may be used to switch multiple opcodes (present and future), to alternate meanings.
  • MISA is WARL and is inaccessible from everything but M-Mode (not even readable).

MISA is therefore wholly unsuited to U-Mode usage; ISANS is specifically permitted to be called by userspace to switch (with no stalling) between namespaces, repeatedly and in quick succession.

What happens if this scheme is not adopted? Why is it better than leaving things well alone?

At the first sign of an emergency non-backwards compatible and unavoidable change to the frozen RISCV official Standards, the entire RISCV community is fragmented and divided into two:

  • Those vendors that are hardware compatible with the legacy standard.
  • Those that are compatible with the new standard.

These two communities would be mutually exclusively incompatible. If a second emergency occurs, RISCV becomes even less tenable.

Hardware that wished to be "compatible" with either flavour would require JIT or offline static binary recompilation. No vendor would willingly accept this as a condition of the standards divergence in the first place, locking up decision making to the detriment of RISCV as a whole.

By providing a "safety valve" in the form of a hidden namespace, at least newer hardware has the option to implement both (or more) variations, and still apply for Certification.

However to also allow "legacy" hardware to at least be JIT soft compatible, some very strict rules must be adhered to, that appear at first sight not to make any sense.

It's complicated in other words!

Surely it's okay to just tell people to use 48-bit encodings?

Short answer: it doesn't help resolve conflicts, and costs hardware and redesigns to do so. Softcores in cost-sensitive embedded applications may even not actually be able to fit the required 48 bit instruction decode engine into a (small, ICE40) FPGA. 48-bit instruction decoding is much more complex than straight 32-bit decoding, requiring a queue.

Second answer: conflicts can still occur in the (unregulated, custom) 48-bit space, which could be resolved by ISAMUX/ISANS as applied to the 48 bit space in exactly the same way. And the 64-bit space.

Why not leave this to individual custom vendors to solve on a case by case basis?

The suggestion was raised that a custom extension vendor could create their own CSR that selects between conflicting namespaces that resolve the meaning of the exact same opcode. This to be done by all and any vendors, as they see fit, with little to no collaboration or coordination towards standardisation in any form.

The problems with this approach are numerous, when presented to a worldwide context that the UNIX Platform, in particular, has to face (where the embedded platform does not)

First: lack of coordination, in the proliferation of arbitrary solutions, has to primarily be borne by gcc, binutils, LLVM and other compilers.

Secondly: CSR space is precious. With each vendor likely needing only one or two bits to express the namespace collision avoidance, if they make even a token effort to use worldwide unique CSRs (an effort that would benefit compiler writers), the CSR register space is quickly exhausted.

Thirdly: JIT Emulation of such an unregulated space becomes just as much hell as it is for compiler writers. In addition, if two vendors use conflicting CSR addresses, the only sane way to tell the emulator what to do is to give the emulator a runtime commandline argument.

Fourthly: with each vendor coming up with their own way of handling conflicts, not only are the chances of mistakes higher, it is against the very principles of collaboration and cooperation that save vendors money on development and ongoing maintenance. Each custom vendor will have to maintain their own separate hard fork of the toolchain and software, which is well known to result in security vulnerabilities.

By coordinating and managing the allocation of namespace bits (unary or binary) the above issues are solved. CSR space is no longer wasted, compiler and JIT software writers have an easier time, clashes are avoided, and RISCV is stabilised and has a trustable long term future.

Why ISAMUX / ISANS has to be WLRL and mandatory trap on illegal writes

The namespaces, set by bits in the CSR, are functionally directly equivalent to c++ namespaces, even down to the use of braces.

WARL, by allowing implementors to choose the value, prevents and prohibits the critical and necessary raising of an exception that would begin the JIT process in the case of ongoing standards evolution.

Without this opportunity, an implementation has no reliable guaranteed way of knowing when to drop into full JIT mode, which is the only guaranteed way to distinguish any given conflicting opcode. It is as if the c++ standard was given a similar optional opportunity to completely ignore the "using namespace" prefix!

--

Ok so I trust it's now clear why WLRL (thanks Allen) is needed.

When Dan raised the WARL concern initially a situation was masked by the conflict, that if gone unnoticed would jeapordise ISAMUX/ISANS entirely. Actually, two separate errors. So thank you for raising the question.

The situation arises when foreign archs are to be given their own NS bit. MIPS is allocated bit 8, x86 bit 9, whilst LE/BE is given bit 0, RVCv2 bit 1 andso on. All of this potential rather than actual, clearly.

Imagine then that software tries to write and set not just bit 8 and bit 9, it also tries to set bit 0 and 1 as well.

This IS on the face of it a legitimate reason to make ISAMUX/ISANS WARL.

However it masks a fundamental flaw that has to be addressed, which brings us back much closer to the original design of 18 months ago, and it's highlighted thus:

x86 and simultaneous RVCv2 modes are total nonsense in the first place!

The solution instead is to have a NS bit (bit0) that SPECIFICALLY determines if the arch is RV or not. If 0, the rest of the ISAMUX/ISANS is very specifically RV only, and if 1, the ISAMUX/ISANS is a binary table of foreign architectures and foreign architectures only.

Exactly how many bits are used for the foreign arch table, is to be determined. 7 bits, one of which is reserved for custom usage, leaving a whopping 64 possible "official" foreign instruction sets to be hardware-supported/JIT-emulated seems to be sufficiently gratuitous, to me.

One of those could even be Java Bytecode!

Now, it could hypothetically be argued that the permutation of setting LE/BE and MIPS for example is desirable. A simple analysis shows this not to be the case: once in the MIPS foreign NS, it is the MIPS hardware implementation that should have its own way of setting and managing its LE/BE mode, because to do otherwise drastically interferes with MIPS binary compatibility.

Thus, it is officially Not Our Problem: only flipping into one foreign arch at a time makes sense, thus this has to be reflected in the ISAMUX/ISANS CSR itself, completely side-stepping the (apparent) need to make the NS CSR WARL (which would not work anyway, as previously mentioned).

So, thank you, again, Dan, for raising this. It would have completely jeapordised ISAMUX/NS if not spotted.

The second issue is: how does any hardware system, whether it support ISANS or not, and whether any future hardware supports some Namespaces and, in a transitive fashion, has to support more future namespaces, through JIT emulation, if this is not planned properly in advance?

Let us take the simple case first: a current 2019 RISCV fully compliant RV64GC UNIX capable system (with mandatory traps on all unsupported CSRs).

Fast forward 20 years, there are now 5 ISAMUX/NS unary bits, and 3 foreign arch binary table entries.

Such a system is perfectly possible of software JIT emulating ALL of these options because the write to the (illegal, for that system) ISAMUX/NS CSR generates the trap that is needed for that system ti begin JIT mode.

(This again emphasises exactly why the trap is mandatory).

Now let us take the case of a hypothetical system from say 2021 that implements RVCv2 at the hardware level.

Fast forward 20 years: if the CSR were made WARL, that system would be absolutely screwed. The implementor would be under the false impression that ignoring setting of "illegal" bits was acceptable, making the transition to JIT mode flat-out impossible to detect.

When this is considered transitively, considering all future additions to the NS, and all permutations, it can be logically deduced that there is a need to reserve a full set of bits in the ISAMUX/NS CSR in advance.

i.e. that right now, in the year 2019, the entire ISAMUX/NS CSR cannot be added to piecemeal, the full 32 (or 64) bits has to be reserved, and reserved bits set at zero.

Furthermore, if any software attempts to write to those reserved bits, it must be treated just as if those bits were distinct and nonexistent CSRs, and a trap raised.

It makes more sense to consider each NS as having its own completely separate CSR, which, if it does not exist, clearly it should be obvious that, as an unsupported CSR, a trap should be raised (and JIT emulation activated).

However given that only the one bit is needed (in RV NS Mode, not Foreign NS Mode), it would be terribly wasteful of the CSRs to do this, despite it being technically correct and much easier to understand why trap raising is so essential (mandatory).

This again should emphasise how to mentally get one's head round this mind-bendingly complex problem space: think of each NS bit as its own totally separate CSR that every implementor is free and clear to implement (or leave to JIT Emulation) as they see fit.

Only then does the mandatory need to trap on write really start to hit home, as does the need to preallocate a full set of reserved zero values in the RV ISAMUX/NS.

Lastly, I think it's ok to only reserve say 32 bits, and, in 50 years time if that genuinely is not enough, start the process all over again with a new CSR. ISAMUX2/NS2.

Subdivision of the RV NS (support for RVCv3/4/5/RV16 without wasting precious CSR bits) best left for discussion another time, the above is a heck of a lot to absorb, already.

Alternative RVC 16 Bit Opcode meanings

Ok, here is appropriate to raise an idea how to cover RVC and future variants, including RV16.

Just as with foreign archs, and you quite rightly highlight above, it makes absolutely no sense to try to select both RVCv1, v2, v3 and so on, all simultaneously. An unary bit vector for RVC modes, changing the 16 BIT opcode space meaning, is wasteful and again has us believe that WARL is the "solution".

The correct thing to do is, again, just like with foreign archs, to treat RVCs as a binary namespace selector. Bits 1 thru 3 would give 8 possible completely new alternative meanings, just like how the Z80 and the 286 and 386 used to do bank switching.

All zeros is clearly reserved for the present RVC. 0b001 for RVCv2. 0b010 for RV16 (look it up) and there should definitely be room reserved here for custom reencodings of the 16 bit opcode space.

Why WARL will not work and why WLRL is required

WARL requires a follow-up read of the CSR to ascertain what heuristic the hardware might have applied, and if that procedure is followed in this proposal, performance even on hardware would be severely compromised.

In addition when switching to foreign architectures, the switch has to be done atomically and guaranteed to occur.

In the case of JIT emulation, the WARL "detection" code will be in an assembly language that is alien to hardware.

Support for both assembly languages immediately after the CSR write is clearly impossible, this leaves no other option but to have the CSR be WLRL (on all platforms) and for traps to be mandatory (on the UNIX Platform).

Is it strictly necessary for foreign archs to switch back?

No, because LAST-ISANS handles the setting and unsetting of the ISANS CSR in a completely transparent fashion as far as the foreign arch is concerned. Supervisor or Hypervisor traps take care of the context switch in a way that the user mode (or guest) need not be aware of, in any way.

Thus, in e.g. Hypervisor Mode, the foreign guest arch has no knowledge or need to know that the hypervisor is flipping back to RV at the time of a trap.

Note however that this is not the same as the foreign arch executing foreign traps! Foreign architecture trap and interrupt handling mechanisms are out of scope of this document and MUST be handled by the foreign architecture implementation in a completely transparent fashion that in no way interacts or interferes with this proposal.

Can we have dynamic declaration and runtime declaration of capabilities?

Answer: don't know (yet). Quoted from Rogier:

"A SOC may have several devices that one may want to directly control with custom instructions. If independent vendors use the same opcodes you either have to change the encodings for every different chip (not very nice for software) or you can give the device an ID which is defined in some device tree or something like that and use that."

dynamic detection wasn't originally planned: static compilation was envisaged to solve the need, with a table of mvendorid-marchid-isamux/isans being maintained inside gcc / binutils / llvm (or separate library?) that, like the linux kernel ARCH table, requires a world-wide atomic "git commit" to add globally-unique registered entries that map functionality to actual namespaces.

where that goes wrong is if there is ever a pair (or more) of vendors that use the exact same custom feature that maps to different opcodes, a statically-compiled binary has no hope of executing natively on both systems.

at that point: yes, something akin to device-tree would be needed.

Open Questions

This section from a post by Rogier Bruisse http://hands.com/~lkcl/gmail_re_isadev_isamux.html

is the ISANS CSR a 32 or XLEN bit value?

This is partly answered in another FAQ above: if 32 bits is not enough for a full suite of official, custom-with-atomic-registration and custom-without then a second CSR group (ISANS2) may be added at a future date (10-20 years hence).

32 bits would not inconvenience RV32, and implementors wishing to make significant altnernative modifications to opcodes in the RV32 ISA space could do so without the burden of having to support a split 32/LO 32/HI CSR across two locations.

is the ISANS a flat number space or should some bits be reserved for use as flags?

See 16-bit RV namespace "page" concept, above. Some bits have to be unary (multiple simultaneous features such as LE/BE in one bit, and augmented Floating-point rounding / clipping in another), whilst others definitely need to be binary (the most obvious one being "paging" in the space currently occupied by RVC).

should the ISANS space be partitioned between reserved, custom with registration guaranteed non clashing, custom, very likely non clashing?

Yes. Format TBD.

should only compiler visible/generated constant setting with CSRRWI and/or using a clearly recognisable LI/LUI be accommodated or should dynamic setting be accommodated as well?

This is almost certainly a software design issue, not so much a hardware issue.

How should the ISANS be (re)stored in a trap and in context switch?

See section above on privilege mode: LAST-ISANS has been introduced that mirrors (x)CAUSE and (x)EPC pretty much exactly. Context switches change uepc just before exit from the trap, in order to change the user-mode PC to switch to a new process, and ulast-isans can - must - be treated in exactly the same way. When the context switch sets ulast-isans (and uepc), the hardware flips both ulast-isans into uisans and uepc into pc (atomically): both the new NS and the new PC activate immediately, on return to usermode.

Quite simple.

Should the mechanism accommodate "foreign ISA's" and if so how does one restore the ISA.

See section above on LAST-ISANS. With the introduction of LAST-ISANS, the change is entirely transparent, and handled by the Supervisor (or Hypervisor) trap, in a fashion that the foreign ISA need not even know of the existence of ISANS. At all.

Where is the default ISA stored and what is responsible for what it is after

Options: * start up * starting a program * calling into a dynamically linked library * taking a trap * changing privilege levels

These first four are entirely at the discretion of (and the responsibility of) the software. There is precedent for most of these having been implemented, historically, at some point, in relation to LE/BE mode CSRs in other hardware (MIPSEL vs MIPS distros for example).

Traps are responsible for saving LAST-ISANS on the stack, exactly as they are also responsible for saving other context-sensitive information such as the registers and xEPC.

The hardware is responsible for atomically switching out ISANS into the relevant xLAST-ISANS (and back again on exit). See Privileged Traps, above.

If the ISANS is just bits of an instruction that are to be prefixed by the cpu, can those bits contain immediates? Register numbers?

The concept of a CSR containing an immediate makes no sense. The concept of a CSR containing a register number, the contents of which would, presumably, be inserted into the NS, would immediately make that register a permanent and irrevocably reserved register that could not be utilised for any other purpose.

This is what the CSRs are supposed to be for!

It would be better just to have a second CSR - ISANS2 - potentially even ISANS3 in 60+ years time, rather than try to use a GPR for the purposes for which CSRs are intended.

How does the system indicate a namespace is not recognised? Does it trap or can/must a recoverable mechanism be provided?

It doesn't "indicate" that a namespace is not recognised. WLRL fields only hold supported values. If the hardware cannot hold the value, a trap MUST be thrown (in the UNIX platform), and at that point it becomes the responsibility of software to deal with it.

What are the security implications? Can some ISA namespaces be set by user space?

Of course they can. It becomes the responsibility of the Supervisor Mode (the kernel) to treat ISANS in a fashion orthogonal to the PC. If the OS is not capable of properly context-switching securely by setting the right PC, it's not going to be capable of properly looking after changes to ISANS.

Does the validity of an ISA namespace depend on privilege level? If so how?

The question does not exactly make sense, and may need a re-reading of the section on how Privilege Modes, above. In RISC-V, privilege modes do not actually change very much state of the system: the absolute minimum changes are made (swapped out) - xEPC, xSTATUS and so on - and the privilege mode is expected to handle the context switching (or other actions) itself.

ISANS - through LAST-ISANS - is absolutely no different. The trap and the kernel (Supervisor or Hypervisor) are provided the mechanism by which ISA Namespace may be set: it is up to the software to use that mechanism correctly, just as the software is expected to use the mechanisms provided to correctly implement context-switching by saving and restoring register files, the PC, and other state. The NS effectively becomes just another part of that state.