As a bit of a thought exercise I started wondering what the new design of machine language would be if we could dump the current "legacy" of x86/x64 and C's domination on the machine language design.
With C's domination I mean that char is always 8 bits (technically it doesn't really need to be but everyone assumes it) the requirement that a char* exists, that any pointer can be converted to it and back and that all pointers are the same size.
These things require that CPUs have byte operations and that all pointers are granular to a single byte even if alignment restrictions would disallow that.
However per-byte operations are only used (of the top of my head) in 5 locations; the machine code itself, string processing and the trio of IO, compression and encryption. Everything else can easily work with 32 or 64 bit integers.
Unless I'm very much mistaken these per-byte algorithms could easily function if all per-byte operations were simd over a 32 bit register (turning off the carry after 8 bits). The code would become more complicated (as simd tends to do) but the naive solution of splatting out each byte over its own 32-bit register would still work.
That would be the first set of operations: compare, integer addition and subtract with full carry, without carry on bit 8, 16 and 24 and without carry on bit 16. Multiply and divide would be restricted to 32 bit granularity.
The real bonus would be that the deference operation can be 4 byte granular. Which means that the 32 bit memory space is 16 gigs and would have bought a few more years before the switch to 64 bit.
The only problem remaining is the machine code, 4 bytes for each operand would be massive ad a bit of a waste. 16 bit operands are a bit more sane but would require some extra consideration with jumps. One solution is to have the program pointer be 33 bits and to only allow absolute jumps to be to even locations and relative jumps remaining arbitrary.
A simpler solution would be to just disallow code to be in the top 8 gigs of virtual memory, the program counter just cannot point to the top half of the memory. This requires more consideration when needing to do run-time code generation and where code can be loaded in.
I do not think that such a machine would be compliant to C. Although if you set the size of char to be 32 bits it would still work. Though it would require rewriting all programs that assume the size of a char to be 8 bits (which is everything).