Bytecode VMs in surprising places (2024)

107 points - last Friday at 2:12 PM

Source

Comments

drob518 today at 1:22 PM
On one hand, all these mini interpreters and compilers are cool. I have a soft spot for extensible systems. On the other hand, all these things are a huge security problem. When every subsystem and data format is carrying around its own Turing complete bytecode and JIT, they all need to be secure and bug free for the system to be secure and bug free. And that far more code surface to keep clean.
jaen today at 3:00 PM
References for the Quake virtual machines:

Quake 1 had QuakeC: [1] https://en.wikipedia.org/wiki/QuakeC [2] Hello world in QuakeC - https://www.leonrische.me/pages/quakec_bytecode_hello_world....

Quake 2 moved to native binaries.

Quake 3 had a new VM that enabled compiling regular C using LCC: [1] https://fabiensanglard.net/quake3/qvm.php [2] Spec - https://www.icculus.org/~phaethon/q3mc/q3vm_specs.html

raddan today at 5:41 PM
I was told by an engineer at Microsoft that Excel's formula interpreter is essentially a kind of bytecode-based stack machine. This came up in the context of a bug I found (while working on a project with Microsoft) that revealed that not only was there a small floating-point bug in some calculations, but (improbably, to me) that Excel preserved this inaccuracy across architectures for decades. So the bytecode interpreter made sense. That said, I've never seen this implementation myself, so it may still be rumor.
superjan today at 11:30 AM
How about the infamous iOS hack with a VM implemented in a JBIG2 PDF? https://projectzero.google/2021/12/a-deep-dive-into-nso-zero...
chirsz today at 11:22 AM
SBus peripherals use the Forth language in their PROMs to initialize themselves[1].

[1] https://docs.oracle.com/cd/E19957-01/802-3239-10/sbusandfc.h...

magnat today at 11:13 AM
Some other examples:

- ACPI configuration for power management and platform stuff [1]

- Bitcoin transactions [2]

- TrueType fonts [3]

[1] https://wiki.osdev.org/AML

[2] https://en.bitcoin.it/wiki/Script

[3] https://learn.microsoft.com/en-us/typography/opentype/spec/t...

tptacek today at 5:36 PM
More surprising to me than the BPF VM itself is the optimizing compiler for it that lives in libpcap.
pratikdeoghare today at 9:39 AM
There is one in golang regular expressions https://swtch.com/~rsc/regexp/regexp2.html

I guess that is why you say re.Compile.

majorbugger today at 5:53 PM
Does it mean we can play Doom on WinRar?
pervasif today at 2:49 PM
These little VMs in applications are everywhere. Apple Mach-O binaries have built in opcodes for binding and rebasing symbols interpreted by (numerous) little VMs in dyld:

https://github.com/apple-oss-distributions/dyld/blob/e9da5ae...

https://github.com/apple-oss-distributions/dyld/blob/e9da5ae...

Their use is less common now since the introduction of the mach-o load command LC_DYLD_CHAINED_FIXUPS, but these opcodes still have to be supported for older binaries. Also, some popular compilers including Zig still emit these opcodes for LC_DYLD_INFO and LC_DYLD_INFO_ONLY.

kazinator today at 2:39 PM
Busicom 141 PF calculator (1971). This was a product built on the Intel 4004 processor. It was not programmed using Intel 4004 machine langauge directly, but using a more powerful machine language for which the 4004 ran an intepreter included in the image.
ivankelly today at 9:50 AM
Quake had it’s own vm also
twic today at 2:36 PM
The Python pickle format is a bytecode [1], although not a Turing-complete one, I think.

[1] https://formats.kaitai.io/python_pickle/

dlojudice today at 1:23 PM
Another World (Out of this world) game had its own bytecode [1]

[1] https://github.com/fabiensanglard/Another-World-Bytecode-Int...

self_awareness today at 10:13 AM
RarVM was used in a previous version of the format, newest RAR has removed it, and RarV5 doesn't have a VM.
omeid2 today at 9:36 AM
This list is entirely incomplete without mentioning Java Card.

There is a tiny Java Bytecode VM in an insanely large list of places, you can find some of them here:

https://github.com/crocs-muni/javacard-curated-list https://en.wikipedia.org/wiki/Java_Card

ignoramous today at 9:31 AM
TikTok shipping XOR cipher'd bytecode & interp is right up there: https://news.ycombinator.com/item?id=34109771
anthk today at 10:20 AM
yt-dlp's jsinterp.py

https://jxself.org/compiling-the-trap.shtml

I've got subleq+eforth (https://github.com/howerj/muxleq) running in JS which is dead simple to do. No input but I could output ASCII mapping values to an array.

https://esolangs.org/wiki/Subleq

So, yes. yt-dlp runs propietary Youtube JS code defying the original purpose.

dsecurity49 today at 9:51 AM
[flagged]