A collection of novel architectural implementations within Witness.
01
Binary Impersonation
The protected binary's PE structure is reshaped to match the forensic
fingerprint of a different commercial protector. Automated classifiers,
YARA rules, and triage tools misidentify the protection technology
and direct analysts toward the wrong toolchain.
VMProtect
VM Protector
Themida
Packer + VM
Denuvo
DRM
CodeVirtualizer
VM Protector
Obsidium
Packer
UPX
Compressor
ASPack
Packer
MPRESS
Compressor
Enigma
Protector
Petite
Compressor
No other protector can impersonate 10+ different products on demand.
The analyst's first step, identifying the protection technology,
yields the wrong answer. Every tool, script, and heuristic they apply
afterward is calibrated for a product that isn't present.
02
Anti-AI Analysis
Modern reverse engineering increasingly relies on neural decompilers,
LLM-assisted analysis, and GNN-based program understanding. Multiple
adversarial primitives are injected specifically to degrade
AI-assisted reverse engineering.
Exploits the quadratic scaling of transformer self-attention.
Semantically meaningless operations exhaust the model's attention
budget, crowding actual logic beyond the attention horizon.
Primitive 02
Graph Poisoning
Targets: GNN-based program analysis, data flow recovery
Injects phantom dependency edges that appear structurally identical
to real data flows but carry no semantic meaning. Graph neural
networks trained on data flow edges are poisoned with
high-confidence false information.
Primitive 03
Context Window Exhaustion
Targets: Sequence models, RNN/LSTM decompilers
Forces models to maintain long-range context across function
boundaries for dependencies that ultimately contribute nothing.
Hidden-state capacity is consumed by algebraic noise, degrading
accuracy on real dependencies.
Coordinated sub-strategies that corrupt the model's internal
representation at the token, function, and structural levels
simultaneously. Embedding vectors for real code are shifted
toward adversarial regions of the representation space.
These primitives are injected before virtualization. The VM then encrypts
and fragments them further. An LLM analyzing the final binary faces
adversarial inputs designed specifically for neural
architectures, wrapped inside a custom encrypted ISA it has
never seen in training data.
Combined with binary impersonation, every LLM tested to date
misidentifies the protection technology 100% of the time,
directing its entire analysis toward the wrong product.
03
Per-Build Algorithm Synthesis
Traditional protectors use fixed algorithms with per-build keys.
Witness synthesizes the algorithms themselves. The cipher, the PRF,
and the ISA encoding are all generated fresh on every compilation.
Component
Traditional Protectors
Witness
Stream Cipher
Fixed algorithm, per-build key
Entire construction synthesized per build. Internal structure, round count, and data routing are all unique
PRF Construction
Fixed structure (HMAC or similar)
Internal wiring synthesized per build. Not a parameterization of a known PRF
Hash Function
Fixed (SHA-256, CRC32, etc.)
Per-build constants and internal structure. A different hash on every compilation
PRNG
Fixed (Mersenne Twister, etc.)
Per-build parameterization that changes the algorithm's behavior, not just its seed
Opcode Encoding
Fixed opcode table, possibly permuted
Multiple encoding transforms applied per opcode. The bytecode format itself is unique
Dispatch Table
Fixed structure, encrypted entries
Self-mutating. Re-encrypted after every dispatch with per-build seeding
An attacker who fully reverse-engineers one build learns nothing
transferable to the next build. The algorithms themselves are different.
There is no "Witness cipher" to study. Every binary contains
a cipher that has never existed before and will never exist again.
04
Language-Agnostic Protection
Operating at the LLVM IR level means any language that compiles through
LLVM is natively supported. No source rewriting, no language-specific
hooks, no binary rewriting.
C
clang
C++
clang++
Rust
rustc (LLVM)
Zig
zig cc
D
ldc2
Fortran
flang
Objective-C
clang
Comparison with other protectors
VMProtect / ThemidaBinary-level. Post-compilation, no language awareness
DenuvoBinary-level. Post-compilation instrumentation and VM injection
Obfuscator-LLVMLLVM-level, but no virtualization (native-code transforms only). Systematically defeated by D810, Miasm, and angr. Forks reach LLVM 17-21
WitnessLLVM 22.1.x plugin. Any LLVM frontend, full ISA, C++ exceptions, atomics, SIMD
05
Cryptographic Entanglement
Protection doesn't end at function boundaries. Cryptographic state
flows between functions, between invocations, and along execution
paths, creating dependencies that cannot be severed without
breaking everything at once.
Cross-Function Binding
When multiple functions are protected, they share a persistent
cryptographic accumulator. Each function's execution folds a
fingerprint into this shared state. Tampering with function A
corrupts the decryption keys of functions B, C, and every other
entangled function. Silently.
No known prior art
Path-Sensitive State Binding
The execution path is folded into the decryption key stream.
Reaching the same point via different paths produces different
keys. Forcing an alternate path doesn't just skip a check.
It makes every subsequent instruction decrypt to garbage.
Advances environmental keying
Cross-Invocation Mutation
After each function invocation, the entire bytecode segment is
re-encrypted with a fresh key. A memory dump captured during one
call produces bytecode that is invalid for the next call. The
binary is literally different every time it runs.
No known prior art
Load-Bearing Integrity
Integrity verification is not a removable check. The values
produced during verification are mathematically required for
correct decryption of subsequent instructions. Patching out
a check doesn't skip it. It corrupts all downstream
computation with no error messages.
No known prior art
06
Algebraic Bytecode Encoding
Values in the VM don't just get XOR-masked. They pass through multiple
layers of algebraic encoding, each operating in a different mathematical
domain. Stripping one layer reveals another that requires entirely
different analysis techniques.
Outermost Layer
Memory Layout Randomization
The VM register file is scattered across disjoint memory regions
with a per-invocation random permutation. The memory layout is
different every time the function is called.
Value Domain Encoding
Every value in the data segment is encrypted with a position-dependent
block cipher. The same value at different offsets produces different
ciphertext. Keys rotate on every instruction dispatch via a one-way
ratchet, defeating black-box synthesis attacks that need multiple
I/O pairs per bijection.
Secret-Shared Bytecode
Bytecode is split into multiple secret shares using a threshold
scheme. Shares are proactively re-randomized at execution
boundaries without changing the underlying secret. Phantom reads
ensure uniform memory access patterns.
Algebraic Matrix Encoding
Arithmetic operations are encoded as matrix-vector multiplications
in an algebraic number system. Consecutive operations are composed
into a single matrix multiply, making it impossible to identify
individual instructions from the bytecode.
Access Pattern Uniformization
Every real memory access is accompanied by phantom read-writes at
unpredictable offsets. Both read and write patterns are uniformized:
phantom writes perform a deterministic decrypt-reencrypt round-trip,
producing identical ciphertext. Cache-timing and memory-trace side
channels see only uniform access patterns.
Innermost Layer (plaintext exists only during handler execution)
Each layer operates in a different mathematical domain. Stripping one
reveals another that requires entirely different analysis techniques.
Values are only meaningful within the context of a specific execution path.
07
Architectural Innovations
Novel structural techniques in the VM runtime that break assumptions
made by dynamic analysis tools, debuggers, and memory forensics.
Novel
Split-Context VM Execution
The VM is split into two cooperative execution contexts: one that
fetches and pre-decrypts bytecode, and one that executes it. The two
contexts alternate. Single-stepping one shows no operations from the
other. Neither context alone reveals the program's behavior.
Novel
White-Box Key Encoding
The root secret key is encoded in multi-layer lookup tables with
affine encoding. A hardware-derived random mask is applied at every
lookup, preventing Differential Power Analysis from correlating
lookup indices with key bytes. The key never exists as a contiguous
plaintext array in memory.
Novel
Call Stack Spoofing
The VM rewrites its call chain so stack unwinders see a clean,
legitimate-looking backtrace. Synthetic unwind metadata is generated
so the spoofed stack passes both debugger inspection and OS
exception dispatch.
Novel
Memory-Hard Key Derivation
Key derivation uses a memory-hard function with a large working
buffer and data-dependent access patterns. This forces hardware
side-channel attackers to contend with unpredictable memory behavior
that defeats cache-line monitoring.
Novel
Recursive Self-Virtualization
Functions can be virtualized multiple times, creating nested VMs.
Each layer has independent encryption keys, independent dispatch
tables, and independent handler variants. An attacker who defeats
the outer VM faces a completely new inner VM with a different ISA,
different algorithms, and different keys.
Novel
Surgical Opcode Subsetting
The VM instruction set is pruned per-build. Handlers for opcodes
the protected code never uses are completely stripped from the
binary. Each build contains a unique, minimal handler set. There
is no universal handler catalog to study, because every binary's
VM has a different instruction set.
Novel
Full C++ Exception Virtualization
C++ try/catch/throw execute entirely within the VM. Every other
commercial protector exits to native code for exception handling,
exposing control flow at exception boundaries. Witness keeps
exception dispatch, stack unwinding, and catch matching inside the
encrypted interpreter. No native exception frames are visible.
Novel
Variadic Function Virtualization
Variadic calling conventions are virtualized entirely inside the
VM. The argument list is captured at function entry and stored in
the encrypted data segment. Argument reads go through
double-indirection within the interpreter. Every other protector
exits to native code for variadic calls, exposing arguments and
calling conventions on the real stack.
Novel
Load-Bearing Phantom Handlers
The dispatch table contains handlers that are never the target of
valid bytecode but are cryptographically entangled with the real
dispatch chain. Removing or modifying a phantom handler corrupts
the encryption state for all real handlers. They cannot be
identified via execution tracing (never dispatched) or stripped
via dead code elimination (load-bearing).
Novel
Self-Mutating Dispatch Table
Handler positions in the dispatch table are physically shuffled
at runtime after every dispatch. A memory dump captured at one
point in execution shows a completely different table layout than
a dump captured moments later. All entries are re-encrypted with
fresh per-position keys periodically. No other protector mutates
the physical layout of its handler table during execution.
Novel
Page-Granular Key Derivation
Each 4KB code page is encrypted with its own unique key derived
from a master key and the page index. Pages are decrypted
on-demand via exception handler when first accessed. The derived
key is never stored after use. Compromising one decrypted page
reveals nothing about the keys of other pages. No other protector
combines per-page key derivation with on-demand exception-driven
decryption.
08
Post-Quantum Cryptography
The key derivation pipeline uses both classical and post-quantum
primitives. Even a quantum computer capable of breaking elliptic-curve
cryptography cannot extract the root key without also breaking
a lattice-based KEM.
Classical
Elliptic Curve
Ephemeral key exchange for hardware-bound per-function
share encryption. Private key material never leaves
the trusted execution boundary.
ECDH + KDF
Memory-Hard
Memory-Hard KDF
Large working buffer with data-dependent access patterns.
Resists GPU/ASIC acceleration and cache-timing side channels
from DCA/DFA attacks on white-box key extraction.
Data-dependent access · Explicit zeroing
Post-Quantum
Lattice-Based KEM
A post-quantum key encapsulation mechanism augments the root
key after memory-hard stretching. Provides security against
quantum adversaries at the highest standardized level.
NIST PQC Standard
The root key derivation pipeline chains multiple cryptographic
stages. An attacker must defeat white-box encoding, memory-hard
stretching, and a post-quantum KEM, simultaneously, in the
correct order.