Operation Piezoelectric Effects — LLVM Compilation Strategy Memo
Epic: Operation Piezoelectric Effects (commit tag
[piezo])
Date: 2026-04-18
Status: Design memo — informs Phase 1 implementation.
Sources: Leijen 2017 (type system foundation), Xie & Leijen 2020 (evidence-passing, three-kind ops, benchmarks), paper notes in EFFECT_SYSTEMS_NOTES.md.
Depends on: committed design in EFFECTS_IMPLEMENTATION_PLAN.md (all 10 decisions closed 2026-04-18).
Phase 0 exit criterion: “Compilation model chosen, with rationale.” This memo is the rationale.
This memo translates the abstract evidence-passing design into concrete LLVM IR shape, calling conventions, and phase sequencing. Where the papers target Haskell (GHC runtime with closures, lazy evaluation, type-class dispatch) or JavaScript (trampoline, variadic detection), Quartz targets LLVM IR directly — we re-derive the mechanics.
1. ABI: evidence as a threaded pointer
1.1 Representation
An evidence context is a singly-linked list of handler nodes, innermost first:
// Conceptual — actual Quartz-generated LLVM IR
struct EvNode {
i64 marker; // unique runtime token for this handler install
ptr handler; // points to effect-specific Handler struct (vtable)
ptr tail; // next node outward, or null for empty context
}
A handler struct is a vtable of operation function pointers plus optional return-clause slot:
struct ThrowsHandler {
ptr throw_op; // (ev: ptr, e: E) -> Never [value/function/operation kind tag]
ptr return_clause; // optional; (ev: ptr, v: T) -> T' or null
// No additional fields for this effect; value-carrying effects add slots.
}
Every function with a non-empty effect row takes an implicit __ev: ptr parameter — the first parameter after self (if any). Pure functions take no evidence.
1.2 Calling convention
def pure_func(x: Int): Int → LLVM: define i64 @pure_func(i64 %x)
def throws_func(x: Int): Int can Throws<E> → LLVM: define i64 @throws_func(ptr %__ev, i64 %x)
def method(self: MyType, x: Int): Int can Io → LLVM: define i64 @method(ptr %self, ptr %__ev, i64 %x)
def poly<ε>(x: Int): Int can ε → LLVM: define i64 @poly(ptr %__ev, i64 %x) // same shape as throws_func
Key invariants:
- Evidence parameter immediately follows
self(or comes first if no self) in the LLVM signature. - Polymorphic effect functions ALWAYS receive evidence, even if instantiated with empty row. Trade-off: one extra parameter push for polymorphism. Acceptable — Koka accepts this and hits perf targets.
- Pure functions (empty row, no
canclause) receive NO evidence parameter. This preserves direct-call perf for the pure subset. - Call sites must materialize the correct evidence. For functions inheriting the caller’s row, pass
%__evverbatim. For functions with a subset row, the caller’s%__evis still correct because evidence is order-preserving (a function requiringThrowsin context[Throws, Io]just looks upThrows, ignoresIo).
1.3 FFI boundaries
Effect rows DO NOT cross extern "C" boundaries.
- Calling
extern "C" def libc_fn(...)from Quartz: libc_fn’s signature has nocan, so it’s callable only from empty-row or row-polymorphic contexts. Callers with effects can still call it; evidence simply isn’t passed. - Exposing a Quartz function to C: if the function has any non-empty row, compiler emits a trampoline wrapper that installs default handlers before the call and tears them down after.
#[export_c]annotation or inferred fromextern "C"at the Quartz-source level. - Quartz → Quartz via function-pointer indirection: the pointer type includes the effect row; calling through it threads evidence normally.
2. The three operation kinds
From Xie & Leijen 2020 §3.3: every effect op is classified at declaration time into one of three kinds. Quartz inherits this taxonomy because it determines the compilation path.
2.1 value kind — constant resumption
effect Config
def debug_mode(): Bool = value(false) # always resumes with false by default
end
Compilation:
- Handler struct slot = just the constant value (stored inline where a fn pointer would be).
perform debug_modeat call site: two loads (evidence → handler; handler → value slot). No call. No continuation capture.- Cost: O(depth of evidence) for handler lookup, then O(1) for value read.
2.2 function kind — tail-resumptive
effect Log
def log(msg: String): Void = function(msg -> write_to_stderr(msg))
end
The body uses resume(x) at exactly the tail position. Most effect ops are this kind.
Compilation:
- Handler struct slot = function pointer to
(ev, args) -> return_type. perform log(msg)at call site: evidence → handler → op fnptr → call(ev, msg). Normal return; continuation is implicit (the return address on the native stack).- No continuation capture. No CPS. No state machine. Direct indirect call with vtable lookup.
- Cost: O(depth of evidence) for handler lookup, then one indirect call.
- This is the fast path — the 2020 paper’s benchmarks show this is competitive with pure code even for deep effect stacks.
2.3 operation kind — general
effect Throws<E>
def throw(e: E): Never = operation(ev, k, e -> /* discards k, aborts */)
end
effect Async
def await<T>(fut: Future<T>): T = operation(ev, k, fut -> /* stores k in scheduler, yields */)
end
Body may discard the continuation (abort), use it once (single-shot resume), or use it multiple times (multi-shot, only if declared multi-shot — deferred for Ndet).
Compilation:
- Handler struct slot = function pointer to
(ev, k, args) -> resultwherekis the reified continuation. perform throw(e)at call site: evidence → handler → op fnptr → call(ev, continuation, e).- Continuation reification needed. We reuse Quartz’s existing
$pollstate-machine machinery (the same mechanism that implementsasync/awaittoday). - Cost: CPS transform for the enclosing function (every call site needs to “know” where to resume); one indirect call for the op itself; heap allocation for captured locals if the continuation outlives the stack frame.
2.4 Tail-resumption detection (the >80% perf win)
Detection is a MIR-level syntactic analysis. The op body is classified as:
valueif body has formvalue(expr)whereexprhas no effects and doesn’t referenceresume.functionif body has formfunction(args -> expr)where:- The function body contains exactly one
resumecall, - And that
resumeis in tail position (last expression, or guarded tail ofif/match), - And
resume’s arg is not itself aperform(that would chain operations).
- The function body contains exactly one
operationotherwise. Default fallback.
Users can force operation via explicit syntax (def throw(e) = operation ...). Auto-detection is for ergonomics; kind is always visible in hover / quartz doc.
Implementation location: self-hosted/middle/typecheck.qz — new pass after effect-row inference, before MIR lowering. Emits a kind tag onto each effect op declaration.
3. Compiling operation-kind ops: CPS transform
For operation-kind effect ops, the caller’s code needs to split at the perform call — code before becomes “push arg, suspend” and code after becomes “continuation, resumable.”
3.1 Quartz already has this machinery
Our existing async compilation does exactly this: functions declared async are lowered to state machines where each await point becomes a suspension point. The state machine struct captures live locals at the suspend; $poll drives the machine forward from one suspend to the next.
Plan: reuse this. A function containing any operation-kind perform becomes a state machine in the same way. The operation handler is invoked with a pointer to the state machine; invoking resume is invoking $poll on the state machine with a value.
This is why Phase 3 (async-as-effect migration) is less work than it looks: async is already operation-kind-flavored; we just reframe it through the effect system rather than as a special-cased keyword.
3.2 When does a function need the state-machine lowering?
A function needs operation-kind treatment if ANY perform in its body (transitively, including through called functions) hits an operation-kind op. Effect-row inference gives us this for free: if a function’s row contains any effect whose ops include an operation-kind, the function’s codegen lowers to a state machine.
Practical consequence: most Quartz functions will be function-kind and compile to direct calls. Only functions that throw, await, yield to unbounded generators, or panic hit the state-machine path. Those are already the “slow” functions in Quartz’s runtime.
3.3 Avoiding the “everything is a state machine” failure mode
Koka’s 2017 paper highlighted this failure mode: applying CPS to everything tanks performance. Evidence-passing + per-kind dispatch avoids this because function-kind ops don’t trigger CPS. A Quartz function that only uses Log and Clock (both function-kind) compiles to ordinary LLVM code with extra evidence parameter.
4. Handler installation
4.1 Source form
result = with catch (e: ParseError) -> default_result do ->
risky_parse(input)
end
Desugars to:
# Allocate new evidence node on stack
# Invoke body with extended evidence
# After body returns, unwind evidence (implicit via stack discipline)
4.2 LLVM IR
; %outer_ev is the caller's evidence (may be null for main)
%marker = call i64 @__qz_fresh_marker()
%handler = alloca %ParseErrorHandler
store ptr @parse_error_catch_op, ptr %handler, ... ; fill in vtable
%new_ev = alloca %EvNode
store i64 %marker, ptr %new_ev, ...
store ptr %handler, ptr %new_ev, ...
store ptr %outer_ev, ptr %new_ev, ...
%result = call ptr @risky_parse(ptr %new_ev, ptr %input)
; (on normal return, %new_ev goes out of scope — stack unwinds)
Allocation strategy: stack allocation for handlers is the common case. Only if the handler is captured into a closure or returned from its scope do we heap-promote. Escape analysis in the MIR layer handles this (same mechanism Quartz uses for other stack-allocatable values).
4.3 resume semantics
For function-kind ops: resume(x) is implicit — the op’s return value IS the resumption. No resume keyword in the source clause body:
effect Log
def log(msg: String): Void = function(msg ->
write_to_stderr(msg) # void return; no explicit resume
)
end
For operation-kind ops: resume(x) explicitly invokes the continuation with a value. Compiled as $poll(continuation, x):
effect Async
def await<T>(fut: Future<T>): T = operation(ev, k, fut ->
if fut.is_ready then
resume(k, fut.value) # resume with the ready value
else
scheduler_park(fut, k) # store k; yield up
end
)
end
4.4 Scoped-resumption enforcement (guard)
When resume is invoked:
- Compare current evidence’s marker tail against the marker captured with
k. - If match (continuation’s context is still above current context): safe, proceed.
- If mismatch (continuation captured under a different evidence stack): runtime abort with
QZ9502: Unscoped resumption — handler torn down before continuation invoked.
This prevents the Peter Landin hazard (“you can enter a room once, yet leave it twice” — referenced in Xie-Leijen 2020 §2.4) — a resume captured under handler A cannot be invoked under handler B even if A and B catch the same effect.
5. Polymorphic effect functions
def map<T, U, ε>(items: Vec<T>, f: Fn(T) can ε: U): Vec<U> can ε inherits f’s effect row.
5.1 Single compilation, evidence threaded
We compile ONE version of map that takes evidence as a regular parameter. When called with a concrete effect row (e.g., can Throws<E>), the caller passes its own evidence; map threads it to f’s invocation. No monomorphization per effect row.
Why not monomorphize per row? Would blow up code size exponentially — N generic functions × M concrete effect rows × K concrete type params = NxMxK compiled copies. Koka’s experience: ~20% code growth is acceptable; 100%+ is not.
5.2 When monomorphization DOES happen
- Concrete effect row + generic type:
Vec<Int>.map(f: Fn(Int): Int)with f pure — no effect row, no evidence, monomorphized as usual forT=Int, U=Int, ε=∅. - Single-use at a concrete row: the compiler MAY specialize if it helps inlining, but it’s not required.
- Trait method dispatch: methods are already monomorphized per concrete type; effects add another dimension but evidence is always threaded, so no additional copies needed.
5.3 LLVM-specific optimization
map with evidence-threaded design compiles to calls of form:
call i64 %f(ptr %__ev, i64 %arg)
LLVM can inline through function pointers when they’re constant at the call site (which happens for any non-higher-order use). For higher-order uses (the actual reason we have map), the indirect call through %f is unavoidable — but that cost exists today for any function-pointer call.
Net cost estimate: ~5% overhead for polymorphic-effect higher-order functions vs. direct calls. Within noise for most workloads.
6. Interaction with existing Quartz features
6.1 Generics
Extend TcRegistry to store effect-row parameters alongside type parameters. Inference unifies both; monomorphization happens on type params only (effect params use evidence threading).
6.2 Traits and trait methods
Trait methods can declare effect rows:
trait Parse
def parse(s: String): Self can Throws<ParseError>
end
Impl blocks must honor the trait’s row (or have a weaker row, narrower is allowed — a parser that doesn’t throw is fine). Trait method dispatch: same vtable mechanism we have today, plus evidence threading.
Auto-satisfaction + effects: an inherent method with row can Throws<E> auto-satisfies a trait method requiring can Throws<E>. Same row, same discipline.
6.3 Move semantics / borrow checker
Open question for Phase 1 design:
- What’s the ownership of
xinthrow(x)? Default: moved. Handlers receive owned values. - What’s the ownership of a captured continuation? The continuation owns its captured frame;
resume(x)consumes the continuation (single-shot). Multi-shot would require cloning, deferred with Ndet. - Evidence pointer is borrowed — never consumed by
perform. Always&evat the call site.
6.4 Drop / RAII
Handler installation maps naturally to RAII. The with handler do -> body end block:
- Allocates handler + evidence node on stack.
- Runs body.
- On scope exit (normal return OR unwinding from a panic): handler’s Drop (if declared) fires.
This gives us “cleanup handlers” for free — effects can observe when their scope ends, not just when their operations are invoked.
6.5 Async (Phase 3 migration)
Current state: async def/await compile via $poll state machines. go do -> end spawns; channels/select are explicit.
Phase 3 migration:
Asyncbecomes an effect with opsawait,spawn,suspend,yield.- Ops are operation-kind (reify continuation).
- The scheduler IS the handler.
with sched_handler do -> program() endis whatsched_runbecomes. go do -> enddesugars to spawning an effect-aware task.- Codegen changes are minimal — existing $poll machinery IS the operation-kind lowering. We reframe rather than rewrite.
- User-visible changes: user code stays invisible (no
can Asyncneeded for main thanks to prelude default handler). Advanced users can install custom schedulers viawith my_scheduler do -> ....
6.6 Panic
Panic is an effect with one op panic(msg, trace) — operation-kind (never resumes). Default handler (installed in prelude): print + flush + exit(101). Custom handlers catch panics before they reach main — useful for test harnesses, crash reporters, debugger hooks.
Panic UNWINDS through intermediate handlers (their Drop firings run), but is NOT caught by with catch unless the catch explicitly targets Panic. Practical effect: with catch (e: ParseError) -> ... do -> code end does NOT swallow a panic from inside code.
7. Phase 1 implementation milestones
7.1 Milestone A — Lexer + parser (~1 quartz-day)
- Add
TOK_CAN,TOK_EFFECT,TOK_WITH,TOK_HANDLE,TOK_TRY,TOK_REIFY,TOK_THROW,TOK_RESUMEtokens. ps_parse_effect_decl— parseseffect Name<T> ... end.- Extend
ps_parse_functionto handlecan Rowsuffix. ps_parse_handler_block— parseswith ... do -> body end.ps_parse_try— parsestry expr(prefix operator).ps_parse_reify— parsesreify { expr }.- Delete
$trymacro handling (replaced bytrykeyword).
7.2 Milestone B — Type system: rows + inference (~2 quartz-days)
TcRegistry.rows— new parallel Vecs storing row structure (label list + optional tail variable).tc_row_unify— Rémy-style unification, allows duplicates (Model B).tc_row_substitute,tc_row_occurs_check.- Extend fn type representation with optional effect row.
- OPEN / CLOSE rules in inference (the load-bearing simplification).
- Call-site propagation: callee’s row ⊆ caller’s row.
- Effect-op call adds op’s label to caller’s row.
- Handler installation removes op’s label from body’s row.
7.3 Milestone C — MIR: effect primitives + kind classification (~1 quartz-day)
- New MIR opcodes:
MIR_PERFORM,MIR_INSTALL_HANDLER,MIR_RESUME,MIR_REIFY_BEGIN,MIR_REIFY_END. - Tail-resumption detection pass (§2.4) — classifies each op declaration as value/function/operation.
- Lower
try exprto implicitperformofthrowon None/Err branches. - Integrate with existing async lowering —
operation-kind ops reuse$pollstate-machine lowering.
7.4 Milestone D — Codegen: evidence threading + handler structs (~1.5 quartz-days)
- Emit evidence parameter on functions with non-empty rows.
- Emit handler struct types (one per
effect Name ... end). cg_emit_perform— compile perform-op call (lookup + indirect call).cg_emit_install_handler— allocate handler + evidence node, extend ev chain.cg_emit_resume— for function-kind, just return; for operation-kind, invoke captured$poll.cg_emit_reify— install a handler that catches-and-wraps.- Default handler stack initialization in main entry point.
7.5 Milestone E — Stdlib pilots (~0.5 + 0.5 quartz-days)
- E1:
std/logmigration. SimpleLogeffect withinfo/warn/errorops (function-kind). Default handler writes to stderr. Tests: log messages reach stderr; custom handler can capture. - E2:
std/parsemigration.Throws<ParseError>effect.parse_int,parse_float,parse_boolops. Tests: uncaught throws reach main (panic);with catchhandles; inference propagates through call chains.
7.6 Milestone F — Error messages (~1 quartz-day, parallel to A-E)
All 6 commitments per quality bar + engagement-level detection. Test: concrete error cases from §5.5 of the plan produce messages matching the spec exactly.
7.7 Milestone G — Fixpoint + smoke tests (~0.5 quartz-day)
quake guardpasses after all changes.- Fixpoint: gen1 == gen2 byte-identical.
- Smoke tests (brainfuck.qz, expr_eval.qz, style_demo.qz) pass unchanged.
- Existing QSpec suite (all 523 files) passes. New specs for effects mechanism:
effect_basics_spec.qz,throws_handler_spec.qz,effect_row_inference_spec.qz.
7.8 Milestone H — Docs (~0.5 quartz-day)
- Fill in
docs/EFFECTS.md§§ 2, 6, 7, 9, 11. - 10-15 examples in
examples/effects/. - Update
docs/QUARTZ_REFERENCE.mdsyntax section. - Update CLAUDE.md with effect idioms.
Total estimate: 8 quartz-days across 10-14 sessions. Matches the plan’s 5-7 quartz-days estimate with ~30% buffer for integration discovery.
8. Performance targets + kill criteria
8.1 Targets
- Function-kind effect op: ≤ 3 extra instructions vs. direct call (1 load for evidence, 1 load for handler, 1 indirect call).
- Operation-kind effect op: reuses existing
$polloverhead (no new cost). - Polymorphic effect function: ≤ 5% slower than same function with concrete effect row.
- Deep effect stack (5+ handlers): constant-time lookup (hot path cache) — target O(depth) but depth-3 case within 2x of depth-1.
- Self-compile with all effects instrumented: within 10% of current 18.3s wall time.
8.2 Kill criteria
From the plan, restated with specifics:
- Self-compile > 2x current wall time → redesign evidence lookup (hot cache, shallow-handler fast path).
- Effect row inference adds > 2 GB peak RSS to typecheck → reconsider row representation.
- Error messages can’t hit the quality bar without reinventing the type system → fall back to Phase 1-lite (Throws only, no full effect system) and revisit in a future major release.
- Polymorphic functions cost > 20% vs. direct — check polymorphic-function code size growth; if it’s worse than Koka’s 20%, investigate.
9. What we’re NOT doing in Phase 1
Deliberately out of scope:
- Multi-shot handlers. Operation-kind ops may
resumezero or one times.resume(k); resume(k)is a runtime abort. Deferred with Ndet (indefinitely). - Effect-polymorphic trait bounds.
def foo<T: Eq can Throws>(a: T)— worth having eventually but adds complexity. Phase 2 or later. - First-class handlers as values. Pass a handler as a function parameter? Nice but complex. Later.
- Dynamic effect extension. Runtime-registered effects. No.
- Alloc effect. Phase 2 (needs State + Reader machinery first).
- Compiler dogfooding (Phase 5). Big payoff, but Phase 1 must prove machinery first.
These omissions are explicit — not forgotten. Phase 1 exits with them intact as future work, not silent gaps.
10. Open design questions (to resolve during Phase 1)
- Exact MIR opcode shape for
MIR_PERFORM— arg order, calling-convention discipline. Resolve when codegen phase starts. - Handler-struct layout for effects with generic type params (
Throws<E>where E varies). LLVM IR has no generics; we probably generate one handler-struct type per concrete E and monomorphize. Confirm during Milestone B. reify { }block’s desugar — exactly what prelude-defined handler does it install? Is it syntactic sugar forwith catch (e: E) -> Err(e), return v -> Ok(v) do -> body end, or a distinguished codegen form? Prefer the desugar for orthogonality.- Scoped-resumption guard overhead — one extra load + comparison per resume. Measurable? Bench before committing to always-on.
- Stack vs heap allocation for handler structs when the handler captures closures. Escape analysis should handle; verify during implementation.
11. Conclusion
Evidence-passing (Xie-Leijen 2020 style) maps cleanly to LLVM IR via:
- Evidence as a threaded pointer parameter.
- Three-kind operation taxonomy that puts the common case on the fast path (direct indirect call).
- Reuse of existing
$pollstate-machine lowering for the operation-kind slow path. - Stack-discipline handler installation with escape-analysis heap-promote as backup.
Bottom line: this is implementable in Phase 1 as planned, with no redesign of existing subsystems. Async ($poll state machines) stays; the borrow checker interacts cleanly; generics extend naturally; FFI boundaries are well-defined; error messages are achievable within our quality bar.
Phase 0 is complete. Phase 1 begins when you want it to.