Result.unwrap() Returns 0 — Root Cause and Fix Plan
Status: Investigation complete. Root cause identified. Fix plan ready.
Affects: 2 tests in spec/qspec/option_narrowing_spec.qz (narrow_result_ok, narrow_result_err_else).
Scope: Not actually a layout bug — it’s a codegen/tail-call correctness bug with a latent escape-analysis gap. See “Root cause” below.
1. Problem
def narrow_result_ok(): Int
r = Result::Ok(55)
if r is Ok
return r.unwrap()
end
return -1
end
Expected return: 55. Actual return: 0.
The sibling test using Option works:
def narrow_some_unwrap(): Int
opt = Option::Some(42)
if opt is Some
return opt.unwrap() # OR opt! — both work, returns 42
end
return -1
end
Reproduced locally: 5/7 tests pass, 2 Result tests fail with “Expected 55, got 0” / “Expected 77, got 0”.
2. Investigation findings
2.1 Layouts are identical, not the bug
Both Option<T> and Result<T, E> use the same two-word layout:
[tag: i64 @ offset 0, payload: i64 @ offset 1]
Option::Some(v)→[0, v],Option::None→[1, 0]Result::Ok(v)→[0, v],Result::Err(e)→[1, e]
Confirmed by:
self-hosted/backend/mir.qz:3297-3301— built-in Option/Result both always have payload (mir_enum_has_payloadreturns 1).self-hosted/backend/mir.qz:3335-3346— variant tag maps: OptionSome=0/None=1, ResultOk=0/Err=1. The Quartz hint “tag check has off-by-one” is wrong — they’re the same.self-hosted/backend/mir_lower_expr_handlers.qz:318— explicit comment:# Result layout: [tag:i64@0, payload:i64@1] Ok=tag 0, Err=tag 1.
The narrowing LLVM IR (generated from narrow_result_ok) correctly constructs the Result on the stack:
%alloc_1.p = alloca [2 x i64], align 8
...
store i64 0, ptr %sptr_2 ; tag = 0 (Ok) at offset 0
store i64 55, ptr %sgep_4 ; payload = 55 at offset 1
store i64 %v1, ptr %r ; %r = pointer to the alloca
The narrowing tag check (if r is Ok) then loads offset 0, compares to 0, and branches to then1. All good so far.
2.2 Option.unwrap() and Result.unwrap() take different codegen paths
self-hosted/middle/typecheck_expr_handlers.qz rewrites UFCS method calls per-type:
opt.unwrap()whereopt: Option→ func renamed to"unwrap"(line 1893).r.unwrap()wherer: Result→ func renamed to"unwrap_ok"(line 1908).
"unwrap" is a handled intrinsic, "unwrap_ok" is not.
-
self-hosted/backend/cg_intrinsic_system.qz:441-484— theunwrapintrinsic handler emits inline LLVM IR: inttoptr, load tag, compare to 0, branch to panic or ok, then load payload at offset 1, produce result register. No function call. This is whatopt.unwrap()compiles to — and that’s why it works. -
"unwrap_ok"is registered as a UFCS name inself-hosted/backend/intrinsic_registry.qz:628(asResult$unwrap) but the UFCS category has no real handler; the nameunwrap_okfalls through to the user-land library definition instd/prelude.qz:97-102:def unwrap_ok<T, E>(res: Result<T, E>): T match res Result::Ok(v) => v Result::Err(e) => panic("called unwrap_ok on Err") end endThis is a real Quartz function, monomorphized and emitted as
define i64 @unwrap_ok(i64 noundef %p0)in the output IR. The match arm forOkloads offset 0 for the tag, compares to 0, then loads offset 1 as the payload. Verified in the generated IR (lines ~2986-3055 of the spec output).
So: Option’s unwrap is inlined intrinsic LLVM; Result’s unwrap is a function call to @unwrap_ok.
2.3 The offending tail call
Inspection of the generated IR for narrow_result_ok:
then1:
%v9 = load i64, ptr %r, align 8 ; load pointer to the stack alloca
...
%v11 = tail call i64 @unwrap_ok(i64 %v9) ; <-- THE BUG
ret i64 %v11
Contrast the passing Option test (using opt.unwrap() in a minimal case) — no call at all; the unwrap is fully inlined.
Quartz’s backend emits a tail call whenever the call’s result is the block’s return value. From self-hosted/backend/codegen_instr.qz:407-418:
# Tail call detection: call result is block's return value
var is_tail = 0
if var_idx < 0 and callee_has_narrow_ret == 0
var blk = state.current_block
var term_kind = blk.mir_block_get_term_kind()
if term_kind == mir::TERM_RETURN
var ret_val = blk.mir_block_get_term_data()
if ret_val == dest
is_tail = 1
end
end
end
Then codegen_instr.qz:433-436 emits tail call i64 @<name>(...) when is_tail == 1. There is no check for whether any argument is (or could be) a pointer into the caller’s stack.
2.4 Why tail call breaks it
In LLVM, the tail marker is a semantic assertion by the frontend that the call is compatible with “pop caller frame before calling callee” tail-call optimization. Critically, one of the preconditions (LLVM LangRef, ‘call’ instruction, tail marker) is:
The callee does not access allocas or byval arguments from the caller.
Equivalently: if you pass a pointer that aliases the caller’s stack frame to a tail call, you have undefined behavior. The optimizer is free to treat the caller’s stack as dead at the call site, fold the call into a jump, and reuse / overwrite the caller’s stack slots. On the backend, with frame pointer set and a 2-word alloca living at a fixed caller-frame offset, the actual outcome we observe is: unwrap_ok reads %p0 (which is the integer value of the ex-caller-frame pointer), inttoptrs it, and loads offset 0 — but the backing memory has already been deallocated / overwritten, and the tag it reads is 0 for different reasons than we expected, and the payload at offset 1 is whatever happens to be at that stack address now. The observed result is 0.
Empirical confirmation. Stripping tail call → call in the generated .ll and re-running the test fixes it:
$ sed 's/tail call/call/g' /tmp/rtest.ll > /tmp/rtest_notail.ll
$ llc /tmp/rtest_notail.ll -o /tmp/rtest_notail.s
$ clang /tmp/rtest_notail.s -o /tmp/rtest_notail_bin -lm -lpthread
$ /tmp/rtest_notail_bin
55 # correct
vs. the same IR with tail call present prints 0.
2.5 Why Option.unwrap() doesn’t hit this
Because it’s an inline intrinsic — there’s no call at all. The bug is latent there; it would surface the moment any Option method were to go through a library definition like Result’s does (e.g., if you ever inline-desugared opt.unwrap() into a call to a Quartz-level unwrap_option helper, it would break identically).
2.6 Second-order: escape analysis doesn’t catch call-argument escapes
mir_compute_escaped_regs in self-hosted/backend/mir.qz:816-958 is the escape analyzer that decides whether a MIR_ALLOC_STACK needs to be promoted to heap. It tracks:
TERM_RETURNof a stack-origin register → escape (line 917, 949).MIR_STOREof a stack-origin value into a non-stack pointer → escape (line 900-912).MIR_STORE_VARto a global → escape (line 857-860).- Transitive propagation through nested field stores (line 960+).
It does NOT check MIR_CALL arguments. A pointer to a stack alloca passed as a call argument does not cause the alloca to escape. This is consistent with the current design — the assumption is “callees don’t outlive arguments, so we don’t need to heap-promote” — but combined with the unconditional tail call emission in codegen_instr.qz, it becomes unsound. The tail-call semantics require that stack-alloca pointers cannot reach the callee at all.
3. External research
3.1 LLVM tail call semantics
From the LLVM Language Reference, 'call' Instruction section (all modern LLVM versions, including 17, 20, 23-git): the tail marker permits sibling call optimization (jump, not call), and the frontend attests that the call satisfies rules including:
- Caller and callee have compatible prototypes (or at least compatible return types).
- No variables in the call reference allocas or byvals in the caller.
- (with
tailspecifically, optimizer is free but not required; withmusttail, it is mandatory.)
The relevant wording on alloca/byval safety is quoted in numerous LLVM docs and LLVM-dev threads — passing a caller-alloca pointer through a tail call is undefined behavior. The inalloca attribute exists precisely because passing stack arguments to calls needs a different, explicit mechanism with defined stack lifetime (LLVM InAlloca docs — “When the call site is reached, the argument allocation must have been the most recent stack allocation that is still live, or the behavior is undefined.”).
Difference between tail and musttail:
tail— advisory. Optimizer may perform sibling-call optimization if safe. Frontend is still responsible for asserting it’s safe (i.e., no caller-alloca/byval pointers among args, caller ABI compatible, etc.). If the frontend lies, UB.musttail— mandatory. Optimizer must perform sibcall, and if it can’t (ABI mismatch, needed cleanup, etc.) it’s a hard error. Strict rules on calling convention and argument compatibility.
Sources:
- LLVM LangRef,
'call' Instruction: https://llvm.org/docs/LangRef.html (Call Instructions section,tailmarker rules). - LLVM InAlloca design doc: https://llvm.org/docs/InAlloca.html.
- LLVM CodeGenerator.html tail call optimization section: https://llvm.org/docs/CodeGenerator.html#tail-call-optimization.
3.2 Result<T,E> runtime layout in production languages
The hint in the task mentions “Is it just a tagged union with the same shape as Option
-
Rust.
Result<T, E>is a regularenumwith two variants, stored as discriminant + largest variant payload. There’s no special-case. Rust does apply niche optimization — using invalid bit patterns as implicit discriminants — which can makeResult<&T, ()>the size of&T, etc. But niche optimization is a size/performance opt, not a layout change that would put payload at a different offset than forOption<T>. Rust Reference – type layout, Niche optimizations in Rust – 0xAtticus, Niche optimization in Rust (Medium). -
Haskell (GHC).
Either a b(Haskell’sResult) is a boxed algebraic data type. Each constructor (Left,Right) is allocated on the heap as a closure with an info-table header (containing the constructor tag) followed by pointer fields for the payload. Same shape asMaybe’sJust. No layout distinction betweenEitherand any other 2-constructor ADT. GHC heap objects wiki, GHC.Runtime.Heap.Inspect. -
OCaml. Variants with zero-argument constructors are unboxed integers starting at 0; variants with arguments are heap blocks with a 1-byte tag in the header (per-constructor) and the payload as block fields.
result(defined in stdlib astype ('a, 'b) result = Ok of 'a | Error of 'b) follows the standard rules — bothOkandErrorare boxed, tagged blocks with one payload word. Identical shape tooption’sSome. OCaml docs – memory representation of values, Real World OCaml – runtime memory layout. -
Swift.
Result<Success, Failure>is just anenumwith two associated-value cases. Laid out with the same discriminant + associated-value rules as any other 2-case enum. Nothing special.
Takeaway for the fix. Don’t touch the layout. The layout is correct and matches what every major language does. The bug is in the tail call emission path, which passes a stack pointer across a call whose ABI assumes no caller-alloca pointers. Any fix that “reshuffles” Result’s layout would be wrong.
4. Root cause hypothesis (with confidence)
Primary cause (very high confidence, empirically confirmed):
narrow_result_ok builds Result::Ok(55) on the caller’s stack as alloca [2 x i64]. The method call r.unwrap() is rewritten by typecheck to unwrap_ok(r). unwrap_ok is not an inline intrinsic — it’s a Quartz library function (std/prelude.qz:97) — so the backend emits a normal LLVM call. Because the call result is the block’s return value, the backend marks the call tail. Passing the pointer %r (which is a caller-alloca address) to a tail-marked LLVM call is undefined behavior per LLVM LangRef; LLVM’s sibling-call optimizer pops the caller frame (or otherwise invalidates the alloca) before unwrap_ok executes, so unwrap_ok reads garbage / overwritten memory and returns 0.
Proof: sed-replacing tail call with call in the generated LLVM IR immediately produces the correct result (55 / 77). No other change required.
Not the cause (things I checked and ruled out):
- Result layout is not different from Option’s. Both are
[tag@0, payload@1]. Confirmed inmir.qzandmir_lower_expr_handlers.qz. - Tag values are not swapped.
Ok=0,Err=1,Some=0,None=1. Confirmed. Result$unwrapdoes not read the wrong offset. The@unwrap_okbody correctly loads offset 0 for tag, offset 1 for payload.- There is no extra field for
Err’s type. The layout is 2 words regardless of variant.
Secondary issue (medium confidence, latent, should be filed):
The backend’s tail-call detector in codegen_instr.qz:407-418 does not verify that the call’s arguments are free of stack-alloca pointers before emitting tail. This is a general correctness bug in the backend; it is currently masked for almost every other call site because:
- Most call args are not local allocas but heap-promoted things (Vec, Map, String, etc., all
malloc’d). - Most calls to library helpers that take
OptionorResultgo through intrinsics that are inlined, not real calls. - The ones that do take stack-alloca pointers typically happen not in tail position.
Result.unwrap() is the unlucky intersection: Result is stack-allocated (small @value-sized type), the method rewrites to a real library call, and the call is in tail position (return r.unwrap()).
5. Fix plan
There are three candidate fixes, ordered from “narrowest correct” to “broadest correct.” Choose option C (harder-but-right per Prime Directives 1, 2, 3) as the primary, with A as a cheap belt-and-suspenders.
Option A — Stop emitting unsafe tail call (minimal correctness fix)
Change: In codegen_instr.qz tail-call detection, refuse to emit tail if any of the call’s arguments is a register whose origin is (or is derived from) a MIR_ALLOC_STACK in the current function.
Where:
- File:
self-hosted/backend/codegen_instr.qz - Function: the call-emission routine containing lines ~407-442 (the block
# Tail call detectionthrough# Emit call). - Logic: compute the escaped-origin bitmap via
mir_compute_escaped_regs(already available onMirFunc) OR walk the call’s args and checkreg_originfor any stack alloca origin. If any arg is stack-origin, forceis_tail = 0.
Concretely:
# After computing is_tail per the existing rule, veto it if any arg
# has a current-function stack-alloca origin.
if is_tail == 1
for i in 0..arg_count
if state.reg_has_stack_alloca_origin(args[i]) == 1
is_tail = 0
break
end
end
end
where reg_has_stack_alloca_origin consults the same origin-tracking data already used by mir_compute_escaped_regs (extract it into a reusable mir_reg_origin_table(func): Vec<Int> helper and pass it down to codegen, or precompute once per function).
Pros: Fixes the immediate bug. Fixes the whole class of future bugs where stack-alloca pointers leak through tail calls, not just Result.
Cons: Does nothing about the Result.unwrap() being slower than Option.unwrap() — it still goes through a function call instead of being inlined. Leaves an asymmetry that will bite us again (Result is a first-class type and should feel as fast as Option).
Estimated effort: 0.5 quartz-days (2 hours).
Option B — Inline unwrap_ok / unwrap_err / unwrap_or_ok as intrinsics
Change: Give Result’s unwrap the same treatment Option’s unwrap gets: register unwrap_ok, unwrap_err, unwrap_or_ok as inline intrinsics in cg_intrinsic_system.qz, emitting inline LLVM IR that panics on the wrong tag and loads payload from offset 1 (for unwrap_ok) or offset 1 (for unwrap_err — same layout, different panic message, different tag check).
Where:
- File:
self-hosted/backend/cg_intrinsic_system.qz- Current
name == "unwrap"handler at line 441. Add parallel handlers:name == "unwrap_ok"— tag check== 0(Ok), panic on Err, load offset 1.name == "unwrap_err"— tag check== 1(Err), panic on Ok, load offset 1.name == "unwrap_or_ok"— tag check== 0, branchy load: payload or fallback.
- Current
- File:
self-hosted/backend/intrinsic_registry.qz— add_r("unwrap_ok", INTRINSIC_CAT_SYSTEM),_r("unwrap_err", INTRINSIC_CAT_SYSTEM),_r("unwrap_or_ok", INTRINSIC_CAT_SYSTEM)alongside the existing_r("unwrap", INTRINSIC_CAT_SYSTEM)at line 536. - File:
self-hosted/middle/typecheck_builtins.qz— registertc_register_builtin(tc, "unwrap_ok", ...),"unwrap_err","unwrap_or_ok","unwrap_or"(also missing) alongside the existingunwrapat line 660. Without this, the intrinsic won’t resolve at typecheck. - File:
std/prelude.qz— deletedef unwrap_ok,def unwrap_err,def unwrap_or_oksince they’re now covered by intrinsics. Per Prime Directive 7 (no compat layers), delete outright in the same commit. - File:
self-hosted/backend/cg_intrinsic_system.qz— adjust panic message string pool: the existingunwraphandler emits “panic with backtrace + abort”; match that exactly forunwrap_ok/unwrap_err, reusing the same string constants (@.str.1, etc.) OR emit new ones with specific messages (“called unwrap_ok on Err”).
Pros: Matches Option’s performance — no call overhead. Symmetric. Dead-code-eliminates the std/prelude wrappers. Eliminates the tail-call bug for the case that triggered it without depending on Option A. Consistent with the “everything is an intrinsic” pattern the backend already uses for unwrap, expect, unwrap_or (see intrinsic_registry.qz:581-583).
Cons: Doesn’t fix the underlying tail-call-with-alloca-pointer bug — latent elsewhere. Must also do Option A (belt and suspenders).
Estimated effort: 1 quartz-day (4 hours) — including symmetric handlers for all three methods, updating typecheck, deleting the std/prelude wrappers, and verifying fixpoint.
Option C — Both (recommended)
Do A + B in a single commit. A closes the underlying correctness gap (no latent class-C bug reappearing in a year when some unrelated feature adds a new stack-allocated type and a new library call). B gets Result parity with Option, deletes dead library code, and is the directly-visible fix for the failing tests.
Order of operations in the commit:
- Land the intrinsic handlers + registry + typecheck registrations + delete prelude wrappers (B).
- Land the tail-call escape check (A).
- Rebuild
quake guard— verify fixpoint. - Run
option_narrowing_specstandalone — confirm 7/7. - Run smoke tests (brainfuck, style_demo, expr_eval) to catch regressions.
- Run a targeted QSpec subset touching Result (force_unwrap_spec, any
*_result*specs) to check nothing else breaks. - Commit.
Estimated effort: 1.5 quartz-days (6 hours) including testing and fixpoint.
Files that will change (final list)
| File | Change | Lines (approx) |
|---|---|---|
self-hosted/backend/cg_intrinsic_system.qz | Add unwrap_ok, unwrap_err, unwrap_or_ok handlers, modeled on existing unwrap at line 441. | +100 |
self-hosted/backend/intrinsic_registry.qz | Register the three new names at SYSTEM category (~line 536). | +3 |
self-hosted/middle/typecheck_builtins.qz | tc_register_builtin for the three new names (~line 660 area). | +3 |
std/prelude.qz | Delete unwrap_ok, unwrap_err, unwrap_or_ok (lines 97-119). | −23 |
self-hosted/backend/codegen_instr.qz | Add stack-alloca escape check to tail-call detector (~lines 407-418). | +10 |
self-hosted/backend/mir.qz | (Optional refactor, if (A) needs it) Export a reusable mir_reg_origin_table helper that codegen_instr can consume. | +15 |
What correct behavior should look like (test expectations)
After the fix:
spec/qspec/option_narrowing_spec.qz— 7/7 tests pass.narrow_result_okreturns 55.narrow_result_err_elsereturns 77.- The generated LLVM IR for
narrow_result_okcontains nocall @unwrap_okat all — it’s inlined to a load-tag/branch/load-payload sequence just like Option. Verify with:quartz /tmp/rtest.qz | grep -c unwrap_ok→ should be 0 (or only ifunwrap_okis called indirectly elsewhere). - Fixpoint (gen1 == gen2 byte-identical) holds.
- Smoke tests pass (brainfuck, style_demo, expr_eval).
- As a regression guard, the existing
force_unwrap_spec.qzstill passes.
6. Quartz-time estimate
- Option C (full fix): 1.5 quartz-days ≈ 6 hours.
- Intrinsic handlers + registration + typecheck registration: 2 hours.
- Delete prelude wrappers, verify nothing else references them: 30 minutes.
- Tail-call escape check with origin-table plumbing: 1 hour.
quake guard+ fixpoint + smoke tests + targeted QSpec subset: 1.5 hours.- Buffer for unanticipated issues (likely: typecheck arity mismatch, monomorphization interaction): 1 hour.
If only Option A is done: ~2 hours.
7. Risk assessment
Low risk. This is a targeted intrinsic addition plus a small backend correctness fix.
- Binary discipline risk: Must run
quake guardbefore committing. Compiler source changes, so fixpoint must be re-verified. Take a fix-specific backup atself-hosted/bin/backups/quartz-pre-result-unwrap-goldenbefore starting, per Rule 1. - Fixpoint risk: Inlining
unwrap_okchanges the emitted IR for every Result.unwrap() call site in the compiler itself. If the compiler usesResult.unwrap()internally (it does — Result is used for resolver, typecheck, etc.), the gen1 and gen2 IR will differ from the pre-fix baseline but must be internally consistent. A clean fixpoint should still obtain because the new inline code is deterministic and the same inputs produce the same IR. If fixpoint fails, most likely cause is interaction with@valueescape analysis on the poll-callee registry — unlikely for this change but worth monitoring. - Test regression risk: Low. Result.unwrap() is a hot path but the behavior change is semantic equivalence (both versions should return the same value on valid input; the library version was returning garbage on stack-allocated Results). All call sites that work today continue to work.
- Typecheck / monomorphization interaction: The existing
Result$unwrapbuiltin (line 677 of typecheck_builtins.qz) is registered asTYPE_INTreturn. The newunwrap_okintrinsic should match that signature. If typecheck currently infers a polymorphic return from the genericunwrap_ok<T,E>in prelude, deleting that definition might change inference. Must check that the intrinsic-return path correctly propagates theTtype to the caller (it should — theunwrapintrinsic already does this for Option). - Error messaging risk: Today, unwrapping an
Errcallspanic("called unwrap_ok on Err")via the library function, which routes through Quartz’s panic path. The intrinsic version should produce an equivalent panic message (new string constant@.str.xxx = "called unwrap_ok on Err"with same qz_print_backtrace + abort sequence as the existingunwrapintrinsic). Trivial to get right; just don’t forget it. unwrap_orfor Result (unwrap_or_ok) is also broken — same root cause. Since we’re fixingunwrap_okandunwrap_err, dounwrap_or_okin the same commit. Easy.
Single-failure mode to watch: If quake guard fails fixpoint after the change, the most likely cause is that the compiler’s own Result-heavy modules (resolver, typecheck, middle/*) compile differently in gen1 vs gen2 because of some stale cache or escape-analysis interaction. Recovery: cp self-hosted/bin/backups/quartz-pre-result-unwrap-golden self-hosted/bin/quartz and diagnose from the working binary. Take this backup before touching code.
8. Summary (TL;DR)
The “Result$unwrap layout bug” isn’t a layout bug. Layouts are identical: [tag@0, payload@1] with Ok=0, Err=1, same as Option with Some=0, None=1.
The real bug: Option.unwrap() is an inline intrinsic (cg_intrinsic_system.qz:441), but Result.unwrap() is typecheck-rewritten to unwrap_ok() which is a normal Quartz library function in std/prelude.qz:97. The backend emits it as tail call @unwrap_ok(...). The tail marker asserts to LLVM that no argument aliases a caller alloca — Quartz violates this, because r on the caller side is a pointer to alloca [2 x i64]. LLVM is then free to pop the caller frame before the callee runs; the callee reads garbage; the test returns 0.
Empirical proof: sed 's/tail call/call/g' on the generated IR makes the test pass.
Fix: promote unwrap_ok, unwrap_err, unwrap_or_ok to inline intrinsics (matching Option’s treatment), delete the std/prelude wrappers, and independently harden the backend tail-call detector to never emit tail when any argument has a caller-alloca origin. One commit.
Effort: ~1.5 quartz-days. Risk: low. Unblocks: 2 tests immediately; plus prevents a whole class of future bugs where small stack-allocated types (Result, maybe future Tuples, small records, @value structs) get their pointers passed to tail calls and silently corrupt.