Next session — fix the \" inside #{...} parser hole
Baseline: unikernel-site branch at 709dcfaf, 11 commits
ahead of trunk. Fixpoint 2144, guard stamp valid. Live production
at https://unikernel.mattkelly.io/ (PMM stable across multiple
runs; no leak regression).
Scope: One compiler fix. Estimate: 0.5–1 quartz-hour (i.e. ~15 minutes of focused debug + 15 minutes of guard+fixpoint+smoke). Small enough to start a clean session on and finish in it without context pressure.
The bug in one screen
# This compiles cleanly:
def id(s: String): String = s
def main(): Int
puts("out = #{id("nested")}") # bare nested string works
return 0
end
# This fails at parse time:
def main(): Int
puts("out = #{id(\"nested\")}") # ESCAPED \" breaks the lexer
return 0
end
# → error[QZ0101]: Expected expression
# --> line:col pointing at the `\` before the first `\"`
The lexer correctly tracks the interpolation block when the inner
string uses unescaped "...", but breaks when the inner string
uses \" escapes. The \ is evidently being consumed as an
outer-string escape, which then treats the following " as the
end of the outer string — the tokenizer’s interpolation-nesting
state is wrong for escaped quotes.
Why it matters
-
Workaround was real friction. While refreshing
examples/error_handling.qzandexamples/collections.qzin the cheat-sheet pass, I had to hoistshow_res(parse_int_str(\"42\"))into a localr1 = show_res(parse_int_str("42"))binding, then interpolate#{r1}. It works but it’s boilerplate that shouldn’t be necessary. -
Cheat sheet says
#{}is the canonical interpolator. If the most idiomatic form breaks when strings need to be quoted, we’re teaching a workaround instead of a rule. -
Low fix risk. The surrounding code already handles unescaped nested strings correctly — it’s just the
\"case that doesn’t track state right. Likely a 3–5 line fix in the lexer.
Minimal repros
Save these as /tmp/A.qz and /tmp/B.qz:
# /tmp/A.qz — COMPILES
def id(s: String): String = s
def main(): Int
puts("out = #{id("nested")}")
return 0
end
# /tmp/B.qz — FAILS
def id(s: String): String = s
def main(): Int
puts("out = #{id(\"nested\")}")
return 0
end
Drive them with:
./self-hosted/bin/quartz /tmp/A.qz > /tmp/A.ll 2> /tmp/A.err
./self-hosted/bin/quartz /tmp/B.qz > /tmp/B.ll 2> /tmp/B.err
echo "A: $(wc -c < /tmp/A.ll) bytes, $(grep -c 'error\[QZ' /tmp/A.err) errors"
echo "B: $(wc -c < /tmp/B.ll) bytes, $(grep -c 'error\[QZ' /tmp/B.err) errors"
Expected today: A succeeds, B fails with error[QZ0101]: Expected expression at the column of the first \.
Expected after fix: both succeed; both should produce similar IR (A ≈ B give or take the escape metadata).
Where to look
The lexer lives at self-hosted/frontend/lexer.qz. The relevant
state machine tracks:
- In an outer string,
"..."begins a string and"ends it, with\n,\t,\",\\, etc. as escapes. - Inside that outer string,
#{begins an interpolation block, inside which the lexer temporarily pops back to expression-lexing mode and the outer-string escape semantics should NOT apply. - The matching
}closes the interpolation block and pops back to outer-string mode.
The bug is in step 2: when the interpolation expression contains
\", the \ is being lexed as if we were still in outer-string
mode, so it consumes the following " as an escaped-quote
literal character — which means the NEXT " is mis-interpreted
as the outer string’s terminator.
Cross-reference:
lexer.qz— grep for\"interp\",interp_depth,interp_stack, orTOK_INTERP_START/TOK_INTERP_END(names vary).parser.qz— grep forps_parse_interp/ how the parser consumes the interpolation tokens and recurses into expression-parsing inside them.
Hypothesis: the lexer has a single in_string flag that’s
set for outer strings. Inside #{...}, it should be cleared
(or a “depth > 0” check should gate escape handling). Possibly
a missing if interp_depth == 0 guard around the \-escape
case.
Confidence: medium. The state machine could be subtler than
that — especially if Quartz supports nested interpolation ("#{f("#{g(x)}")}").
Verify the nested case works before assuming a simple fix.
Test coverage to add
Three new cases for spec/qspec/lexer_spec.qz (or a dedicated
interp_escape_spec.qz — whichever matches the existing style):
- Basic escape:
"out = #{id(\"a\")}"lexes to the same token stream as the unescaped form. - Multiple escapes in one interpolation:
"#{f(\"a\", \"b\")}"handles multiple\"pairs correctly. - Escape at boundary:
"#{f(\"a\")}end"— the}terminator of the interpolation still resolves correctly after escaped inner strings.
If nested interpolation is legal (it may or may not be — check QUARTZ_REFERENCE.md’s string section), add a 4th test:
- Nested interp with escape:
"#{f("#{g(\"a\")}")}"or the syntax the grammar actually supports. This is the torture-test case and may surface a related bug.
After the fix
./self-hosted/bin/quake guard— fixpoint must hold. Current count is 2144 functions.- Smoke-test per the Rule-2 ritual:
./self-hosted/bin/quartz examples/style_demo.qz | llc -filetype=obj -o /tmp/sd.o clang /tmp/sd.o -o /tmp/sd -lm -lpthread && /tmp/sd | head -3 ./self-hosted/bin/quartz examples/brainfuck.qz | llc -filetype=obj -o /tmp/bf.o clang /tmp/bf.o -o /tmp/bf -lm -lpthread && /tmp/bf | head -3 - Re-verify the A/B repros above — both should compile; B’s
output should print
out = nested. - Revisit
examples/error_handling.qzandexamples/collections.qz: the local-binding hoists I introduced to work around this bug can now be folded back into direct#{}expressions. Small tightening pass — maybe 10 lines.
What this doesn’t fix
Adjacent parser hole filed in the same session and still open:
-
docs/bugs/WASM_IMPLICIT_IT_LOCAL_OOB.md— WASM backend rejects.each() { it }/.filter() { it }/.map() { it }onVec<T>. Separate fix, different file (codegen_wasm.qz), different priority (P1 for the playground; the interp-escape bug is P2 as a quality-of-life thing). -
docs/bugs/WASM_STRING_INTERP_PTR.md— WASM backend prints a pointer for#{string_var}instead of the text. Also codegen_wasm, still open.
Don’t bundle these into the same PR as the interp-escape fix unless the investigation shows they share a root cause (unlikely — the escape bug is pure lexer, the WASM bugs are pure backend).
Branch state for the new session
Branch: unikernel-site
Tip: 709dcfaf [style] Extend cheat-sheet pass: README + examples/ + Piezo alignment
Ahead: 11 commits from trunk (1d90d51b)
Fixpoint: 2144 functions, stamp valid, guard will pass on empty rebuild
Worktree: .claude/worktrees/unikernel-site/
Production: https://unikernel.mattkelly.io/ (PMM flat, 9/9 demos live)
Nothing else in flight. Start from a clean git status and the
repros above.