Quartz v5.25

Overnight Handoff — Binary DSL Phase 2 Track C landed; Track B still open

Baseline: 801ed0c5 on trunk (Phase 2 Track A — computed fields). This session ships: Track C — array fields. Fixpoint: 2091 functions (was 2088 after Track A). Tests: 93 binary-DSL green (86 prior + 7 new in binary_arrays_spec.qz).

Design doc (canonical): docs/design/BINARY_DSL.md — still the locked 12 decisions.

Prior handoffs (read for context if anything’s unclear):


What shipped this session

Track C — Array fields. [T], [T; N], [T; field] inside binary {} blocks, with integer-primitive element types. Struct field presents as Vec<Int> at the Quartz level.

type DnsQuery = binary {
  id:       u16be
  flags:    u16be
  qdcount:  u16be
  questions: [u8; qdcount]        # count-prefixed by prior field
}

type UuidSlot = binary {
  uuid: [u8; 16]                  # fixed literal N
}

type Trailer = binary {
  hdr:  u16be
  tail: [u8]                       # rest-of-stream
}

Surface: Parser was already accepting the syntax (stashes specs like "[u8;qdcount]" into the field spec string). This session:

  1. Fixed _tc_bin_field_annotation in self-hosted/middle/typecheck.qz — integer-primitive arrays now annotate as Vec<Int> (was an “Int” placeholder).
  2. Extended _cg_bin_var_spec_class in self-hosted/backend/cg_intrinsic_binary.qz — new class -10 for arrays.
  3. Added three helpers alongside the pstring helpers:
    • _cg_bin_parse_array_info(spec) — returns [variant, elem_bits, is_le, is_signed, count_literal]. Variants: 0 rest-of-stream, 1 fixed literal, 2 field-ref.
    • _cg_bin_array_count_field_name(spec) — extracts the identifier for variant 2.
    • _cg_bin_find_prior_field_slot(prog, layout_id, name, upto_fi) — finds the non-pad struct slot index for UNPACK’s field-ref lookups.
  4. Extended PACK’s Phase 1 size accumulator and Phase 4 tail-emit loop.
  5. Extended UNPACK’s Phase 2 tail-emit loop, including runtime Vec allocation with per-element byte read + shift/or accumulation.

Spec: spec/qspec/binary_arrays_spec.qz — 7 tests covering each variant, multi-byte endianness (u16be, u32le), and Track A + Track C co-existence (via the vec_size(...) workaround — see TA-F5 below).


D17 — Track A’s compute expressions don’t see Vec field types

Filed as TA-F5. Track A lowers compute expressions via a direct mir_lower_expr call on the raw AST (see mir_lower_expr_handlers.qz:1471). The self binding carries the struct type (via mir_ctx_mark_struct_var), so field reads like self.counter * 2 work — those land as integer math on i64. But .size (and other member accesses) on a non-primitive field type don’t resolve correctly, because:

  • self.payload lowers via mir_emit_load_offset(self, N) and produces a bare i64 SSA value — no type annotation rides with it.
  • .size on that SSA value has no signal that it’s a Vec<Int>, so it doesn’t dispatch to the Vec header offset-1 read. Empirically it reads the wrong header slot (tested: self.payload.size returns 8 = Vec’s elem_width).

Workaround (recommended for Track B / protocol consumers): use the free-function form.

type Vlen = binary {
  count:   u16be = vec_size(self.payload)   # works
  # count: u16be = self.payload.size        # BROKEN (TA-F5)
  payload: [u8; count]
}

The 7th test in binary_arrays_spec.qz documents this pattern end-to-end.

Proper fix (not in Track C scope): re-run typecheck on compute ASTs before MIR lowering so member accesses get typed annotations, OR teach MIR lowering to propagate struct-field types through load_offset when the base’s type is known.


Track C restrictions (file if a consumer complains)

  1. Primitives only. [NestedBinaryBlock] is a Phase 2d follow-up — arrays of other binary blocks don’t work yet. Arrays of floats ([f32; 4]) fall back to the generic “Int” annotation and class -99 (unsupported). File as TC-F1 if a Track B consumer needs variant arrays.
  2. Encode trusts the user’s Vec size. For [T; 16] with a 17-element Vec, encode writes all 17 elements and decode (on anyone else’s side) will read 16 and silently drop one. Use computed fields with vec_size(self.field) to enforce consistency. TC-F2.
  3. Element type loss on decode. Decoded Vec<Int> carries no narrow-type info — u16be elements are indistinguishable from u32le elements after decode. Callers needing specific ranges should check bounds explicitly. TC-F3.
  4. Field-ref count must be prior. [T; field] where field is a later-positioned field doesn’t resolve (find-prior walks [0..fi)). TC-F4.
  5. Pad-after-array not handled. A fixed scalar after an array tail is a Phase 2 generalization — the existing tail-cursor loop will cope, but there’s no spec coverage. TC-F5.

Copy-paste handoff prompt (paste into a fresh session)

Read docs/handoff/overnight-binary-dsl-phase-2-track-c-done.md FIRST.
Track A and Track C are done; Track B (discriminated unions) is the
remaining Phase 2 track. TA-F5 (Vec.size in Track A compute exprs) is
a Track A follow-up worth fixing before Track B to avoid writing the
`vec_size(self.field)` workaround throughout Track B test specs.

Starting state (verified at handoff):
- Trunk clean. Guard stamp valid at 2091 functions. Smokes green.
- 14 binary-DSL specs, 93 tests, all green.
- Session backup: self-hosted/bin/backups/quartz-pre-binary-phase2-trackc-golden.
  Before touching the compiler, snapshot a new fix-specific copy:
    cp self-hosted/bin/quartz self-hosted/bin/backups/quartz-pre-binary-phase2-trackX-golden
  (Substitute trackb / trackafix / whatever you're attempting.)

NEVER overwrite a fix-specific backup until the attempted fix is
committed end-to-end with tests and smokes passing. The rolling
quartz-golden managed by `quake guard` gets overwritten on every
successful build — your fix-specific copy is the recovery hatch.

Recommended order for this session:

1. TA-F5 FIX (Track A follow-up, ~0.5 sessions).
   Teach Track A's compute-expr lowering to resolve member access on
   non-primitive fields. Two options:
   (a) Re-run a lightweight typecheck pass on the compute AST before
       mir_lower_expr — populate AST type annotations so `.size` on a
       Vec<Int> resolves to the Vec header read.
   (b) Have `mir_emit_load_offset` consult the owning struct's field
       annotations and carry a type tag forward for subsequent member
       accesses in the same expression.
   Verification: change `binary_arrays_spec.qz` test 7 to use
   `self.payload.size` (UFCS dot form) and confirm it still passes.
   Delete the TA-F5 workaround paragraph from that test's preamble.

2. TRACK B — Discriminated unions inside binary {} (higher impact).
   Surface (proposed, see BINARY_DSL.md):
     type Tcp = binary {
       data_offset: u4
       flags:       u8
       ...
       options: [TcpOption]           # Track C [T] form — now available
     }
     type TcpOption = binary {
       kind: u8
       match kind
         0 => { }                     # END_OF_LIST
         1 => { }                     # NOP
         2 => { mss: u16be }          # MSS
         8 => { tsval: u32be; tsecr: u32be }  # Timestamps
       end
     }
   Semantics:
     - Discriminator is always the FIRST field, primitive integer.
     - Each variant adds additional field(s) after the discriminator.
     - Decode reads discriminator, dispatches to variant layout.
     - Encode: Quartz value is an enum with the discriminator baked in.
   Scope: parser (match inside binary block), typecheck (variant type
   registration), MIR (new opcode OR extend PACK/UNPACK with a variant-
   dispatch indirection), codegen.
   Size estimate: 800-1200 lines.
   Spec: spec/qspec/binary_union_spec.qz — TCP options, PE section
   kinds, ELF section header types.

Workflow per STEP (identical to prior phases):
1. Write QSpec tests FIRST (red phase).
2. Implement the minimum to green.
3. Run `./self-hosted/bin/quake guard` before EVERY commit.
4. Smoke after every guard — brainfuck + expr_eval (both ~10s each).
5. Commit each STEP as a single coherent commit.

Prime Directives v2 compact:
1. Pick highest-impact, not easiest.
2. Design is locked (BINARY_DSL.md) — implement, don't redesign.
3. Pragmatism = sequencing correctly; shortcut = wrong thing.
4. Work spans sessions; don't compromise because context is ending.
5. Report reality. Partial = say partial.
6. Holes get filled or filed.
7. Delete freely. Pre-launch.
8. Binary discipline: guard mandatory, smokes + backups not optional.
9. Quartz-time = traditional ÷ 4.
10. Corrections = calibration, not conflict.

Stop conditions:
- Track complete with fixpoint stable → write next handoff.
- Blocked on compiler bug → file in Discoveries, commit what works.
- Context limit → stop at next clean commit boundary, write handoff.

Pointers (verified post-Track-C):
- Track A compute-expr lowering: `mir_lower_expr_handlers.qz:1463-1477`
  — the loop that walks fields and calls `mir_lower_expr` on each
  compute AST. That's where TA-F5 lives.
- Array classification: `_cg_bin_var_spec_class` returns -10 for
  arrays; `_cg_bin_parse_array_info` + `_cg_bin_array_count_field_name`
  are the dispatch helpers. Add Track B's variant class next to them.
- Fixed-prefix emit helpers (`_cg_bin_emit_pack_prefix_stores` /
  `_cg_bin_emit_unpack_prefix_reads`) handle straddle + sub-byte +
  byte-aligned. Still the hook for Track B's discriminator byte reads.
- Variable-tail pack emitter: `cg_emit_binary_pack` dispatches to
  `_cg_bin_emit_pack_variable` around line 1190. Track C's array klass
  branches are inline — mirror that pattern for Track B enum branches.
- Variable-tail unpack emitter: `_cg_bin_emit_unpack_variable` around
  line 1180. Track C's klass -10 branch allocates a fresh Vec<Int> and
  loops — the shape for Track B's variant-dispatch body.

Test status after Track C

FileTestsStatus
binary_parse_spec.qz14🟢 green
binary_typecheck_spec.qz19🟢 green
binary_mir_spec.qz10🟢 green
binary_types_spec.qz5🟢 green
binary_methods_spec.qz3🟢 green
binary_bitcast_spec.qz3🟢 green
binary_with_spec.qz3🟢 green
binary_roundtrip_spec.qz5🟢 green
binary_varwidth_spec.qz5🟢 green
binary_straddle_spec.qz3🟢 green
binary_eof_spec.qz4🟢 green
binary_strict_spec.qz6🟢 green
binary_computed_spec.qz6🟢 green
binary_arrays_spec.qz (new)7🟢 green
Total93🟢 green

Smokes (post-guard): examples/brainfuck.qz, examples/expr_eval.qz, examples/style_demo.qz — all pass with the post-guard binary.

Full QSpec suite NOT run from Claude Code (CLAUDE.md protocol). Run ./self-hosted/bin/quake qspec in a terminal to catch cross-spec regressions before declaring Track C “fully done.”


Safety rails (verify before starting Track B or the TA-F5 fix)

  1. Quake guard before every commit. Pre-commit hook enforces it.
  2. Smoke after every guard. brainfuck + expr_eval are enough.
  3. Fix-specific backup at self-hosted/bin/backups/quartz-pre-binary-phase2-trackX-golden (create at top of next session).
  4. Full QSpec NOT in Claude Code. The harness PTY can hang on large runs. Use targeted FILE=... invocations for spec files.
  5. Crash reports first (CLAUDE.md): on silent SIGSEGV check ~/Library/Logs/DiagnosticReports/quartz-*.ips before ASAN/lldb.