Quartz Language Roadmap

Version: v5.25.0-alpha | Status: Self-Hosted Primary | Tests: 3,274 (0 failures, 0 pending)

Goal: Transform Quartz from “a language that compiles itself” into “a language that competes.”

Latest: Phase U (Union/Intersection/Record Types) IN PROGRESS — record types { x: Int } working end-to-end. Row variables (8D) and monomorphized codegen scaffold (8G) complete in both compilers. Remaining: intersection simplification (8F).

Velocity calibration: Phases 1-4 estimated 8-9 weeks, completed in ~5 hours. Estimates below reflect observed AI-assisted throughput. Old human-solo estimates preserved in parentheses.

Principles

We accept only world-class solutions.

Every phase in this roadmap adheres to these non-negotiable principles:

Test-Driven Development: Write tests first. Implementation follows specification.
Fixpoint Validation: After ANY change to bootstrap or self-hosted, run rake quartz:validate to verify the compiler can still compile itself.
No Shortcuts: If a feature can’t be done right, it waits. Technical debt compounds.
Document Changes: Update docs/QUARTZ_REFERENCE.md immediately after language changes. A stale reference causes cascading errors.
Incremental Progress: Commit working states. Never leave the compiler broken.

# The sacred workflow
rake test                    # All tests pass
rake quartz:validate             # Fixpoint verified
git add -A && git commit     # Progress preserved

Priority Stack

Phases are stack-ranked by impact. Each phase enables subsequent phases. Do not skip ahead.

Rank	Phase	Impact	Status
0	Dogfooding Phase 1	Validates features, simplifies codebase	COMPLETE
1	Fixed-Width Integers	CRITICAL - Unblocks networking, FFI, binary protocols	COMPLETE
2	Volatile Access	CRITICAL - Enables MMIO, hardware access	COMPLETE
3	Memory Ordering	CRITICAL - Efficient lock-free algorithms	COMPLETE
4	Exhaustiveness Checking	Prevents silent match bugs	COMPLETE
5	Native Floats	Enables math, graphics, games	COMPLETE
6	Packed Structs	Binary protocol parsing	COMPLETE
7	Conditional Compilation	Platform-specific code	COMPLETE
8	Const Evaluation (archived)	Compile-time computation	COMPLETE (const generics Array<T,N> via Step 4)
9	Inline Assembly Improvements	Full asm control	COMPLETE
10	Bit Manipulation Intrinsics	popcount, clz, ctz, bswap	COMPLETE
11	Dogfooding Phase 2	Final validation	COMPLETE
C	Concurrency Sprint	Select statement, try_recv/try_send, Task	COMPLETE (select, blocking ops, Option try_recv, generic Task)
S	Sized Storage & SIMD	Memory efficiency, vectorization	COMPLETE (S.1-S.9: narrow params+locals+struct fields, packed F32 vecs, auto-vec metadata, F32x4+F64x2+I32x4 SIMD, FMA, min/max/abs, shuffle)
B	Benchmark Optimization Sprint	Competitive runtime performance	COMPLETE (pipeline fix, StringBuilder, LLVM hints, vec unchecked, Polly)
T	Multi-Target Compilation	C backend, WASI, cross-platform stdlib	COMPLETE (~165 C intrinsics, 57+6+4 tests, `@cfg` target threading)
L	Linear Types	Resource safety: move semantics, borrows, Drop	COMPLETE (11 tests, both compilers)
SC	Structured Concurrency	Scope-bound task groups, cooperative cancellation	COMPLETE (9 tests, both compilers)
CI	Custom Iterators	`for x in struct` via `$next` method returning Option	COMPLETE (7 tests, both compilers)
N	Networking & Concurrency Hardening	Unblocks P2P chat, web server, dogfooding vision	COMPLETE
OPT	Bug Fixes & MIR Optimization Sprint	DCE, TCO, regalloc hints, inlining, iterator fix	COMPLETE
GV	Cross-Module Global Variables	Shared mutable state across modules	COMPLETE
GAP	Table Stakes Gap Analysis	35 gaps audited (2 false), 5 true critical, 16 expected, 12 nice-to-have	COMPLETE (audit)
TS	Table Stakes Implementation	21 features approved, 2 skipped (`?` operator, async/await)	COMPLETE (21/21 done)
LS	Literal Syntax & Data Types	Negative indexing, char literals, set comp, ranges, Index trait, binary data	COMPLETE
U	Union / Intersection / Record Types	SimpleSub inference, structural subtyping, record types	ACTIVE

Phase Overview

Phase 0: Dogfooding Phase 1 (COMPLETE)

Duration: ~6 hours (was: 1-2 weeks) Rationale: The self-hosted compiler avoids many features it implements (closures, traits, defer, Result/Option). Before adding new features, we must validate existing ones work in production.

Progress:

Critical Bugs Fixed (Feb 2026):

tc_parse_type was missing “Int” mapping → all function parameters with Int type returned TYPE_UNKNOWN
mir_lower_stmt wasn’t calling mir_ctx_mark_struct_var for local struct variables → field access always used index 0
tc_parse_type didn’t handle Fn(...) annotations → first-class function parameters returned TYPE_UNKNOWN, causing “Undefined function” errors when called
cg_emit_intrinsic had no codegen handlers for HOF intrinsics (each, map, filter, reduce, count, any, all, find, take, drop) → all emitted ; WARNING: Unknown intrinsic and a zero result
Type checker NODE_IDENT handler didn’t check function registry when bare name wasn’t in scope → filter(arr, is_even) failed with “Undefined variable: is_even”
MIR NODE_IDENT handler didn’t try arity-mangled names (is_even$1) when bare name lookup failed → mir_lookup_function missed function refs
cg_emit_intrinsic had no codegen handlers for atomic intrinsics (atomic_new, atomic_load, atomic_store, atomic_add, atomic_sub, atomic_cas) → all emitted ; WARNING: Unknown intrinsic and returned 0. Last failing enumerable spec (each with atomic accumulator) now passes.
Bootstrap wildcard _ check used bound_names[bi][0] != '_' → rejected __v bindings used by $try/$unwrap macro expansions. Changed to strcmp(bound_names[bi], "_") != 0.
Direct vec_push/vec_set type_arg propagation broke self-hosted compiler (uses Vec<Int> as generic container). Scoped propagation to vec_get/vec_pop only.
~x (bitwise NOT) MIR lowering had no handler for op 18 — silently returned x unchanged. Type checker also returned TYPE_UNKNOWN. Added ~x → x XOR -1 decomposition and op == 18 → "~" type checking.
Parser accepted ! as logical NOT alias (TOK_BANG) alongside not (TOK_NOT). Bootstrap only accepts not. No source uses ! as NOT. Removed to reserve ! for never type.
Dead MIR_UNARY codegen handler (kind == 7) was unreachable — all unary ops decompose to binary in MIR lowering. Handler also had a bug: emitted xor i1 instead of xor i64. Removed entirely.
Self-hosted typecheck op_str mapping for binary ops used C bootstrap numbering (5-12) instead of self-hosted parser numbering (7-15 for comparisons, 14/15 for and/or). Comparisons silently returned TYPE_UNKNOWN.
Self-hosted MIR string equality checked op == 6 (C bootstrap OP_EQ) instead of op == 7 (self-hosted OP_EQ). String == fell through to integer compare, comparing pointers instead of content.
Self-hosted MIR regex nomatch emitted mir_emit_binary(ctx, 5, ...) — op 5 maps to bitwise AND in self-hosted, not equality. Changed to op 7 (OP_EQ) so !~ correctly emits icmp eq against zero.

Phase 1: Fixed-Width Integers (COMPLETE)

Duration: ~2.5 hours (was: 3-4 weeks) Rationale: This was the #1 blocker. Six networking tests were pending solely because sockaddr_in requires u8 and u16. Every binary protocol, every FFI call to real C libraries, every hardware register depends on fixed-width types.

Phase 1.1 Complete (Type System Foundation):

Add I8, I16, I32, U8, U16, U32, U64 type constants to bootstrap
LLVM type mapping (quartz_type_to_llvm, quartz_return_type_to_llvm)
Type annotation parsing in tc_parse_type_annotation
Call-site truncation/extension for all widths (i8, i16, i32)
Struct field truncation and store patterns for all widths
I64 aliased to Int/TY_INT (not a separate type kind)

Phase 1.2 Complete (Native Function Narrow Codegen):

MIRFunction carries return_type_name and param_type_names through pipeline
Native functions emit narrow LLVM types (e.g., define i16 @func(i8 %p0))
sext/zext on param entry, trunc on return, call-site conversions

Phase 1.3 Complete (Integer Literal Suffixes):

Hex literals (0xFF), binary literals (0b1010), underscore separators (1_000)
Type suffixes: 42_u8, 0xFF_i32, 0b1010_u16 (case-insensitive)
Token suffix[16] field, AST int_suffix[16] field, parse_int_lexeme() with strtol
30/30 fixed-width integer tests pass

Phase 1.4 Complete (Arithmetic Operations — Option B):

Type propagation through binary expressions (MIR result_type_name field)
Typechecker: int_suffix → fixed-width type for literals; arithmetic preserves fixed-width types
Unsigned ops: udiv/urem/lshr, unsigned comparison predicates (ult/ugt/ule/uge)
Narrow type wrapping: trunc+sext/zext after each arithmetic operation
43/43 fixed-width integer tests pass (13 new Phase 1.4 tests)

Phase 1.5 Complete (Explicit Conversion Functions):

7 conversion intrinsics: to_i8, to_i16, to_i32, to_u8, to_u16, to_u32, to_u64
Codegen emits trunc+sext/zext for each conversion
56 fixed-width integer tests pass (13 Phase 1.5)

Phase 1.6 Complete (FFI/repr(C) Fixes):

addr_of() materializes proper C-layout structs on stack (alloca/GEP/store)
Fixed type mappings, field_offset calculations, unsigned zext for FFI returns
Suffix matching for module-qualified struct names

Phase 1.7 Complete (Networking Unblocked):

Networking tests unblocked (5 previously pending now pass)
DNS resolution via getaddrinfo — resolve_host() in std/net/tcp.qz
@repr(C) struct Addrinfo defined in std/ffi/socket.qz (macOS layout)
tcp_connect resolves hostnames before falling back to inet_pton
http_get test un-skipped and passing (full networking stack operational)
4 new DNS tests: localhost resolution, invalid hostname, literal IP, hostname connect
0 pending tests remain
getaddrinfo added to EXTERNAL_SYMBOLS — forces clang linking for DNS-dependent tests (lli JIT crashes on macOS getaddrinfo due to mDNSResponder/XPC)
http_get test guards for network access (skips gracefully offline)

Phase 2: Volatile Access (COMPLETE)

Duration: ~45 minutes (was: 1 week) Rationale: Without volatile, the compiler can legally remove “redundant” memory accesses. This breaks MMIO completely.

Add volatile_load<T>(ptr) intrinsic
Add volatile_store<T>(ptr, value) intrinsic
Emit LLVM volatile flag on loads/stores
Bootstrap: typecheck.c, mir.c, mir_codegen.c
Self-hosted: typecheck.qz, mir.qz, codegen.qz
5 integration tests (runtime + IR verification)
Narrow volatile: volatile_store<U8> emits store volatile i8 (not i64)
12 integration tests (runtime + IR verification + narrow types)
Documentation in ref.md

Phase 3: Memory Ordering (COMPLETE)

Duration: ~45 minutes (was: 2 weeks) Rationale: All atomics are currently seq_cst. This is correct but inefficient. Real lock-free code needs relaxed, acquire, release, acq_rel.

Add ordering parameter to atomic intrinsics
Add fence(ordering) intrinsic
Add compiler_fence() for compiler-only barriers
Bootstrap: typecheck.c, mir.c, mir_codegen.c
Self-hosted: typecheck.qz, mir.qz, codegen.qz
29 integration tests (IR verification + runtime correctness + backward compat)
Documentation in ref.md

Phase 4: Exhaustiveness Checking (COMPLETE)

Duration: ~34 minutes (was: 1-2 weeks) Started: 2026-02-07 20:23 EST Completed: 2026-02-07 20:57 EST Rationale: There’s a TODO at typecheck.qz:3249. Without exhaustiveness checking, adding an enum variant silently breaks all match expressions.

Complexity profile: Typechecker-only analysis pass. No new syntax, no new types, no bootstrap changes. Bounded scope: collect variants, check match arms, emit errors.

Collect all enum variants during typecheck
Check match arms cover all variants
Require _ wildcard or explicit coverage
Emit clear error on missing variant (QZ0106)
Handle both ident-based and enum-access-based subject resolution
Fix enum param binding (struct_name not stored for TYPE_ENUM params)
Int/String matches without wildcard emit QZ0106

Phase 5: Native Floats (COMPLETE)

Duration: ~3.5 hours (was: 2-3 weeks) Rationale: Currently only CFloat/CDouble exist for FFI. No native float arithmetic. This blocks games, graphics, scientific computing. F32 is also the foundation for future SIMD/vectorization support.

F64 Foundation — COMPLETE:

F64 type constant and type system in both compilers (TYPE_F64 = 36)
Float literal lexing (TOK_FLOAT_LIT) and parsing (NodeFloatLit = 68)
Float arithmetic: fadd, fsub, fmul, fdiv via i64↔double bitcast trick
Float comparisons: fcmp oeq/one/olt/ole/ogt/oge with zext to i64
Float parameter passing and return values (F64 type annotation)
Unary negation for floats (0.0 - x decomposition)
Self-hosted implementation (typecheck, MIR, codegen all updated)
28 integration tests passing (literals, arithmetic, comparisons, functions)

Phase 5R (Float Conversions) — COMPLETE:

Float-to-int conversion (to_int intrinsic — fptosi)
Int-to-float conversion (to_f64, to_f32 intrinsics — sitofp)
Float-to-float conversion (F64↔F32 — fptrunc/fpext)
NaN and Inf special value tests
Documentation in ref.md
6 new integration tests

Phase 5.F (F32 First-Class Type) — COMPLETE:

Design: Strict distinct types with literal polymorphism. See 05-native-floats.md for full design rationale.

F32 ≠ F64 — no implicit promotion, mixed arithmetic is a type error
Float literals are polymorphic — 5.0 adopts context type (F32 or F64)
Conversion builtins are multi-input — to_int accepts F32 or F64, to_f64 accepts F32 or Int, to_f32 accepts F64 or Int
True 32-bit storage — fptrunc double→float, bitcast float→i32, zext i32→i64

Progress:

Test spec with 22 tests (declarations, functions, arithmetic, comparisons, conversions, precision, type safety)
TY_F32 added to bootstrap type enum (types.h)
Bootstrap type system (t_f32 singleton, type_name, type_from_string, types_match with F32↔F64 permissive matching)
Bootstrap typecheck (F32 annotation parsing, literal coercion in let/comparison/call, binary op propagation, unary negation, polymorphic conversion builtins)
Bootstrap MIR (TYPE_F32 BeType, type_to_betype, resolve_expr_type for NODE_BINARY/NODE_FLOAT_LIT, NODE_IDENT load type, var binding for annotated/inferred floats, call argument F64→F32 coercion)
Bootstrap codegen (F32 constant encoding, F32 binary ops, F32 comparisons, F32 negation, F32-aware to_int/to_f64/to_f32 intrinsics)
Self-hosted typecheck (TYPE_F32 constant, tc_parse_type for F32, F32 binary op checking, F32 unary negation, literal coercion, to_f32 builtin registration)
22/22 F32 tests passing, 2383 total tests, fixpoint validated

Phase 6: Packed Structs (COMPLETE)

Duration: 30-60 minutes (was: 1 week) Rationale: Binary protocols have precise layouts. TCP headers, USB descriptors, file formats all require #pragma pack(1) equivalent.

Complexity profile: Minimal. Add @packed attribute to parser, emit <{...}> instead of {...} in LLVM IR, update field_offset. Same template as volatile/fence — attribute in, IR flag out.

Add @packed attribute (supports @packed, @repr(C) @packed, @packed @repr(C))
Emit LLVM packed struct type (<{ }> syntax)
Update field_offset<T> to respect packing (skip alignment when packed)
6 integration tests
Both bootstrap and self-hosted compilers updated

Phase 7: Conditional Compilation ✅

Duration: 2-3 hours (was: 2 weeks) Rationale: Platform-specific code is impossible without this. The stdlib needs different implementations for Linux, macOS, Windows.

Complexity profile: Most design-intensive remaining phase. Touches parser, AST, typechecker, and codegen. Syntax decisions (@if(os == "macos") vs @cfg(target_os, "macos")) need upfront design. More moving parts than intrinsic-style phases, but each change point is small.

Decision: Attribute-based @cfg(key: "value") — cleaner than block-level @if/@end.

Phase 8: Const Evaluation (NEARLY COMPLETE)

Duration: ~6 hours total (was: 3-4 weeks) Rationale: No compile-time computation exists. Can’t even compute array sizes at compile time.

Complexity profile: Hardest remaining phase. Requires a compile-time expression evaluator (interpreter over AST/MIR). Const generics (Array<T, N>) is a genuine type system extension with ripple effects. Scoping to simple const declarations first, then const generics, keeps it tractable.

Phase 8.1-8.2 Complete (Const Declarations & Evaluator):

Phase 8.1+ Complete (@cImport integration):

const c = @cImport(@cInclude("stdio.h")) — compile-time C header import
MIR skips const_eval for @cImport/@cInclude bindings (namespace, not value)
16 @cImport integration tests passing

Phase 8.3 Complete (Const Functions & Self-Hosted Parity):

const def syntax — parser encodes int_val 2 (public) / 3 (private)
Const functions with block bodies — if/while/for/return/locals, nested calls
Const conditionals (if/elsif/else in const function bodies)
Self-hosted const eval parity with C bootstrap — 7 new helper functions:
- mir_const_eval_find_local_idx — reverse search locals by name
- mir_const_eval_update_local — update local var value by name
- mir_const_eval_call — call const fn with depth check, param binding, body eval
- mir_const_eval_full — full expression evaluator (IntLit, BoolLit, Ident, Unary, Binary, Call, If, Block)
- mir_const_eval_while_loop — bounded while (1M iterations)
- mir_const_eval_for_loop — range-based for with scope management
- mir_const_eval_body — statement evaluator (Block, Return, If, While, Let, Assign, ExprStmt, For)
All compilation passes updated (single-module PASS 1.7/2, multi-module PASS 0.7/1)
Const functions skipped during MIR lowering (exist only at compile time)
56 integration tests (44 base + 12 const functions/conditionals)
2,430 tests, 0 failures, fixpoint validated

Remaining: None — Phase 8 complete.

Const generics (Array<T, N>) — implemented via Step 4 (10 tests, both compilers)
- G1.1-G1.3: Generic struct infrastructure
- G2: Generic struct semantics
- G3: Parameterized type propagation (G3.0-G3.5 complete)
- Step 4: array_new<T, N>(), array_get, array_set, array_len with const N

Phase 9: Inline Assembly Improvements (COMPLETE)

Duration: 1-2 hours (was: 2 weeks) Rationale: Current @c() is minimal. No output operands, no clobber lists, no register constraints.

Complexity profile: Extending existing syntax, not inventing new infrastructure. LLVM inline asm format is well-documented. Main work is parser extensions and IR emission. @naked is a function attribute — same pattern as @packed.

Add output operand syntax
Add clobber list syntax
Add register constraint syntax
Add @naked function attribute
Extended @asm() syntax with output/input/clobbers sections
14 integration tests (4 @naked, 10 @asm)

Phase 10: Bit Manipulation Intrinsics (COMPLETE)

Duration: 30-45 minutes (was: 1 week) Rationale: These are trivial to add via LLVM intrinsics and enable efficient crypto, compression, and parsing.

Complexity profile: Easiest remaining phase. 6 intrinsics, each maps directly to an LLVM intrinsic (llvm.ctpop, llvm.ctlz, llvm.cttz, llvm.bswap, llvm.fshl, llvm.fshr). Identical template to volatile/fence — register builtin, add to intrinsic list, emit IR call.

Add popcount(x) - count set bits
Add clz(x) - count leading zeros
Add ctz(x) - count trailing zeros
Add bswap(x) - byte swap (endianness)
Add rotl(x, n), rotr(x, n) - bit rotation
16 integration tests
Both bootstrap and self-hosted compilers updated

Phase 11: Dogfooding Phase 2 (COMPLETE)

Duration: ~8 hours total (was: 2-3 weeks) Rationale: After all features are added, rewrite the self-hosted compiler to use them. This validates everything works together.

Phase 11.0: Cross-Module Type Resolution (COMPLETE):

TODO Sweep (Batches 1-5, COMPLETE):

Panic stderr in self-hosted codegen (write to fd 2 + abort)
Map comprehension MIR lowering (NODE_MAP_COMP = 47, full loop codegen)
Newtype type safety (distinct type IDs per newtype, both compilers)
TOML parser fixes (do…end crash, infinite loop, module-qualified enums)
Block-scoped defer (Go-style: fires at block exit, not function return, both compilers)

Phase C: Concurrency Sprint (COMPLETE)

Duration: ~2.5 hours total Rationale: Full concurrency feature parity between self-hosted and C bootstrap compilers.

Pass 1 (select statement):

Parser stores send arm payload in extras slot (parser.qz)
Full select MIR lowering — polling loop with try blocks, body blocks, variable binding (mir.qz, ~100 lines)
try_recv inline IR in self-hosted codegen — lock, check count, load from buffer[head], unlock (codegen.qz)
try_send inline IR in self-hosted codegen — lock, check capacity, store at buffer[tail], unlock (codegen.qz)

Pass 2 (blocking ops + Option fix):

6 blocking channel ops ported to self-hosted codegen: channel_new, send, recv, channel_close, channel_len, channel_closed — all emit inline IR, dead @qz_channel_* declarations removed
Parameterized type system: PTYPE_BASE=100000, interned ptypes with tc_make_ptype, tc_base_kind normalization at UFCS dispatch sites (both compilers)
Generic Task<T> type: spawn returns Task<T>, await unwraps to T (TYPE_TASK=47, ptype-based)
try_recv returns Option pointer [tag:i64, value:i64] — tag=0 is Some, tag=1 is None
Three new intrinsics: option_is_some(ptr), option_get(ptr), option_free(ptr) — registered in typecheck, MIR, and codegen of both compilers
Select recv arm branches on option_is_some with free blocks for cleanup (both compilers)
Zero-value bug fixed: sending 0 through select channels now works correctly
Fixpoint maintained: gen2==gen3 byte-identical (310,608 lines)

Phase S: Sized Storage & SIMD Vectorization (COMPLETE)

Priority: After Phase 11 | Depends on: Phase 5.F (F32), Phase 1 (fixed-width integers) Rationale: Quartz stores everything as i64 — every Int, every F32, every Bool takes 8 bytes. This wastes memory, kills cache locality, and makes SIMD impossible. Sized storage is the foundation that unlocks memory efficiency, packed arrays, auto-vectorization, and eventually explicit SIMD intrinsics.

Phase S.1: Sized Storage Model (COMPLETE — Steps 5 + Narrow Locals) Narrow type codegen at function boundaries AND local variables. Values stored at natural width.

TYPE_I8..TYPE_U32 in both compilers
MirFunc param_type_names + return_type_name populated from AST
Narrow param/return types in LLVM define line (never main)
sext/zext narrow params to i64 at function entry
trunc i64 to narrow type before ret
trunc args before call, sext/zext return after call
Narrow local variables: MirFunc tracks local_narrow_names/types, codegen emits alloca at natural width, trunc on store, sext/zext on load
26 integration tests passing (16 boundary + 10 narrow locals)
Update struct field storage for non-repr(C) structs — typed GEP, sizeof-via-GEP allocation

Phase S.2: Packed Arrays (COMPLETE — Step 6) F32 vectors with contiguous 4-byte storage.

f32_vec_new(), f32_vec_push(), f32_vec_get(), f32_vec_set(), f32_vec_len()
3-field header [cap:i64, len:i64, data_ptr:i64] with 4-byte stride data
F64↔float conversion at boundaries (fptrunc/fpext + bitcast)
Dynamic growth (realloc with doubled capacity)
Both compilers, integration tests passing

Phase S.3: LLVM Auto-Vectorization Metadata (COMPLETE) Loop metadata emission enables LLVM’s auto-vectorizer at -O2. Narrow locals (S.1) provide the sized storage LLVM needs.

Loop back-edge detection in TERM_JUMP (for_cond/while_cond labels)
!llvm.loop metadata with llvm.loop.vectorize.enable hints on back-edges
Unique metadata IDs per loop (20000 + i*2 numbering scheme)
cg_emit_loop_metadata() / mir_codegen_emit_loop_metadata() in both compilers
4 integration tests (metadata emission, multiple loops, correctness, narrow+metadata)
Add -O0/-O1/-O2/-O3 CLI flag to compilation pipeline (default -O2)
Benchmark auto-vectorized vs scalar loops — future

Phase S.4: Explicit SIMD F32x4 Intrinsics (COMPLETE — Step 8) 9 intrinsics emitting native <4 x float> LLVM vector operations.

Phase S.5: F64x2 SIMD (COMPLETE) 8 intrinsics emitting native <2 x double> LLVM vector operations.

simd_f64x2_splat, _add, _sub, _mul, _div, _extract, _sum, _from_values
TYPE_SIMD_F64X2 = 45 in both compilers
8 integration tests, IR verified <2 x double>

Phase S.6: I32x4 SIMD (COMPLETE) 8 intrinsics emitting native <4 x i32> LLVM vector operations.

simd_i32x4_splat, _add, _sub, _mul, _and, _or, _xor, _extract, _sum, _from_values, _shl, _shr
TYPE_SIMD_I32X4 = 46 in both compilers
10 integration tests, IR verified <4 x i32>

Phase S.7: SIMD FMA (COMPLETE) 3 intrinsics: simd_f32x4_fma, simd_f64x2_fma, simd_i32x4_fma — fused multiply-add (a*b+c).

Float types use @llvm.fma.v4f32 / @llvm.fma.v2f64; I32x4 uses mul + add
8 LLVM intrinsic declarations (fma, minnum, maxnum, fabs for v4f32 and v2f64)
Both compilers, 3 integration tests

Phase S.8: SIMD Min/Max/Abs (COMPLETE) 9 intrinsics: min/max/abs for each of F32x4, F64x2, I32x4.

Float min/max: @llvm.minnum/@llvm.maxnum; Float abs: @llvm.fabs
I32x4 min/max: icmp slt/sgt + select; I32x4 abs: sub zeroinitializer + select
Both compilers, 9 integration tests

Phase S.9: SIMD Shuffle (COMPLETE) 3 intrinsics: simd_f32x4_shuffle, simd_f64x2_shuffle, simd_i32x4_shuffle.

Lane permutation via extractelement/insertelement sequences (runtime masks)
Both compilers, 3 integration tests

Phase S Future Extensions:

F32x8/AVX SIMD (requires @cfg(target_feature: "avx2"))
simd_convert (F32x4↔I32x4 conversion)
simd_gather / simd_scatter (indexed memory access)

Phase B: Benchmark Optimization Sprint (COMPLETE)

Duration: ~3 hours Rationale: Benchmark suite revealed Quartz was 1.3x–172x slower than C across 6 benchmarks. Root cause: broken compilation pipeline (LLVM optimizations never ran on Quartz IR), O(n²) string concatenation, missing optimization hints, and bounds-check overhead in tight loops.

Phase B.1: Pipeline Fix — bin/bench was running llc -filetype=obj then clang -O2 on the object file. -O2 on an object file does nothing. Fix: clang -O2 -x ir tells clang the input is LLVM IR and runs the full optimization pipeline.

Phase B.2: StringBuilder Benchmark — Rewrote string_concat benchmark from O(n²) str_concat to amortized O(1) sb_new/sb_append/sb_to_string.

Phase B.3: LLVM Optimization Hints — Three changes to improve generated IR quality:

target datalayout + target triple (platform-specific, selected via #ifdef/@cfg)
nounwind on all function definitions (user-defined + runtime helpers)
noalias on malloc declaration (enables alias analysis)

Phase B.4: Unchecked Vector Access — New intrinsics vec_get_unchecked/vec_set_unchecked for performance-critical code. Straight-line pointer math without bounds-check branches, OOB blocks, or phi nodes.

Both compilers: typecheck, mir, codegen
4 integration tests (vec_unchecked_spec.rb)
Sieve and matrix benchmarks rewritten to use unchecked access in inner loops

Phase B.5: Polly Loop Optimization — Added -mllvm -polly to bench pipeline with fallback to plain -O2. Polly performs cache-locality optimization, loop tiling, and vectorization on polyhedral loop nests.

Results: 2,813 tests at time of completion. Fixpoint maintained.

Phase O: MIR Optimization (COMPLETE)

Duration: ~4 hours Rationale: Close the performance gap vs C. Single-pass per-block MIR optimizer applied between MIR lowering and LLVM codegen.

Optimizations:

Constant folding — evaluate const op const at compile time (all binary ops including comparisons, bitwise)
Algebraic identity — x+0, x*1, x|0, x<<0, etc. → x
Annihilator — x*0, x&0 → 0; x-x, x^x → 0
CSE (hashcons) — deduplicate pure expressions per block
Copy propagation — forward store→load within a block, clear on side effects

Architecture:

3-pass design: (1) per-block analysis + in-place rewrites, (2) resolve all operands through final forward map, (3) NOP forwarded instructions
C bootstrap: mir_opt.c (~435 lines), mir_opt.h, --no-opt flag
Self-hosted: mir_opt.qz (~590 lines), imported in quartz.qz

Bugs fixed during implementation:

MIR_INDEX_STORE: must resolve operand1 (array), operand2 (index), AND args[0] (value)
MIR_EXTENDED_ASM / MIR_INLINE_C: added to side-effecting list that clears copy propagation
While-cond dominance: short-circuit and/or in while conditions creates sub-blocks; must use mir_ctx_get_block() after lowering condition (not hardcoded cond_block)

Results: 2,901 tests (29 new), 0 failures. Fixpoint: gen3==gen4 byte-identical (369,896 lines). llc validates both gen3 and gen4.

Quality Plan: Compiler Issues & Stdlib TODOs (COMPLETE)

Duration: ~4 hours Rationale: Fix all known compiler bugs and stdlib TODOs before continuing with S.1/S.3 sized storage work. Building on a shaky foundation compounds technical debt.

Phase 1 (Issue #3 — Array Literal Codegen): Fixed TYPE_VOID → TYPE_INT in self-hosted mir_emit_store_offset.

Phase 2 (New Intrinsics): Added str_to_f64 and f64_to_str to both compilers. str_to_f64 calls @strtod; f64_to_str calls @sprintf with %g format.

Phase 3 (JSON Float Parsing): Number parser now detects ./e/E and routes to str_to_f64 → JsonValue::Float. Fixed both monolithic std/json.qz and modular std/json/parser.qz.

Phase 4 (errno Access): Replaced hardcoded return 0 in get_errno() with 4-byte little-endian read via __error() + ptr_read_byte.

Phase 5-7 (Inference Issues #4, #5, #6): Added error_count to InferStorage, checked infer_unify return values at all call sites. Changed MIR field access fallback from silent to warning via eputs().

Phase 8 (Stdlib TODOs):

8a: json_stringify_pretty — full implementation with indent tracking and helper functions
8b: TcpListener type — COMPLETE (ListenResult enum, tcp_listener(), tcp_accept(TcpListener), tcp_close_listener())
8c: Buffer-to-string — verified working, documented pattern in tcp.qz comment

Results: 2,536 tests (23 new), 0 failures, fixpoint gen3==gen4 at 324,388 lines, all 7 compiler issues FIXED.

Hash-Based String Match Dispatch (COMPLETE)

Duration: ~2 hours Rationale: String match arms desugar to linear str_eq chains (O(n) comparisons). Hash-based dispatch is O(1).

Strategy: Adaptive — compiler picks strategy based on arm count:

1-4 arms: Linear str_eq chain (branch prediction wins)
5+ arms: Compile-time FNV-1a hash table with runtime hash+compare
FNV-1a hash function in both compilers
Compile-time hash table generation for 5+ string arms
Runtime hash+compare dispatch with collision fallback
Both bootstrap and self-hosted compilers updated

Stdlib API Unification Phase 1 (COMPLETE)

Duration: ~20 minutes Rationale: ~296 flat-namespace builtins had duplicate aliases, inconsistent UFCS coverage, and naming divergence between compilers. Clean API before Phase T (C backend) adds more intrinsics.

Design decisions:

to_* = type conversion ONLY (changes the type): .to_s(), .to_i(), .to_f64()
Same-type transforms use descriptive names: .downcase(), .upcase(), .trim(), .reverse()
.size property everywhere (no .len(), .count(), .length)
Unsafe ops always in prelude: as_int, as_type, etc.

Killed:

int_to_str (use str_from_int or n.to_s())
String$len / Vec$len (use .size property)
.length property (use .size)
Set$contains / Set$remove (use .has() / .delete())
StringBuilder$push (use .append())
str_to_lower / str_to_upper (renamed to str_downcase / str_upcase)

Added 18 UFCS methods (both compilers):

String: .to_f(), .hash()
Vec: .free(), .get_unchecked(), .set_unchecked()
HashMap: .get(), .set(), .free()
Set: .has(), .delete()
StringBuilder: .append(), .append_int(), .append_char(), .free()
Int: .to_f64(), .to_f32()
F64: .to_i(), .to_s()

Results: 2,760 tests, 0 failures. Fixpoint: gen3==gen4 (353,132 lines).

Success Metrics

Minimal Systems Programming Readiness (ACHIEVED)

Full Systems Programming Readiness

Remaining work stack-ranked by impact. Each item unlocks downstream capabilities.

Tier 1: Polish & Production Quality (COMPLETE)

Rank	Item	Status	Rationale
1	Apply `newtype` to TcpListener/TcpStream	DONE	`ListenResult` enum, type-safe accept/close
2	`-O0`/`-O1`/`-O2`/`-O3` CLI flag	DONE	Optimization level control (default -O2)
3	Generic `Task<T>` for spawn/await	DONE	TYPE_TASK=47, ptype-based spawn/await
4	Conditional struct fields (`@cfg` on fields)	DONE	Parser skips non-matching fields
5	Networking test IR caching	DONE	MD5-keyed IR cache in spec_helper.rb

Tier 2: SIMD Extensions (COMPLETE)

Rank	Item	Status	Rationale
6	`simd_fma` (fused multiply-add)	DONE	F32x4, F64x2, I32x4 variants
7	`simd_min`/`simd_max`/`simd_abs`	DONE	9 intrinsics across all types
8	`simd_shuffle` / lane permutation	DONE	3 intrinsics via extract/insert
9	`@cfg(target_feature)`	DONE	neon, SSE/SSE2/AVX/FMA detection

Tier 2.5: Benchmark Sprint (COMPLETE)

Rank	Item	Status	Rationale
10	Bench pipeline: `clang -O2 -x ir`	DONE	Previous pipeline never optimized Quartz IR
11	StringBuilder benchmark rewrite	DONE	O(n²) → O(n) string concatenation
12	LLVM hints: datalayout, nounwind, noalias	DONE	Enables platform-specific + alias optimizations
13	`vec_get/set_unchecked` intrinsics	DONE	Eliminates bounds checks in hot loops
14	Polly loop optimizer in bench pipeline	DONE	Polyhedral cache/tiling optimization

Tier 3: Deeper Systems Capabilities

Rank	Item	Status	Rationale
15	Narrow struct fields (non-repr(C))	DONE	Typed GEP, sizeof-via-GEP allocation
16	Cross-compilation `--target` flag	DONE	5 targets: aarch64/x86_64 macOS+Linux, wasm32-wasi
17	F32x8/AVX SIMD	DONE	14 intrinsics, 256-bit 8-lane float vectors
18	Linear types (`linear struct`)	DONE	Move semantics, borrows, Drop trait, 11 tests
19	Structured concurrency (`task_group`)	DONE	Scope-bound tasks, barrier, cancellation, 9 tests
20	Custom iterators (`for x in struct`)	DONE	`$next` method returning Option, 7 tests
21	Short-circuit `and`/`or`	DONE	MIR-level branching, RHS skipped when determined
22	Module system fix (imported siblings)	DONE	impl/extend cross-references across modules
23	C bootstrap vec_push type checking	DONE	Proper generics: eliminated Vec abuse, enabled type checking, 8 tests

Tier 4: Validation Projects

Rank	Item	Status	Rationale
24	Slab allocator in Quartz	DONE	10 tests: alloc, free, LIFO reuse, growth, linked list
25	Lock-free ring buffer	DONE	12 tests: SPSC, MPMC, CAS contention, stress 2x500
26	Bare metal QEMU target	DONE	Freestanding aarch64/x86_64, UART output, QEMU test harness
27	Cross-platform stdlib	DONE	`@cfg`-gated socket constants + struct fields, 4 tests

Tier 5: Networking & Concurrency Hardening (Phase N)

Rank	Item	Status	Rationale
28	Networking stdlib (`std/net`)	DONE	Hardened tcp_read_all/tcp_write_all, cross-platform socket constants
29	recv with timeout	DONE	`recv_timeout(ch, timeout_ms)` via pthread_cond_timedwait
30	Compound field assignment (`self.x += 1`)	DONE	Field + index LHS in compound assignment, both compilers
31	String formatting (`format("{}!", x)`)	DONE	`format()` intrinsic with `{}` placeholders
32	Non-blocking I/O (epoll/kqueue)	DONE	EventLoop stdlib, kqueue (macOS) + epoll (Linux)
33	Thread pool runtime	DONE	Verified current model sufficient; pool deferred to Phase V
34	Supervision / error recovery	REMOVED	Not a core language feature; see P2P gossip chat docs for application-level approach

Tier 6: Launch Readiness (Phase W)

Rank	Item	Status	Rationale
35	API unification sprint	TODO	Kill naming chaos: one canonical name per operation, sync 34 divergent builtins
36	Examples gallery (12-15 programs)	TODO	Every downstream artifact (website, blog, playground) needs example code
37	Auto-generated API reference (`rake docs`)	DONE	`tools/doc.qz` extracts `##` comments, `tools/doc_check.rb` enforces coverage
38	Website skeleton (static site)	TODO	Landing page, getting started, reference, API docs, examples
39	Literate source site	TODO	Browsable annotated compiler source with cross-linked definitions
40	VS Code extension (syntax highlighting + format)	TODO	TextMate grammar, format-on-save, basic diagnostics
41	`quartz` CLI unification	MOSTLY DONE	`build`, `run`, `check`, `fmt`, `lint` done; `doc` and `init` remain
42	Launch blog post	TODO	”I Built a Self-Hosting Language in 47 Days” — technical narrative from git log

Tier 7: Language Evolution

Rank	Item	Status	Rationale
43	Refinement types (Z3 integration)	TODO	Eliminate bounds/null/zero errors at compile time
44	Mutable borrows (`&mut T`)	DONE	Exclusive mutable references for linear types, both compilers, 10 tests
45	E-graph MIR optimizer	DONE	Acyclic e-graph with hashcons CSE, replaces mir_opt as default, both compilers
45b	Variadic extern ABI (self-hosted)	DONE	Correct arm64 variadic call syntax, 4 tests
45c	Safe strength reduction	DONE	`x * 2^n → x << n` via new-instruction insertion, 6 tests, both compilers
45d	Dead code elimination	DONE	Iterative unused instruction removal, 5 tests, both optimizers
45e	Tail call optimization	DONE	`tail call` for self-recursive returns, 6 tests, both compilers
45f	Register allocation hints	DONE	`noundef` on all function params, 4 tests, both compilers
45g	Function inlining	DONE	Tiny function inline expansion (<=8 instrs), 6 tests, both optimizers
45h	Cross-module global variables	DONE	`var x = 0` at module level, NODE_GLOBAL_VAR, 5 tests
46	GPU compute (`@gpu` annotation)	TODO	NVPTX/AMDGPU via LLVM backends
47	LLM directives (`@ai` functions)	TODO	Language-integrated AI (needs design session)
48	Union / intersection / record types	ACTIVE	SimpleSub subtype inference (Phases 0-7 DONE, both compilers). Phase 8: record types `{ x: Int }` working end-to-end; row variables (8D) DONE; monomorphized codegen scaffold (8G) DONE. Remaining: intersection simplification (8F).
49	Wildcard import (`import * from module`)	DONE	Parser supports `from mod import ` and `import from mod`; resolver filters selective imports (`from mod import a, b`); 9 integration tests

Tier 8: Dogfooding Vision (Phase V)

Rank	Item	Status	Rationale
49	P2P gossip chat in Quartz	DONE	425 lines, 6 tests, gossip relay + dedup verified
50	Web server in Quartz	DONE	`std/net/http_server.qz` — 751 lines, kqueue event loop, routing, middleware, UFCS import working
50b	Cross-module import bug fixes	DONE	Extern fn prefixing fix, NODE_INDEX type inference, `mir_extract_generic_element_type`, let-binding struct type fallback, prelude `do..end` refactor
51	Web framework on Quartz server	TODO	Dogfooding stdlib and ergonomics
52	Marketing site in Quartz	TODO	End-to-end language validation
53	Canvas-based WASM frontend	TODO	Browser deployment, radical rendering approach

Next Wave: Language Evolution

These features represent the next major evolution of Quartz. They transform the language from “a systems language that compiles itself” into “a systems language that competes.”

Detailed design notes and research links in funideas.md. Far-future moonshots in moonshots.md.

Phase T: Multi-Target Compilation (COMPLETE)

Duration: ~6 hours Rationale: Quartz currently only emits LLVM IR. Adding C and WASM backends unlocks portability, browser deployment, and the dogfooding vision.

Item	Priority	Status	Notes
C backend (emit C from MIR)	HIGH	DONE	`--backend c` flag, ~165 intrinsics, 57 tests
WASI target (IR gating)	HIGH	DONE	`--target wasm32-wasi`, `_start` entry, pthread/regex gated, 6 tests
Cross-platform stdlib	HIGH	DONE	`@cfg(os:)` socket constants, struct field gating, 4 tests
`@cfg` respects `--target`	HIGH	DONE	Target string threaded to parser, both compilers

Phase N: Networking & Concurrency Hardening (COMPLETE)

Priority: HIGH — Prerequisite for Phase V (Dogfooding Vision) and P2P gossip chat validation project Rationale: Quartz has the concurrency primitives (spawn, channels, select, mutex, atomics, task_group) but lacks the ergonomic networking layer and production-grade concurrency features needed for real networked applications. These gaps were identified during the P2P gossip chat design (see docs/projects/p2p-gossip-chat.md).

Step	Item	Priority	Status	Notes
N.1	Networking stdlib (`std/net` module)	HIGH	DONE	Hardened `std/net/tcp.qz` with `tcp_read_all`, `tcp_write_all`, error codes; `std/ffi/socket.qz` with cross-platform constants. 14 networking tests.
N.2	recv with timeout	HIGH	DONE	`recv_timeout(ch, timeout_ms)` via `pthread_cond_timedwait` + `clock_gettime`. Both compilers, 4 tests, fixpoint verified.
N.3	Non-blocking I/O	MEDIUM	DONE	`std/ffi/event.qz` (kqueue/epoll FFI) + `std/net/event_loop.qz` (cross-platform EventLoop). 8 tests passing. Fixed variadic extern ABI on arm64. Fixed module struct init canonical naming for vec_push type checking.
N.4	Thread pool runtime	MEDIUM	DONE	Verified existing `task_group` + per-task pthreads sufficient for current scale. Thread pool deferred to Phase V if needed.
N.5	Compound field assignment	MEDIUM	DONE	`self.x += 1`, `arr[i] += 1` now parse and lower correctly. Parser desugars to read-modify-write. Both compilers, 14 tests.
N.7	String formatting	LOW	DONE	`format("Hello, {}!", name)` intrinsic with `{}` placeholder substitution. Both compilers, 9 tests.

Validation project: P2P Gossip Chat (projects/gossip-chat.qz) — 425 lines, exercises spawn, mutex, atomics, raw sockets, gossip relay with message dedup. Three-node message relay verified. 6 integration tests.

Phase OPT: Bug Fixes & MIR Optimization Sprint (COMPLETE)

Duration: ~4 hours Rationale: Six targeted improvements to MIR optimization and codegen quality, plus bug fixes for custom iterators and @cfg arity mangling.

Item	Status	Notes
Custom iterator break fix	DONE	Null out `__copt` after `option_free` in body to prevent double-free
@cfg arity-mangling validation	DONE (4 tests)	Verified correct behavior; added `cfg_arity_spec.rb`
Dead code elimination (DCE)	DONE (5 tests)	`eg_dce()` in egraph_opt.qz, `dce_function()` in mir_opt.c; iterative unused instruction removal
Tail call optimization (TCO)	DONE (6 tests)	`tail call` emitted when call dest matches TERM_RETURN value
Register allocation hints	DONE (4 tests)	`noundef` attribute on all function parameters in both compilers
Function inlining	DONE (6 tests)	Single-block functions (<=8 instrs, all clonable kinds) inlined at call sites

Phase GV: Cross-Module Global Variables (COMPLETE)

Duration: ~3 hours Rationale: Module-level var x = 0 creates shared mutable state accessible across modules. Required for stateful libraries and configuration.

NODE_GLOBAL_VAR (kind=49) in parser and AST
Resolver: symbol collection, name prefixing, init rewriting, resolve_imports (7 changes)
ast_program_has_global_var() to prevent duplicate merging
MIR: is_global_var() checks globals[]; is_global flag on STORE_VAR/LOAD_VAR
Optimizer Pass 3 fix: skip side-effecting instructions when NOP’ing forwarded dest_ids
5 tests in global_var_spec.rb (shared state, initial value, multiple vars, reset/read, IR check)

Phase LS: Literal Syntax & Data Types (COMPLETE)

Date: February 15, 2026 | Status: COMPLETE — 11 of 11 items DONE Rationale: Comprehensive audit of literal syntax across Python, Ruby, Rust, Go, Swift, Kotlin, and Elixir revealed gaps in Quartz’s expressiveness. Every feature below was approved during the audit review.

Rank	ID	Item	Effort	Status
0	LS.0	Set literals `{a, b, c}`	Medium	DONE (16 tests, fixpoint verified)
1	LS.1	Negative indexing `a[-1]`	Small	DONE (6 tests, parser/HIR desugar)
2	LS.2	Char literal `'a'`	Small	DONE (9 tests: char_literal + char_predicates)
3	LS.3	Set comprehension `{x for x in ...}`	Small	DONE (4 tests, parser desugar to `set_new` block)
4	LS.4	Range as first-class object	Medium	DONE (first-class Range struct, `.contains()`, `.each()`, `.to_array()`)
5	LS.5	Sliceable trait `v[start..end]`	Medium	DONE (Vec slicing via trait dispatch)
6	LS.6	Custom Index trait (`obj[key]`)	Medium	DONE (`Type$get`/`Type$set` dispatch, both compilers)
7	LS.7	Binary data type (`Bytes`)	Large	DONE (Bytes type + ByteReader cursor, LS.7a)
7b	LS.7b	Byte literal syntax `b"..."`	Small	DONE (lexer + codegen, raw byte strings)
8	LS.8	Multi-line strings `"""..."""`	Medium	DONE (triple-quote syntax, interpolation support)
9	LS.9	String interpolation `"#{expr}"`	Medium	DONE (lexer+parser desugaring to `str_concat`/`to_str`, type-aware runtime, both compilers)

Design notes:

LS.1: Desugar a[-n] to a[a.size - n] in parser. Trivial.
LS.2: Use single-quote 'a' → resolves to integer char code. Analogous to :sym but for characters.
LS.3: Extension of existing list comp ([expr for ...]) and map comp ({k: v for ...}). Desugar to set_new + set_add loop.
LS.9: Parser detects #{ in string literals and desugars to str_concat(part, to_str(expr)) chain. Self-hosted to_str uses runtime type discrimination (heap pointer heuristic) for String identity vs Int sprintf.

Phase W: Launch Readiness

Priority: HIGH — This is the next phase of work. Rationale: Quartz has world-class internals but no public-facing polish. 743 commits, 278K lines of churn, 2,986 tests, byte-identical fixpoint — none of this matters if nobody can find, try, or understand the language. Phase W transforms Quartz from “impressive project” to “language people adopt.”

W.1: API Unification Sprint (COMPLETE) Synced ~37 missing builtins from C bootstrap to self-hosted typecheck + MIR + codegen. Both compilers now expose identical API surfaces.

Added 13 concurrency builtins (channel_new, send/recv, mutex, recv_timeout, try_send/try_recv)
Added 10 FFI/pointer builtins (string_to_cstr, cstr_to_string, qz_malloc/qz_free, ptr_read/write_byte)
Added 7 narrow type conversions (to_i8 through to_u64) with codegen
Added 6 misc builtins (print, eprint, unreachable, sb_len, str_join, regex_capture_named)
Renamed atomic_cas → added atomic_compare_exchange (returns old value, not bool)
Unified close_channel → channel_close (consistent channel_* naming)
Fixed duplicate cstr_to_string registration in C bootstrap typecheck.c
20 integration tests in api_unification_spec.rb
Updated docs/INTRINSICS.md with complete builtin tables

W.2: Examples Gallery (Day 2-3) Every downstream artifact needs example code. Write 12-15 examples, each <50 lines:

W.3: Auto-Generated API Reference (Day 3-4) Wire tools/doc.qz into the build pipeline:

tools/doc.qz rewritten — scans 29 stdlib modules, extracts ## doc comments
Generate structured Markdown per module (one page per file) — docs/api/*.md
Include function signatures, @param, @returns, @since tags
Index page with module-level summaries and symbol counts
Generate JSON index for potential static site integration
Consider extending tools/doc.qz to emit HTML directly

Parser gap fixes for 100% coverage: Three stdlib modules failed initially (toml/manifest.qz, ffi/event.qz, net/event_loop.qz). Fixed 5 parser issues across both compilers:

Bootstrap parser.c: match arm body parsing (statement vs expression forms), () unit literal, arm boundary detection
Self-hosted parser.qz: bitwise OR | operator (was using |> token), () unit literal, match arm body via ps_parse_stmt, @cfg EOF handling

W.4: Website Skeleton (Day 4-5) Static site — does not need to be fancy. Content is king.

Landing page: tagline, 3 code examples, key stats (743 commits, 47 days, self-hosting)
Getting Started: install LLVM, clone repo, hello world, compile to binary
Language Reference: QUARTZ_REFERENCE.md rendered as HTML
API Reference: auto-generated from doc comments (W.3)
Examples Gallery: the 15 examples from W.2 with syntax highlighting
Playground: “Coming soon” placeholder (or WASM if time permits)

W.5: Literate Source Site (Day 5-7) Browsable annotated compiler source. The compiler explains itself.

Write tools/literate.qz (or Ruby equivalent) that reads .qz source files
Render ##! module headers as page introductions
Render ## function docs as rich prose between syntax-highlighted code blocks
Cross-link identifiers — click a function call, jump to its definition
Generate nav sidebar from module directory structure
Deploy as source.quartz-lang.org or similar

W.6: VS Code Extension (Day 6-7) Minimum viable editor support:

TextMate grammar (.tmLanguage.json) for Quartz syntax highlighting
Wire quartz --format as format-on-save provider
Wire compiler errors as basic diagnostic output
Publish to VS Code marketplace
README with installation instructions and screenshots

W.7: CLI Unification (MOSTLY COMPLETE) One binary, one interface. Hide the llc | clang pipeline:

quartz build program.qz — compile to LLVM IR
quartz run program.qz — compile and execute
quartz check program.qz — typecheck only
quartz fmt program.qz — format source
quartz lint program.qz — lint for style issues (bonus)
quartz doc — generate stdlib API documentation (29 modules, 664 symbols)
quartz init my_project — scaffold project with quartz.toml

W.8: Launch Blog Post (Day 8-9) The technical narrative that gets posted to HN/Reddit/lobste.rs:

“I Built a Self-Hosting Language in 47 Days” — engineering story, not marketing
Structure: Day 1 micro-C bootstrap → Day 6 fixpoint → Day 21 type inference → four dogfooding cycles → 278K churn → 85K alive
Include the bugs: signed FNV-1a, closure capture, field_idx=0
Include the numbers: 743 commits, 2,986 tests, 353K-line fixpoint
Address what AI-assisted programming actually looks like in practice
Link to playground, examples, literate source

Phase V: The Dogfooding Vision

Rationale: Prove the language by building our entire web presence in it.

Step	Description	Status	Depends On
1	Web server written in Quartz	DONE	`std/net/http_server.qz` — 751 lines, kqueue event loop
2	Web framework on Quartz server	TODO	Step 1
3	Marketing site in Quartz	TODO	Step 2, WASM target
4	Canvas-based WASM rendering	TODO	Step 3 — radical approach: render as canvas app, not HTML/CSS

Rationale: Eliminate entire classes of runtime errors at compile time. This will be a thorough, research-driven, first-class implementation.

Step	Description	Status	Notes
R.0	Deep research: Flux, LiquidHaskell, Thrust, Kleppmann	TODO	Study all existing implementations
R.1	Design integration with existential type model	TODO	How refinements compose with i64-everywhere
R.2	SMT solver integration (Z3)	TODO	Acceptable dependency
R.3	Core refinement type checker	TODO	Predicate-decorated types with inference
R.4	Gradual adoption: runtime checks → static proofs	TODO	Refinements start as asserts, compiler elides proven ones
R.5	AI-assisted spec generation (stretch)	TODO	LLMs generating refinement annotations

Phase G: GPU Compute

Rationale: First-class GPU support with ergonomic annotation syntax. Build on existing SIMD work (Phases S.4-S.9).

Step	Description	Status	Notes
G.0	Enhanced SIMD hints (build on S.3-S.9)	TODO	Foundation already laid
G.1	`@gpu` annotation + LLVM NVPTX backend	TODO	Basic GPU kernel emission
G.2	Host-side kernel launch codegen	TODO	Automatic memory transfer
G.3	Multi-vendor (AMD via AMDGPU backend)	TODO	Portability
G.4	Kernel fusion / advanced optimization	TODO	Stretch — may need MLIR

Phase O: Compiler Optimization

Rationale: Close the performance gap vs C. Make Quartz-generated code competitive.

Item	Priority	Status	Notes
Benchmark pipeline fix (`clang -O2 -x ir`)	HIGH	DONE	Phase B.1 — was the #1 perf issue
LLVM optimization hints (datalayout, nounwind, noalias)	HIGH	DONE	Phase B.3 — both compilers
`vec_get/set_unchecked` intrinsics	HIGH	DONE	Phase B.4 — eliminates bounds checks in hot loops
Polyhedral loop optimization (Polly)	MEDIUM	DONE (bench)	Phase B.5 — `-mllvm -polly` in bench pipeline; could be added to compiler’s `-O2` default
MIR optimizer (const fold, CSE, copy prop)	HIGH	DONE	Both compilers; 29 tests; fixpoint verified
E-graph MIR optimizer	HIGH	DONE	Acyclic e-graph with hashcons CSE; replaces mir_opt as default in self-hosted
Strength reduction (`x * 2^n → x << n`)	MEDIUM	DONE	Safe new-instruction insertion in both optimizers; 6 tests
Dead code elimination	HIGH	DONE	Iterative unused instruction removal; 5 tests
Tail call optimization	HIGH	DONE	`tail call` when call dest matches return value; 6 tests
Register allocation hints	MEDIUM	DONE	`noundef` on all function params; 4 tests
Function inlining	HIGH	DONE	Single-block, <=8 instrs, all clonable kinds; 6 tests
LLM-driven optimization (experimental)	LOW	TODO	ML models for inlining/pass ordering decisions

Phase A: AI Integration

Rationale: Language-integrated AI could be world-changing. Needs serious design work.

Item	Priority	Status	Notes
LLM directive design session	HIGH	TODO	Resolve semantics before implementing
`@ai("prompt")` function annotations	MEDIUM	TODO	Compiler generates API call + type validation
Constrained decoding (LMQL-style)	LOW	TODO	Token masking during generation

Pending Design Discussions

These need dedicated design sessions before committing to implementation:

Topic	Key Questions
Union / intersection / record types	RESOLVED. SimpleSub for Phases 0-7 (union types, trait-bound intersections, COMPLETE). Phase 8 adds row polymorphism + structural subtyping for value-level record types (ACTIVE — parser, type resolution, field access DONE; row variables (8D) DONE; monomorphized codegen scaffold (8G) DONE; intersection simplification (8F) remaining).
Existential type model analysis	Is i64-everywhere a speed ceiling? Where do we pay? Is anyone else doing this?
Quantitative Type Theory	How much of Idris 2’s QTT can we adopt? 0/1/unrestricted usage tracking?
Generational references	Does Vale’s approach fit our arena model? Worth the 10.84% overhead?
Austral linear types	Can we add a 600-LOC linearity checker for resource safety?
Language-integrated queries	Does compile-time query generation fit Quartz’s identity?

Phase U: Union / Intersection / Record Types (ACTIVE)

Duration: ~8 hours so far (started 2026-02-16) | Status: Phase 8 in progress Rationale: Quartz needs first-class union types, intersection types, and anonymous structural record types to compete with TypeScript, OCaml, and PureScript. This phase brings SimpleSub subtype inference and structural typing.

Phases 0-7: SimpleSub Foundation (COMPLETE) Both bootstrap and self-hosted compilers updated with union/intersection type infrastructure:

Phase 0: TY_UNION/TY_INTERSECTION type constants (bootstrap TY_UNION=49, self-hosted TYPE_UNION=50)
Phase 1: Parser extensions — | and & type operators in type annotations
Phase 2: Subtype relations and type join for union types
Phase 3: MIR_UNION_WRAP/MIR_UNION_TAG/MIR_UNION_UNWRAP operations
Phase 4: LLVM IR emission for union wrap/tag/unwrap
Phase 5: Union exhaustiveness checking in match expressions
Phase 6: Union linearity propagation (move semantics)
Phase 7: JsonPrimitive union type alias in stdlib

Phase 8: Row Polymorphism & Structural Subtyping (IN PROGRESS) Record types { x: Int, y: String } — anonymous structural types.

Sub-phase	Description	Status	Commits
8A-foundation	`TY_RECORD=51` (bootstrap), `TYPE_RECORD=52` (self-hosted)	DONE	`e48a334`, `22e5c2e`
8A	Parser record type syntax in both compilers	DONE	`ad3ce05`, `926d81d`
8B	`type_record()` constructor + `tc_parse_type_annotation` resolution	DONE	`e638772`
8C/8E	Type checker + MIR field access for `TY_RECORD`	DONE	`27b886c`
8H	Integration tests (3/3 pass: syntax, field access, width subtyping)	DONE	`9bd01c1`
8D	Row variables — `InferStorage` + record-aware unification	DONE	`78b1970`, `184bebd`, `24b318c`
8F	Intersection simplification — record `&` record → merged record	TODO	—
8G	Monomorphized codegen — GEP offset specialization scaffold	DONE	`4475125`

Key design decisions:

Record type syntax: { field: Type } in type positions (no conflict with block syntax)
Row variables are implicit only — programmers never write them; inferred from field access on untyped params
Monomorphization: record field access compiled to concrete GEP offsets per call site (Zig/Rust style, zero-cost, no vtables)
Width subtyping: struct with extra fields satisfies narrower record type (e.g., RGB { r, g, b } matches { r: Int })

Known Issues

Issues discovered during development that need resolution. These are bugs or shortcomings that affect correctness, not feature requests.

BUG (FIXED): Self-hosted fixpoint verification hits trace trap

Severity: Medium — fixpoint could not be verified via rake quartz:validate Discovered: Pre-existing (observed 2026-02-16 during union type work, but present before) Fixed: 2026-02-16 Location: C bootstrap MIR optimizer (mir_opt.c) + self-hosted type checker (typecheck.qz)

Root cause: Two issues.

Five type errors in typecheck.qz (wrong arity on tc_type_name calls, unnecessary as_string() on Vec<String> elements) prevented clean compilation.
Buffer overflow in optimize_function (mir_opt.c): int child_count[DT_MAX_BLOCKS] is 1024 entries, but memset(child_count, 0, sizeof(int) * n) used n = dt.block_count without bounds check. Large functions from typecheck.qz generate >1024 basic blocks, triggering __chk_fail_overflow.

Fix: Fixed type errors in typecheck.qz. Added early guard in optimize_function to skip domtree-based optimization for functions exceeding DT_MAX_BLOCKS. Fixpoint restored: 1,140 functions, full test suite passes.

BUG (FIXED): Gen2 fixpoint broken — keywords tokenized as identifiers

Severity: Medium — gen2 binary was unusable Discovered: 2026-02-10 (pre-existing, not caused by defer work) Fixed: 2026-02-10 (commit 8f159bb) Location: Gen2 binary (self-hosted compiled by gen1)

Root cause: FNV-1a offset basis literal (14695981039346656037) overflowed INT64_MAX in Quartz, producing a wrong hash constant. The lexer’s keyword recognition uses hash-based string dispatch (46 arms) — incorrect hash table constants caused all keywords to fall through to identifier matching.

Fix: Used signed FNV-1a offset basis that fits in i64. Gen2 fixpoint restored: gen2==gen3 byte-identical.

LIMITATION (FIXED): try_recv cannot distinguish empty channel from value 0

Severity: Low — only affected select arms where 0 is a valid channel value Discovered: Concurrency sprint planning (2026-02-10) Fixed: 2026-02-10

Root cause: try_recv had a single i64 return value used for both the received value and the “nothing available” sentinel.

Fix: try_recv now returns a heap-allocated Option struct [tag:i64, value:i64] — tag=0 is Some (value present), tag=1 is None (channel empty). Three new intrinsics (option_is_some, option_get, option_free) extract/cleanup the result. Select recv arms branch on option_is_some instead of recv_val != 0, with free blocks on both success and failure paths. Both compilers updated.

BUG (FIXED): Self-hosted MIR identifier resolution in single-file programs

Severity: High — blocks self-hosted match codegen tests Discovered: Phase 4 (2026-02-07) Fixed: 2026-02-09 Location: self-hosted/backend/mir.qz (mir_lower_expr, NODE_IDENT handler ~line 2925)

Symptoms:

Match subject variables emitted as global function references (ptrtoint i64 (i64)* @c to i64) instead of local variable loads (load i64, i64* %c)
Function parameters treated as function references when names collide
For-in loop variables treated as function references when names collide

Root cause: mir_lookup_function checked all_funcs before local variable bindings. In single-file programs, all_funcs stores bare names (e.g., double), so any identifier matching a function name was emitted as MIR_FUNC_REF instead of MIR_LOAD_VAR. Multi-file compilation was unaffected because module-prefixed names (e.g., mir$func) never collide with locals.

Fix: Added mir_ctx_lookup_var(ctx, name) >= 0 check before mir_lookup_function in the NODE_IDENT handler. Local variables now correctly shadow function names. 5 new tests in self_hosted_ident_spec.rb.

BUG (FIXED): C bootstrap parser cannot parse generic struct init

Severity: Medium — 6 generic struct tests failing Discovered: Pre-existing Fixed: 2026-02-09 Location: quartz-bootstrap/src/parser.c (typed-call parsing, line ~2404)

Symptoms: All 6 tests in generic_semantics_spec.rb returned exit_code: nil because Pair<Int, String> { ... } failed to parse — the parser only handled IDENT<Types>(...) (function call), not IDENT<Types> { ... } (struct init).

Fix: Added TOK_LBRACE branch after type arg parsing in the C bootstrap parser. Reuses the same field-parsing pattern as non-generic struct init. Also initialized type_args field in ast_struct_init(). No MIR/codegen changes needed — generic structs are structurally identical to non-generic structs at runtime.

BUG (FIXED): Cross-module type resolution in self-hosted compiler

Severity: High — imported modules with struct/enum types fail to typecheck Discovered: Phase 11 (2026-02-08) Fixed: Phase 11.0 (2026-02-08) Location: self-hosted/resolver.qz (resolve_collect_funcs) + self-hosted/quartz.qz (compile)

Symptoms:

error[TYPE]: Undefined variable: s Did you mean 's'? when a function in an imported module uses a struct parameter
error[TYPE]: Unknown struct: Unknown for struct field access in imported modules

Root cause: resolve_collect_funcs only collected functions, global vars, const decls, static asserts, and impl methods from imported modules. Struct/enum/type-alias/newtype/trait definitions were not collected, so when tc_function typechecked imported function bodies, tc_parse_type("S") returned TYPE_UNKNOWN for any struct defined in the imported module.

Fix: Extended resolve_collect_funcs to collect all definition types (tags 4-9). Added Phase 4.0 registration in compile() that registers imported types before tc_program() in the correct phase order (structs/enums/aliases/newtypes → traits → impl blocks).

DESIGN DIVERGENCE: @cfg same-module arity-mangling

Severity: None (dormant) — both compilers are internally consistent Discovered: Phase N (2026-02)

Description: The C bootstrap always arity-mangles function names (func$N where N is the parameter count). The self-hosted compiler uses bare names from the resolver. This means the two compilers produce different IR for the same source, but each is internally consistent — all call sites match the naming used by definition sites within the same compiler.

Impact: When a module has @cfg-gated duplicate definitions (same name, different bodies selected by target), same-module calls in the C bootstrap emit arity-mangled refs while the self-hosted emits unmangled refs. Both compile correctly because each compiler’s resolver and codegen agree on the naming convention.

Decision: Not a bug. Fixing would require adding arity mangling to the self-hosted compiler (high risk, touches every function) or removing it from the C bootstrap (high risk, could break cross-module name collisions). Neither is justified since both compilers work correctly.

RESOLVED: vec_push/vec_set type checking in C bootstrap

Severity: Low — correctness issue, not a crash Discovered: Phase 0 (2026-02), re-investigated Feb 2026 Fixed: Feb 2026 (Proper Generics phase)

Description: The C bootstrap did not propagate the Vec element type for vec_push/vec_set calls. vec_push(my_string_vec, 42) silently compiled when it should have been a type error.

Resolution: Eliminated ~75 improperly-typed Vec<Int> fields and ~400 as_int/as_string casts across 12 self-hosted compiler files. Enabled vec_push/vec_set type checking in both compilers. Heterogeneous containers (struct_registry, droppable_stack) remain as bare Vec which accepts any type. 8 tests in vec_type_check_spec.rb.

Follow-up fix: Module struct init types now use the canonical def->name (e.g., event_loop$ReadyEvent) instead of the AST name (ReadyEvent). This fixed vec_push type mismatches in cross-module code (event_loop_spec.rb: 7 of 8 tests were failing).

Language Feature Gaps (Identified by Quality Plan)

Distinct Type Aliases / Newtype for FFI Wrappers (RESOLVED): Newtype works cross-module. TcpListener uses struct wrapper with ListenResult enum. tcp_listen() returns ListenResult, tcp_accept() takes TcpListener, tcp_close_listener() for cleanup.

Improved Error Diagnostics with Source Locations (RESOLVED): Inference engine now emits diagnostics with file:line:col context. Error codes QZ0200-0299 for inference errors. Both compilers updated.

Commands Reference

# Full test suite
rake test

# Build self-hosted compiler
rake build

# Fixpoint validation (ALWAYS RUN AFTER COMPILER CHANGES)
rake quartz:validate

# Rebuild C bootstrap
make -C ../quartz-bootstrap

# View pending tests
bundle exec rspec --tag pending

# Debug compiler output
./self-hosted/bin/quartz --dump-ast file.qz
./self-hosted/bin/quartz --dump-mir file.qz
./self-hosted/bin/quartz --dump-ir file.qz

Phase TS: Table Stakes Implementation

Date: February 14, 2026 | Status: COMPLETE (all 21 features, 3 sprints) Methodology: Interactive interview — each feature discussed with trade-offs, comparisons, and design considerations. 21 approved, 2 skipped. Details: See TABLE_STAKES_AUDIT.md for full interview transcript and design decisions. Results: 74 new tests across sprints 2-3. Fixpoint verified: gen3==gen4, 1,113 functions, 490,830 lines.

Key Decisions

SKIPPED: ? operator (preserving ? in identifiers for Ruby predicates), async/await (4 paradigms enough)
Visibility: Default public, priv keyword (Ruby-style)
@x sigil: Shorthand for self.x in extend blocks
Operator overloading: Constrained to fixed operator set via extend blocks
User macros: Design-first (research Rust/Elixir/Zig/Yuescript before implementation)

Stack-Ranked Implementation Order

Rank	ID	Item	Effort	Status
1	TS.1	`loop` keyword	Trivial	DONE
2	TS.2	`usize` type alias	Trivial	DONE
3	TS.3	Raw strings `r"..."`	Small	DONE
4	TS.4	`vec_sort` / `vec_sort_by`	Small	DONE
5	TS.5	Path manipulation	Small	DONE
6	TS.6	Filesystem ops	Small	DONE
7	TS.7	Thread-local storage	Small	DONE
8	TS.8	Multi-line strings `"""..."""`	Medium	DONE
9	TS.9	Buffered I/O	Medium	DONE
10	TS.10	Argparse stdlib	Small	DONE
11	TS.11	UTF-8 awareness	Medium	DONE
12	TS.12	Visibility (`priv`)	Medium	DONE
13	TS.13	Integer overflow detection	Medium	DONE
14	TS.14	`@x` sigil (implicit self)	Small-Medium	DONE
15	TS.15	Stack traces on panic	Medium-Large	DONE
16	TS.16	Re-exports (`pub import`)	Medium	DONE
17	TS.17	Tuples	Medium	DONE
18	TS.18	Operator overloading	Medium	DONE
19	TS.19	Slices `Slice<T>`	Large	DONE
20	TS.20	User macros — design	Design	DONE
21	TS.21	User macros — implementation	Large	DONE

Deferred to Phase W or later

Gap	Notes
LSP server	Phase W
Package manager	Phase W
`inline` hint	Low priority — LLVM + MIR inliners sufficient
Dynamic dispatch	Evaluate after operator overloading
Separate compilation	Phase W
`@derive`	After user-defined macros

Quartz Language Roadmap

Principles

Priority Stack

Phase Overview

Phase 0: Dogfooding Phase 1 (COMPLETE)

Phase 1: Fixed-Width Integers (COMPLETE)

Phase 2: Volatile Access (COMPLETE)

Phase 3: Memory Ordering (COMPLETE)

Phase 4: Exhaustiveness Checking (COMPLETE)

Phase 5: Native Floats (COMPLETE)

Phase 6: Packed Structs (COMPLETE)

Phase 7: Conditional Compilation ✅

Phase 8: Const Evaluation (NEARLY COMPLETE)

Phase 9: Inline Assembly Improvements (COMPLETE)

Phase 10: Bit Manipulation Intrinsics (COMPLETE)

Phase 11: Dogfooding Phase 2 (COMPLETE)

Phase C: Concurrency Sprint (COMPLETE)

Phase S: Sized Storage & SIMD Vectorization (COMPLETE)

Phase B: Benchmark Optimization Sprint (COMPLETE)

Phase O: MIR Optimization (COMPLETE)

Quality Plan: Compiler Issues & Stdlib TODOs (COMPLETE)

Hash-Based String Match Dispatch (COMPLETE)

Stdlib API Unification Phase 1 (COMPLETE)

Success Metrics

Minimal Systems Programming Readiness (ACHIEVED)

Full Systems Programming Readiness

Tier 1: Polish & Production Quality (COMPLETE)

Tier 2: SIMD Extensions (COMPLETE)

Tier 2.5: Benchmark Sprint (COMPLETE)

Tier 3: Deeper Systems Capabilities

Tier 4: Validation Projects

Tier 5: Networking & Concurrency Hardening (Phase N)

Tier 6: Launch Readiness (Phase W)

Tier 7: Language Evolution

Tier 8: Dogfooding Vision (Phase V)

Next Wave: Language Evolution

Phase T: Multi-Target Compilation (COMPLETE)

Phase N: Networking & Concurrency Hardening (COMPLETE)

Phase OPT: Bug Fixes & MIR Optimization Sprint (COMPLETE)

Phase GV: Cross-Module Global Variables (COMPLETE)

Phase LS: Literal Syntax & Data Types (COMPLETE)

Phase W: Launch Readiness

Phase V: The Dogfooding Vision

Phase R: Refinement Types (World-Class)

Phase G: GPU Compute

Phase O: Compiler Optimization

Phase A: AI Integration

Pending Design Discussions

Phase U: Union / Intersection / Record Types (ACTIVE)

Known Issues

BUG (FIXED): Self-hosted fixpoint verification hits trace trap

BUG (FIXED): Gen2 fixpoint broken — keywords tokenized as identifiers

LIMITATION (FIXED): try_recv cannot distinguish empty channel from value 0

BUG (FIXED): Self-hosted MIR identifier resolution in single-file programs

BUG (FIXED): C bootstrap parser cannot parse generic struct init

BUG (FIXED): Cross-module type resolution in self-hosted compiler

DESIGN DIVERGENCE: @cfg same-module arity-mangling

RESOLVED: vec_push/vec_set type checking in C bootstrap

Language Feature Gaps (Identified by Quality Plan)

Commands Reference

Phase TS: Table Stakes Implementation

Key Decisions

Stack-Ranked Implementation Order

Deferred to Phase W or later

Archive