Φ-nodes, SSI, and MLIR Block Arguments

Oct 2, 2025

1. Φ-nodes (SSA basics)

SSA (Static Single Assignment): each variable assigned once.
Problem: at control-flow merges, multiple reaching definitions exist.
Solution: φ-node merges values depending on predecessor.
Example in LLVM IR:
```
merge:
  %x = phi i32 [1, %then], [2, %else]
```
Limitation: φ handles merges only, not branch-specific refinements.

2. MLIR Block Arguments

MLIR treats basic blocks like functions:
- Blocks can declare arguments.
- Terminators (br, cond_br) pass values to successors.
Block arguments replace φ-nodes.
Example:
```
^merge(%x: i32):
  %y = addi %x, %c3
```
Advantages (from MLIR Rationale):
1. No “all φ at block top” special case.
2. Function args = block args (unified model).
3. No “atomic φ-bundle” problems (lost-copy issues).
4. Scales better for many predecessors.

3. SSI (Static Single Information)

Extension of SSA: adds σ-nodes at splits (dual of φ at joins).
Purpose: propagate branch-sensitive facts.
- Example: if (x == 0) ⇒ σ splits into x0 (then) and x_not0 (else).
Placement:
- φ-nodes placed using dominance frontier DF(·).
- σ-nodes placed using reverse dominance frontier RDF(·).
Benefits: precise predicated and backward dataflow analyses.
Sources: Ananian (1999/2001), Singer (2002).

4. Why MLIR Doesn’t Need φ

Block arguments are a strict superset of φ-nodes (Chris Lattner).
Block args handle both:
- φ-like merges at joins.
- σ-like refinements at branches (by passing renamed arguments).
This makes MLIR’s design functional SSA: branches apply values to blocks like function calls.

5. Counterarguments & Responses

Criticism (HN): “Block args are just syntax for φ.”
Response: With σ-like behavior (branch-local refinements), block args are more general than φ.
- Example: conditions like x==0 vs x!=0 are explicit in MLIR args.

6. Performance / Cost Considerations

SSA out-of-form: eliminating φ requires parallel copy resolution (lost-copy, swap problems).
MLIR advantage: explicit argument passing avoids “atomic φ-bundles.”
SSI/Singer: more nodes, but pruned forms are efficient; optimistic algorithms exist.
Lowering MLIR → LLVM IR: block args are reconstructed as φ-nodes for backends.

7. Comparisons to Other IR Designs

CPS (Continuation-Passing Style): blocks-as-functions, more explicit than MLIR.
Region-based IRs (Sea-of-Nodes): merges are implicit via dataflow edges.
Predicate/Gated SSA (GSA, PSSA): encode conditions explicitly; similar motivation to SSI.
MLIR position: middle ground — explicit, compositional, generalizes φ and σ.

8. What to Add for Completeness

CGO slides / DevMtg talks (Lattner): design tradeoffs and heuristics.
MLIR Rationale & LangRef: official explanation + examples.
Formal definitions: φ placement (DF), σ placement (RDF).
Counterexamples: branch-specific renamings that φ cannot encode.
Implementation cost: Singer’s node counts, SSA destruction papers (Boissinot 2009, Pereira 2009).
Lowering reality: LLVM and SPIR-V still need φ; MLIR reconstructs them.
Other IRs: compare to CPS, region/dataflow IRs.

9. Reading List

SSA core: Cytron et al. (1991) Efficiently Computing SSA.
SSI: Ananian (1999/2001), Singer (2002) Efficiently Computing SSI.
Out-of-SSA: Boissinot et al. (2009), Pereira & Palsberg (2009).
MLIR docs:
- LangRef → Blocks
- Rationale → Block Arguments vs PHI nodes
- LLVM dialect → PHI Nodes and Block Args
- SPIR-V dialect → Block args for Phi
Design context: Nikita Popov, Design Issues in LLVM IR.
Talks: Lattner/Shpeisman CGO & LLVM Dev Mtg slides.

✅ Key Takeaway:
MLIR’s block arguments subsume φ-nodes and σ-nodes in a clean, function-like model. They make SSA transformations easier, interop with existing IRs via lowering, and position MLIR as a generalization of SSA forms like SSI, GSA, PSSA.