The Parser

Building the syntax tree

The parser takes tokens from the lexer and figures out how they relate to each other. It builds a tree structure called an Abstract Syntax Tree (AST) that represents the structure of your program.

Why Trees?

Code has hierarchy. In 1 + 2 * 3, the multiplication happens before addition. A flat list of tokens doesn't capture this — we need a tree.

Binary (+) ├── left: 1 └── right: Binary (*) ├── left: 2 └── right: 3

The tree shows: first multiply 2*3, then add 1

The tree structure naturally encodes operator precedence. When we evaluate, we start from the leaves and work up: 2 * 3 = 6, then 1 + 6 = 7.

AST Node Types

The AST has different node types for different language constructs:

Expression Nodes

Things that produce values:

Literal42, "hello", true Identifierx, playerScore Binary → left + right Unary-x, not condition Callcircle(100, 50, 30) Array → [1, 2, 3] Index → arr[0] Member → point.x

Statement Nodes

Things that do actions:

Letlet x = 10 Functionfn draw() { ... } Ifif condition { ... } else { ... } Whilewhile condition { ... } Forfor i in 0..10 { ... } Returnreturn value Block → { stmt1; stmt2; ... }

Parsing a Function

Let's see how this code becomes an AST:

Input
fn greet(name) {
  return "Hello " + name
}
Function ├── name: "greet" ├── params: ["name"] └── body: Block └── Return └── Binary (+) ├── left: Literal "Hello " └── right: Identifier name

How Parsing Works

Bloom uses recursive descent parsing. The parser has a method for each grammar rule, and these methods call each other recursively.

Here's the conceptual flow for parsing an expression like 1 + 2 * 3:

expression() Entry point
term() handles + -
factor() handles * /
primary() numbers, ids, (expr)

Each level handles operators with the same precedence. Lower precedence operators (+ -) are parsed at higher levels. Higher precedence (* /) is parsed at lower levels. Primaries are the "atoms" at the bottom of recursion.

Operator Precedence

Bloom follows standard mathematical precedence (highest binds tightest):

Level
Operators
Example
1 (lowest)
or
a or b
2
and
a and b
3
== != < > <= >=
a == b
4
+ -
a + b
5
* / %
a * b
6 (highest)
- not (unary)
-x, not y

Parsing a For Loop

Let's trace through parsing this for loop:

Input
for i in 0..5 {
  circle(i * 20, 50, 10)
}
for
See FOR keyword → enter parseForStatement()
i
Parse identifier → loop variable is "i"
in
Expect IN keyword → consume it
0..5
Parse expression → see ".." → create Range node (0 to 5)
{ ... }
Parse block → recursively parse statements inside

Resulting AST:

For ├── variable: "i" ├── iterable: Range(0, 5) └── body: Block └── Call ├── callee: circle └── args: [ Binary(i * 20), Literal(50), Literal(10) ]

Error Recovery

When the parser finds an error, it tries to recover and keep going. This lets Bloom show you multiple errors at once instead of stopping at the first one.

Input with errors
let x =      // Missing value
let y = 10   // This still parses fine

The parser reports the first error, then synchronizes by skipping to the next statement boundary (usually a newline or semicolon).

Helpful Error Messages

Bloom's parser tries to guess what you meant:

Example error fr i in 0..5 {

Error: Unexpected token 'fr'. Did you mean 'for'?

This is done by checking if unknown identifiers are similar to keywords (using edit distance).

In the Source Code

The parser implementation lives in src/lang/parser.ts. Key methods: