Moonbit 与 llvm 共舞 上篇 - 实现语法前端

引言
编程语言设计与编译器实现历来被视为计算机科学领域中最具挑战性的课题之一。传统的编译器教学路径往往要求学生首先掌握复杂的理论基础:
- 自动机理论:有限状态自动机与正则表达式
- 类型理论:λ演算与类型系统的数学基础
- 计算机体系结构:从汇编语言到机器码的底层实现
然而,Moonbit作为一门专为现代开发环境设计的函数式编程语言,为我们提供了一个全新的视角。它不仅具备严谨的类型系统和卓越的内存安全保障,更重要的是,其丰富的语法特性和为AI时代量身定制的工具链,使得Moonbit成为学习和实现编译器的理想选择。
系列概述 本系列文章将通过构建一个名为TinyMoonbit的小型编程语言编译器,深入探讨现代编译器实现的核心概念和最佳实践。
- 上篇:聚焦语言前端的实现,包括词法分析、语法解析和类型检查,最终生成带有完整类型标记的抽象语法树
- 下篇:深入代码生成阶段,利用Moonbit官方的
llvm.mbt绑定库,将语法树转换为LLVM中间表示,并最终生成RISC-V汇编代码
TinyMoonbit 语言设计
TinyMoonbit是一种系统级编程语言,其抽象层次与C语言相当。虽然在语法设计上大量借鉴了Moonbit的特性,但TinyMoonbit实际并非Moonbit语言的子集,而是一个为测试llvm.mbt功能完备性兼具教学作用的简化版本。
注:由于篇幅限制,本系列文章所提到的TinyMoonbit实现比真正的TinyMoonbit更加简单,完整版本请参考 TinyMoonbitLLVM。
核心特性
TinyMoonbit提供了现代系统编程所需的基础功能:
- ✅ 底层内存操作:直接的指针操作和内存管理
- ✅ 控制流结构:条件分支、循环和函数调用
- ✅ 类型安全:静态类型检查和明确的类型声明
- ❌ 简化设计:为降低实现复杂度,不支持类型推导和闭包等高级特性
语法示例
让我们通过一个经典的斐波那契数列实现来展示TinyMoonbit的语法:
extern fn fn print_int(x : Int) -> Unit
print_int(Int
x : Int
Int) -> Unit
Unit;
// 递归实现斐波那契数列
fn fn fib(n : Int) -> Int
fib(Int
n : Int
Int) -> Int
Int {
if Int
n fn Compare::op_le(x : Int, y : Int) -> Bool
<= 1 {
return Int
n;
}
return fn fib(n : Int) -> Int
fib(Int
n fn Sub::sub(self : Int, other : Int) -> Int
Performs subtraction between two 32-bit integers, following standard two's
complement arithmetic rules. When the result overflows or underflows, it
wraps around within the 32-bit integer range.
Parameters:
self : The minuend (the number being subtracted from).
other : The subtrahend (the number to subtract).
Returns the difference between self and other.
Example:
test {
let a = 42
let b = 10
inspect(a - b, content="32")
let max = 2147483647 // Int maximum value
inspect(max - -1, content="-2147483648") // Overflow case
}
- 1) fn Add::add(self : Int, other : Int) -> Int
Adds two 32-bit signed integers. Performs two's complement arithmetic, which
means the operation will wrap around if the result exceeds the range of a
32-bit integer.
Parameters:
self : The first integer operand.
other : The second integer operand.
Returns a new integer that is the sum of the two operands. If the
mathematical sum exceeds the range of a 32-bit integer (-2,147,483,648 to
2,147,483,647), the result wraps around according to two's complement rules.
Example:
test {
inspect(42 + 1, content="43")
inspect(2147483647 + 1, content="-2147483648") // Overflow wraps around to minimum value
}
+ fn fib(n : Int) -> Int
fib(Int
n fn Sub::sub(self : Int, other : Int) -> Int
Performs subtraction between two 32-bit integers, following standard two's
complement arithmetic rules. When the result overflows or underflows, it
wraps around within the 32-bit integer range.
Parameters:
self : The minuend (the number being subtracted from).
other : The subtrahend (the number to subtract).
Returns the difference between self and other.
Example:
test {
let a = 42
let b = 10
inspect(a - b, content="32")
let max = 2147483647 // Int maximum value
inspect(max - -1, content="-2147483648") // Overflow case
}
- 2);
}
fn main() -> Unit {
fn print_int(x : Int) -> Unit
print_int(fn fib(n : Int) -> Int
fib(10));
}
编译目标
经过完整的编译流程后,上述代码将生成如下的LLVM中间表示:
; ModuleID = 'tinymoonbit'
source_filename = "tinymoonbit"
define i32 @fib(i32 %0) {
entry:
%1 = alloca i32, align 4
store i32 %0, ptr %1, align 4
%2 = load i32, ptr %1, align 4
%3 = icmp sle i32 %2, 1
br i1 %3, label %4, label %6
4: ; preds = %entry
%5 = load i32, ptr %1, align 4
ret i32 %5
6: ; preds = %4, %entry
%7 = load i32, ptr %1, align 4
%8 = sub i32 %7, 1
%9 = call i32 @fib(i32 %8)
%10 = load i32, ptr %1, align 4
%11 = sub i32 %10, 2
%12 = call i32 @fib(i32 %11)
%13 = add i32 %9, %12
ret i32 %13
}
define void @main() {
entry:
%0 = call i32 @fib(i32 10)
call void @print_int(i32 %0)
}
declare void @print_int(i32 %0)
第二章:词法分析
词法分析(Lexical Analysis)构成了编译过程的第一道关卡,其核心使命是将连续的字符流转换为具有语义意义的词法单元(Tokens)序列。这个看似简单的转换过程,实际上是整个编译器流水线的基石。
从字符到符号:Token的设计与实现
考虑以下代码片段:
let Int
x : Int
Int = 5;
经过词法分析器处理后,将产生如下的Token序列:
(Keyword "let") → (Identifier "x") → (Symbol ":") →
(Type "Int") → (Operator "=") → (IntLiteral 5) → (Symbol ";")
这个转换过程需要处理多种复杂情况:
- 空白符过滤:跳过空格、制表符和换行符
- 关键字识别:区分保留字与用户定义标识符
- 数值解析:正确识别整数、浮点数的边界
- 运算符处理:区分单字符和多字符运算符
Token类型系统设计
基于TinyMoonbit的语法规范,我们将所有可能的符号分类为以下Token类型:
pub enum Token {
(Bool) -> Token
Bool(Bool
Bool) // 布尔值:true, false
(Int) -> Token
Int(Int
Int) // 整数:1, 2, 3, ...
(Double) -> Token
Double(Double
Double) // 浮点数:1.0, 2.5, 3.14, ...
(String) -> Token
Keyword(String
String) // 保留字:let, if, while, fn, return
(String) -> Token
Upper(String
String) // 类型标识符:首字母大写,如 Int, Double, Bool
(String) -> Token
Lower(String
String) // 变量标识符:首字母小写,如 x, y, result
(String) -> Token
Symbol(String
String) // 运算符和标点:+, -, *, :, ;, ->
(Char) -> Token
Bracket(Char
Char) // 括号类:(, ), [, ], {, }
Token
EOF // 文件结束标记
} derive(trait Show {
output(Self, &Logger) -> Unit
to_string(Self) -> String
}
Trait for types that can be converted to String
Show, trait Eq {
equal(Self, Self) -> Bool
op_equal(Self, Self) -> Bool
not_equal(Self, Self) -> Bool
}
Trait for types whose elements can test for equality
Eq)
利用模式匹配
Moonbit的强大模式匹配能力使我们能够以一种前所未有的优雅方式实现词法分析器。与传统的有限状态自动机方法相比,这种基于模式匹配的实现更加直观和易于理解。
核心分析函数
pub fn fn lex(code : String) -> Array[Token]
lex(String
code: String
String) -> type Array[T]
An Array is a collection of values that supports random access and can
grow in size.
Array[enum Token {
Bool(Bool)
Int(Int)
Double(Double)
Keyword(String)
Upper(String)
Lower(String)
Symbol(String)
Bracket(Char)
EOF
} derive(Show, Eq)
Token] {
let Array[Token]
tokens = type Array[T]
An Array is a collection of values that supports random access and can
grow in size.
Array::fn[T] Array::new(capacity? : Int) -> Array[T]
Creates a new empty array with an optional initial capacity.
Parameters:
capacity : The initial capacity of the array. If 0 (default), creates an
array with minimum capacity. Must be non-negative.
Returns a new empty array of type Array[T] with the specified initial
capacity.
Example:
test {
let arr : Array[Int] = Array::new(capacity=10)
inspect(arr.length(), content="0")
inspect(arr.capacity(), content="10")
let arr : Array[Int] = Array::new()
inspect(arr.length(), content="0")
}
new()
loop String
code[:] {
// 跳过空白字符
StringView
[' ' | '\n' | '\r' | '\t', ..rest] =>
continue StringView
rest
// 处理单行注释
StringView
[.."//", ..rest] =>
continue loop StringView
rest {
StringView
['\n' | '\r', ..rest_str] => break StringView
rest_str
StringView
[_, ..rest_str] => continue StringView
rest_str
StringView
[] as rest_str => break StringView
rest_str
}
// 识别多字符运算符(顺序很重要!)
StringView
[.."->", ..rest] => { Array[Token]
tokens.fn[T] Array::push(self : Array[T], value : T) -> Unit
Adds an element to the end of the array.
If the array is at capacity, it will be reallocated.
Example
test {
let v = []
v.push(3)
}
push((String) -> Token
Symbol("->")); continue StringView
rest }
StringView
[.."==", ..rest] => { Array[Token]
tokens.fn[T] Array::push(self : Array[T], value : T) -> Unit
Adds an element to the end of the array.
If the array is at capacity, it will be reallocated.
Example
test {
let v = []
v.push(3)
}
push((String) -> Token
Symbol("==")); continue StringView
rest }
StringView
[.."!=", ..rest] => { Array[Token]
tokens.fn[T] Array::push(self : Array[T], value : T) -> Unit
Adds an element to the end of the array.
If the array is at capacity, it will be reallocated.
Example
test {
let v = []
v.push(3)
}
push((String) -> Token
Symbol("!=")); continue StringView
rest }
StringView
[.."<=", ..rest] => { Array[Token]
tokens.fn[T] Array::push(self : Array[T], value : T) -> Unit
Adds an element to the end of the array.
If the array is at capacity, it will be reallocated.
Example
test {
let v = []
v.push(3)
}
push((String) -> Token
Symbol("<=")); continue StringView
rest }
StringView
[..">=", ..rest] => { Array[Token]
tokens.fn[T] Array::push(self : Array[T], value : T) -> Unit
Adds an element to the end of the array.
If the array is at capacity, it will be reallocated.
Example
test {
let v = []
v.push(3)
}
push((String) -> Token
Symbol(">=")); continue StringView
rest }
// 识别单字符运算符和标点符号
[':' | '.' | ',' | ';' | '+' | '-' | '*' |
'/' | '%' | '>' | '<' | '=' as c, ..rest] => {
Array[Token]
tokens.fn[T] Array::push(self : Array[T], value : T) -> Unit
Adds an element to the end of the array.
If the array is at capacity, it will be reallocated.
Example
test {
let v = []
v.push(3)
}
push((String) -> Token
Symbol("\{Char
c}"))
continue StringView
rest
}
// 识别括号
StringView
[Char
'(' | ')' | '[' | ']' | '{' | '}' as cStringView
, ..rest] => {
Array[Token]
tokens.fn[T] Array::push(self : Array[T], value : T) -> Unit
Adds an element to the end of the array.
If the array is at capacity, it will be reallocated.
Example
test {
let v = []
v.push(3)
}
push((Char) -> Token
Bracket(Char
c))
continue StringView
rest
}
// 识别标识符和字面量
StringView
['a'..='z', ..] as code => {
let (Token
tok, StringView
rest) = (StringView) -> (Token, StringView)
lex_ident(StringView
code);
Array[Token]
tokens.fn[T] Array::push(self : Array[T], value : T) -> Unit
Adds an element to the end of the array.
If the array is at capacity, it will be reallocated.
Example
test {
let v = []
v.push(3)
}
push(Token
tok)
continue StringView
rest
}
['A'..='Z', ..] => { ... }
['0'..='9', ..] => { ... }
// 到达文件末尾
[] => { Array[Token]
tokens.fn[T] Array::push(self : Array[T], value : T) -> Unit
Adds an element to the end of the array.
If the array is at capacity, it will be reallocated.
Example
test {
let v = []
v.push(3)
}
push(Token
EOF); break Array[Token]
tokens }
}
}
关键字识别策略
标识符解析需要特别处理关键字的识别:
pub fn fn let_ident(rest : StringView) -> (Token, StringView)
let_ident(StringView
rest: type StringView
@string.View) -> (enum Token {
Bool(Bool)
Int(Int)
Double(Double)
Keyword(String)
Upper(String)
Lower(String)
Symbol(String)
Bracket(Char)
EOF
} derive(Show, Eq)
Token, type StringView
@string.View) {
// 预定义关键字映射表
let Unit
keyword_map = Unit
Map.(Array[(String, Token)]) -> Unit
from_array([
("let", Token::(String) -> Token
Keyword("let")),
("fn", Token::(String) -> Token
Keyword("fn")),
("if", Token::(String) -> Token
Keyword("if")),
("else", Token::(String) -> Token
Keyword("else")),
("while", Token::(String) -> Token
Keyword("while")),
("return", Token::(String) -> Token
Keyword("return")),
("extern", Token::(String) -> Token
Keyword("extern")),
("true", Token::(Bool) -> Token
Bool(true)),
("false", Token::(Bool) -> Token
Bool(false)),
])
let Array[Char]
identifier_chars = type Array[T]
An Array is a collection of values that supports random access and can
grow in size.
Array::fn[T] Array::new(capacity? : Int) -> Array[T]
Creates a new empty array with an optional initial capacity.
Parameters:
capacity : The initial capacity of the array. If 0 (default), creates an
array with minimum capacity. Must be non-negative.
Returns a new empty array of type Array[T] with the specified initial
capacity.
Example:
test {
let arr : Array[Int] = Array::new(capacity=10)
inspect(arr.length(), content="0")
inspect(arr.capacity(), content="10")
let arr : Array[Int] = Array::new()
inspect(arr.length(), content="0")
}
new()
let StringView
remaining = loop StringView
rest {
StringView
[Char
'a'..='z' | 'A'..='Z' | '0'..='9' | '_' as cStringView
, ..rest_str] => {
Array[Char]
identifier_chars.fn[T] Array::push(self : Array[T], value : T) -> Unit
Adds an element to the end of the array.
If the array is at capacity, it will be reallocated.
Example
test {
let v = []
v.push(3)
}
push(Char
c)
continue StringView
rest_str
}
StringView
_ as rest_str => break StringView
rest_str
}
let String
ident = (ArrayView[Char]) -> String
String::fn String::from_array(chars : ArrayView[Char]) -> String
Convert char array to string.
test {
let s = String::from_array(['H', 'e', 'l', 'l', 'o'])
assert_eq(s, "Hello")
}
Do not convert large data to Array[Char] and build a string with String::from_array.
For efficiency considerations, it's recommended to use Buffer instead.
from_array(Array[Char]
identifier_chars)
let Token
token = Unit
keyword_map.(Unit) -> Unit
get(Unit
identifier).(() -> Token) -> Token
or_else(() => Token::(String) -> Token
Lower(String
ident))
(Token
token, StringView
remaining)
}
💡 Moonbit语法特性深度解析
上述词法分析器的实现充分展示了Moonbit在编译器开发中的几个突出优势:
- 函数式循环构造
loop initial_value {
pattern1 => continue new_value1
pattern2 => continue new_value2
pattern3 => break final_value
}
loop并非传统意义上的循环结构,而是一种函数式循环:
- 接受一个初始参数作为循环状态
- 通过模式匹配处理不同情况
-
continue传递新状态到下一次迭代 -
break终止循环并返回最终值
- 字符串视图与模式匹配
Moonbit的字符串模式匹配功能极大简化了文本处理:
// 匹配单个字符
['a', ..rest] => // 以字符'a'开头
// 匹配字符范围
['a'..='z' as c, ..rest] => // 小写字母,绑定到变量c
// 匹配字符串字面量
[.."hello", ..rest] => // 等价于 ['h','e','l','l','o', ..rest]
// 匹配多个可能的字符
[' ' | '\t' | '\n', ..rest] => // 任 意空白字符
- 模式匹配优先级的重要性
⚠️ 重要提醒:匹配顺序至关重要
在编写模式匹配规则时,必须将更具体的模式放在更一般的模式之前。例如:
// ✅ 正确的顺序 loop code[:] { [.."->", ..rest] => { ... } // 先匹配多字符运算符 ['-' | '>' as c, ..rest] => { ... } // 再匹配单字符 } // ❌ 错误的顺序 - "->"将永远无法被匹配 loop code[:] { ['-' | '>' as c, ..rest] => { ... } [.."->", ..rest] => { ... } // 永远不会执行 }
通过这种基于模式匹配的方法,我们不仅避免了复杂的状态机实现,还获得了更清晰、更容易维护的代码结构。
第三章:语法分析与抽象语法树构建
语法分析(Syntactic Analysis)是编译器的第二个核心阶段,其任务是将词法分析产生的Token序列重新组织为具有层次结构的抽象语法树(Abstract Syntax Tree, AST)。这个过程不仅要验证程序是否符合语言的语法规则,更要为后续的语义分析和代码生成提供结构化的数据表示。
抽象语法树设计:程序的结构化表示
在构建语法分析器之前,我们需要精心设计AST的结构。这个设计决定了如何表示程序的语法结构,以及后续编译阶段如何处理这些结构。
1. 核心类型系统
首先,我们定义TinyMoonbit类型系统在AST中的表示:
pub enum Type {
Type
Unit // 单位类型,表示无返回值
Type
Bool // 布尔类型:true, false
Type
Int // 32位有符号整数
Type
Double // 64位双精度浮点数
} derive(trait Show {
output(Self, &Logger) -> Unit
to_string(Self) -> String
}
Trait for types that can be converted to String
Show, trait Eq {
equal(Self, Self) -> Bool
op_equal(Self, Self) -> Bool
not_equal(Self, Self) -> Bool
}
Trait for types whose elements can test for equality
Eq, trait ToJson {
to_json(Self) -> Json
}
Trait for types that can be converted to Json
ToJson)
pub fn fn parse_type(type_name : String) -> Type
parse_type(String
type_name: String
String) -> enum Type {
Unit
Bool
Int
Double
} derive(Show, Eq, ToJson)
Type {
match String
type_name {
"Unit" => Type::Type
Unit
"Bool" => Type::Type
Bool
"Int" => Type::Type
Int
"Double" => Type::Type
Double
_ => fn[T] abort(string : String, loc~ : SourceLoc = _) -> T
Abort evaluation with a message and source location.
This function never returns.
Function abort.
abort("Unknown type: \{String
type_name}")
}
}
2. 分层的AST节点设计
我们采用分层设计来清晰地表示程序的不同抽象层次:
- 原子表达式(AtomExpr) 代表不可再分解的基本表达式单元:
pub enum AtomExpr {
(Bool) -> AtomExpr
Bool(Bool
Bool) // 布尔字面量
(Int) -> AtomExpr
Int(Int
Int) // 整数字面量
(Double) -> AtomExpr
Double(Double
Double) // 浮点数字面量
(String, ty~ : Type?) -> AtomExpr
Var(String
String, mut Type?
ty~ : enum Type {
Unit
Bool
Int
Double
} derive(Show, Eq, ToJson)
Type?) // 变量引用
(Expr, ty~ : Type?) -> AtomExpr
Paren(enum Expr {
AtomExpr(AtomExpr, ty~ : Type?)
Unary(String, Expr, ty~ : Type?)
Binary(String, Expr, Expr, ty~ : Type?)
} derive(Show, Eq, ToJson)
Expr, mut Type?
ty~ : enum Type {
Unit
Bool
Int
Double
} derive(Show, Eq, ToJson)
Type?) // 括号表达式
(String, Array[Expr], ty~ : Type?) -> AtomExpr
Call(String
String, type Array[T]
An Array is a collection of values that supports random access and can
grow in size.
Array[enum Expr {
AtomExpr(AtomExpr, ty~ : Type?)
Unary(String, Expr, ty~ : Type?)
Binary(String, Expr, Expr, ty~ : Type?)
} derive(Show, Eq, ToJson)
Expr], mut Type?
ty~ : enum Type {
Unit
Bool
Int
Double
} derive(Show, Eq, ToJson)
Type?) // 函数调用
} derive(trait Show {
output(Self, &Logger) -> Unit
to_string(Self) -> String
}
Trait for types that can be converted to String
Show, trait Eq {
equal(Self, Self) -> Bool
op_equal(Self, Self) -> Bool
not_equal(Self, Self) -> Bool
}
Trait for types whose elements can test for equality
Eq, trait ToJson {
to_json(Self) -> Json
}
Trait for types that can be converted to Json
ToJson)
- 复合表达式(Expr) 可以包含运算符和多个子表达式的更复杂结构:
pub enum Expr {
(AtomExpr, ty~ : Type?) -> Expr
AtomExpr(enum AtomExpr {
Bool(Bool)
Int(Int)
Double(Double)
Var(String, ty~ : Type?)
Paren(Expr, ty~ : Type?)
Call(String, Array[Expr], ty~ : Type?)
} derive(Show, Eq, ToJson)
AtomExpr, mut Type?
ty~ : enum Type {
Unit
Bool
Int
Double
} derive(Show, Eq, ToJson)
Type?) // 原子表达式包装
(String, Expr, ty~ : Type?) -> Expr
Unary(String
String, enum Expr {
AtomExpr(AtomExpr, ty~ : Type?)
Unary(String, Expr, ty~ : Type?)
Binary(String, Expr, Expr, ty~ : Type?)
} derive(Show, Eq, ToJson)
Expr, mut Type?
ty~ : enum Type {
Unit
Bool
Int
Double
} derive(Show, Eq, ToJson)
Type?) // 一元运算:-, !
(String, Expr, Expr, ty~ : Type?) -> Expr
Binary(String
String, enum Expr {
AtomExpr(AtomExpr, ty~ : Type?)
Unary(String, Expr, ty~ : Type?)
Binary(String, Expr, Expr, ty~ : Type?)
} derive(Show, Eq, ToJson)
Expr, enum Expr {
AtomExpr(AtomExpr, ty~ : Type?)
Unary(String, Expr, ty~ : Type?)
Binary(String, Expr, Expr, ty~ : Type?)
} derive(Show, Eq, ToJson)
Expr, mut Type?
ty~ : enum Type {
Unit
Bool
Int
Double
} derive(Show, Eq, ToJson)
Type?) // 二元运算:+, -, *, /, ==, !=, 等
} derive(trait Show {
output(Self, &Logger) -> Unit
to_string(Self) -> String
}
Trait for types that can be converted to String
Show, trait Eq {
equal(Self, Self) -> Bool
op_equal(Self, Self) -> Bool
not_equal(Self, Self) -> Bool
}
Trait for types whose elements can test for equality
Eq, trait ToJson {
to_json(Self) -> Json
}
Trait for types that can be converted to Json
ToJson)
- 语句(Stmt) 代表程序中的可执行单元:
pub enum Stmt {
(String, Type, Expr) -> Stmt
Let(String
String, enum Type {
Unit
Bool
Int
Double
} derive(Show, Eq, ToJson)
Type, enum Expr {
AtomExpr(AtomExpr, ty~ : Type?)
Unary(String, Expr, ty~ : Type?)
Binary(String, Expr, Expr, ty~ : Type?)
} derive(Show, Eq, ToJson)
Expr) // 变量声明:let x : Int = 5;
(String, Expr) -> Stmt
Assign(String
String, enum Expr {
AtomExpr(AtomExpr, ty~ : Type?)
Unary(String, Expr, ty~ : Type?)
Binary(String, Expr, Expr, ty~ : Type?)
} derive(Show, Eq, ToJson)
Expr) // 赋值语句:x = 10;
(Expr, Array[Stmt], Array[Stmt]) -> Stmt
If(enum Expr {
AtomExpr(AtomExpr, ty~ : Type?)
Unary(String, Expr, ty~ : Type?)
Binary(String, Expr, Expr, ty~ : Type?)
} derive(Show, Eq, ToJson)
Expr, type Array[T]
An Array is a collection of values that supports random access and can
grow in size.
Array[enum Stmt {
Let(String, Type, Expr)
Assign(String, Expr)
If(Expr, Array[Stmt], Array[Stmt])
While(Expr, Array[Stmt])
Return(Expr?)
Expr(Expr)
} derive(Show, Eq, ToJson)
Stmt], type Array[T]
An Array is a collection of values that supports random access and can
grow in size.
Array[enum Stmt {
Let(String, Type, Expr)
Assign(String, Expr)
If(Expr, Array[Stmt], Array[Stmt])
While(Expr, Array[Stmt])
Return(Expr?)
Expr(Expr)
} derive(Show, Eq, ToJson)
Stmt]) // 条件分支:if-else
(Expr, Array[Stmt]) -> Stmt
While(enum Expr {
AtomExpr(AtomExpr, ty~ : Type?)
Unary(String, Expr, ty~ : Type?)
Binary(String, Expr, Expr, ty~ : Type?)
} derive(Show, Eq, ToJson)
Expr, type Array[T]
An Array is a collection of values that supports random access and can
grow in size.
Array[enum Stmt {
Let(String, Type, Expr)
Assign(String, Expr)
If(Expr, Array[Stmt], Array[Stmt])
While(Expr, Array[Stmt])
Return(Expr?)
Expr(Expr)
} derive(Show, Eq, ToJson)
Stmt]) // 循环语句:while
(Expr?) -> Stmt
Return(enum Expr {
AtomExpr(AtomExpr, ty~ : Type?)
Unary(String, Expr, ty~ : Type?)
Binary(String, Expr, Expr, ty~ : Type?)
} derive(Show, Eq, ToJson)
Expr?) // 返回语句:return expr;
(Expr) -> Stmt
Expr(enum Expr {
AtomExpr(AtomExpr, ty~ : Type?)
Unary(String, Expr, ty~ : Type?)
Binary(String, Expr, Expr, ty~ : Type?)
} derive(Show, Eq, ToJson)
Expr) // 单表达式语句
} derive(trait Show {
output(Self, &Logger) -> Unit
to_string(Self) -> String
}
Trait for types that can be converted to String
Show, trait Eq {
equal(Self, Self) -> Bool
op_equal(Self, Self) -> Bool
not_equal(Self, Self) -> Bool
}
Trait for types whose elements can test for equality
Eq, trait ToJson {
to_json(Self) -> Json
}
Trait for types that can be converted to Json
ToJson)
- 顶层结构 函数定义和完整程序:
pub struct Function {
String
name : String
String // 函数名
Array[(String, Type)]
params : type Array[T]
An Array is a collection of values that supports random access and can
grow in size.
Array[(String
String, enum Type {
Unit
Bool
Int
Double
} derive(Show, Eq, ToJson)
Type)] // 参数列表:[(参数名, 类型)]
Type
ret_ty : enum Type {
Unit
Bool
Int
Double
} derive(Show, Eq, ToJson)
Type // 返回类型
Array[Stmt]
body : type Array[T]
An Array is a collection of values that supports random access and can
grow in size.
Array[enum Stmt {
Let(String, Type, Expr)
Assign(String, Expr)
If(Expr, Array[Stmt], Array[Stmt])
While(Expr, Array[Stmt])
Return(Expr?)
Expr(Expr)
} derive(Show, Eq, ToJson)
Stmt] // 函数体语句序列
} derive(trait Show {
output(Self, &Logger) -> Unit
to_string(Self) -> String
}
Trait for types that can be converted to String
Show, trait Eq {
equal(Self, Self) -> Bool
op_equal(Self, Self) -> Bool
not_equal(Self, Self) -> Bool
}
Trait for types whose elements can test for equality
Eq, trait ToJson {
to_json(Self) -> Json
}
Trait for types that can be converted to Json
ToJson)
// 程序定义为函数名到函数定义的映射
pub type Program type Map[K, V]
Mutable linked hash map that maintains the order of insertion, not thread safe.
Example
test {
let map = { 3: "three", 8: "eight", 1: "one" }
assert_eq(map.get(2), None)
assert_eq(map.get(3), Some("three"))
map.set(3, "updated")
assert_eq(map.get(3), Some("updated"))
}
Map[String
String, struct Function {
name: String
params: Array[(String, Type)]
ret_ty: Type
body: Array[Stmt]
} derive(Show, Eq, ToJson)
Function]
设计要点:类型标记的可变性
注意到每个表达式节点都包含一个
mut ty~ : Type? 字段。这个设计允许我们在类型检查阶段填充类型信息,而不需要重新构建整个AST。