跳到主要内容

3 篇博文 含有标签「MoonBit」

查看所有标签

MoonBit Pearls Vol.03:01背包问题

· 阅读需 13 分钟

01背包问题是算法竞赛中经典的dp题目。文中总共包含五个版本的代码。从最朴素的枚举法开始,在不断的改进下,最终变成了dp解法。

问题定义

有若干个物品,每件物品的有重量weight和价值value

struct Item {
  
Int
weight
:
Int
Int
Int
value
:
Int
Int
}

现在,给定一个物品列表items,和背包的容量capacity。从中选出若干件物品,使得这些物品的总重量不超过背包的容量,且物品的总价值最大。

typealias 
enum @list.T[A] {
  Empty
  More(A, tail~ : @list.T[A])
}
@list.T
as List
let
@list.T[Item]
items_1
:
enum @list.T[A] {
  Empty
  More(A, tail~ : @list.T[A])
}
List
[
struct Item {
  weight: Int
  value: Int
}
Item
] =
(arr : FixedArray[Item]) -> @list.T[Item]
@list.of
([
{
Int
weight
: 7,
Int
value
: 20 },
{
Int
weight
: 4,
Int
value
: 10 },
{
Int
weight
: 5,
Int
value
: 11 },
])

以上面的items_1为例,假设背包容量是1010,那么最优的方案是选取后两个物品,占用4+5=94+5=9的容量,总共有10+11=2110+11=21点价值。

注意,由于我们不能把物品切割,因此优先挑选性价比最高的物品并非正解。例如,在上面的例子中,若选取了性价比最高的物品1,则只有2020点价值,而此时背包已经放不下其他物品了。

问题建模

我们先定义一些基础的对象与操作。

//物品的组合,下文简称组合
struct Combination {
  
@list.T[Item]
items
:
enum @list.T[A] {
  Empty
  More(A, tail~ : @list.T[A])
}
List
[
struct Item {
  weight: Int
  value: Int
}
Item
]
Int
total_weight
:
Int
Int
Int
total_value
:
Int
Int
} //空的组合 let
Combination
empty_combination
:
struct Combination {
  items: @list.T[Item]
  total_weight: Int
  total_value: Int
}
Combination
= {
@list.T[Item]
items
:
() -> @list.T[Item]

Creates an empty list

@list.empty
(),
Int
total_weight
: 0,
Int
total_value
: 0,
} //往组合中添加物品,得到新的组合 fn
struct Combination {
  items: @list.T[Item]
  total_weight: Int
  total_value: Int
}
Combination
::
(self : Combination, item : Item) -> Combination
add
(
Combination
self
:
struct Combination {
  items: @list.T[Item]
  total_weight: Int
  total_value: Int
}
Combination
,
Item
item
:
struct Item {
  weight: Int
  value: Int
}
Item
) ->
struct Combination {
  items: @list.T[Item]
  total_weight: Int
  total_value: Int
}
Combination
{
{
@list.T[Item]
items
:
Combination
self
.
@list.T[Item]
items
.
(self : @list.T[Item], head : Item) -> @list.T[Item]
add
(
Item
item
),
Int
total_weight
:
Combination
self
.
Int
total_weight
(self : Int, other : Int) -> Int

Adds two 32-bit signed integers. Performs two's complement arithmetic, which means the operation will wrap around if the result exceeds the range of a 32-bit integer.

Parameters:

  • self : The first integer operand.
  • other : The second integer operand.

Returns a new integer that is the sum of the two operands. If the mathematical sum exceeds the range of a 32-bit integer (-2,147,483,648 to 2,147,483,647), the result wraps around according to two's complement rules.

Example:

  inspect(42 + 1, content="43")
  inspect(2147483647 + 1, content="-2147483648") // Overflow wraps around to minimum value
+
Item
item
.
Int
weight
,
Int
total_value
:
Combination
self
.
Int
total_value
(self : Int, other : Int) -> Int

Adds two 32-bit signed integers. Performs two's complement arithmetic, which means the operation will wrap around if the result exceeds the range of a 32-bit integer.

Parameters:

  • self : The first integer operand.
  • other : The second integer operand.

Returns a new integer that is the sum of the two operands. If the mathematical sum exceeds the range of a 32-bit integer (-2,147,483,648 to 2,147,483,647), the result wraps around according to two's complement rules.

Example:

  inspect(42 + 1, content="43")
  inspect(2147483647 + 1, content="-2147483648") // Overflow wraps around to minimum value
+
Item
item
.
Int
value
,
} } //两个组合等效,意思是它们总价值一样 impl
trait Eq {
  op_equal(Self, Self) -> Bool
}

Trait for types whose elements can test for equality

Eq
for
struct Combination {
  items: @list.T[Item]
  total_weight: Int
  total_value: Int
}
Combination
with
(self : Combination, other : Combination) -> Bool
op_equal
(
Combination
self
,
Combination
other
) {
Combination
self
.
Int
total_value
(self : Int, other : Int) -> Bool

Compares two integers for equality.

Parameters:

  • self : The first integer to compare.
  • other : The second integer to compare.

Returns true if both integers have the same value, false otherwise.

Example:

  inspect(42 == 42, content="true")
  inspect(42 == -42, content="false")
==
Combination
other
.
Int
total_value
} //比较两个组合的大小,就是比较它们总价值的大小 impl
trait Compare {
  compare(Self, Self) -> Int
}

Trait for types whose elements are ordered

The return value of [compare] is:

  • zero, if the two arguments are equal
  • negative, if the first argument is smaller
  • positive, if the first argument is greater
Compare
for
struct Combination {
  items: @list.T[Item]
  total_weight: Int
  total_value: Int
}
Combination
with
(self : Combination, other : Combination) -> Int
compare
(
Combination
self
,
Combination
other
) {
Combination
self
.
Int
total_value
.
(self : Int, other : Int) -> Int

Compares two integers and returns their relative order.

Parameters:

  • self : The first integer to compare.
  • other : The second integer to compare against.

Returns an integer indicating the relative order:

  • A negative value if self is less than other
  • Zero if self equals other
  • A positive value if self is greater than other

Example:

  let a = 42
  let b = 24
  inspect(a.compare(b), content="1") // 42 > 24
  inspect(b.compare(a), content="-1") // 24 < 42
  inspect(a.compare(a), content="0") // 42 = 42
compare
(
Combination
other
.
Int
total_value
)
}

然后,我们就可以开始思考如何解决问题了。

一、朴素的枚举

枚举法是最朴素的方案,我们依照问题的定义,一步一步执行,就能得到答案:

  1. 枚举出所有的组合;
  2. 过滤出其中有效的组合,也就是那些能装入背包的;
  3. 答案是其中总价值最大的那个。

得益于标准库提供的两个函数,我们可以将上面三行文字一比一地翻译为MoonBit代码。其中all_combinations是我们后续需要实现的函数,它的类型是(List[Item]) -> List[Combination]

fn 
(items : @list.T[Item], capacity : Int) -> Combination
solve_v1
(
@list.T[Item]
items
:
enum @list.T[A] {
  Empty
  More(A, tail~ : @list.T[A])
}
List
[
struct Item {
  weight: Int
  value: Int
}
Item
],
Int
capacity
:
Int
Int
) ->
struct Combination {
  items: @list.T[Item]
  total_weight: Int
  total_value: Int
}
Combination
{
(items : @list.T[Item]) -> @list.T[Combination]
all_combinations
(
@list.T[Item]
items
)
.
(self : @list.T[Combination], f : (Combination) -> Bool) -> @list.T[Combination]

Filter the list.

Example

  assert_eq(@list.of([1, 2, 3, 4, 5]).filter(x => x % 2 == 0), @list.of([2, 4]))
filter
(fn(
Combination
comb
) {
Combination
comb
.
Int
total_weight
(self_ : Int, other : Int) -> Bool
<=
Int
capacity
})
.
(self : @list.T[Combination]) -> Combination
unsafe_maximum
()
}

注意这里使用的是unsafe_maximum而不是maximum。这是因为空列表列表中没有最大值,maximum在这种情况下会返回一个None。但我们知道,题目保证答案存在(只要capacity不是负数),所以我们可以改用unsafe_maximum。它在输入空列表的情况下直接中断程序,其它情况返回列表中的最大值。

接下来我们去实现枚举的过程。函数all_combinations接受一个物品的列表,返回一个组合的列表,其中包含所有能由这些物品构造出的组合。也许你现在没有头绪,这时我们可以先查看一下列表的定义。它大概长这样:

enum List[A] {
  Empty
  More(A, tail~ : List[A])
}

也就是说,列表分为两种:

  1. 第一种是空的列表,叫Empty
  2. 第二种是非空的列表,叫More,其中包含了第一个元素(A)和剩余的部分(tail~ : T[A]),剩余部分也是一个列表。

这启示我们按物品列表是否为空来分情况讨论:

  • 如果物品列表为空,那么唯一的一种组合方式就是空的组合;
  • 否则,一定存在第一个物品item1和剩余部分items_tail。这种情况下,我们可以:
    1. 先求出不含item1的那些组合。这其实就是items_tail能凑出的那些组合,可以递归地求出。
    2. 再求出包含item1的那些组合。它们与不含item1的组合一一对应,只差把item1加入其中。
    3. 将这两者合并起来,就是所有items能凑出的组合。

例如,当物品列表包含a,b,c三个元素时,答案分为以下两个部分:

不含a的部分包含a的部分
{ }{ a }
{ b }{ a, b }
{ c }{ a, c }
{ b, c }{ a, b, c }
fn 
(items : @list.T[Item]) -> @list.T[Combination]
all_combinations
(
@list.T[Item]
items
:
enum @list.T[A] {
  Empty
  More(A, tail~ : @list.T[A])
}
List
[
struct Item {
  weight: Int
  value: Int
}
Item
]) ->
enum @list.T[A] {
  Empty
  More(A, tail~ : @list.T[A])
}
List
[
struct Combination {
  items: @list.T[Item]
  total_weight: Int
  total_value: Int
}
Combination
] {
match
@list.T[Item]
items
{
@list.T[Item]
Empty
=>
(x : Combination) -> @list.T[Combination]
@list.singleton
(
Combination
empty_combination
)
(Item, @list.T[Item]) -> @list.T[Item]
More
(
Item
item1
,
@list.T[Item]
tail
=
@list.T[Item]
items_tail
) => {
let
@list.T[Combination]
combs_without_item1
=
(items : @list.T[Item]) -> @list.T[Combination]
all_combinations
(
@list.T[Item]
items_tail
)
let
@list.T[Combination]
combs_with_item1
=
@list.T[Combination]
combs_without_item1
.
(self : @list.T[Combination], f : (Combination) -> Combination) -> @list.T[Combination]

Maps the list.

Example

  assert_eq(@list.of([1, 2, 3, 4, 5]).map(x => x * 2), @list.of([2, 4, 6, 8, 10]))
map
(_.
(self : Combination, item : Item) -> Combination
add
(
Item
item1
))
@list.T[Combination]
combs_with_item1
(self : @list.T[Combination], other : @list.T[Combination]) -> @list.T[Combination]

Concatenate two lists.

a + b equal to a.concat(b)

+
@list.T[Combination]
combs_without_item1
} } }

通过使用模式匹配(match),我们再一次将上面的五行文字一比一地翻译成了MoonBit代码。

二、提前过滤,仅枚举有效的组合

在第一个版本中,枚举所有组合过滤出能放入背包的组合是不相干的两个过程。在枚举的过程中,出现了很多无效的组合。这些组合早已放不进背包中,却还在后续的过程中被添加物品。不如早一点过滤它们,避免在它之上不断产生新的无效组合。观察代码,发现无效的组合只会在.map(_.add(item1))这一步产生。于是我们可以做出改进:仅向能再装下item1的组合添加item1

我们将all_combinations改为all_combinations_valid,仅返回能装入这个背包的组合。现在枚举和过滤将交替进行。

fn 
(items : @list.T[Item], capacity : Int) -> @list.T[Combination]
all_combinations_valid
(
@list.T[Item]
items
:
enum @list.T[A] {
  Empty
  More(A, tail~ : @list.T[A])
}
List
[
struct Item {
  weight: Int
  value: Int
}
Item
],
Int
capacity
:
Int
Int
// 添加一个参数,因为过滤需要知道背包的容量
) ->
enum @list.T[A] {
  Empty
  More(A, tail~ : @list.T[A])
}
List
[
struct Combination {
  items: @list.T[Item]
  total_weight: Int
  total_value: Int
}
Combination
] {
match
@list.T[Item]
items
{
@list.T[Item]
Empty
=>
(x : Combination) -> @list.T[Combination]
@list.singleton
(
Combination
empty_combination
) // 空的组合自然是有效的
(Item, @list.T[Item]) -> @list.T[Item]
More
(
Item
item1
,
@list.T[Item]
tail
=
@list.T[Item]
items_tail
) => {
// 我们假设 all_combinations_valid 返回的组合都是有效的(归纳假设) let
@list.T[Combination]
valid_combs_without_item1
=
(items : @list.T[Item], capacity : Int) -> @list.T[Combination]
all_combinations_valid
(
@list.T[Item]
items_tail
,
Int
capacity
,
) // 由于添加了过滤,所以它里面的组合都是有效的 let
@list.T[Combination]
valid_combs_with_item1
=
@list.T[Combination]
valid_combs_without_item1
.
(self : @list.T[Combination], f : (Combination) -> Bool) -> @list.T[Combination]

Filter the list.

Example

  assert_eq(@list.of([1, 2, 3, 4, 5]).filter(x => x % 2 == 0), @list.of([2, 4]))
filter
(fn(
Combination
comb
) {
Combination
comb
.
Int
total_weight
(self : Int, other : Int) -> Int

Adds two 32-bit signed integers. Performs two's complement arithmetic, which means the operation will wrap around if the result exceeds the range of a 32-bit integer.

Parameters:

  • self : The first integer operand.
  • other : The second integer operand.

Returns a new integer that is the sum of the two operands. If the mathematical sum exceeds the range of a 32-bit integer (-2,147,483,648 to 2,147,483,647), the result wraps around according to two's complement rules.

Example:

  inspect(42 + 1, content="43")
  inspect(2147483647 + 1, content="-2147483648") // Overflow wraps around to minimum value
+
Item
item1
.
Int
weight
(self_ : Int, other : Int) -> Bool
<=
Int
capacity
})
.
(self : @list.T[Combination], f : (Combination) -> Combination) -> @list.T[Combination]

Maps the list.

Example

  assert_eq(@list.of([1, 2, 3, 4, 5]).map(x => x * 2), @list.of([2, 4, 6, 8, 10]))
map
(_.
(self : Combination, item : Item) -> Combination
add
(
Item
item1
))
// 两个部分都仅包含有效组合,所以合并后也仅包含有效组合
@list.T[Combination]
valid_combs_with_item1
(self : @list.T[Combination], other : @list.T[Combination]) -> @list.T[Combination]

Concatenate two lists.

a + b equal to a.concat(b)

+
@list.T[Combination]
valid_combs_without_item1
} } }

遵循代码的结构进行分类讨论,很容易证明all_combinations_valid的正确性——它返回的所有组合确实都是有效的。

由于all_combinations_valid返回的那些组合都是有效的,就不再需要在solve中过滤了。我们将solve中的filter删去。

fn 
(items : @list.T[Item], capacity : Int) -> Combination
solve_v2
(
@list.T[Item]
items
:
enum @list.T[A] {
  Empty
  More(A, tail~ : @list.T[A])
}
List
[
struct Item {
  weight: Int
  value: Int
}
Item
],
Int
capacity
:
Int
Int
) ->
struct Combination {
  items: @list.T[Item]
  total_weight: Int
  total_value: Int
}
Combination
{
(items : @list.T[Item], capacity : Int) -> @list.T[Combination]
all_combinations_valid
(
@list.T[Item]
items
,
Int
capacity
).
(self : @list.T[Combination]) -> Combination
unsafe_maximum
()
}

三、维护升序性质,提前结束过滤

在上个版本中,为了过滤出那些能装下item1的组合,我们必须遍历valid_combs_without_item1中的每一个组合。

但我们可以发现:如果item1没法放入一个组合,那么item1一定都无法放入比这个组合总重量更大的那些组合。

这也就是说,如果valid_combs_without_item1能按总重量升序排列,那么过滤时就不需要完整地遍历它了。在过滤的过程中,一旦碰到一个放不下item1的组合,就可以立刻舍去后续的所有组合。由于这种逻辑很常见,标准库提供了一个叫take_while的函数,我们用它替换掉filter

要想让valid_combs_without_item1升序排列,可以用排序算法,但这却要遍历整个列表,违背了初衷。因此,我们得采用另一种方案:想办法让all_combinations_valid返回的列表是升序的。这需要一次递归的信仰之跃:

fn 
(items : @list.T[Item], capacity : Int) -> @list.T[Combination]
all_combinations_valid_ordered
(
@list.T[Item]
items
:
enum @list.T[A] {
  Empty
  More(A, tail~ : @list.T[A])
}
List
[
struct Item {
  weight: Int
  value: Int
}
Item
],
Int
capacity
:
Int
Int
) ->
enum @list.T[A] {
  Empty
  More(A, tail~ : @list.T[A])
}
List
[
struct Combination {
  items: @list.T[Item]
  total_weight: Int
  total_value: Int
}
Combination
] {
match
@list.T[Item]
items
{
@list.T[Item]
Empty
=>
(x : Combination) -> @list.T[Combination]
@list.singleton
(
Combination
empty_combination
) // 单元素的列表,自然是升序的
(Item, @list.T[Item]) -> @list.T[Item]
More
(
Item
item1
,
@list.T[Item]
tail
=
@list.T[Item]
items_tail
) => {
// 我们假设 all_combinations_valid_ordered 返回的列表是升序的(归纳假设) let
@list.T[Combination]
valid_combs_without_item1
=
(items : @list.T[Item], capacity : Int) -> @list.T[Combination]
all_combinations_valid_ordered
(
@list.T[Item]
items_tail
,
Int
capacity
,
) // 那么它也是升序的,因为一个升序的列表先截取一部分,再往每个元素加上同样的重量,它们的总重量还是升序的 let
@list.T[Combination]
valid_combs_with_item1
=
@list.T[Combination]
valid_combs_without_item1
.
(self : @list.T[Combination], p : (Combination) -> Bool) -> @list.T[Combination]

Take the longest prefix of a list of elements that satisfies a given predicate.

Example

  let ls = @list.from_array([1, 2, 3, 4])
  let r = ls.take_while(x => x < 3)
  assert_eq(r, @list.of([1, 2]))
take_while
(fn(
Combination
comb
) {
Combination
comb
.
Int
total_weight
(self : Int, other : Int) -> Int

Adds two 32-bit signed integers. Performs two's complement arithmetic, which means the operation will wrap around if the result exceeds the range of a 32-bit integer.

Parameters:

  • self : The first integer operand.
  • other : The second integer operand.

Returns a new integer that is the sum of the two operands. If the mathematical sum exceeds the range of a 32-bit integer (-2,147,483,648 to 2,147,483,647), the result wraps around according to two's complement rules.

Example:

  inspect(42 + 1, content="43")
  inspect(2147483647 + 1, content="-2147483648") // Overflow wraps around to minimum value
+
Item
item1
.
Int
weight
(self_ : Int, other : Int) -> Bool
<=
Int
capacity
})
.
(self : @list.T[Combination], f : (Combination) -> Combination) -> @list.T[Combination]

Maps the list.

Example

  assert_eq(@list.of([1, 2, 3, 4, 5]).map(x => x * 2), @list.of([2, 4, 6, 8, 10]))
map
(_.
(self : Combination, item : Item) -> Combination
add
(
Item
item1
))
// 现在我们只需要确保合并后也升序,就能衔接上最开始的假设
(a : @list.T[Combination], b : @list.T[Combination]) -> @list.T[Combination]
merge_keep_order
(
@list.T[Combination]
valid_combs_with_item1
,
@list.T[Combination]
valid_combs_without_item1
)
} } }

最后的任务是完成函数merge_keep_order,它将两个升序的列表合并为一个升序的列表:

fn 
(a : @list.T[Combination], b : @list.T[Combination]) -> @list.T[Combination]
merge_keep_order
(
@list.T[Combination]
a
:
enum @list.T[A] {
  Empty
  More(A, tail~ : @list.T[A])
}
List
[
struct Combination {
  items: @list.T[Item]
  total_weight: Int
  total_value: Int
}
Combination
],
@list.T[Combination]
b
:
enum @list.T[A] {
  Empty
  More(A, tail~ : @list.T[A])
}
List
[
struct Combination {
  items: @list.T[Item]
  total_weight: Int
  total_value: Int
}
Combination
]
) ->
enum @list.T[A] {
  Empty
  More(A, tail~ : @list.T[A])
}
List
[
struct Combination {
  items: @list.T[Item]
  total_weight: Int
  total_value: Int
}
Combination
] {
match (
@list.T[Combination]
a
,
@list.T[Combination]
b
) {
(
@list.T[Combination]
Empty
,
@list.T[Combination]
another
) | (
@list.T[Combination]
another
,
@list.T[Combination]
Empty
) =>
@list.T[Combination]
another
(
(Combination, @list.T[Combination]) -> @list.T[Combination]
More
(
Combination
a1
,
@list.T[Combination]
tail
=
@list.T[Combination]
a_tail
),
(Combination, @list.T[Combination]) -> @list.T[Combination]
More
(
Combination
b1
,
@list.T[Combination]
tail
=
@list.T[Combination]
b_tail
)) =>
// 如果 a1 比 b1 更轻,而 b 又是升序的,说明 // a1 比 b 里所有组合都轻 // 由于 a 是升序的,所以 // a1 比 a_tail 里所有组合都轻 // 所以 a1 是 a 和 b 中最小的那一个 if
Combination
a1
.
Int
total_weight
(self_ : Int, other : Int) -> Bool
<
Combination
b1
.
Int
total_weight
{
// 我们先递归地合并出答案的剩余部分,再把 a1 加到开头
(a : @list.T[Combination], b : @list.T[Combination]) -> @list.T[Combination]
merge_keep_order
(
@list.T[Combination]
a_tail
,
@list.T[Combination]
b
).
(self : @list.T[Combination], head : Combination) -> @list.T[Combination]
add
(
Combination
a1
)
} else { // 同理
(a : @list.T[Combination], b : @list.T[Combination]) -> @list.T[Combination]
merge_keep_order
(
@list.T[Combination]
a
,
@list.T[Combination]
b_tail
).
(self : @list.T[Combination], head : Combination) -> @list.T[Combination]
add
(
Combination
b1
)
} } }

虽然看起来有点啰嗦,但我还是想提一句:通过遵循代码结构的分类讨论,很容易证明all_combinations_valid_orderedmerge_keep_order的正确性——它确实返回的一个升序的列表。

对于一个升序的列表,它的最大值就是最后一个。于是我们将unsafe_maximum替换成unsafe_last

fn 
(items : @list.T[Item], capacity : Int) -> Combination
solve_v3
(
@list.T[Item]
items
:
enum @list.T[A] {
  Empty
  More(A, tail~ : @list.T[A])
}
List
[
struct Item {
  weight: Int
  value: Int
}
Item
],
Int
capacity
:
Int
Int
) ->
struct Combination {
  items: @list.T[Item]
  total_weight: Int
  total_value: Int
}
Combination
{
(items : @list.T[Item], capacity : Int) -> @list.T[Combination]
all_combinations_valid_ordered
(
@list.T[Item]
items
,
Int
capacity
).
(self : @list.T[Combination]) -> Combination
unsafe_last
()
}

回过头来看,在这一版的改进中,我们似乎并没有得到什么太大的好处,毕竟在合并列表的过程中,我们仍然需要遍历整个列表。最初我也是这么想的,但后来意外地发现merge_keep_order的真正作用在下一个版本。

四、去除等同重量的冗余组合,达到最优时间复杂度

目前为止,我们进行的都不是时间复杂度层面的优化,但这些优化恰恰为接下来的步骤铺平了道路。现在让我们来考察一下时间复杂度。

在最差情况下(背包很大,全都放得下),组合列表(all_combinations的返回值)将最多包含 2物品数量2^{物品数量} 个元素。这导致整个算法的时间复杂度也是指数级的,因为all_combinations会被调用 物品数量物品数量 次,而每次都会遍历组合列表。

为了降低时间复杂度,我们就需要降低组合列表的长度。这基于一个观察:如果有两个组合,它们总重量相同,那么总价值更高的那个组合总是比另一个更好。因此,我们不需要在列表中同时保留两者。

如果能排除那些冗余的组合,组合列表的长度将不会超过背包容量(抽屉原理),进而将整个算法的时间复杂度降低到 O(物品数量×背包容量)\mathcal{O}(物品数量 \times 背包容量)。观察代码,现在唯一有可能会向列表中引入冗余组合的地方是merge_keep_orderelse分支。为了避免这种情况出现,我们只需要对这个地方进行一点改动:

fnalias 
(x : T, y : T) -> T
@math.maximum
fn
(a : @list.T[Combination], b : @list.T[Combination]) -> @list.T[Combination]
merge_keep_order_and_dedup
(
@list.T[Combination]
a
:
enum @list.T[A] {
  Empty
  More(A, tail~ : @list.T[A])
}
List
[
struct Combination {
  items: @list.T[Item]
  total_weight: Int
  total_value: Int
}
Combination
],
@list.T[Combination]
b
:
enum @list.T[A] {
  Empty
  More(A, tail~ : @list.T[A])
}
List
[
struct Combination {
  items: @list.T[Item]
  total_weight: Int
  total_value: Int
}
Combination
]
) ->
enum @list.T[A] {
  Empty
  More(A, tail~ : @list.T[A])
}
List
[
struct Combination {
  items: @list.T[Item]
  total_weight: Int
  total_value: Int
}
Combination
] {
match (
@list.T[Combination]
a
,
@list.T[Combination]
b
) {
(
@list.T[Combination]
Empty
,
@list.T[Combination]
another
) | (
@list.T[Combination]
another
,
@list.T[Combination]
Empty
) =>
@list.T[Combination]
another
(
(Combination, @list.T[Combination]) -> @list.T[Combination]
More
(
Combination
a1
,
@list.T[Combination]
tail
=
@list.T[Combination]
a_tail
),
(Combination, @list.T[Combination]) -> @list.T[Combination]
More
(
Combination
b1
,
@list.T[Combination]
tail
=
@list.T[Combination]
b_tail
)) =>
if
Combination
a1
.
Int
total_weight
(self_ : Int, other : Int) -> Bool
<
Combination
b1
.
Int
total_weight
{
(a : @list.T[Combination], b : @list.T[Combination]) -> @list.T[Combination]
merge_keep_order_and_dedup
(
@list.T[Combination]
a_tail
,
@list.T[Combination]
b
).
(self : @list.T[Combination], head : Combination) -> @list.T[Combination]
add
(
Combination
a1
)
} else if
Combination
a1
.
Int
total_weight
(self_ : Int, other : Int) -> Bool
>
Combination
b1
.
Int
total_weight
{
(a : @list.T[Combination], b : @list.T[Combination]) -> @list.T[Combination]
merge_keep_order_and_dedup
(
@list.T[Combination]
a
,
@list.T[Combination]
b_tail
).
(self : @list.T[Combination], head : Combination) -> @list.T[Combination]
add
(
Combination
b1
)
} else { // 此时 a1 和 b1 一样重,出现冗余,保留总价值更高的那个 let
Combination
better
=
(x : Combination, y : Combination) -> Combination
maximum
(
Combination
a1
,
Combination
b1
)
(a : @list.T[Combination], b : @list.T[Combination]) -> @list.T[Combination]
merge_keep_order_and_dedup
(
@list.T[Combination]
a_tail
,
@list.T[Combination]
b_tail
).
(self : @list.T[Combination], head : Combination) -> @list.T[Combination]
add
(
Combination
better
)
} } }

all_combinations_valid_ordered_nodup(这是我这辈子写过的名字最长的函数了)和solve_v4替换相应部分即可。

fn 
(items : @list.T[Item], capacity : Int) -> @list.T[Combination]
all_combinations_valid_ordered_nodup
(
@list.T[Item]
items
:
enum @list.T[A] {
  Empty
  More(A, tail~ : @list.T[A])
}
List
[
struct Item {
  weight: Int
  value: Int
}
Item
],
Int
capacity
:
Int
Int
) ->
enum @list.T[A] {
  Empty
  More(A, tail~ : @list.T[A])
}
List
[
struct Combination {
  items: @list.T[Item]
  total_weight: Int
  total_value: Int
}
Combination
] {
match
@list.T[Item]
items
{
@list.T[Item]
Empty
=>
(x : Combination) -> @list.T[Combination]
@list.singleton
(
Combination
empty_combination
)
(Item, @list.T[Item]) -> @list.T[Item]
More
(
Item
item1
,
@list.T[Item]
tail
=
@list.T[Item]
items_tail
) => {
let
@list.T[Combination]
combs_without_item1
=
(items : @list.T[Item], capacity : Int) -> @list.T[Combination]
all_combinations_valid_ordered_nodup
(
@list.T[Item]
items_tail
,
Int
capacity
,
) let
@list.T[Combination]
combs_with_item1
=
@list.T[Combination]
combs_without_item1
.
(self : @list.T[Combination], p : (Combination) -> Bool) -> @list.T[Combination]

Take the longest prefix of a list of elements that satisfies a given predicate.

Example

  let ls = @list.from_array([1, 2, 3, 4])
  let r = ls.take_while(x => x < 3)
  assert_eq(r, @list.of([1, 2]))
take_while
(fn(
Combination
comb
) {
Combination
comb
.
Int
total_weight
(self : Int, other : Int) -> Int

Adds two 32-bit signed integers. Performs two's complement arithmetic, which means the operation will wrap around if the result exceeds the range of a 32-bit integer.

Parameters:

  • self : The first integer operand.
  • other : The second integer operand.

Returns a new integer that is the sum of the two operands. If the mathematical sum exceeds the range of a 32-bit integer (-2,147,483,648 to 2,147,483,647), the result wraps around according to two's complement rules.

Example:

  inspect(42 + 1, content="43")
  inspect(2147483647 + 1, content="-2147483648") // Overflow wraps around to minimum value
+
Item
item1
.
Int
weight
(self_ : Int, other : Int) -> Bool
<=
Int
capacity
})
.
(self : @list.T[Combination], f : (Combination) -> Combination) -> @list.T[Combination]

Maps the list.

Example

  assert_eq(@list.of([1, 2, 3, 4, 5]).map(x => x * 2), @list.of([2, 4, 6, 8, 10]))
map
(_.
(self : Combination, item : Item) -> Combination
add
(
Item
item1
))
(a : @list.T[Combination], b : @list.T[Combination]) -> @list.T[Combination]
merge_keep_order_and_dedup
(
@list.T[Combination]
combs_with_item1
,
@list.T[Combination]
combs_without_item1
)
} } } fn
(items : @list.T[Item], capacity : Int) -> Combination
solve_v4
(
@list.T[Item]
items
:
enum @list.T[A] {
  Empty
  More(A, tail~ : @list.T[A])
}
List
[
struct Item {
  weight: Int
  value: Int
}
Item
],
Int
capacity
:
Int
Int
) ->
struct Combination {
  items: @list.T[Item]
  total_weight: Int
  total_value: Int
}
Combination
{
(items : @list.T[Item], capacity : Int) -> @list.T[Combination]
all_combinations_valid_ordered_nodup
(
@list.T[Item]
items
,
Int
capacity
).
(self : @list.T[Combination]) -> Combination
unsafe_last
()
}

至此,我们重新发明了01背包问题的dp解法。

总结

这篇文章的内容是我某天早上躺在床上的突发奇想,从第一版到第四版代码完全在手机上写成,没有经过任何调试,但却能轻松地保证了正确性。相比传统算法竞赛题解中常见的写法,本文中使用的函数式写法带来了以下优势:

  1. 告别循环,使用递归分情况讨论。要想从列表中获取元素,必须使用模式匹配(match),这提醒我考虑列表为空时的答案。它相比dp数组的初始值拥有更加明确的含义。
  2. 依赖库函数进行遍历。标准库中提供的高阶函数(filtertake_whilemapmaximum)能替换掉样板化的循环(forwhile),便于读者一眼看出遍历的目的。
  3. 声明式编程。第一版的代码是想法的一比一地翻译。与其说是在描述一个算法,更像是在描述这个问题本身,这保证了第一版的正确性。而随后每次改进都在不影响结果的前提下进行,于是继承了第一版的正确性。

当然,从来就没有银弹。我们需要可读性和效率之间做取舍。函数式的风格固然好理解,但还是有许多优化余地的。进一步的优化方向是将列表替换成数组,再替换成从头到尾只使用两个滚动数组,甚至是只使用一个数组。这可以将空间复杂度优化成 O(背包容量)\mathcal{O}(背包容量),但不在本文的讨论范围内。我相信初学者更希望看到的是一个易于理解的代码。

附录

更多细节优化

利用items中物品的顺序不影响结果的总价值这个性质。可以把all_combinations转化成尾递归。

另外,take_while产生的列表在map后马上就被丢弃了,我们可以改用迭代器来避免产生这个一次性列表。

fn 
(items : @list.T[Item], capacity : Int) -> @list.T[Combination]
all_combinations_loop
(
@list.T[Item]
items
:
enum @list.T[A] {
  Empty
  More(A, tail~ : @list.T[A])
}
List
[
struct Item {
  weight: Int
  value: Int
}
Item
],
Int
capacity
:
Int
Int
) ->
enum @list.T[A] {
  Empty
  More(A, tail~ : @list.T[A])
}
List
[
struct Combination {
  items: @list.T[Item]
  total_weight: Int
  total_value: Int
}
Combination
] {
loop
@list.T[Item]
items
,
(x : Combination) -> @list.T[Combination]
@list.singleton
(
Combination
empty_combination
) {
@list.T[Item]
Empty
,
@list.T[Combination]
combs_so_far
=>
@list.T[Combination]
combs_so_far
(Item, @list.T[Item]) -> @list.T[Item]
More
(
Item
item1
,
@list.T[Item]
tail
=
@list.T[Item]
items_tail
),
@list.T[Combination]
combs_so_far
=> {
let
@list.T[Combination]
combs_without_item1
=
@list.T[Combination]
combs_so_far
let
@list.T[Combination]
combs_with_item1
=
@list.T[Combination]
combs_without_item1
.
(self : @list.T[Combination]) -> Iter[Combination]
iter
()
.
(self : Iter[Combination], f : (Combination) -> Bool) -> Iter[Combination]

Takes elements from the iterator as long as the predicate function returns true.

Type Parameters

  • T: The type of the elements in the iterator.

Arguments

  • self - The input iterator.
  • f - The predicate function that determines whether an element should be taken.

Returns

A new iterator that contains the elements as long as the predicate function returns true.

take_while
(fn(
Combination
comb
) {
Combination
comb
.
Int
total_weight
(self : Int, other : Int) -> Int

Adds two 32-bit signed integers. Performs two's complement arithmetic, which means the operation will wrap around if the result exceeds the range of a 32-bit integer.

Parameters:

  • self : The first integer operand.
  • other : The second integer operand.

Returns a new integer that is the sum of the two operands. If the mathematical sum exceeds the range of a 32-bit integer (-2,147,483,648 to 2,147,483,647), the result wraps around according to two's complement rules.

Example:

  inspect(42 + 1, content="43")
  inspect(2147483647 + 1, content="-2147483648") // Overflow wraps around to minimum value
+
Item
item1
.
Int
weight
(self_ : Int, other : Int) -> Bool
<=
Int
capacity
})
.
(self : Iter[Combination], f : (Combination) -> Combination) -> Iter[Combination]

Transforms the elements of the iterator using a mapping function.

Type Parameters

  • T: The type of the elements in the iterator.
  • R: The type of the transformed elements.

Arguments

  • self - The input iterator.
  • f - The mapping function that transforms each element of the iterator.

Returns

A new iterator that contains the transformed elements.

map
(_.
(self : Combination, item : Item) -> Combination
add
(
Item
item1
))
|>
(iter : Iter[Combination]) -> @list.T[Combination]

Convert the iterator into a list. Preserves order of elements. If the order of elements is not important, use from_iter_rev instead.

@list.from_iter
continue
@list.T[Item]
items_tail
,
(a : @list.T[Combination], b : @list.T[Combination]) -> @list.T[Combination]
merge_keep_order_and_dedup
(
@list.T[Combination]
combs_with_item1
,
@list.T[Combination]
combs_without_item1
)
} } } fn
(items : @list.T[Item], capacity : Int) -> Combination
solve_v5
(
@list.T[Item]
items
:
enum @list.T[A] {
  Empty
  More(A, tail~ : @list.T[A])
}
List
[
struct Item {
  weight: Int
  value: Int
}
Item
],
Int
capacity
:
Int
Int
) ->
struct Combination {
  items: @list.T[Item]
  total_weight: Int
  total_value: Int
}
Combination
{
(items : @list.T[Item], capacity : Int) -> @list.T[Combination]
all_combinations_loop
(
@list.T[Item]
items
,
Int
capacity
).
(self : @list.T[Combination]) -> Combination
unsafe_last
()
}

题外话

  1. 在第一版中,all_combinations(items)产生的Combination甚至比其中的More还多一个,堪称链表节点复用大师。
  2. 升序还可以换成降序,对应的take_while要换成drop_while。而改用Array后可以通过binary_search来寻找下标直接切分。
  3. 如果你感兴趣,可以考虑一下怎么把上面的做法拓展到各种其它的背包问题
  4. all_combinations_loop原名:generate_all_ordered_combination_that_fit_in_backpack_list_without_duplicates_using_loop

测试

test {
  for 
(@list.T[Item], Int) -> Combination
solve
in [
(items : @list.T[Item], capacity : Int) -> Combination
solve_v1
,
(items : @list.T[Item], capacity : Int) -> Combination
solve_v2
,
(items : @list.T[Item], capacity : Int) -> Combination
solve_v3
,
(items : @list.T[Item], capacity : Int) -> Combination
solve_v4
,
(items : @list.T[Item], capacity : Int) -> Combination
solve_v5
] {
(a : Int, b : Int, msg? : String, loc~ : SourceLoc = _) -> Unit raise Error

Asserts that two values are equal. If they are not equal, raises a failure with a message containing the source location and the values being compared.

Parameters:

  • a : First value to compare.
  • b : Second value to compare.
  • loc : Source location information to include in failure messages. This is usually automatically provided by the compiler.

Throws a Failure error if the values are not equal, with a message showing the location of the failing assertion and the actual values that were compared.

Example:

  assert_eq(1, 1)
  assert_eq("hello", "hello")
assert_eq
(
(@list.T[Item], Int) -> Combination
solve
(
@list.T[Item]
items_1
, 10).
Int
total_value
, 21)
} }

MoonBit Pearls Vol.02:MoonBit中的面向对象编程

· 阅读需 22 分钟
刘子悦

alt text

引言

在软件开发的世界里,面向对象编程(OOP)无疑是一座绕不开的话题。Java、C++ 等语言凭借其强大的 OOP 机制构建了无数复杂的系统。然而,Moonbit,作为一门围绕函数式编程构建的现代语言,它如何实现 OOP?

Moonbit 是一门以函数式编程为核心的语言,它的面向对象编程思路与传统编程语言有很大不同。它抛弃了传统的继承机制,拥抱"组合优于继承"的设计哲学。乍一看,这可能让习惯了传统OOP的程序员有些不适应,但细细品味,你会发现这种方法有着意想不到的优雅和实用性。

本文将通过一个生动的RPG游戏开发例子,带你深入体验Moonbit中的面向对象编程。我们会逐一剖析封装、继承和多态这三大特性,并与C++的实现方式进行对比,最后提供一些实际开发中的最佳实践建议。

封装(Encapsulation)

想象一下,我们要开发一款经典的单机RPG游戏。在这个奇幻世界里,英雄四处游历,与怪物战斗,向NPC商人购买装备,最终拯救被困的公主。要构建这样一个世界,我们首先需要对其中的所有元素进行建模。

不管是勇敢的英雄、凶恶的怪物,还是朴实的桌椅板凳,它们在游戏世界中都有一些共同的特征。我们可以将这些对象都抽象为Sprite(精灵),每个Sprite都应该具备几个基本属性:

  • ID:对象的唯一标识符,就像身份证号码一样。
  • xy:在游戏地图上的坐标位置。

C++的经典封装方式

在C++的世界里,我们习惯于用class来构建数据的封装:

// 一个基础的 Sprite 类
class Sprite {
private:
    int id;
    double x;
    double y;

public:
    // 构造函数,用来创建对象
    Sprite(int id, double x, double y) : id(id), x(x), y(y) {}

    // 提供一些公共的 "getter" 方法来访问数据
    int getID() const { return id; }
    double getX() const { return x; }
    double getY() const { return y; }

    // 可能还需要 "setter" 方法来修改数据
    void setX(double newX) { x = newX; }
    void setY(double newY) { y = newY; }
};

你可能会问:"为什么要搞这么多get方法,直接把属性设为public不就好了?"这就涉及到封装的核心思想了。

为什么需要封装?

想象一下,如果你的同事直接通过sprite.id = enemy_id来修改ID,英雄瞬间就能"变身"成敌人的同伙,直接大摇大摆地走到终点——但这显然不是我们想要的游戏机制!封装就像给数据加了一道防护网,private字段配合getter方法,确保外部只能读取而无法随意修改关键数据。这样的设计让代码更加健壮,避免了意想不到的副作用。

Moonbit的优雅封装

到了Moonbit这里,封装的思路发生了微妙而重要的变化。让我们先看一个简单的版本:

// 在 Moonbit 中定义 Sprite
pub struct Sprite {
  id: Int          // 默认不可变,外部可读但不可写
  mut x: Double    // mut 关键字表示可变
  mut y: Double
}

// 我们可以为 struct 定义方法
pub fn Sprite::get_x(self: Self) -> Double {
  self.x
}

pub fn Sprite::get_y(self: Self) -> Double {
  self.y
}

pub fn Sprite::set_x(self: Self, new_x: Double) -> Unit {
  self.x = new_x
}

pub fn Sprite::set_y(self: Self, new_y: Double) -> Unit {
  self.y = new_y
}

注意到这里有两个关键的不同点:

1. 可变性的显式声明

在Moonbit中,字段默认是不可变的(immutable)。如果你想让某个字段可以被修改,必须明确使用mut关键字。在我们的Sprite中,id保持不可变——这完美符合我们的设计意图,毕竟我们不希望对象的身份被随意篡改。而xy被标记为mut,因为精灵需要在世界中自由移动。

2. 更简洁的访问控制

由于id本身就是不可变的,我们甚至不需要为它编写get_id方法!外部代码可以直接通过sprite.id来读取它,但任何尝试修改的行为都会被编译器坚决拒绝。这比C++的"private + getter"模式更加简洁明了,同时保持了同样的安全性。

💡 实践建议

在设计数据结构时,优先考虑哪些字段真正需要可变。Moonbit的默认不可变设计能帮你避免很多意外的状态修改bug。

继承(Inheritance)

面向对象编程的第二大支柱是继承。在我们的RPG世界中,会有多种不同类型的Sprite。为了简化示例,我们定义三种:

  • Hero(英雄):玩家操控的角色
  • Enemy(敌人):需要被击败的对手
  • Merchant(商人):售卖道具的NPC

C++的继承层次

在C++中,我们很自然地使用类继承来构建这种层级关系:

class Hero : public Sprite {
private:
    double hp;
    double damage;
    int money;

public:
    Hero(int id, double x, double y, double hp, double damage, int money)
        : Sprite(id, x, y), hp(hp), damage(damage), money(money) {}

    void attack(Enemy& e) { /* ... */ }
};

class Enemy : public Sprite {
private:
    double hp;
    double damage;

public:
    Enemy(int id, double x, double y, double hp, double damage)
        : Sprite(id, x, y), hp(hp), damage(damage) {}

    void attack(Hero& h) { /* ... */ }
};

class Merchant : public Sprite {
public:
    Merchant(int id, double x, double y) : Sprite(id, x, y) {}
    // 商人专有的方法...
};

C++的面向对象建立在 "is-a" 关系基础上:Hero是一个SpriteEnemy是一个Sprite。这种思维方式直观且容易理解。

Moonbit的组合式思维

现在轮到Moonbit了。这里需要进行一次重要的思维转换:Moonbit的struct不支持直接继承。取而代之的是使用trait(特质)和组合(Composition)。

这种设计迫使我们重新思考问题:我们不再将Sprite视为可被继承的"父类",而是将其拆分为两个独立的概念:

  1. SpriteData:一个纯粹的数据结构,存储所有Sprite共享的数据
  2. Sprite:一个trait,定义所有Sprite应该具备的行为能力

让我们看看实际的代码:

// 1. 定义共享的数据结构
pub struct SpriteData {
  id: Int
  mut x: Double
  mut y: Double
}

// 2. 定义描述通用行为的 Trait
pub trait Sprite {
  getSpriteData(Self) -> SpriteData  // 必须实现的核心方法
  getID(Self) -> Int = _             // = _ 表示有默认实现
  getX(Self) -> Double = _
  getY(Self) -> Double = _
  setX(Self, Double) -> Unit = _
  setY(Self, Double) -> Unit = _
}

// Sprite的默认实现
// 只要实现了 getSpriteData,就自动拥有了其他方法
impl Sprite with getID(self) {
  self.getSpriteData().id
}

impl Sprite with getX(self) {
  self.getSpriteData().x
}

impl Sprite with getY(self) {
  self.getSpriteData().y
}

impl Sprite with setX(self, new_x) {
  self.getSpriteData().x = new_x
}

impl Sprite with setY(self, new_y) {
  self.getSpriteData().y = new_y
}

理解Trait的威力

Sprite trait定义了一个"契约":任何声称自己是Sprite的类型,都必须能够提供它的SpriteData。一旦满足了这个条件,getIDgetXgetY等方法就会自动可用。这里的= _语法表示该方法有默认实现,这是Moonbit的最新语法特性。

有了这个基础架构,我们就可以实现具体的游戏角色了:

// 定义Hero
pub struct Hero {
  
SpriteData
sprite_data
:
struct SpriteData {
  id: Int
  mut x: Double
  mut y: Double
}
SpriteData
// 组合SpriteData
Double
hp
:
Double
Double
Int
damage
:
Int
Int
Int
money
:
Int
Int
} // 实现Sprite trait,只需要提供getSpriteData方法 pub impl
trait Sprite {
  getSpriteData(Self) -> SpriteData
  asSpriteEnum(Self) -> SpriteEnum
  tryAsWarrior(Self) -> &Warrior?
  getID(Self) -> Int
  getX(Self) -> Double
  getY(Self) -> Double
  setX(Self, Double) -> Unit
  setY(Self, Double) -> Unit
}
Sprite
for
struct Hero {
  sprite_data: SpriteData
  hp: Double
  damage: Int
  money: Int
}
Hero
with
(self : Hero) -> SpriteData
getSpriteData
(
Hero
self
) {
Hero
self
.
SpriteData
sprite_data
} pub fn
struct Hero {
  sprite_data: SpriteData
  hp: Double
  damage: Int
  money: Int
}
Hero
::
(self : Hero, e : Enemy) -> Unit
attack
(
Hero
self
:
struct Hero {
  sprite_data: SpriteData
  hp: Double
  damage: Int
  money: Int
}
Self
,
Enemy
e
:
struct Enemy {
  sprite_data: SpriteData
  hp: Double
  damage: Int
}
Enemy
) ->
Unit
Unit
{
// 攻击逻辑... } // 定义Enemy pub struct Enemy {
SpriteData
sprite_data
:
struct SpriteData {
  id: Int
  mut x: Double
  mut y: Double
}
SpriteData
Double
hp
:
Double
Double
Int
damage
:
Int
Int
} pub impl
trait Sprite {
  getSpriteData(Self) -> SpriteData
  asSpriteEnum(Self) -> SpriteEnum
  tryAsWarrior(Self) -> &Warrior?
  getID(Self) -> Int
  getX(Self) -> Double
  getY(Self) -> Double
  setX(Self, Double) -> Unit
  setY(Self, Double) -> Unit
}
Sprite
for
struct Enemy {
  sprite_data: SpriteData
  hp: Double
  damage: Int
}
Enemy
with
(self : Enemy) -> SpriteData
getSpriteData
(
Enemy
self
) {
Enemy
self
.
SpriteData
sprite_data
} pub fn
struct Enemy {
  sprite_data: SpriteData
  hp: Double
  damage: Int
}
Enemy
::
(self : Enemy, h : Hero) -> Unit
attack
(
Enemy
self
:
struct Enemy {
  sprite_data: SpriteData
  hp: Double
  damage: Int
}
Self
,
Hero
h
:
struct Hero {
  sprite_data: SpriteData
  hp: Double
  damage: Int
  money: Int
}
Hero
) ->
Unit
Unit
{
// 攻击逻辑... } // 定义Merchant pub struct Merchant {
SpriteData
sprite_data
:
struct SpriteData {
  id: Int
  mut x: Double
  mut y: Double
}
SpriteData
} pub impl
trait Sprite {
  getSpriteData(Self) -> SpriteData
  asSpriteEnum(Self) -> SpriteEnum
  tryAsWarrior(Self) -> &Warrior?
  getID(Self) -> Int
  getX(Self) -> Double
  getY(Self) -> Double
  setX(Self, Double) -> Unit
  setY(Self, Double) -> Unit
}
Sprite
for
struct Merchant {
  sprite_data: SpriteData
}
Merchant
with
(self : Merchant) -> SpriteData
getSpriteData
(
Merchant
self
) {
Merchant
self
.
SpriteData
sprite_data
}

注意这里的思维方式转变:Moonbit采用的是 "has-a" 关系,而不是传统OOP的 "is-a" 关系。Hero拥有SpriteData,并且实现Sprite的能力。

看起来Moonbit更复杂?

初看之下,Moonbit的代码似乎比C++要写更多"模板代码"。但这只是表面现象!我们这里刻意回避了C++的诸多复杂性:构造函数、析构函数、const正确性、模板实例化等等。更重要的是,Moonbit这种设计在大型项目中会展现出巨大优势——我们稍后会详细讨论这一点。

多态(Polymorphism)

多态是面向对象编程的第三大支柱,指的是同一个接口作用于不同对象时产生不同行为的能力。让我们通过一个具体例子来理解:假设我们需要实现一个who_are_you函数,它能够识别传入对象的类型并给出相应回答。

C++的多态机制

C++的多态机制实际上是一个比较复杂的问题,笼统地说,它包括静态多态(模板)和动态多态(虚函数、RTTI等)。对C++多态机制的讨论超出了我们这篇文章的内容范围,读者如果有兴趣可以自行查阅相关书籍。这里我们重点讨论两种经典的运行时多态方法。

方法一:虚函数机制

最传统的做法是为基类定义虚函数,让子类重写:

class Sprite {
public:
    virtual ~Sprite() = default;  // 虚析构函数
    // 定义一个"纯虚函数",强制子类必须实现它
    virtual std::string say_name() const = 0;
};

// 在子类中"重写"(override)这个函数
class Hero : public Sprite {
public:
    std::string say_name() const override {
        return "I am a hero!";
    }
    // ...
};

class Enemy : public Sprite {
public:
    std::string say_name() const override {
        return "I am an enemy!";
    }
    // ...
};

class Merchant : public Sprite {
public:
    std::string say_name() const override {
        return "I am a merchant.";
    }
    // ...
};

// 现在 who_are_you 函数变得极其简单!
void who_are_you(const Sprite& s) {
    std::cout << s.say_name() << std::endl;
}

方法二:RTTI + dynamic_cast

如果我们不想为每个类单独定义虚函数,还可以使用C++的运行时类型信息(RTTI):

class Sprite {
public:
    // 拥有虚函数的类才能使用 RTTI
    virtual ~Sprite() = default;
};

// who_are_you 函数的实现
void who_are_you(const Sprite& s) {
    if (dynamic_cast<const Hero*>(&s)) {
        std::cout << "I am a hero!" << std::endl;
    } else if (dynamic_cast<const Enemy*>(&s)) {
        std::cout << "I am an enemy!" << std::endl;
    } else if (dynamic_cast<const Merchant*>(&s)) {
        std::cout << "I am a merchant." << std::endl;
    } else {
        std::cout << "I don't know who I am" << std::endl;
    }
}

RTTI的工作原理

开启RTTI后,C++编译器会为每个有虚函数的对象维护一个隐式的type_info结构。当使用dynamic_cast时,编译器检查这个类型信息:匹配则返回有效指针,不匹配则返回nullptr。这种机制虽然功能强大,但也带来了运行时开销。

不过,第二种方法在大型项目中存在一些问题:

  1. 类型不安全。如果你新增了一个子类但忘记修改who_are_you函数,这个bug只能在运行时才能被发现!在现代软件开发中,我们更希望此类错误能在编译时就被捕获。
  2. 性能不够好。开启RTTI后,每一次判断类型都会调用一个比较麻烦的类型信息读取方法,这不太利于优化,因此很容易出现性能上的问题。
  3. 数据不透明。开启RTTI后,C++会为每一个类隐式地添加一块类型信息,但是代码的编写者是看不到的,这对于一些期望对代码拥有更强掌控力的库编写者而言非常头疼。事实上,不少大型项目会考虑禁用RTTI,最典型的就是LLVM,这个C++的编译器项目反而自己并不愿意使用RTTI.

Moonbit的ADT机制

Moonbit通过引入代数数据类型(Algebraic Data Type,ADT)来优雅地解决多态问题。我们需要添加一个新的结构——SpriteEnum

pub trait Sprite {
  getSpriteData(Self) -> SpriteData
  asSpriteEnum(Self) -> SpriteEnum  // 新增:类型转换方法
}

// Moonbit允许enum的标签名和类名重名
pub enum SpriteEnum {
  Hero(Hero)
  Enemy(Enemy)
  Merchant(Merchant)
}

// 我们仍然需要实现Sprite中的getSpriteData
pub impl Sprite for Hero with getSpriteData(self) {
  self.sprite_data
}

pub impl Sprite for Enemy with getSpriteData(self) {
  self.sprite_data
}

pub impl Sprite for Merchant with getSpriteData(self) {
  self.sprite_data
}

// 为三个子类实现 asSpriteEnum 方法
// 这里实际上是将具体类型"装箱"到enum中
pub impl Sprite for Hero with asSpriteEnum(self) {
  Hero(self)  // 注意:这里的Hero是enum标签,不是类型
}

pub impl Sprite for Enemy with asSpriteEnum(self) {
  Enemy(self)
}

pub impl Sprite for Merchant with asSpriteEnum(self) {
  Merchant(self)
}

现在我们可以实现类型安全的who_are_you函数了:

test "who are you" {
  fn who_are_you(s: &Sprite) -> String {
    // 使用模式匹配进行类型分发
    match s.asSpriteEnum() {
      Hero(_) => "hero"
      Enemy(_) => "enemy"
      Merchant(_) => "merchant"
    }
  }

  let hero = Hero::new();
  let enemy = Enemy::new();
  let merchant = Merchant::new();
  inspect(who_are_you(hero), content="hero")
  inspect(who_are_you(enemy), content="enemy")
  inspect(who_are_you(merchant), content="merchant")
}

这种方法的美妙之处在于:它是编译时类型安全的!如果你添加了一个新的Sprite子类但忘记修改who_are_you函数,编译器会立即报错,而不是等到运行时才发现问题。

静态分发 vs 动态分发

你可能注意到函数签名中的&Sprite。这在Moonbit中被称为Trait Object,支持动态分发,类似于C++的虚函数机制。如果你写成fn[S: Sprite] who_are_you(s: S),那就是静态分发(泛型),编译器会为每种具体类型生成专门的代码。

两者的关键区别在于处理异构集合的能力。假设英雄有AOE技能需要攻击一个包含不同类型敌人的数组,你必须使用Array[&Sprite]而不是Array[V],因为后者无法同时容纳不同的具体类型。

当然,Moonbit也支持类似C++虚函数的直接方法调用:

pub trait SayName {
  say_name(Self) -> String
}

pub impl SayName for Hero with say_name(_) {
  "hero"
}

pub impl SayName for Enemy with say_name(_) {
  "enemy"
}

pub impl SayName for Merchant with say_name(_) {
  "merchant"
}

test "say_name" {
  fn who_are_you(s: &SayName) -> String {
    s.say_name()  // 直接调用trait方法,类似虚函数
  }

  let hero = Hero::new();
  let enemy = Enemy::new();
  let merchant = Merchant::new();
  inspect(who_are_you(hero), content="hero")
  inspect(who_are_you(enemy), content="enemy")
  inspect(who_are_you(merchant), content="merchant")
}

显式化的RTTI

实际上,Moonbit的ADT方法就是将C++隐式的RTTI过程显式化了。开发者明确知道有哪些类型,编译器也能在编译时进行完整性检查。

多层继承:构建复杂的能力体系

随着游戏系统的发展,我们发现HeroEnemy都有hp(生命值)、damage(攻击力)和attack方法。能否将这些共同特征抽象出来,形成一个Warrior(战士)层级呢?

C++的多层继承

在C++中,我们可以很自然地在继承链中插入新的中间层:

class Warrior : public Sprite {
protected: // 使用 protected,子类可以访问
    double hp;
    double damage;

public:
    Warrior(int id, double x, double y, double hp, double damage)
        : Sprite(id, x, y), hp(hp), damage(damage) {}

    virtual void attack(Sprite& target) = 0; // 战士都能攻击

    double getHP() const { return hp; }
    double getDamage() const { return damage; }
};

class Hero final : public Warrior {
    private:
        int money;
    public:
        Hero(int id, double x, double y, double hp, double damage, int money)
            : Warrior(id, x, y, hp, damage), money(money) {}
};

class Enemy final : public Warrior {
    public:
        Enemy(int id, double x, double y, double hp, double damage)
            : Warrior(id, x, y, hp, damage) {}
};

class Merchant final : public Sprite {
    public:
        Merchant(int id, double x, double y) : Sprite(id, x, y) {}
}; // 商人仍然直接继承 Sprite

这形成了一个清晰的继承链:Sprite → Warrior → Hero/EnemySprite → Merchant

Moonbit的组合式多层能力

在Moonbit中,我们继续坚持组合的思路,构建一个更灵活的能力体系:

pub struct WarriorData {
  hp: Double
  damage: Double
}

// Warrior trait 继承自 Sprite,形成能力层次
pub trait Warrior : Sprite {
  getWarriorData(Self) -> WarriorData
  asWarriorEnum(Self) -> WarriorEnum
  attack(Self, target: &Warrior) -> Unit = _  // 默认实现
}

pub enum WarriorEnum {
  Hero(Hero)
  Enemy(Enemy)
}

// 重新定义Hero,现在它组合了两种数据
pub struct Hero {
  sprite_data: SpriteData    // 基础精灵数据
  warrior_data: WarriorData  // 战士数据
  money: Int                 // 英雄特有数据
}

// Hero 需要实现多个 trait
pub impl Sprite for Hero with getSpriteData(self) {
  self.sprite_data
}

pub impl Warrior for Hero with getWarriorData(self) {
  self.warrior_data
}

pub impl Warrior for Hero with asWarriorEnum(self) {
  Hero(self)
}

// 重新定义Enemy
pub struct Enemy {
  sprite_data: SpriteData
  warrior_data: WarriorData
}

pub impl Sprite for Enemy with getSpriteData(self) {
  self.sprite_data
}

pub impl Warrior for Enemy with getWarriorData(self) {
  self.warrior_data
}

pub impl Warrior for Enemy with asWarriorEnum(self) {
  Enemy(self)
}

有时我们也可能会遇到需要将父基类转换成子基类的场景。例如,我们的商人可能对不同的Sprite做出不同的反应:当他遇到一个Warrior时,他会说"Want to buy something?",当他遇到另一个商人时,则什么也不做。这个时候,我们就需要把Sprite父基类转换成Warrior子基类。推荐的方式是为Sprite trait添加一个tryAsWarrior的函数:

pub trait Sprite {
  // other methods
  tryAsWarrior(Self) -> &Warrior? = _  // 尝试转换为Warrior
}

impl Sprite with tryAsWarrior(self) {
  match self.asSpriteEnum() {
    // 第一项需要添加 as &Warrior, 来告知编译器整个表达式返回一个&Warrior
    // 如果不加这个as语句,编译器就会根据第一个表达式的类型
    // 判断整个表达式的类型为Hero,从而引发编译错误。
    Hero(h) => Some(h as &Warrior)
    Enemy(e) => Some(e)
    _ => None
  }
}

pub fn Merchant::ask(self: Merchant, s: &Sprite) -> String {
  match s.tryAsWarrior() {
    Some(_) => "Want to buy something?"  // 对战士说话
    None => ""                           // 对其他类型保持沉默
  }
}

这种设计的精妙之处在于它的极致灵活性

  • HeroEnemy通过组合SpriteDataWarriorData,同时实现SpriteWarrior两个trait,获得了所需的全部能力
  • Merchant只需要组合SpriteData并实现Sprite trait即可
  • 如果将来要引入Mage(法师)能力,只需定义MageDataMage trait
  • 一个角色甚至可以同时是WarriorMage,成为"魔剑士",而不需要处理C++中的菱形继承问题

菱形继承问题

假设我们要创建一个既是商人又是敌人的Profiteer(奸商)类。在C++中,如果Profiteer同时继承EnemyMerchant,就会出现菱形继承:Profiteer会拥有两份Sprite数据!这可能导致修改了一份数据,但调用时却使用了另一份的诡异bug。Moonbit的组合方式从根本上避免了这个问题。


传统面向对象编程的深层问题

看到这里,你可能会想:"Moonbit的方法需要写更多代码,看起来更复杂啊!"确实,从代码行数来看,Moonbit似乎需要更多的"模板代码"。但是,在真实的软件工程实践中,传统的面向对象编程方式实际上存在诸多深层问题:

1. 脆弱的继承链

问题:对父类的任何修改都会影响所有子类,可能产生难以预估的连锁反应。

想象一下你的RPG游戏已经发布了两年,拥有上百种不同的Sprite子类。现在你需要给基础的Sprite类做一个重构。然而,你可能很快就会发现这并不现实。在传统继承体系中,这个改动会影响到每一个子类,即便是很小的改动可能也影响巨大。某些子类可能因为这个改动出现意外的行为变化,而你需要逐一检查和测试所有相关代码。

Moonbit的解决方案:组合式设计让我们可以通过ADT直接找到Sprite的所有子类,立刻知道重构代码的影响范围。

2. 菱形继承的噩梦

问题:多重继承容易导致菱形继承,产生数据重复和方法调用歧义。

如前所述,Profiteer类同时继承EnemyMerchant时,会拥有两份Sprite数据。这不仅浪费内存,更可能导致数据不一致的bug。

Moonbit的解决方案:组合天然避免了这个问题,Profiteer可以拥有SpriteDataWarriorDataMerchantData,清晰明了。

3. 运行时错误的隐患

问题:传统OOP的许多问题只能在运行时被发现,增加了调试难度和项目风险。

还记得前面dynamic_cast的例子吗?如果你添加了新的子类但忘记更新相关的类型判断代码,只有在程序运行到那个分支时才会暴露问题。在大型项目中,这可能意味着bug在生产环境中才被发现。

Moonbit的解决方案:ADT配合模式匹配提供编译时类型安全。遗漏任何一个case,编译器都会报错。

4. 复杂度爆炸

问题:深层继承树变得难以理解和维护。

经过几年的开发,你的游戏可能演化出这样的继承树:

Sprite
├── Warrior
│   ├── Hero
│   │   ├── Paladin
│   │   ├── Berserker
│   │   └── ...
│   └── Enemy
│       ├── Orc
│       ├── Dragon
│       └── ...
├── Mage
│   ├── Wizard
│   └── Sorceror
└── NPC
    ├── Merchant
    ├── QuestGiver
    └── ...

当需要重构时,你可能需要花费大量时间来理解这个复杂的继承关系,而且任何改动都可能产生意想不到的副作用。

Moonbit的解决方案:扁平化的组合结构让系统更容易理解。每个能力都是独立的trait,组合关系一目了然。

结语

通过这次深入的比较,我们看到了两种截然不同的面向对象编程哲学:

  • C++的传统OOP:基于继承的"is-a"关系,直观但可能陷入复杂度陷阱
  • Moonbit的现代OOP:基于组合的"has-a"关系,初学稍复杂但长期更优雅

Moonbit的方法虽然需要编写更多的"模板代码",但这些额外的代码换来的是:

  • 更好的类型安全:编译时捕获更多错误
  • 更清晰的架构:组合关系比继承关系更容易理解
  • 更容易的维护:修改影响范围更可控
  • 更少的运行时错误:ADT和模式匹配提供完整性保证

尽管我们必须承认,对于小型项目或特定场景,传统继承依然有其价值。但现实情况是,随着软件系统复杂度的增长,Moonbit这种组合优于继承的设计哲学确实展现出了更强的适应性和可维护性。

希望这篇文章能为你的Moonbit编程之旅提供有价值的指导,让你在构建复杂系统时能够充分利用Moonbit的设计优势。


完整版代码

pub struct SpriteData {
  
Int
id
:
Int
Int
mut
Double
x
:
Double
Double
mut
Double
y
:
Double
Double
} pub fn
struct SpriteData {
  id: Int
  mut x: Double
  mut y: Double
}
SpriteData
::
(id : Int, x : Double, y : Double) -> SpriteData
new
(
Int
id
:
Int
Int
,
Double
x
:
Double
Double
,
Double
y
:
Double
Double
) ->
struct SpriteData {
  id: Int
  mut x: Double
  mut y: Double
}
SpriteData
{
struct SpriteData {
  id: Int
  mut x: Double
  mut y: Double
}
SpriteData
::{
Int
id
,
Double
x
,
Double
y
}
} // 2. 定义描述通用行为的 Trait pub trait
trait Sprite {
  getSpriteData(Self) -> SpriteData
  asSpriteEnum(Self) -> SpriteEnum
  tryAsWarrior(Self) -> &Warrior?
  getID(Self) -> Int
  getX(Self) -> Double
  getY(Self) -> Double
  setX(Self, Double) -> Unit
  setY(Self, Double) -> Unit
}
Sprite
{
(Self) -> SpriteData
getSpriteData
(

type parameter Self

Self
) ->
struct SpriteData {
  id: Int
  mut x: Double
  mut y: Double
}
SpriteData
(Self) -> SpriteEnum
asSpriteEnum
(

type parameter Self

Self
) ->
enum SpriteEnum {
  Hero(Hero)
  Enemy(Enemy)
  Merchant(Merchant)
}
SpriteEnum
(Self) -> &Warrior?
tryAsWarrior
(

type parameter Self

Self
) -> &
trait Warrior {
  getWarriorData(Self) -> WarriorData
  asWarriorEnum(Self) -> WarriorEnum
  attack(Self, target : &Warrior) -> Unit
}
Warrior
? = _
(Self) -> Int
getID
(

type parameter Self

Self
) ->
Int
Int
= _
(Self) -> Double
getX
(

type parameter Self

Self
) ->
Double
Double
= _
(Self) -> Double
getY
(

type parameter Self

Self
) ->
Double
Double
= _
(Self, Double) -> Unit
setX
(

type parameter Self

Self
,
Double
Double
) ->
Unit
Unit
= _
(Self, Double) -> Unit
setY
(

type parameter Self

Self
,
Double
Double
) ->
Unit
Unit
= _
} // Sprite的默认实现 // 只要实现了 getSpriteData,就自动拥有了其他方法 impl
trait Sprite {
  getSpriteData(Self) -> SpriteData
  asSpriteEnum(Self) -> SpriteEnum
  tryAsWarrior(Self) -> &Warrior?
  getID(Self) -> Int
  getX(Self) -> Double
  getY(Self) -> Double
  setX(Self, Double) -> Unit
  setY(Self, Double) -> Unit
}
Sprite
with
(self : Self) -> Int
getID
(
Self
self
) {
Self
self
.
(Self) -> SpriteData
getSpriteData
().
Int
id
} impl
trait Sprite {
  getSpriteData(Self) -> SpriteData
  asSpriteEnum(Self) -> SpriteEnum
  tryAsWarrior(Self) -> &Warrior?
  getID(Self) -> Int
  getX(Self) -> Double
  getY(Self) -> Double
  setX(Self, Double) -> Unit
  setY(Self, Double) -> Unit
}
Sprite
with
(self : Self) -> Double
getX
(
Self
self
) {
Self
self
.
(Self) -> SpriteData
getSpriteData
().
Double
x
} impl
trait Sprite {
  getSpriteData(Self) -> SpriteData
  asSpriteEnum(Self) -> SpriteEnum
  tryAsWarrior(Self) -> &Warrior?
  getID(Self) -> Int
  getX(Self) -> Double
  getY(Self) -> Double
  setX(Self, Double) -> Unit
  setY(Self, Double) -> Unit
}
Sprite
with
(self : Self) -> Double
getY
(
Self
self
) {
Self
self
.
(Self) -> SpriteData
getSpriteData
().
Double
y
} impl
trait Sprite {
  getSpriteData(Self) -> SpriteData
  asSpriteEnum(Self) -> SpriteEnum
  tryAsWarrior(Self) -> &Warrior?
  getID(Self) -> Int
  getX(Self) -> Double
  getY(Self) -> Double
  setX(Self, Double) -> Unit
  setY(Self, Double) -> Unit
}
Sprite
with
(self : Self, new_x : Double) -> Unit
setX
(
Self
self
,
Double
new_x
) {
Self
self
.
(Self) -> SpriteData
getSpriteData
().
Double
x
=
Double
new_x
} impl
trait Sprite {
  getSpriteData(Self) -> SpriteData
  asSpriteEnum(Self) -> SpriteEnum
  tryAsWarrior(Self) -> &Warrior?
  getID(Self) -> Int
  getX(Self) -> Double
  getY(Self) -> Double
  setX(Self, Double) -> Unit
  setY(Self, Double) -> Unit
}
Sprite
with
(self : Self, new_y : Double) -> Unit
setY
(
Self
self
,
Double
new_y
) {
Self
self
.
(Self) -> SpriteData
getSpriteData
().
Double
y
=
Double
new_y
} impl
trait Sprite {
  getSpriteData(Self) -> SpriteData
  asSpriteEnum(Self) -> SpriteEnum
  tryAsWarrior(Self) -> &Warrior?
  getID(Self) -> Int
  getX(Self) -> Double
  getY(Self) -> Double
  setX(Self, Double) -> Unit
  setY(Self, Double) -> Unit
}
Sprite
with
(self : Self) -> &Warrior?
tryAsWarrior
(
Self
self
) {
match
Self
self
.
(Self) -> SpriteEnum
asSpriteEnum
() {
(Hero) -> SpriteEnum
Hero
(
Hero
h
) =>
(&Warrior) -> &Warrior?
Some
(
Hero
h
as
trait Warrior {
  getWarriorData(Self) -> WarriorData
  asWarriorEnum(Self) -> WarriorEnum
  attack(Self, target : &Warrior) -> Unit
}
&Warrior
)
(Enemy) -> SpriteEnum
Enemy
(
Enemy
e
) =>
(&Warrior) -> &Warrior?
Some
(
Enemy
e
)
_ =>
&Warrior?
None
} } pub enum SpriteEnum {
(Hero) -> SpriteEnum
Hero
(
struct Hero {
  sprite_data: SpriteData
  hp: Double
  damage: Int
  money: Int
}
Hero
)
(Enemy) -> SpriteEnum
Enemy
(
struct Enemy {
  sprite_data: SpriteData
  hp: Double
  damage: Int
}
Enemy
)
(Merchant) -> SpriteEnum
Merchant
(
struct Merchant {
  sprite_data: SpriteData
}
Merchant
)
} pub struct WarriorData {
Double
hp
:
Double
Double
Double
damage
:
Double
Double
} pub trait
trait Warrior {
  getWarriorData(Self) -> WarriorData
  asWarriorEnum(Self) -> WarriorEnum
  attack(Self, target : &Warrior) -> Unit
}
Warrior
:
trait Sprite {
  getSpriteData(Self) -> SpriteData
  asSpriteEnum(Self) -> SpriteEnum
  tryAsWarrior(Self) -> &Warrior?
  getID(Self) -> Int
  getX(Self) -> Double
  getY(Self) -> Double
  setX(Self, Double) -> Unit
  setY(Self, Double) -> Unit
}
Sprite
{ // Warrior 继承自 Sprite
(Self) -> WarriorData
getWarriorData
(

type parameter Self

Self
) ->
struct WarriorData {
  hp: Double
  damage: Double
}
WarriorData
(Self) -> WarriorEnum
asWarriorEnum
(

type parameter Self

Self
) ->
enum WarriorEnum {
  Hero(Hero)
  Enemy(Enemy)
}
WarriorEnum
(Self, &Warrior) -> Unit
attack
(

type parameter Self

Self
, target: &
trait Warrior {
  getWarriorData(Self) -> WarriorData
  asWarriorEnum(Self) -> WarriorEnum
  attack(Self, target : &Warrior) -> Unit
}
Warrior
) ->
Unit
Unit
= _
} impl
trait Warrior {
  getWarriorData(Self) -> WarriorData
  asWarriorEnum(Self) -> WarriorEnum
  attack(Self, target : &Warrior) -> Unit
}
Warrior
with
(self : Self, target : &Warrior) -> Unit
attack
(
Self
self
,
&Warrior
target
) {
(t : (Self, &Warrior)) -> Unit

Evaluates an expression and discards its result. This is useful when you want to execute an expression for its side effects but don't care about its return value, or when you want to explicitly indicate that a value is intentionally unused.

Parameters:

  • value : The value to be ignored. Can be of any type.

Example:

  let x = 42
  ignore(x) // Explicitly ignore the value
  let mut sum = 0
  ignore([1, 2, 3].iter().each((x) => { sum = sum + x })) // Ignore the Unit return value of each()
ignore
((
Self
self
,
&Warrior
target
))
// ... } pub enum WarriorEnum {
(Hero) -> WarriorEnum
Hero
(
struct Hero {
  sprite_data: SpriteData
  hp: Double
  damage: Int
  money: Int
}
Hero
)
(Enemy) -> WarriorEnum
Enemy
(
struct Enemy {
  sprite_data: SpriteData
  hp: Double
  damage: Int
}
Enemy
)
} // 定义Hero pub struct Hero { sprite_data: SpriteData warrior_data: WarriorData money: Int } pub fn
struct Hero {
  sprite_data: SpriteData
  hp: Double
  damage: Int
  money: Int
}
Hero
::
() -> Hero
new
(
) ->
struct Hero {
  sprite_data: SpriteData
  hp: Double
  damage: Int
  money: Int
}
Hero
{
let
SpriteData
sprite_data
=
struct SpriteData {
  id: Int
  mut x: Double
  mut y: Double
}
SpriteData
::
(id : Int, x : Double, y : Double) -> SpriteData
new
(0, 42, 33)
let
WarriorData
warrior_data
=
struct WarriorData {
  hp: Double
  damage: Double
}
WarriorData
::{
Double
hp
: 100,
Double
damage
: 20 }
struct Hero {
  sprite_data: SpriteData
  hp: Double
  damage: Int
  money: Int
}
Hero
::{
SpriteData
sprite_data
,
WarriorData
warrior_data
,
Int
money
: 1000}
} pub impl
trait Sprite {
  getSpriteData(Self) -> SpriteData
  asSpriteEnum(Self) -> SpriteEnum
  tryAsWarrior(Self) -> &Warrior?
  getID(Self) -> Int
  getX(Self) -> Double
  getY(Self) -> Double
  setX(Self, Double) -> Unit
  setY(Self, Double) -> Unit
}
Sprite
for
struct Hero {
  sprite_data: SpriteData
  hp: Double
  damage: Int
  money: Int
}
Hero
with
(self : Hero) -> SpriteData
getSpriteData
(
Hero
self
) {
Hero
self
.
SpriteData
sprite_data
} pub impl
trait Sprite {
  getSpriteData(Self) -> SpriteData
  asSpriteEnum(Self) -> SpriteEnum
  tryAsWarrior(Self) -> &Warrior?
  getID(Self) -> Int
  getX(Self) -> Double
  getY(Self) -> Double
  setX(Self, Double) -> Unit
  setY(Self, Double) -> Unit
}
Sprite
for
struct Hero {
  sprite_data: SpriteData
  hp: Double
  damage: Int
  money: Int
}
Hero
with
(self : Hero) -> SpriteEnum
asSpriteEnum
(
Hero
self
) {
(Hero) -> SpriteEnum
Hero
(
Hero
self
)
} pub impl
trait Warrior {
  getWarriorData(Self) -> WarriorData
  asWarriorEnum(Self) -> WarriorEnum
  attack(Self, target : &Warrior) -> Unit
}
Warrior
for
struct Hero {
  sprite_data: SpriteData
  hp: Double
  damage: Int
  money: Int
}
Hero
with
(self : Hero) -> WarriorData
getWarriorData
(
Hero
self
) {
Hero
self
.
WarriorData
warrior_data
} pub impl
trait Warrior {
  getWarriorData(Self) -> WarriorData
  asWarriorEnum(Self) -> WarriorEnum
  attack(Self, target : &Warrior) -> Unit
}
Warrior
for
struct Hero {
  sprite_data: SpriteData
  hp: Double
  damage: Int
  money: Int
}
Hero
with
(self : Hero) -> WarriorEnum
asWarriorEnum
(
Hero
self
) {
WarriorEnum::
(Hero) -> WarriorEnum
Hero
(
Hero
self
)
} // 定义Enemy pub struct Enemy { sprite_data: SpriteData warrior_data: WarriorData } pub fn
struct Enemy {
  sprite_data: SpriteData
  hp: Double
  damage: Int
}
Enemy
::
() -> Enemy
new
() ->
struct Enemy {
  sprite_data: SpriteData
  hp: Double
  damage: Int
}
Enemy
{
let
SpriteData
sprite_data
=
struct SpriteData {
  id: Int
  mut x: Double
  mut y: Double
}
SpriteData
::
(id : Int, x : Double, y : Double) -> SpriteData
new
(0, 42, 33)
let
WarriorData
warrior_data
=
struct WarriorData {
  hp: Double
  damage: Double
}
WarriorData
::{
Double
hp
: 100,
Double
damage
: 5}
struct Enemy {
  sprite_data: SpriteData
  hp: Double
  damage: Int
}
Enemy
::{
SpriteData
sprite_data
,
WarriorData
warrior_data
}
} pub impl
trait Sprite {
  getSpriteData(Self) -> SpriteData
  asSpriteEnum(Self) -> SpriteEnum
  tryAsWarrior(Self) -> &Warrior?
  getID(Self) -> Int
  getX(Self) -> Double
  getY(Self) -> Double
  setX(Self, Double) -> Unit
  setY(Self, Double) -> Unit
}
Sprite
for
struct Enemy {
  sprite_data: SpriteData
  hp: Double
  damage: Int
}
Enemy
with
(self : Enemy) -> SpriteData
getSpriteData
(
Enemy
self
) {
Enemy
self
.
SpriteData
sprite_data
} pub impl
trait Sprite {
  getSpriteData(Self) -> SpriteData
  asSpriteEnum(Self) -> SpriteEnum
  tryAsWarrior(Self) -> &Warrior?
  getID(Self) -> Int
  getX(Self) -> Double
  getY(Self) -> Double
  setX(Self, Double) -> Unit
  setY(Self, Double) -> Unit
}
Sprite
for
struct Enemy {
  sprite_data: SpriteData
  hp: Double
  damage: Int
}
Enemy
with
(self : Enemy) -> SpriteEnum
asSpriteEnum
(
Enemy
self
) {
(Enemy) -> SpriteEnum
Enemy
(
Enemy
self
)
} pub impl
trait Warrior {
  getWarriorData(Self) -> WarriorData
  asWarriorEnum(Self) -> WarriorEnum
  attack(Self, target : &Warrior) -> Unit
}
Warrior
for
struct Enemy {
  sprite_data: SpriteData
  hp: Double
  damage: Int
}
Enemy
with
(self : Enemy) -> WarriorData
getWarriorData
(
Enemy
self
) {
Enemy
self
.
WarriorData
warrior_data
} pub impl
trait Warrior {
  getWarriorData(Self) -> WarriorData
  asWarriorEnum(Self) -> WarriorEnum
  attack(Self, target : &Warrior) -> Unit
}
Warrior
for
struct Enemy {
  sprite_data: SpriteData
  hp: Double
  damage: Int
}
Enemy
with
(self : Enemy) -> WarriorEnum
asWarriorEnum
(
Enemy
self
) {
WarriorEnum::
(Enemy) -> WarriorEnum
Enemy
(
Enemy
self
)
} // 定义Merchant pub struct Merchant { sprite_data: SpriteData } pub fn
struct Merchant {
  sprite_data: SpriteData
}
Merchant
::
() -> Merchant
new
() ->
struct Merchant {
  sprite_data: SpriteData
}
Merchant
{
let
SpriteData
sprite_data
=
struct SpriteData {
  id: Int
  mut x: Double
  mut y: Double
}
SpriteData
::
(id : Int, x : Double, y : Double) -> SpriteData
new
(0, 42, 33)
struct Merchant {
  sprite_data: SpriteData
}
Merchant
::{
SpriteData
sprite_data
}
} pub impl
trait Sprite {
  getSpriteData(Self) -> SpriteData
  asSpriteEnum(Self) -> SpriteEnum
  tryAsWarrior(Self) -> &Warrior?
  getID(Self) -> Int
  getX(Self) -> Double
  getY(Self) -> Double
  setX(Self, Double) -> Unit
  setY(Self, Double) -> Unit
}
Sprite
for
struct Merchant {
  sprite_data: SpriteData
}
Merchant
with
(self : Merchant) -> SpriteData
getSpriteData
(
Merchant
self
) {
Merchant
self
.
SpriteData
sprite_data
} pub impl
trait Sprite {
  getSpriteData(Self) -> SpriteData
  asSpriteEnum(Self) -> SpriteEnum
  tryAsWarrior(Self) -> &Warrior?
  getID(Self) -> Int
  getX(Self) -> Double
  getY(Self) -> Double
  setX(Self, Double) -> Unit
  setY(Self, Double) -> Unit
}
Sprite
for
struct Merchant {
  sprite_data: SpriteData
}
Merchant
with
(self : Merchant) -> SpriteEnum
asSpriteEnum
(
Merchant
self
) {
(Merchant) -> SpriteEnum
Merchant
(
Merchant
self
)
} pub fn
struct Merchant {
  sprite_data: SpriteData
}
Merchant
::
(self : Merchant, s : &Sprite) -> String
ask
(
Merchant
self
:
struct Merchant {
  sprite_data: SpriteData
}
Merchant
,
&Sprite
s
: &
trait Sprite {
  getSpriteData(Self) -> SpriteData
  asSpriteEnum(Self) -> SpriteEnum
  tryAsWarrior(Self) -> &Warrior?
  getID(Self) -> Int
  getX(Self) -> Double
  getY(Self) -> Double
  setX(Self, Double) -> Unit
  setY(Self, Double) -> Unit
}
Sprite
) ->
String
String
{
(t : Merchant) -> Unit

Evaluates an expression and discards its result. This is useful when you want to execute an expression for its side effects but don't care about its return value, or when you want to explicitly indicate that a value is intentionally unused.

Parameters:

  • value : The value to be ignored. Can be of any type.

Example:

  let x = 42
  ignore(x) // Explicitly ignore the value
  let mut sum = 0
  ignore([1, 2, 3].iter().each((x) => { sum = sum + x })) // Ignore the Unit return value of each()
ignore
(
Merchant
self
)
match
&Sprite
s
.
(&Sprite) -> &Warrior?
tryAsWarrior
() {
&Warrior?
Some
(_) =>"what to buy something?"
&Warrior?
None
=> ""
} } test "who are you" { fn
(&Sprite) -> String
who_are_you
(
&Sprite
s
: &
trait Sprite {
  getSpriteData(Self) -> SpriteData
  asSpriteEnum(Self) -> SpriteEnum
  tryAsWarrior(Self) -> &Warrior?
  getID(Self) -> Int
  getX(Self) -> Double
  getY(Self) -> Double
  setX(Self, Double) -> Unit
  setY(Self, Double) -> Unit
}
Sprite
) ->
String
String
{
match
&Sprite
s
.
(&Sprite) -> SpriteEnum
asSpriteEnum
() {
SpriteEnum
Hero
(_) => "hero"
SpriteEnum
Enemy
(_) => "enemy"
SpriteEnum
Merchant
(_) => "merchant"
} } let
Hero
hero
=
struct Hero {
  sprite_data: SpriteData
  hp: Double
  damage: Int
  money: Int
}
Hero
::
() -> Hero
new
();
let
Enemy
enemy
=
struct Enemy {
  sprite_data: SpriteData
  hp: Double
  damage: Int
}
Enemy
::
() -> Enemy
new
();
let
Merchant
merchant
=
struct Merchant {
  sprite_data: SpriteData
}
Merchant
::
() -> Merchant
new
();
(obj : &Show, content~ : String, loc~ : SourceLoc = _, args_loc~ : ArgsLoc = _) -> Unit raise InspectError

Tests if the string representation of an object matches the expected content. Used primarily in test cases to verify the correctness of Show implementations and program outputs.

Parameters:

  • object : The object to be inspected. Must implement the Show trait.
  • content : The expected string representation of the object. Defaults to an empty string.
  • location : Source code location information for error reporting. Automatically provided by the compiler.
  • arguments_location : Location information for function arguments in source code. Automatically provided by the compiler.

Throws an InspectError if the actual string representation of the object does not match the expected content. The error message includes detailed information about the mismatch, including source location and both expected and actual values.

Example:

  inspect(42, content="42")
  inspect("hello", content="hello")
  inspect([1, 2, 3], content="[1, 2, 3]")
inspect
(
(&Sprite) -> String
who_are_you
(
Hero
hero
),
String
content
="hero")
(obj : &Show, content~ : String, loc~ : SourceLoc = _, args_loc~ : ArgsLoc = _) -> Unit raise InspectError

Tests if the string representation of an object matches the expected content. Used primarily in test cases to verify the correctness of Show implementations and program outputs.

Parameters:

  • object : The object to be inspected. Must implement the Show trait.
  • content : The expected string representation of the object. Defaults to an empty string.
  • location : Source code location information for error reporting. Automatically provided by the compiler.
  • arguments_location : Location information for function arguments in source code. Automatically provided by the compiler.

Throws an InspectError if the actual string representation of the object does not match the expected content. The error message includes detailed information about the mismatch, including source location and both expected and actual values.

Example:

  inspect(42, content="42")
  inspect("hello", content="hello")
  inspect([1, 2, 3], content="[1, 2, 3]")
inspect
(
(&Sprite) -> String
who_are_you
(
Enemy
enemy
),
String
content
="enemy")
(obj : &Show, content~ : String, loc~ : SourceLoc = _, args_loc~ : ArgsLoc = _) -> Unit raise InspectError

Tests if the string representation of an object matches the expected content. Used primarily in test cases to verify the correctness of Show implementations and program outputs.

Parameters:

  • object : The object to be inspected. Must implement the Show trait.
  • content : The expected string representation of the object. Defaults to an empty string.
  • location : Source code location information for error reporting. Automatically provided by the compiler.
  • arguments_location : Location information for function arguments in source code. Automatically provided by the compiler.

Throws an InspectError if the actual string representation of the object does not match the expected content. The error message includes detailed information about the mismatch, including source location and both expected and actual values.

Example:

  inspect(42, content="42")
  inspect("hello", content="hello")
  inspect([1, 2, 3], content="[1, 2, 3]")
inspect
(
(&Sprite) -> String
who_are_you
(
Merchant
merchant
),
String
content
="merchant")
} pub trait
trait SayName {
  say_name(Self) -> String
}
SayName
{
(Self) -> String
say_name
(

type parameter Self

Self
) ->
String
String
} pub impl
trait SayName {
  say_name(Self) -> String
}
SayName
for
struct Hero {
  sprite_data: SpriteData
  hp: Double
  damage: Int
  money: Int
}
Hero
with
(Hero) -> String
say_name
(_) {
"hero" } pub impl
trait SayName {
  say_name(Self) -> String
}
SayName
for
struct Enemy {
  sprite_data: SpriteData
  hp: Double
  damage: Int
}
Enemy
with
(Enemy) -> String
say_name
(_) {
"enemy" } pub impl
trait SayName {
  say_name(Self) -> String
}
SayName
for
struct Merchant {
  sprite_data: SpriteData
}
Merchant
with
(Merchant) -> String
say_name
(_) {
"merchant" } test "say_name" { fn
(&SayName) -> String
who_are_you
(
&SayName
s
: &
trait SayName {
  say_name(Self) -> String
}
SayName
) ->
String
String
{
&SayName
s
.
(&SayName) -> String
say_name
()
} let
Hero
hero
=
struct Hero {
  sprite_data: SpriteData
  hp: Double
  damage: Int
  money: Int
}
Hero
::
() -> Hero
new
();
let
Enemy
enemy
=
struct Enemy {
  sprite_data: SpriteData
  hp: Double
  damage: Int
}
Enemy
::
() -> Enemy
new
();
let
Merchant
merchant
=
struct Merchant {
  sprite_data: SpriteData
}
Merchant
::
() -> Merchant
new
();
(obj : &Show, content~ : String, loc~ : SourceLoc = _, args_loc~ : ArgsLoc = _) -> Unit raise InspectError

Tests if the string representation of an object matches the expected content. Used primarily in test cases to verify the correctness of Show implementations and program outputs.

Parameters:

  • object : The object to be inspected. Must implement the Show trait.
  • content : The expected string representation of the object. Defaults to an empty string.
  • location : Source code location information for error reporting. Automatically provided by the compiler.
  • arguments_location : Location information for function arguments in source code. Automatically provided by the compiler.

Throws an InspectError if the actual string representation of the object does not match the expected content. The error message includes detailed information about the mismatch, including source location and both expected and actual values.

Example:

  inspect(42, content="42")
  inspect("hello", content="hello")
  inspect([1, 2, 3], content="[1, 2, 3]")
inspect
(
(&SayName) -> String
who_are_you
(
Hero
hero
),
String
content
="hero")
(obj : &Show, content~ : String, loc~ : SourceLoc = _, args_loc~ : ArgsLoc = _) -> Unit raise InspectError

Tests if the string representation of an object matches the expected content. Used primarily in test cases to verify the correctness of Show implementations and program outputs.

Parameters:

  • object : The object to be inspected. Must implement the Show trait.
  • content : The expected string representation of the object. Defaults to an empty string.
  • location : Source code location information for error reporting. Automatically provided by the compiler.
  • arguments_location : Location information for function arguments in source code. Automatically provided by the compiler.

Throws an InspectError if the actual string representation of the object does not match the expected content. The error message includes detailed information about the mismatch, including source location and both expected and actual values.

Example:

  inspect(42, content="42")
  inspect("hello", content="hello")
  inspect([1, 2, 3], content="[1, 2, 3]")
inspect
(
(&SayName) -> String
who_are_you
(
Enemy
enemy
),
String
content
="enemy")
(obj : &Show, content~ : String, loc~ : SourceLoc = _, args_loc~ : ArgsLoc = _) -> Unit raise InspectError

Tests if the string representation of an object matches the expected content. Used primarily in test cases to verify the correctness of Show implementations and program outputs.

Parameters:

  • object : The object to be inspected. Must implement the Show trait.
  • content : The expected string representation of the object. Defaults to an empty string.
  • location : Source code location information for error reporting. Automatically provided by the compiler.
  • arguments_location : Location information for function arguments in source code. Automatically provided by the compiler.

Throws an InspectError if the actual string representation of the object does not match the expected content. The error message includes detailed information about the mismatch, including source location and both expected and actual values.

Example:

  inspect(42, content="42")
  inspect("hello", content="hello")
  inspect([1, 2, 3], content="[1, 2, 3]")
inspect
(
(&SayName) -> String
who_are_you
(
Merchant
merchant
),
String
content
="merchant")
} test "merchant ask" { let
Hero
hero
=
struct Hero {
  sprite_data: SpriteData
  hp: Double
  damage: Int
  money: Int
}
Hero
::
() -> Hero
new
();
let
Enemy
enemy
=
struct Enemy {
  sprite_data: SpriteData
  hp: Double
  damage: Int
}
Enemy
::
() -> Enemy
new
();
let
Merchant
merchant
=
struct Merchant {
  sprite_data: SpriteData
}
Merchant
::
() -> Merchant
new
();
(obj : &Show, content~ : String, loc~ : SourceLoc = _, args_loc~ : ArgsLoc = _) -> Unit raise InspectError

Tests if the string representation of an object matches the expected content. Used primarily in test cases to verify the correctness of Show implementations and program outputs.

Parameters:

  • object : The object to be inspected. Must implement the Show trait.
  • content : The expected string representation of the object. Defaults to an empty string.
  • location : Source code location information for error reporting. Automatically provided by the compiler.
  • arguments_location : Location information for function arguments in source code. Automatically provided by the compiler.

Throws an InspectError if the actual string representation of the object does not match the expected content. The error message includes detailed information about the mismatch, including source location and both expected and actual values.

Example:

  inspect(42, content="42")
  inspect("hello", content="hello")
  inspect([1, 2, 3], content="[1, 2, 3]")
inspect
(
Merchant
merchant
.
(self : Merchant, s : &Sprite) -> String
ask
(
Hero
hero
),
String
content
="what to buy something?")
(obj : &Show, content~ : String, loc~ : SourceLoc = _, args_loc~ : ArgsLoc = _) -> Unit raise InspectError

Tests if the string representation of an object matches the expected content. Used primarily in test cases to verify the correctness of Show implementations and program outputs.

Parameters:

  • object : The object to be inspected. Must implement the Show trait.
  • content : The expected string representation of the object. Defaults to an empty string.
  • location : Source code location information for error reporting. Automatically provided by the compiler.
  • arguments_location : Location information for function arguments in source code. Automatically provided by the compiler.

Throws an InspectError if the actual string representation of the object does not match the expected content. The error message includes detailed information about the mismatch, including source location and both expected and actual values.

Example:

  inspect(42, content="42")
  inspect("hello", content="hello")
  inspect([1, 2, 3], content="[1, 2, 3]")
inspect
(
Merchant
merchant
.
(self : Merchant, s : &Sprite) -> String
ask
(
Enemy
enemy
),
String
content
="what to buy something?")
(obj : &Show, content~ : String, loc~ : SourceLoc = _, args_loc~ : ArgsLoc = _) -> Unit raise InspectError

Tests if the string representation of an object matches the expected content. Used primarily in test cases to verify the correctness of Show implementations and program outputs.

Parameters:

  • object : The object to be inspected. Must implement the Show trait.
  • content : The expected string representation of the object. Defaults to an empty string.
  • location : Source code location information for error reporting. Automatically provided by the compiler.
  • arguments_location : Location information for function arguments in source code. Automatically provided by the compiler.

Throws an InspectError if the actual string representation of the object does not match the expected content. The error message includes detailed information about the mismatch, including source location and both expected and actual values.

Example:

  inspect(42, content="42")
  inspect("hello", content="hello")
  inspect([1, 2, 3], content="[1, 2, 3]")
inspect
(
Merchant
merchant
.
(self : Merchant, s : &Sprite) -> String
ask
(
Merchant
merchant
),
String
content
="")
}

MoonBit Pearls Vol.01:使用MoonBit编写Pratt解析器

· 阅读需 9 分钟
myfreess

上周 MoonBit 社区发起 MoonBit Pearls 高质量文档与示例征集活动,经过精细筛选,本周正式推出"MoonBit Pearls"专栏的首篇入选文章。专栏作为长期知识沉淀平台,持续收录优质文档。我们期待更多开发者参与后续投稿,共同丰富 MoonBit 社区生态。

以下是首篇投稿正文内容,作者通过完整案例,演示了如何用 MoonBit 编写 Pratt 解析器:

在编译过程中,语法分析(也称为解析,Parsing)是一个关键步骤。解析器的主要职责是将Token流转换成抽象语法树(AST)。

本文将介绍一种解析器的实现算法:Pratt解析(Pratt Parsing), 是自顶向下的算符优先分析法(Top Down Operator Precedence Parsing),并展示如何用MoonBit来实现它。

为什么用Pratt解析器

几乎每个程序员都不会对中缀表达式感到陌生, 即使是坚定的Lisp/Forth程序员,至少也知道世界上有大半人这样写算术表达式:

24 * (x + 4)

而对于编译器(或者解释器)的编写者而言,这样的中缀表达式要比Lisp所用的前缀表达式和Forth使用的后缀表达式难解析一点。例如,使用朴素的手写递归下降解析器来解析就需要多个互相递归的函数,还得在分析表达式语法时消除左递归,这样的代码在运算符增多时变得很不友好。解析器生成工具在这一问题上也不是很令人满意的选项,以一个简单加法和乘法运算表达式的BNF为例:

Expr ::=
    Factor
    | Expr '+' Factor
Factor ::=
    Atom
    | Factor '*' Atom
Atom ::=
    'number'
    | '(' Expr ')'

这看起来并不是很直观,搞不好还得花时间复习一下大学里上过的形式语言课程。

而有些语言如Haskell支持自定义的中缀运算符,这几乎不太可能简单地用某种解析器生成工具解决。

Pratt解析器很好地解决了中缀表达式解析的问题,与此同时,它还很方便扩展支持添加新的运算符(不需要改源码就可以做到!)。它被著名的编译原理书籍《Crafting Interpreters》推荐和递归下降解析器一同使用,rust-analyzer项目中也使用了它。

结合力

Pratt 解析器中用于描述结合性和优先级的概念叫做binding power(结合力),对于每个中缀运算符而言,其结合力是一对整数 - 左右各一个。如下所示:

expr:   A     +     B     *     C
power:     3     3     5     5

而其作用和名字非常符合,数字越大,越能优先获取某个操作数(operand, 这个例子中A B C都是操作数)。

上面的例子展示了具有不同优先级的运算符,而同一运算符的结合性通过一大一小的结合力来表示。

expr:   A     +     B     +     C
power:     1     2     1     2

在这个例子中,当解析到B时,由于左边的结合力较大,表达式会变成这样:

expr:   (A + B)     +     C
power:           1     2

接下来让我们看看Pratt解析器在具体执行时如何使用这一概念。

概览与前期准备

Pratt解析器的主体框架大概是这样:

fn parse(self : Tokens, min_bp : Int) -> SExpr ! ParseError {
    ...
    while true {
       parse(...)
    }
    ...
}

从上文可以看出,它是交替使用递归和循环实现的。这其实对应着两种模式:

  • 永远是最左边的表达式在最内层,即"1 + 2 + 3" = "(1 + 2) + 3", 只需要使用循环就能解析
  • 永远最右边的表达式在最内层,即"1 + 2 + 3" = "1 + (2 + 3)", 这只使用递归也可以解析

min_bp是一个代表左侧某个还没有解析完毕的运算符结合力的参数。

我们的目标是读入一个token流,并输出一个不需要考虑优先级的前缀表达式:

enum SExpr {
  
(String) -> SExpr
Atom
(
String
String
)
(Char, Array[SExpr]) -> SExpr
Cons
(
Char
Char
,
type Array[T]

An Array is a collection of values that supports random access and can grow in size.

Array
[
enum SExpr {
  Atom(String)
  Cons(Char, Array[SExpr])
}
SExpr
])
} impl
trait Show {
  output(Self, &Logger) -> Unit
  to_string(Self) -> String
}

Trait for types that can be converted to String

Show
for
enum SExpr {
  Atom(String)
  Cons(Char, Array[SExpr])
}
SExpr
with
(self : SExpr, logger : &Logger) -> Unit
output
(
SExpr
self
,
&Logger
logger
) {
match
SExpr
self
{
(String) -> SExpr
Atom
(
String
s
) =>
&Logger
logger
.
(&Logger, String) -> Unit
write_string
(
String
s
)
(Char, Array[SExpr]) -> SExpr
Cons
(
Char
op
,
Array[SExpr]
args
) => {
&Logger
logger
.
(&Logger, Char) -> Unit
write_char
('(')
&Logger
logger
.
(&Logger, Char) -> Unit
write_char
(
Char
op
)
for
Int
i
= 0;
Int
i
(self_ : Int, other : Int) -> Bool
<
Array[SExpr]
args
.
(self : Array[SExpr]) -> Int

Returns the number of elements in the array.

Parameters:

  • array : The array whose length is to be determined.

Returns the number of elements in the array as an integer.

Example:

  let arr = [1, 2, 3]
  inspect(arr.length(), content="3")
  let empty : Array[Int] = []
  inspect(empty.length(), content="0")
length
();
Int
i
=
Int
i
(self : Int, other : Int) -> Int

Adds two 32-bit signed integers. Performs two's complement arithmetic, which means the operation will wrap around if the result exceeds the range of a 32-bit integer.

Parameters:

  • self : The first integer operand.
  • other : The second integer operand.

Returns a new integer that is the sum of the two operands. If the mathematical sum exceeds the range of a 32-bit integer (-2,147,483,648 to 2,147,483,647), the result wraps around according to two's complement rules.

Example:

  inspect(42 + 1, content="43")
  inspect(2147483647 + 1, content="-2147483648") // Overflow wraps around to minimum value
+
1 {
&Logger
logger
.
(&Logger, Char) -> Unit
write_char
(' ')
&Logger
logger
.
(&Logger, String) -> Unit
write_string
(
Array[SExpr]
args
(Array[SExpr], Int) -> SExpr

Retrieves an element from the array at the specified index.

Parameters:

  • array : The array to get the element from.
  • index : The position in the array from which to retrieve the element.

Returns the element at the specified index.

Throws a panic if the index is negative or greater than or equal to the length of the array.

Example:

  let arr = [1, 2, 3]
  inspect(arr[1], content="2")
[
i].
(self : SExpr) -> String
to_string
())
}
&Logger
logger
.
(&Logger, Char) -> Unit
write_char
(')')
} } } test {
(obj : &Show, content~ : String, loc~ : SourceLoc = _, args_loc~ : ArgsLoc = _) -> Unit raise InspectError

Tests if the string representation of an object matches the expected content. Used primarily in test cases to verify the correctness of Show implementations and program outputs.

Parameters:

  • object : The object to be inspected. Must implement the Show trait.
  • content : The expected string representation of the object. Defaults to an empty string.
  • location : Source code location information for error reporting. Automatically provided by the compiler.
  • arguments_location : Location information for function arguments in source code. Automatically provided by the compiler.

Throws an InspectError if the actual string representation of the object does not match the expected content. The error message includes detailed information about the mismatch, including source location and both expected and actual values.

Example:

  inspect(42, content="42")
  inspect("hello", content="hello")
  inspect([1, 2, 3], content="[1, 2, 3]")
inspect
(
(Char, Array[SExpr]) -> SExpr
Cons
('+', [
(String) -> SExpr
Atom
("3"),
(String) -> SExpr
Atom
("4")]),
String
content
="(+ 3 4)")
}

由于这个过程中可能有各种各样的错误,所以parseExpr的返回类型是Sexpr ! ParseError

不过在开始编写解析器之前,我们还需要对字符串进行分割,得到一个简单的Token流。

enum Token {
  
Token
LParen
Token
RParen
(String) -> Token
Operand
(
String
String
)
(Char) -> Token
Operator
(
Char
Char
)
Token
Eof
} derive(
trait Show {
  output(Self, &Logger) -> Unit
  to_string(Self) -> String
}

Trait for types that can be converted to String

Show
,
trait Eq {
  op_equal(Self, Self) -> Bool
}

Trait for types whose elements can test for equality

Eq
)
struct Tokens { mut
Int
position
:
Int
Int
Array[Token]
tokens
:
type Array[T]

An Array is a collection of values that supports random access and can grow in size.

Array
[
enum Token {
  LParen
  RParen
  Operand(String)
  Operator(Char)
  Eof
}
Token
]
}

这个token流需要实现两个方法:peek() pop()

peek()方法能获取token流中的第一个token,对状态无改变,换言之它是无副作用的,只是偷看一眼将要处理的内容。对于空token流,它返回Eof。

fn 
(self : Tokens) -> Token
peek
(
Tokens
self
:
struct Tokens {
  mut position: Int
  tokens: Array[Token]
}
Tokens
) ->
enum Token {
  LParen
  RParen
  Operand(String)
  Operator(Char)
  Eof
}
Token
{
if
Tokens
self
.
Int
position
(self_ : Int, other : Int) -> Bool
<
Tokens
self
.
Array[Token]
tokens
.
(self : Array[Token]) -> Int

Returns the number of elements in the array.

Parameters:

  • array : The array whose length is to be determined.

Returns the number of elements in the array as an integer.

Example:

  let arr = [1, 2, 3]
  inspect(arr.length(), content="3")
  let empty : Array[Int] = []
  inspect(empty.length(), content="0")
length
() {
Tokens
self
.
Array[Token]
tokens
.
(self : Array[Token], idx : Int) -> Token

Retrieves the element at the specified index from an array without bounds checking.

Parameters:

  • array : The array from which to retrieve the element.
  • index : The position in the array from which to retrieve the element.

Returns the element at the specified index.

Example:

  let arr = [1, 2, 3]
  inspect(arr.unsafe_get(1), content="2")
unsafe_get
(
Tokens
self
.
Int
position
)
} else {
Token
Eof
} }

pop()peek()的基础上消耗一个token。

fn 
(self : Tokens) -> Token
pop
(
Tokens
self
:
struct Tokens {
  mut position: Int
  tokens: Array[Token]
}
Tokens
) ->
enum Token {
  LParen
  RParen
  Operand(String)
  Operator(Char)
  Eof
}
Token
{
if
Tokens
self
.
Int
position
(self_ : Int, other : Int) -> Bool
<
Tokens
self
.
Array[Token]
tokens
.
(self : Array[Token]) -> Int

Returns the number of elements in the array.

Parameters:

  • array : The array whose length is to be determined.

Returns the number of elements in the array as an integer.

Example:

  let arr = [1, 2, 3]
  inspect(arr.length(), content="3")
  let empty : Array[Int] = []
  inspect(empty.length(), content="0")
length
() {
let
Int
pos
=
Tokens
self
.
Int
position
Tokens
self
.
Int
position
(self : Int, other : Int) -> Int

Adds two 32-bit signed integers. Performs two's complement arithmetic, which means the operation will wrap around if the result exceeds the range of a 32-bit integer.

Parameters:

  • self : The first integer operand.
  • other : The second integer operand.

Returns a new integer that is the sum of the two operands. If the mathematical sum exceeds the range of a 32-bit integer (-2,147,483,648 to 2,147,483,647), the result wraps around according to two's complement rules.

Example:

  inspect(42 + 1, content="43")
  inspect(2147483647 + 1, content="-2147483648") // Overflow wraps around to minimum value
+=
1
Tokens
self
.
Array[Token]
tokens
.
(self : Array[Token], idx : Int) -> Token

Retrieves the element at the specified index from an array without bounds checking.

Parameters:

  • array : The array from which to retrieve the element.
  • index : The position in the array from which to retrieve the element.

Returns the element at the specified index.

Example:

  let arr = [1, 2, 3]
  inspect(arr.unsafe_get(1), content="2")
unsafe_get
(
Int
pos
)
} else {
Token
Eof
} }

tokenize函数负责将一个字符串解析成token流。

fn 
(this : Char) -> Bool
isDigit
(
Char
this
:
Char
Char
) ->
Bool
Bool
{
Char
this
is '0'..='9'
} fn
(this : Char) -> Bool
isAlpha
(
Char
this
:
Char
Char
) ->
Bool
Bool
{
Char
this
is 'A'..='Z'
(Bool, Bool) -> Bool
||
Char
this
is 'a'..='z'
} fn
(this : Char) -> Bool
isWhiteSpace
(
Char
this
:
Char
Char
) ->
Bool
Bool
{
Char
this
(self : Char, other : Char) -> Bool

Compares two characters for equality.

Parameters:

  • self : The first character to compare.
  • other : The second character to compare.

Returns true if both characters represent the same Unicode code point, false otherwise.

Example:

  let a = 'A'
  let b = 'A'
  let c = 'B'
  inspect(a == b, content="true")
  inspect(a == c, content="false")
==
' '
(Bool, Bool) -> Bool
||
Char
this
(self : Char, other : Char) -> Bool

Compares two characters for equality.

Parameters:

  • self : The first character to compare.
  • other : The second character to compare.

Returns true if both characters represent the same Unicode code point, false otherwise.

Example:

  let a = 'A'
  let b = 'A'
  let c = 'B'
  inspect(a == b, content="true")
  inspect(a == c, content="false")
==
'\t'
(Bool, Bool) -> Bool
||
Char
this
(self : Char, other : Char) -> Bool

Compares two characters for equality.

Parameters:

  • self : The first character to compare.
  • other : The second character to compare.

Returns true if both characters represent the same Unicode code point, false otherwise.

Example:

  let a = 'A'
  let b = 'A'
  let c = 'B'
  inspect(a == b, content="true")
  inspect(a == c, content="false")
==
'\n'
} fn
(this : Char) -> Bool
isOperator
(
Char
this
:
Char
Char
) ->
Bool
Bool
{
let
String
operators
= "+-*/"
String
operators
.
(self : String, c : Char) -> Bool

Returns true if this string contains the given character.

contains_char
(
Char
this
)
} type! LexError
Int
Int
fn
(source : String) -> Tokens raise LexError
tokenize
(
String
source
:
String
String
) ->
struct Tokens {
  mut position: Int
  tokens: Array[Token]
}
Tokens
!
suberror LexError Int
LexError
{
let
Array[Token]
tokens
= []
let
Array[Char]
source
=
String
source
.
(self : String) -> Array[Char]

Converts the String into an array of Chars.

to_array
()
let
StringBuilder
buf
=
type StringBuilder
StringBuilder
::
(size_hint~ : Int) -> StringBuilder

Creates a new string builder with an optional initial capacity hint.

Parameters:

  • size_hint : An optional initial capacity hint for the internal buffer. If less than 1, a minimum capacity of 1 is used. Defaults to 0. It is the size of bytes, not the size of characters. size_hint may be ignored on some platforms, JS for example.

Returns a new StringBuilder instance with the specified initial capacity.

new
(
Int
size_hint
= 100)
let mut
Int
i
= 0
while
Int
i
(self_ : Int, other : Int) -> Bool
<
Array[Char]
source
.
(self : Array[Char]) -> Int

Returns the number of elements in the array.

Parameters:

  • array : The array whose length is to be determined.

Returns the number of elements in the array as an integer.

Example:

  let arr = [1, 2, 3]
  inspect(arr.length(), content="3")
  let empty : Array[Int] = []
  inspect(empty.length(), content="0")
length
() {
let
Char
ch
=
Array[Char]
source
.
(self : Array[Char], idx : Int) -> Char

Retrieves the element at the specified index from an array without bounds checking.

Parameters:

  • array : The array from which to retrieve the element.
  • index : The position in the array from which to retrieve the element.

Returns the element at the specified index.

Example:

  let arr = [1, 2, 3]
  inspect(arr.unsafe_get(1), content="2")
unsafe_get
(
Int
i
)
Int
i
(self : Int, other : Int) -> Int

Adds two 32-bit signed integers. Performs two's complement arithmetic, which means the operation will wrap around if the result exceeds the range of a 32-bit integer.

Parameters:

  • self : The first integer operand.
  • other : The second integer operand.

Returns a new integer that is the sum of the two operands. If the mathematical sum exceeds the range of a 32-bit integer (-2,147,483,648 to 2,147,483,647), the result wraps around according to two's complement rules.

Example:

  inspect(42 + 1, content="43")
  inspect(2147483647 + 1, content="-2147483648") // Overflow wraps around to minimum value
+=
1
if
Char
ch
(self : Char, other : Char) -> Bool

Compares two characters for equality.

Parameters:

  • self : The first character to compare.
  • other : The second character to compare.

Returns true if both characters represent the same Unicode code point, false otherwise.

Example:

  let a = 'A'
  let b = 'A'
  let c = 'B'
  inspect(a == b, content="true")
  inspect(a == c, content="false")
==
'('{
Array[Token]
tokens
.
(self : Array[Token], value : Token) -> Unit

Adds an element to the end of the array.

If the array is at capacity, it will be reallocated.

Example

  let v = []
  v.push(3)
push
(
Token
LParen
)
} else if
Char
ch
(self : Char, other : Char) -> Bool

Compares two characters for equality.

Parameters:

  • self : The first character to compare.
  • other : The second character to compare.

Returns true if both characters represent the same Unicode code point, false otherwise.

Example:

  let a = 'A'
  let b = 'A'
  let c = 'B'
  inspect(a == b, content="true")
  inspect(a == c, content="false")
==
')' {
Array[Token]
tokens
.
(self : Array[Token], value : Token) -> Unit

Adds an element to the end of the array.

If the array is at capacity, it will be reallocated.

Example

  let v = []
  v.push(3)
push
(
Token
RParen
)
} else if
(this : Char) -> Bool
isOperator
(
Char
ch
) {
Array[Token]
tokens
.
(self : Array[Token], value : Token) -> Unit

Adds an element to the end of the array.

If the array is at capacity, it will be reallocated.

Example

  let v = []
  v.push(3)
push
(
(Char) -> Token
Operator
(
Char
ch
))
} else if
(this : Char) -> Bool
isAlpha
(
Char
ch
) {
StringBuilder
buf
.
(self : StringBuilder, ch : Char) -> Unit

Writes a character to the StringBuilder.

write_char
(
Char
ch
)
while
Int
i
(self_ : Int, other : Int) -> Bool
<
Array[Char]
source
.
(self : Array[Char]) -> Int

Returns the number of elements in the array.

Parameters:

  • array : The array whose length is to be determined.

Returns the number of elements in the array as an integer.

Example:

  let arr = [1, 2, 3]
  inspect(arr.length(), content="3")
  let empty : Array[Int] = []
  inspect(empty.length(), content="0")
length
()
(Bool, Bool) -> Bool
&&
(
(this : Char) -> Bool
isAlpha
(
Array[Char]
source
(Array[Char], Int) -> Char

Retrieves an element from the array at the specified index.

Parameters:

  • array : The array to get the element from.
  • index : The position in the array from which to retrieve the element.

Returns the element at the specified index.

Throws a panic if the index is negative or greater than or equal to the length of the array.

Example:

  let arr = [1, 2, 3]
  inspect(arr[1], content="2")
[
i])
(Bool, Bool) -> Bool
||
(this : Char) -> Bool
isDigit
(
Array[Char]
source
(Array[Char], Int) -> Char

Retrieves an element from the array at the specified index.

Parameters:

  • array : The array to get the element from.
  • index : The position in the array from which to retrieve the element.

Returns the element at the specified index.

Throws a panic if the index is negative or greater than or equal to the length of the array.

Example:

  let arr = [1, 2, 3]
  inspect(arr[1], content="2")
[
i])
(Bool, Bool) -> Bool
||
Array[Char]
source
(Array[Char], Int) -> Char

Retrieves an element from the array at the specified index.

Parameters:

  • array : The array to get the element from.
  • index : The position in the array from which to retrieve the element.

Returns the element at the specified index.

Throws a panic if the index is negative or greater than or equal to the length of the array.

Example:

  let arr = [1, 2, 3]
  inspect(arr[1], content="2")
[
i]
(self : Char, other : Char) -> Bool

Compares two characters for equality.

Parameters:

  • self : The first character to compare.
  • other : The second character to compare.

Returns true if both characters represent the same Unicode code point, false otherwise.

Example:

  let a = 'A'
  let b = 'A'
  let c = 'B'
  inspect(a == b, content="true")
  inspect(a == c, content="false")
==
'_') {
StringBuilder
buf
.
(self : StringBuilder, ch : Char) -> Unit

Writes a character to the StringBuilder.

write_char
(
Array[Char]
source
(Array[Char], Int) -> Char

Retrieves an element from the array at the specified index.

Parameters:

  • array : The array to get the element from.
  • index : The position in the array from which to retrieve the element.

Returns the element at the specified index.

Throws a panic if the index is negative or greater than or equal to the length of the array.

Example:

  let arr = [1, 2, 3]
  inspect(arr[1], content="2")
[
i])
Int
i
(self : Int, other : Int) -> Int

Adds two 32-bit signed integers. Performs two's complement arithmetic, which means the operation will wrap around if the result exceeds the range of a 32-bit integer.

Parameters:

  • self : The first integer operand.
  • other : The second integer operand.

Returns a new integer that is the sum of the two operands. If the mathematical sum exceeds the range of a 32-bit integer (-2,147,483,648 to 2,147,483,647), the result wraps around according to two's complement rules.

Example:

  inspect(42 + 1, content="43")
  inspect(2147483647 + 1, content="-2147483648") // Overflow wraps around to minimum value
+=
1
}
Array[Token]
tokens
.
(self : Array[Token], value : Token) -> Unit

Adds an element to the end of the array.

If the array is at capacity, it will be reallocated.

Example

  let v = []
  v.push(3)
push
(
(String) -> Token
Operand
(
StringBuilder
buf
.
(self : StringBuilder) -> String

Returns the current content of the StringBuilder as a string.

to_string
()))
StringBuilder
buf
.
(self : StringBuilder) -> Unit

Resets the string builder to an empty state.

reset
()
} else if
(this : Char) -> Bool
isDigit
(
Char
ch
) {
StringBuilder
buf
.
(self : StringBuilder, ch : Char) -> Unit

Writes a character to the StringBuilder.

write_char
(
Char
ch
)
while
Int
i
(self_ : Int, other : Int) -> Bool
<
Array[Char]
source
.
(self : Array[Char]) -> Int

Returns the number of elements in the array.

Parameters:

  • array : The array whose length is to be determined.

Returns the number of elements in the array as an integer.

Example:

  let arr = [1, 2, 3]
  inspect(arr.length(), content="3")
  let empty : Array[Int] = []
  inspect(empty.length(), content="0")
length
()
(Bool, Bool) -> Bool
&&
(this : Char) -> Bool
isDigit
(
Array[Char]
source
(Array[Char], Int) -> Char

Retrieves an element from the array at the specified index.

Parameters:

  • array : The array to get the element from.
  • index : The position in the array from which to retrieve the element.

Returns the element at the specified index.

Throws a panic if the index is negative or greater than or equal to the length of the array.

Example:

  let arr = [1, 2, 3]
  inspect(arr[1], content="2")
[
i]) {
StringBuilder
buf
.
(self : StringBuilder, ch : Char) -> Unit

Writes a character to the StringBuilder.

write_char
(
Array[Char]
source
(Array[Char], Int) -> Char

Retrieves an element from the array at the specified index.

Parameters:

  • array : The array to get the element from.
  • index : The position in the array from which to retrieve the element.

Returns the element at the specified index.

Throws a panic if the index is negative or greater than or equal to the length of the array.

Example:

  let arr = [1, 2, 3]
  inspect(arr[1], content="2")
[
i])
Int
i
(self : Int, other : Int) -> Int

Adds two 32-bit signed integers. Performs two's complement arithmetic, which means the operation will wrap around if the result exceeds the range of a 32-bit integer.

Parameters:

  • self : The first integer operand.
  • other : The second integer operand.

Returns a new integer that is the sum of the two operands. If the mathematical sum exceeds the range of a 32-bit integer (-2,147,483,648 to 2,147,483,647), the result wraps around according to two's complement rules.

Example:

  inspect(42 + 1, content="43")
  inspect(2147483647 + 1, content="-2147483648") // Overflow wraps around to minimum value
+=
1
}
Array[Token]
tokens
.
(self : Array[Token], value : Token) -> Unit

Adds an element to the end of the array.

If the array is at capacity, it will be reallocated.

Example

  let v = []
  v.push(3)
push
(
(String) -> Token
Operand
(
StringBuilder
buf
.
(self : StringBuilder) -> String

Returns the current content of the StringBuilder as a string.

to_string
()))
StringBuilder
buf
.
(self : StringBuilder) -> Unit

Resets the string builder to an empty state.

reset
()
} else if
(this : Char) -> Bool
isWhiteSpace
(
Char
ch
) {
continue } else { raise
(Int) -> LexError
LexError
(
Int
i
)
} } else { return
struct Tokens {
  mut position: Int
  tokens: Array[Token]
}
Tokens
::{
Int
position
: 0,
Array[Token]
tokens
}
} } test {
(obj : &Show, content~ : String, loc~ : SourceLoc = _, args_loc~ : ArgsLoc = _) -> Unit raise InspectError

Tests if the string representation of an object matches the expected content. Used primarily in test cases to verify the correctness of Show implementations and program outputs.

Parameters:

  • object : The object to be inspected. Must implement the Show trait.
  • content : The expected string representation of the object. Defaults to an empty string.
  • location : Source code location information for error reporting. Automatically provided by the compiler.
  • arguments_location : Location information for function arguments in source code. Automatically provided by the compiler.

Throws an InspectError if the actual string representation of the object does not match the expected content. The error message includes detailed information about the mismatch, including source location and both expected and actual values.

Example:

  inspect(42, content="42")
  inspect("hello", content="hello")
  inspect([1, 2, 3], content="[1, 2, 3]")
inspect
(
(source : String) -> Tokens raise LexError
tokenize
("(((((47)))))").
Array[Token]
tokens
,
String
content
=
#|[LParen, LParen, LParen, LParen, LParen, Operand("47"), RParen, RParen, RParen, RParen, RParen] )
(obj : &Show, content~ : String, loc~ : SourceLoc = _, args_loc~ : ArgsLoc = _) -> Unit raise InspectError

Tests if the string representation of an object matches the expected content. Used primarily in test cases to verify the correctness of Show implementations and program outputs.

Parameters:

  • object : The object to be inspected. Must implement the Show trait.
  • content : The expected string representation of the object. Defaults to an empty string.
  • location : Source code location information for error reporting. Automatically provided by the compiler.
  • arguments_location : Location information for function arguments in source code. Automatically provided by the compiler.

Throws an InspectError if the actual string representation of the object does not match the expected content. The error message includes detailed information about the mismatch, including source location and both expected and actual values.

Example:

  inspect(42, content="42")
  inspect("hello", content="hello")
  inspect([1, 2, 3], content="[1, 2, 3]")
inspect
(
(source : String) -> Tokens raise LexError
tokenize
("13 + 6 + 5 * 3").
Array[Token]
tokens
,
String
content
=
#|[Operand("13"), Operator('+'), Operand("6"), Operator('+'), Operand("5"), Operator('*'), Operand("3")] ) }

最后我们还需要一个计算运算符结合力的函数,这可以用简单的match实现。在实际操作中为了便于添加新运算符,应该使用某种键值对容器。

fn 
(op : Char) -> (Int, Int)?
infix_binding_power
(
Char
op
:
Char
Char
) -> (
Int
Int
,
Int
Int
)? {
match
Char
op
{
'+' =>
((Int, Int)) -> (Int, Int)?
Some
((1, 2))
'-' =>
((Int, Int)) -> (Int, Int)?
Some
((1, 2))
'/' =>
((Int, Int)) -> (Int, Int)?
Some
((3, 4))
'*' =>
((Int, Int)) -> (Int, Int)?
Some
((3, 4))
_ =>
(Int, Int)?
None
} }

解析器实现

首先取出第一个token并赋值给变量lhs(left hand side的缩写,表示左侧参数)。

  • 如果它是操作数,就存储下来
  • 如果是左括号,则递归解析出第一个表达式,然后消耗掉一个成对的括号。
  • 其他结果都说明解析出了问题,抛出错误

接着我们试着看一眼第一个运算符:

  • 假如此时结果是Eof,那并不能算失败,一个操作数也可以当成是完整的表达式,直接跳出循环
  • 结果是运算符, 正常返回
  • 结果是右括号,跳出循环
  • 其他结果则返回ParseError

接下来我们需要决定lhs归属于哪个操作符了,这里就要用到min_bp这个参数,它代表左边最近的一个尚未完成解析的操作符的结合力,其初始值为0(没有任何操作符在左边争抢第一个操作数)。不过,此处我们要先做个判断,就是运算符是不是括号 - 假如是括号,说明当前是在解析一个括号里的表达式,也应该跳出循环直接结束。这也是使用peek方法的原因之一,因为我们无法确定到底要不要在这里就消耗掉这个运算符。

在计算好当前运算符op的结合力之后,首先将左侧结合力l_bpmin_bp进行比较:

  • l_bp小于min_bp,马上break,这样就会将lhs返回给上层还等着右侧参数的运算符
  • 否则用pop方法消耗掉当前操作符,并且递归调用parseExpr获取右侧参数,只是第二个参数使用当前操作符的右结合力r_bp。解析成功之后将结果赋值给lhs,继续循环
type! ParseError (
Int
Int
,
enum Token {
  LParen
  RParen
  Operand(String)
  Operator(Char)
  Eof
}
Token
) derive (
trait Show {
  output(Self, &Logger) -> Unit
  to_string(Self) -> String
}

Trait for types that can be converted to String

Show
)
fn
(self : Tokens, min_bp~ : Int = ..) -> SExpr raise ParseError
parseExpr
(
Tokens
self
:
struct Tokens {
  mut position: Int
  tokens: Array[Token]
}
Tokens
,
Int
min_bp
~ :
Int
Int
= 0) ->
enum SExpr {
  Atom(String)
  Cons(Char, Array[SExpr])
}
SExpr
!
suberror ParseError (Int, Token)
ParseError
{
let mut
SExpr
lhs
= match
Tokens
self
.
(self : Tokens) -> Token
pop
() {
Token
LParen
=> {
let
SExpr
expr
=
Tokens
self
.
(self : Tokens, min_bp~ : Int = ..) -> SExpr raise ParseError
parseExpr
()
if
Tokens
self
.
(self : Tokens) -> Token
peek
() is
Token
RParen
{
(t : Token) -> Unit

Evaluates an expression and discards its result. This is useful when you want to execute an expression for its side effects but don't care about its return value, or when you want to explicitly indicate that a value is intentionally unused.

Parameters:

  • value : The value to be ignored. Can be of any type.

Example:

  let x = 42
  ignore(x) // Explicitly ignore the value
  let mut sum = 0
  ignore([1, 2, 3].iter().each((x) => { sum = sum + x })) // Ignore the Unit return value of each()
ignore
(
Tokens
self
.
(self : Tokens) -> Token
pop
())
SExpr
expr
} else { raise
((Int, Token)) -> ParseError
ParseError
((
Tokens
self
.
Int
position
,
Tokens
self
.
(self : Tokens) -> Token
peek
()))
} }
(String) -> Token
Operand
(
String
s
) =>
(String) -> SExpr
Atom
(
String
s
)
Token
t
=> raise
((Int, Token)) -> ParseError
ParseError
((
Tokens
self
.
Int
position
(self : Int, other : Int) -> Int

Performs subtraction between two 32-bit integers, following standard two's complement arithmetic rules. When the result overflows or underflows, it wraps around within the 32-bit integer range.

Parameters:

  • self : The minuend (the number being subtracted from).
  • other : The subtrahend (the number to subtract).

Returns the difference between self and other.

Example:

  let a = 42
  let b = 10
  inspect(a - b, content="32")
  let max = 2147483647 // Int maximum value
  inspect(max - -1, content="-2147483648") // Overflow case
-
1,
Token
t
))
} while true { let
Char
op
= match
Tokens
self
.
(self : Tokens) -> Token
peek
() {
Token
Eof
|
Token
RParen
=> break
(Char) -> Token
Operator
(
Char
op
) =>
Char
op
Token
t
=> raise
((Int, Token)) -> ParseError
ParseError
((
Tokens
self
.
Int
position
,
Token
t
))
} guard
(op : Char) -> (Int, Int)?
infix_binding_power
(
Char
op
) is
((Int, Int)) -> (Int, Int)?
Some
((
Int
l_bp
,
Int
r_bp
)) else {
raise
((Int, Token)) -> ParseError
ParseError
((
Tokens
self
.
Int
position
,
(Char) -> Token
Operator
(
Char
op
)))
} if
Int
l_bp
(self_ : Int, other : Int) -> Bool
<
Int
min_bp
{
break }
(t : Token) -> Unit

Evaluates an expression and discards its result. This is useful when you want to execute an expression for its side effects but don't care about its return value, or when you want to explicitly indicate that a value is intentionally unused.

Parameters:

  • value : The value to be ignored. Can be of any type.

Example:

  let x = 42
  ignore(x) // Explicitly ignore the value
  let mut sum = 0
  ignore([1, 2, 3].iter().each((x) => { sum = sum + x })) // Ignore the Unit return value of each()
ignore
(
Tokens
self
.
(self : Tokens) -> Token
pop
())
let
SExpr
rhs
=
Tokens
self
.
(self : Tokens, min_bp~ : Int) -> SExpr raise ParseError
parseExpr
(
Int
min_bp
=
Int
r_bp
)
SExpr
lhs
=
(Char, Array[SExpr]) -> SExpr
Cons
(
Char
op
, [
SExpr
lhs
,
SExpr
rhs
])
continue } return
SExpr
lhs
} fn
(s : String) -> SExpr raise Error
parse
(
String
s
:
String
String
) ->
enum SExpr {
  Atom(String)
  Cons(Char, Array[SExpr])
}
SExpr
!
type Error
Error
{
(source : String) -> Tokens raise LexError
tokenize
(
String
s
).
(self : Tokens, min_bp~ : Int = ..) -> SExpr raise ParseError
parseExpr
()
}

现在我们获得了一个可扩展的四则运算表达式解析器,可以在下面测试块中添加更多的例子来验证其正确性。

test {
    
(obj : &Show, content~ : String, loc~ : SourceLoc = _, args_loc~ : ArgsLoc = _) -> Unit raise InspectError

Tests if the string representation of an object matches the expected content. Used primarily in test cases to verify the correctness of Show implementations and program outputs.

Parameters:

  • object : The object to be inspected. Must implement the Show trait.
  • content : The expected string representation of the object. Defaults to an empty string.
  • location : Source code location information for error reporting. Automatically provided by the compiler.
  • arguments_location : Location information for function arguments in source code. Automatically provided by the compiler.

Throws an InspectError if the actual string representation of the object does not match the expected content. The error message includes detailed information about the mismatch, including source location and both expected and actual values.

Example:

  inspect(42, content="42")
  inspect("hello", content="hello")
  inspect([1, 2, 3], content="[1, 2, 3]")
inspect
(
(s : String) -> SExpr raise Error
parse
("13 + 6 + 5 * 3"),
String
content
="(+ (+ 13 6) (* 5 3))")
(obj : &Show, content~ : String, loc~ : SourceLoc = _, args_loc~ : ArgsLoc = _) -> Unit raise InspectError

Tests if the string representation of an object matches the expected content. Used primarily in test cases to verify the correctness of Show implementations and program outputs.

Parameters:

  • object : The object to be inspected. Must implement the Show trait.
  • content : The expected string representation of the object. Defaults to an empty string.
  • location : Source code location information for error reporting. Automatically provided by the compiler.
  • arguments_location : Location information for function arguments in source code. Automatically provided by the compiler.

Throws an InspectError if the actual string representation of the object does not match the expected content. The error message includes detailed information about the mismatch, including source location and both expected and actual values.

Example:

  inspect(42, content="42")
  inspect("hello", content="hello")
  inspect([1, 2, 3], content="[1, 2, 3]")
inspect
(
(s : String) -> SExpr raise Error
parse
("3 * 3 + 5 * 5"),
String
content
="(+ (* 3 3) (* 5 5))")
(obj : &Show, content~ : String, loc~ : SourceLoc = _, args_loc~ : ArgsLoc = _) -> Unit raise InspectError

Tests if the string representation of an object matches the expected content. Used primarily in test cases to verify the correctness of Show implementations and program outputs.

Parameters:

  • object : The object to be inspected. Must implement the Show trait.
  • content : The expected string representation of the object. Defaults to an empty string.
  • location : Source code location information for error reporting. Automatically provided by the compiler.
  • arguments_location : Location information for function arguments in source code. Automatically provided by the compiler.

Throws an InspectError if the actual string representation of the object does not match the expected content. The error message includes detailed information about the mismatch, including source location and both expected and actual values.

Example:

  inspect(42, content="42")
  inspect("hello", content="hello")
  inspect([1, 2, 3], content="[1, 2, 3]")
inspect
(
(s : String) -> SExpr raise Error
parse
("(3 + 4) * 3 * (17 * 5)"),
String
content
="(* (* (+ 3 4) 3) (* 17 5))")
(obj : &Show, content~ : String, loc~ : SourceLoc = _, args_loc~ : ArgsLoc = _) -> Unit raise InspectError

Tests if the string representation of an object matches the expected content. Used primarily in test cases to verify the correctness of Show implementations and program outputs.

Parameters:

  • object : The object to be inspected. Must implement the Show trait.
  • content : The expected string representation of the object. Defaults to an empty string.
  • location : Source code location information for error reporting. Automatically provided by the compiler.
  • arguments_location : Location information for function arguments in source code. Automatically provided by the compiler.

Throws an InspectError if the actual string representation of the object does not match the expected content. The error message includes detailed information about the mismatch, including source location and both expected and actual values.

Example:

  inspect(42, content="42")
  inspect("hello", content="hello")
  inspect([1, 2, 3], content="[1, 2, 3]")
inspect
(
(s : String) -> SExpr raise Error
parse
("(((47)))"),
String
content
="47")
}

不过,pratt parser的能力不止于此,它还可以解析前缀运算符(例如按位取反!n)、数组索引运算符arr[i]乃至于三目运算符c ? e1 : e2。关于这方面更详细的解析请见Simple but Powerful Pratt Parsing, 这篇博客的作者在著名的程序分析工具rust-analyzer中实现了一个工业级的pratt parser。