Mini-adapton: 用 MoonBit 实现增量计算

介绍
让我们先用一个类似 excel 的例子感受一下增量计算长什么样子. 首先, 定义一个这样的依赖图:

在这个图中, t1 的值通过 n1 + n2 计算得到, t2 的值通过 t1 + n3 计算得到.
当我们想得到 t2 的值时, 该图定义的计算将被执行: 首先通过 n1 + n2 算出 t1, 再通过 t1 + n3 算出 t2. 这个过程和非增量计算是相同的.
但当我们开始改变n1, n2 或 n3 的值时, 事情就不一样了. 比如说我们想将 n1 和 n2 的值互换, 再得到 t2 的值. 在非增量计算中, t1 和 t2 都将被重新计算一遍, 但实际上 t2 是不需要被重新计算的, 因为它依赖的两个值 t1 和 n3 都没有改变 (将 n1 和 n2 的值互换不会改变 t1 的值).
下面的代码实现了我们刚刚举的例子. 我们使用 Cell::new 来定义 n1, n2 和 n3 这些不需要计算的东西, 使用 Thunk::new 来定义 t1 和 t2 这样需要计算的东西.
test {
// a counter to record the times of t2's computation
let mut Int
cnt = 0
// start define the graph
let Cell[Int]
n1 = struct Cell[A] {
mut is_dirty: Bool
mut value: A
mut is_changed: Bool
incoming_edges: Array[&Node]
}
Cell::fn[A : Eq] Cell::new(value : A) -> Cell[A]
new(1)
let Cell[Int]
n2 = struct Cell[A] {
mut is_dirty: Bool
mut value: A
mut is_changed: Bool
incoming_edges: Array[&Node]
}
Cell::fn[A : Eq] Cell::new(value : A) -> Cell[A]
new(2)
let Cell[Int]
n3 = struct Cell[A] {
mut is_dirty: Bool
mut value: A
mut is_changed: Bool
incoming_edges: Array[&Node]
}
Cell::fn[A : Eq] Cell::new(value : A) -> Cell[A]
new(3)
let Thunk[Int]
t1 = struct Thunk[A] {
mut is_dirty: Bool
mut value: A?
mut is_changed: Bool
thunk: () -> A
incoming_edges: Array[&Node]
outgoing_edges: Array[&Node]
}
Thunk::fn[A : Eq] Thunk::new(thunk : () -> A) -> Thunk[A]
new(fn() {
Cell[Int]
n1.fn[A] Cell::get(self : Cell[A]) -> A
get() fn Add::add(self : Int, other : Int) -> Int
Adds two 32-bit signed integers. Performs two's complement arithmetic, which
means the operation will wrap around if the result exceeds the range of a
32-bit integer.
Parameters:
self : The first integer operand.
other : The second integer operand.
Returns a new integer that is the sum of the two operands. If the
mathematical sum exceeds the range of a 32-bit integer (-2,147,483,648 to
2,147,483,647), the result wraps around according to two's complement rules.
Example:
test {
inspect(42 + 1, content="43")
inspect(2147483647 + 1, content="-2147483648") // Overflow wraps around to minimum value
}
+ Cell[Int]
n2.fn[A] Cell::get(self : Cell[A]) -> A
get()
})
let Thunk[Int]
t2 = struct Thunk[A] {
mut is_dirty: Bool
mut value: A?
mut is_changed: Bool
thunk: () -> A
incoming_edges: Array[&Node]
outgoing_edges: Array[&Node]
}
Thunk::fn[A : Eq] Thunk::new(thunk : () -> A) -> Thunk[A]
new(fn() {
Int
cnt fn Add::add(self : Int, other : Int) -> Int
Adds two 32-bit signed integers. Performs two's complement arithmetic, which
means the operation will wrap around if the result exceeds the range of a
32-bit integer.
Parameters:
self : The first integer operand.
other : The second integer operand.
Returns a new integer that is the sum of the two operands. If the
mathematical sum exceeds the range of a 32-bit integer (-2,147,483,648 to
2,147,483,647), the result wraps around according to two's complement rules.
Example:
test {
inspect(42 + 1, content="43")
inspect(2147483647 + 1, content="-2147483648") // Overflow wraps around to minimum value
}
+= 1
Thunk[Int]
t1.fn[A : Eq] Thunk::get(self : Thunk[A]) -> A
get() fn Add::add(self : Int, other : Int) -> Int
Adds two 32-bit signed integers. Performs two's complement arithmetic, which
means the operation will wrap around if the result exceeds the range of a
32-bit integer.
Parameters:
self : The first integer operand.
other : The second integer operand.
Returns a new integer that is the sum of the two operands. If the
mathematical sum exceeds the range of a 32-bit integer (-2,147,483,648 to
2,147,483,647), the result wraps around according to two's complement rules.
Example:
test {
inspect(42 + 1, content="43")
inspect(2147483647 + 1, content="-2147483648") // Overflow wraps around to minimum value
}
+ Cell[Int]
n3.fn[A] Cell::get(self : Cell[A]) -> A
get()
})
// get the value of t2
fn inspect(obj : &Show, content~ : String, loc~ : SourceLoc = _, args_loc~ : ArgsLoc = _) -> Unit raise InspectError
Tests if the string representation of an object matches the expected content.
Used primarily in test cases to verify the correctness of Show
implementations and program outputs.
Parameters:
object : The object to be inspected. Must implement the Show trait.
content : The expected string representation of the object. Defaults to
an empty string.
location : Source code location information for error reporting.
Automatically provided by the compiler.
arguments_location : Location information for function arguments in
source code. Automatically provided by the compiler.
Throws an InspectError if the actual string representation of the object
does not match the expected content. The error message includes detailed
information about the mismatch, including source location and both expected
and actual values.
Example:
test {
inspect(42, content="42")
inspect("hello", content="hello")
inspect([1, 2, 3], content="[1, 2, 3]")
}
inspect(Thunk[Int]
t2.fn[A : Eq] Thunk::get(self : Thunk[A]) -> A
get(), String
content="6")
fn inspect(obj : &Show, content~ : String, loc~ : SourceLoc = _, args_loc~ : ArgsLoc = _) -> Unit raise InspectError
Tests if the string representation of an object matches the expected content.
Used primarily in test cases to verify the correctness of Show
implementations and program outputs.
Parameters:
object : The object to be inspected. Must implement the Show trait.
content : The expected string representation of the object. Defaults to
an empty string.
location : Source code location information for error reporting.
Automatically provided by the compiler.
arguments_location : Location information for function arguments in
source code. Automatically provided by the compiler.
Throws an InspectError if the actual string representation of the object
does not match the expected content. The error message includes detailed
information about the mismatch, including source location and both expected
and actual values.
Example:
test {
inspect(42, content="42")
inspect("hello", content="hello")
inspect([1, 2, 3], content="[1, 2, 3]")
}
inspect(Int
cnt, String
content="1")
// swap value of n1 and n2
Cell[Int]
n1.fn[A : Eq] Cell::set(self : Cell[A], new_value : A) -> Unit
set(2)
Cell[Int]
n2.fn[A : Eq] Cell::set(self : Cell[A], new_value : A) -> Unit
set(1)
fn inspect(obj : &Show, content~ : String, loc~ : SourceLoc = _, args_loc~ : ArgsLoc = _) -> Unit raise InspectError
Tests if the string representation of an object matches the expected content.
Used primarily in test cases to verify the correctness of Show
implementations and program outputs.
Parameters:
object : The object to be inspected. Must implement the Show trait.
content : The expected string representation of the object. Defaults to
an empty string.
location : Source code location information for error reporting.
Automatically provided by the compiler.
arguments_location : Location information for function arguments in
source code. Automatically provided by the compiler.
Throws an InspectError if the actual string representation of the object
does not match the expected content. The error message includes detailed
information about the mismatch, including source location and both expected
and actual values.
Example:
test {
inspect(42, content="42")
inspect("hello", content="hello")
inspect([1, 2, 3], content="[1, 2, 3]")
}
inspect(Thunk[Int]
t2.fn[A : Eq] Thunk::get(self : Thunk[A]) -> A
get(), String
content="6")
// t2 does not recompute
fn inspect(obj : &Show, content~ : String, loc~ : SourceLoc = _, args_loc~ : ArgsLoc = _) -> Unit raise InspectError
Tests if the string representation of an object matches the expected content.
Used primarily in test cases to verify the correctness of Show
implementations and program outputs.
Parameters:
object : The object to be inspected. Must implement the Show trait.
content : The expected string representation of the object. Defaults to
an empty string.
location : Source code location information for error reporting.
Automatically provided by the compiler.
arguments_location : Location information for function arguments in
source code. Automatically provided by the compiler.
Throws an InspectError if the actual string representation of the object
does not match the expected content. The error message includes detailed
information about the mismatch, including source location and both expected
and actual values.
Example:
test {
inspect(42, content="42")
inspect("hello", content="hello")
inspect([1, 2, 3], content="[1, 2, 3]")
}
inspect(Int
cnt, String
content="1")
}
在这篇文章中, 我们将介绍如何在 MoonBit 中实现一个增量计算库. 这个库的 API 就是我们上面例子中出现的那些:
Cell::new
Cell::get
Cell::set
Thunk::new
Thunk::get
问题分析和解法
要实现这个库, 我们主要有三个问题需要解决:
如何在运行时构建依赖图
作为一个使用 MoonBit 实现的库, 没有简单方法让我们可以静态地构建依赖图, 因为 MoonBit 目前还不支持任何元编程的机制. 因此我们需要动态地把依赖图构建出来. 事实上, 我们关心的只是哪些 thunk 或 cell 被另一个 thunk 依赖了, 所以一个不错的构建依赖图的时机就是在用户调用 Thunk::get 的时候. 比如在上面的例子中:
let n1 = Cell::new(1)
let n2 = Cell::new(2)
let n3 = Cell::new(3)
let t1 = Thunk::new(fn() { n1.get() + n2.get() })
let t2 = Thunk::new(fn() { t1.get() + n3.get() })
t2.get()
当用户调用 t2.get() 时, 我们在运行时会知道 t1.get() 和 n3.get() 在其中也被调用了. 因此 t1 和 n3 是 t2 的依赖, 并且我们可以构建一个这样的图:

同样的过程也会在 t1.get() 被调用时发生.
所以计划是这样的:
- 我们定义一个栈来记录我们当前在获得哪个 thunk 的值. 在这里使用栈的原因是, 我们事实上是在尝试记录每个
get的调用栈. - 当我们调用
get时, 将其标记为栈顶 thunk 的依赖, 如果它是一个 thunk, 再把它压栈. - 当一个 thunk 的
get结束时, 将它出栈.
让我们看看上面那个例子在这个算法下的过程是什么样子的:
-
当我们调用
t2.get时, 将t2压栈.
-
当我们在
t2.get中调用t1.get时, 将t1记为t2的依赖, 并将t1压栈.
-
当我们在
t1.get中调用n1.get时, 将 n1 记为t1的依赖
-
相同的过程发生在
n2身上.
-
当
t1.get结束时, 将t1出栈.
-
当我们调用
n3.get时, 将n3记为t2的依赖.
除了这些从父依赖到子依赖的边之外, 我们最好也记录一个从子依赖到父依赖的边, 方便后面我们在这个图上反向便利.
在接下来的代码中, 我们将使用 outgoing_edges 指代从父依赖到子依赖的边, 使用 incoming_edges 指代中子依赖到父依赖的边.
如何标记过时的节点
当我们调用 Cell::set 时, 该节点本身和所有依赖它的节点都应该被标记为过时的. 这将在后面作为判断一个 thunk 是否需要重新计算的标准之一. 这基本上是一个从图的叶子节点向后遍历的过程. 我们可以用这样的伪 MoonBit 代码表示这个算法:
fn dirty(node: Node) -> Unit {
for n in node.incoming_edges {
n.set_dirty(true)
dirty(node)
}
}