OT 算法的 transform 源码走读

协同编辑基础

Posted by Nicodechal on 2020-08-10

Transform 逻辑介绍

ot.js 的逻辑

将用户的一次操作认为是一个光标从左到右完整遍历整个字符串并在过程中做出删除和插入操作的行为。

When an operation is applied to an input string, you can think of this as

if an imaginary cursor runs over the entire string and skips over some
parts, deletes some parts and inserts characters at some positions.

These actions (skip/delete/insert) are stored as an array in the “ops” property.

这里要注意每次操作都认为是完整遍历了整个字符串,基于下面三种操作:

  1. retain,用一个正数 n 表示,表示光标向后移动 n 步
  2. delete,用一个负数 -n 表示,表示删除目前光标位置后的 n 个字符
  3. insert,用一个字符串 str 表示,表示向目前光标位置前插入 str

举例说明:

1
2
3
4
5
6
7
8
9
10
11
|apple -> appe // | 代表光标位置

光标先向后移 3 位, app|le // retain(3) = 3
删除当前光标后的 1 个字符, app|e // delete(1) = -1
光标向后移动 1 位结束字符串遍历, appe| // retain(1) = 1

|apple -> apppple // | 代表光标位置

光标先向后移 3 位, app|le // retain(3) = 3
当前光标前插入 'pp', apppp|le // insert('pp') = 'pp'
光标向后移动 2 位结束字符串遍历, apppple| // retain(1) = 1

transform 源码走读

Transform 返回两个操作 A’ 和 B’, 实现 A 操作后应用 B’ 得到的结果和 B 操作后应用 A’ 操作相同。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
// Transform takes two operations A and B that happened concurrently and
// produces two operations A' and B' (in an array) such that
// `apply(apply(S, A), B') = apply(apply(S, B), A')`. This function is the
// heart of OT.
TextOperation.transform = function (operation1, operation2) {
// 转换基于一个已经同步的状态,因此必须保证正在操作的数据相同
if (operation1.baseLength !== operation2.baseLength) {
throw new Error("Both operations have to have the same base length");
}

var operation1prime = new TextOperation();
var operation2prime = new TextOperation();
var ops1 = operation1.ops, ops2 = operation2.ops;
var i1 = 0, i2 = 0;
var op1 = ops1[i1++], op2 = ops2[i2++];
while (true) {
// At every iteration of the loop, the imaginary cursor that both
// operation1 and operation2 have that operates on the input string must
// have the same position in the input string.
// 每一轮对当前两人的操作(使用 i1, i2 表示操作数组中的当前操作下标)进行处理
// 确保光标在输入中位置相同

if (typeof op1 === 'undefined' && typeof op2 === 'undefined') {
// end condition: both ops1 and ops2 have been processed
// 终止条件
break;
}

// next two cases: one or both ops are insert ops
// => insert the string in the corresponding prime operation, skip it in
// the other one. If both op1 and op2 are insert ops, prefer op1.
// 下面处理其中一个是插入操作的情况
// 如果 A 的当前操作时插入, 不管 B 是什么操作, 先跳过.
if (isInsert(op1)) {
// A' 也进行插入操作
operation1prime.insert(op1);
// B' 由于 A 已经把数据插入了,直接后移至插入结束位置
operation2prime.retain(op1.length);
// 这里只有 A 完成了一个操作
op1 = ops1[i1++];
continue;
}
if (isInsert(op2)) {
operation1prime.retain(op2.length);
operation2prime.insert(op2);
op2 = ops2[i2++];
continue;
}

if (typeof op1 === 'undefined') {
throw new Error("Cannot compose operations: first operation is too short.");
}
if (typeof op2 === 'undefined') {
throw new Error("Cannot compose operations: first operation is too long.");
}

var minl;
// 如果两个操作都是后移,移动两个操作中比较小的一个,剩下的下一轮处理。
// 比如 A 后移 5 位, B 后移 8 格, A',B' 都后移 min(5, 8) 格,
// B 剩下的 retain(3) 下一轮和 A 的下一个操作一起处理
if (isRetain(op1) && isRetain(op2)) {
// Simple case: retain/retain
if (op1 > op2) {
minl = op2;
op1 = op1 - op2;
op2 = ops2[i2++];
} else if (op1 === op2) {
minl = op2;
op1 = ops1[i1++];
op2 = ops2[i2++];
} else {
minl = op1;
op2 = op2 - op1;
op1 = ops1[i1++];
}
operation1prime.retain(minl);
operation2prime.retain(minl);
} else if (isDelete(op1) && isDelete(op2)) {
// 如果都是删除,先删除 min(abs(op1), abs(op2)), 剩下的留到下一轮和下一个操作一起处理
// Both operations delete the same string at the same position. We don't
// need to produce any operations, we just skip over the delete ops and
// handle the case that one operation deletes more than the other.
if (-op1 > -op2) {
op1 = op1 - op2;
op2 = ops2[i2++];
} else if (op1 === op2) {
op1 = ops1[i1++];
op2 = ops2[i2++];
} else {
op2 = op2 - op1;
op1 = ops1[i1++];
}
// next two cases: delete/retain and retain/delete
} else if (isDelete(op1) && isRetain(op2)) {
// 如果有一个删除和一个后移,删除的多的话先删除 op2 个字符,然后 op2 直接跳过
// (因为这部分刚好被删了)
if (-op1 > op2) {
minl = op2;
op1 = op1 + op2;
op2 = ops2[i2++];
} else if (-op1 === op2) {
minl = op2;
op1 = ops1[i1++];
op2 = ops2[i2++];
} else {
// 如果删除的少,就删除 op1 个字符,然后 op2 去掉 abs(op1) 步,
// 剩下的下一轮进行处理
minl = -op1;
op2 = op2 + op1;
op1 = ops1[i1++];
}
operation1prime['delete'](minl);
} else if (isRetain(op1) && isDelete(op2)) {
if (op1 > -op2) {
minl = -op2;
op1 = op1 + op2;
op2 = ops2[i2++];
} else if (op1 === -op2) {
minl = op1;
op1 = ops1[i1++];
op2 = ops2[i2++];
} else {
minl = op1;
op2 = op2 + op1;
op1 = ops1[i1++];
}
operation2prime['delete'](minl);
} else {
throw new Error("The two operations aren't compatible");
}
}

return [operation1prime, operation2prime];
};

下面直接举例说明:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
源字符串:apple

A编辑的结果:appple -> [3, 'p', 2]
B编辑的结果:aplle -> [2, -1, 1, 'l', 1]

下面计算 A' 和 B', 使得
apply(apply(apple, A), B') === apply(apply(apple, B), A')

// Aretain Bretain
A : [3, 'p', 2]
B : [2, -1, 1, 'l', 1]
A': []
B': []
// Aretain Bdelete
A : [1, 'p', 2]
B : [-1, 1, 'l', 1]
A': [2]
B': [2]
// Ainsert
A : ['p', 2]
B : [1, 'l', 1]
A': [2]
B': [2, -1]
// Aretain Bretain
A : [2]
B : [1, 'l', 1]
A': [2, 'p']
B': [2, -1, 1]
// Binsert
A : [1]
B : ['l', 1]
A': [2, 'p', 1]
B': [2, -1, 2]
// Aretain Bretain
A : [1]
B : [1]
A': [2, 'p', 2]
B': [2, -1, 2, 'l']
// end
A : []
B : []
A': [2, 'p', 3]
B': [2, -1, 2, 'l', 1]

最终得到 A' = [2, 'p', 3], B' = [2, -1, 2, 'l', 1]

对 appple 应用 B' -> applle
对 aplle 应用 A' -> applle