Sharding: Load Balancing Algorithms
Abstract
Objectives
Load balance
Few cross-shard transactions
Low assignment costs
For instance, under the constraint of making each shard's loads as equal as possible, minimize cross-shard transactions like ε-constraint method (cf. scalarizing) Methods
Load balancing is NP-hard problem, so there are no polynomial-time algorithms to find exact solutions.
The same is true for the simple 2-shard load balancing problem.
Approximation algorithms and heuristics are suitable to solve NP-hard problems like this.
Approximation algorithms
E.g., greedy algorithms
Performance guarantee
More theoretical than heuristic approaches
Difficult to get good solutions
Heuristics
E.g., simulated annealing algorithms
Like 2-opt in the traveling salesman problem
Heuristics seems to be the best methods to solve this problem.
Load balancing is NP-hard
Decision problem
$ m shards
$ n accounts
$ c_i is the load of $ i-th account
Is there an account assignment with the maximum shard load at most $ L?
This decision problem is NP-complete because the subset sum problem (NP-complete) is reduced to this problem by polynomial-time reduction.
Therefore, the load balancing problem is NP-hard.
Hub-and-spoke sharding (e.g., ETH2.0)
Using a load-balancing manager contract, competition-based load balancing is held.
Every do load balancing, the randomly chosen $ x % accounts are load balanced.
It seems that it is better for the contract deployer to decide whether the contract is in the hub.
Hub contracts seems to have different gas cost system.
Basic formulation to do load balancing
$ \begin{aligned} & { \mathrm{minimize} } && \max_k \left\{\sum_{i,j,s(i)= k} c_{i,j} + \sum_{i,j,s(i)\neq k \land s(j)=k} c_{i,j} \right\} \\ & \mathrm{subject\;to} && \sum_{i,j,s(i)=k}c_{i,j} \leq L,\quad k=1,\ldots ,N\end{aligned}
$ a_i: the i-th account
$ c_{i,j}: total gas cost of transactions from $ a_i to $ a_j
$ s(i): shard id of the shard to which the i-th account belongs
$ L: block (collation) gas limit
Suppose that every shard has the same block gas limit.
Issue: This formula is suitable to do load balancing, but not necessarily suitable to do improve UX, because it is not considered to reduce cross shard transactions.
Basic formulation to minimize the total gas cost of cross shard transactions
$ \begin{aligned} & { \mathrm{minimize} } && \sum_{i,j,s(i)\neq s(j)} c_{i,j} \\ & \mathrm{subject\;to} && \sum_{i,j,s(i)=k}c_{i,j} \leq L,\quad k=1,\ldots ,N\end{aligned}
Issue: Some shards always have blocks that reach block gas limit, so the following improvements are required.
Set an upper limit
$ \begin{aligned} & { \mathrm{minimize} } && \sum_{i,j,s(i)\neq s(j)} c_{i,j} \\ & \mathrm{subject\;to} && \sum_{i,j,s(i)=k}c_{i,j} \leq pL,\quad k=1,\ldots ,N\end{aligned}
Using $ p\le 1, blocks don't reach block gas limit $ L.
Set a deviation condition
$ \begin{aligned} & { \mathrm{minimize} } && \sum_{i,j,s(i)\neq s(j)} c_{i,j} \\ & \mathrm{subject\;to} && \sum_{i,j,s(i)=k}c_{i,j} \leq L,\quad k=1,\ldots ,N \\ &&& \sum_{i,j,s(i)=q}c_{i,j} \leq p\sum_{i,j,s(i)=r} c_{i,j} \end{aligned}
It is better than using variance because the formula becomes a quadratic equation.
The maximum shard loads is less than p times the minimum shard loads.
Seperate computation costs and storage costs
Using a deviation condition of computations costs and storage consts
シャードの個数とコントラクトの個数を同じ値$ Nとして単純化
$ N個の頂点を持つ二分木を考える
任意の頂点ペア$ (i,j)に対して
ペナルティ$ w_{i,j}
距離$ d_{i,j}
頂点$ iから頂点$ jへ行くパスに頂点$ kを含むなら$ c_{i,j,k}=1そうでないなら$ 0
$ Lはシャードが耐えられる最大の負荷
$ \begin{aligned} & { \mathrm{minimize} } && \sum_{i,j} d_{i,j}w_{i,j} \\ & \mathrm{subject\;to} && \sum_{i,j}c_{i,j,k}d_{i,j}w_{i,j} \leq L,\quad k=1,\ldots ,N\end{aligned}
愚直にやると当然まずくて、全ての二分木のパターン数を考えたら組合せ爆発
シャードの個数$ Sコントラクトの個数$ C
$ S個の頂点を持つ二分木を考える。
各コントラクトのストレージgasコストを$ s_iとする
コントラクト$ iがシャード$ jに含まれるなら$ a_{i,j}=1そうでないなら$ 0
$ Dはシャードが耐えられる最大のストレージgasコスト
$ \begin{aligned} & { \mathrm{minimize} } && \sum_{i,j} d_{i,j}w_{i,j} \\ & \mathrm{subject\;to} && \sum_{i,j}c_{i,j,k}d_{i,j}w_{i,j} \leq L,\quad k=1,\ldots ,N \\ &&& \sum_{i} a_{i,j}s_i \leq D,\quad j=1,\ldots ,N \end{aligned}