RSA Accumulators for Plasma Cash history reduction

ref. https://ethresear.ch/t/rsa-accumulators-for-plasma-cash-history-reduction/3739

m0t0k1ch1.icon RSA accumulator の概要や Plasma Cash への導入方法を把握する

---.icon

One of the challenges for all Plasma flavors, Plasma Cash too, is the amount of history data that users need to keep around, and send to recipients if they want to send coins. For example, suppose a Plasma chain has one block committed to chain per minute, with$ \approx 2^{16}transactions per minute (~= 1000 transactions per second). We assume each user has their coins split across$ \approx 10fragments. To have the full information needed to win an exit game, each fragment requires a Merkle branch of length$ \approx 16per block, so that’s 16 hashes * 32 bytes * 500000 minutes * 10 fragments$ \approx 2.5GB for one year. In a transfer or atomic swap or similar operation, this is data that must be transferred.

Plasma Cash 然り、Plasma における挑戦の 1 つは、ユーザーが持ち回す履歴データの量である。このデータは、コインを送りたい相手に対しても送る必要がある。例えば、Plasma チェーンが毎分$ 2^{16}程度のトランザクションを含んだブロックをコミットすると仮定する（毎秒$ 1000トランザクション程度）。各ユーザーは$ 10程度の fragment に分割されたコインを保有していると仮定する。exit game で勝利するのに必要な情報を全て保持するためには、1 ブロックごとに各 fragment について 16 ハッシュ分程度の Merkle branch が必要になる。これは 1 年で

$ 16 ~ \mathrm{hashes} * 32 ~ \mathrm{bytes} * 500000 ~ \mathrm{minutes} * 10 ~ \mathrm{fragements} \approx 2.5 ~ \mathrm{GB}

となる。コインの transfer や atomic swap、もしくは類似のオペレーションにおいて、これだけのデータを送信しなければならない。

We can increase the efficiency here with RSA accumulators. An RSA accumulator is a data structure providing similar functionality to a Merkle tree: a large amount of data can be compressed into a single fixed-size “root”, such that given the data and this root, one can construct a “witness” to prove that any specific value from the data is part of the data represented by that root. This is done as follows. We choose a large value$ Nwhose factorization is not fully known, and a generator$ g(eg.$ g = 3) which represents the empty accumulator. To add a value$ vto an accumulator$ A, we set the new accumulator$ A' = A^v. To prove that a value$ vis part of an accumulator$ A, we can use a proof of knowledge of exponent scheme (see below), proving that we know a cofactor$ xsuch that$ (g^v)^x = A(note that$ xmay be very large; it is linear in the number of elements, hence why we need a proof scheme rather than providing$ xdirectly).

RSA accumulator を利用することで効率を上げることできる。RSA accumulator は、Merkle tree と似た機能を提供するデータ構造である。大量のデータを一定サイズの "root" 1 つに圧縮することができる。このようなデータと "root" が与えられたとき、そのデータに含まれる特定の値が、root によって表されるデータの一部であることを証明する "witness" を組み立てることができる。これは次のように行われる。因数分解の結果が分からない十分に大きな値$ Nと、空の accumulator を表す generator$ g（例えば$ g = 3）を選択する。値$ vを accumulator$ Aに加えるときは、新しい accumulator$ A' = A^vを設定する。値$ vが accumulator$ Aの一部であることの証明には proof of knowledge of exponent scheme（下記）を使うことができる。これによって、$ (g^v)^x = Aとなるような因数$ xを知っていることを証明することができる（$ xは非常に大きい、すなわち、因数に数に対して線形に増加するため、$ xを直接提供しないで済むような証明スキームが必要である）。

For example, suppose that we want to add the values$ 3, 5, 11into the accumulator. Then,$ A = g^{165}. To prove that$ 3is part of$ A(we’ll use the notation "part of$ \lbrack g \ldots A \rbrack" from now on for reasons that should become clear soon), we use a proof of knowledge scheme to prove that$ (g^3)^x = Afor some known cofactor$ x. In this case,$ x = 55, but in cases involving many thousands of values$ xcould get very large, which is why the proof of knowledge schemes are necessary.

例えば、$ 3, 5, 11を accumulator に加えるとすると、$ A = g^{165}となる。$ 3が$ Aの一部であること（以降、"part of$ \lbrack g \ldots A \rbrack" という表現を用いる）を証明するためには、既知の因数$ xについて$ (g^3)^x = A が成り立つことを証明する proof of knowledge スキームを使用することができる。今回は$ x = 55であるが、数千の値が含まれるような場合、$ xはとても大きな値になる。よって、proof of knowledge スキームが必要となる。

m0t0k1ch1.icon $ 3 \cdot 5 \cdot 11 = 165

m0t0k1ch1.icon$ xの値が大きくなるとデータサイズも大きくなるからダメってことよね？

One example of a proof of knowledge scheme, described by Wesolowski (see also this talk by Benedikt Bünz) is as follows. Let$ h = g^vand$ z = h^x. Select$ B = hash(h, z) \mod{N}. Compute$ b = h^{floor(\frac{x}{B})}, and$ r = x \mod{B}. A proof is simply$ (b, z), which we can verify by checking$ b^B * h^r = z(proof of completeness:$ b^B * h^r = h^{B * floor(\frac{x}{B}) + x \mod{B}} = h^x). Note that this proof is constant-sized.

proof of knowledge スキームの一例として、Wesolowski によって記述されたものがある（Benedikt Bünz による解説も参考になる）。これは次のようなものである。まず、$ h = g^v、$ z = h^xとし、$ B = hash(h, z) \mod{N}を求める。$ b = h^{floor(\frac{x}{B})}と$ r = x \mod Bを計算する。このとき、proof は単純に$ (b, z)となる。proof は$ b^B * h^r = z（丁寧に書くと、$ b^B * h^r = h^{B * floor(\frac{x}{B}) + x \mod B} = h^x）を確認することで検証できる。この proof のサイズは一定である。

m0t0k1ch1.icon 小難しそうに見えるけど、$ B * floor(\frac{x}{B}) + (x \mod{B}) = x は、$ x \div B = floor(\frac{x}{B}) + (x \mod{B})を考えれば小学生でも分かる話

m0t0k1ch1.icon verifier は$ xを知らないまま検証できないといけないので、proof は$ (b, z, r)では？$ rは$ xを知らないと計算できない

m0t0k1ch1.icon Plasma で考えると、各ブロックごとの$ zを root chain が保持することになると思うので、実際にコイン保有者が提出するのは$ (b, r)かな？

m0t0k1ch1.icon$ Bは$ hash(h, z) \mod{N}でなくてもよいと思うけど、なぜこうなってるんだろう？なんとなく、使い回せるのはよろしくなさそうだけど、クリティカルな問題が発生するイメージが湧いていない

m0t0k1ch1.icon ちなみに、これもゼロ知識証明と呼んでいいのかな？verifier は、prover が$ xを知っていることを$ xを知らないまま検証できるわけなので

To prove that a value is$ vnot part of$ \lbrack g \ldots A \rbrack, we need only prove that we know$ rsuch that$ 0 < r < vwhere$ A * g^ris a known power of$ g^vusing the algorithm above. For example, suppose we want to prove that$ 7is not part of$ \lbrack g \ldots A \rbrackin the example above.$ 7is not a factor of$ 165, so there is no integer$ xsuch that$ (g^7)^x = g^{165} = A.But$ g^7is a known root of$ A * g^3 = g^{168}, so we can provide the value$ r = 3(satisfying the desired$ 0 < r < 7) and cofactor$ x = 24, and we see$ (g^7)^{24} = A * g^3 = g^{168}.

$ vが part of$ \lbrack g \ldots A \rbrackでないことを証明するには、上記のアルゴリズムを用いて$ 0 < r < vとなるような$ rを知っていることを証明するだけでよい。ここで、$ A * g^rは$ g^vの倍数である。例えば、上記の例において$ 7が part of$ \lbrack g \ldots A \rbrackではないことを示したいとする。$ 7は$ 165の因数ではないため、$ (g^7)^x = g^{165} = Aとなるような整数$ xは存在しない。しかし、$ A * g^3 = g^{168}は$ g^7の倍数であるため、$ r = 3（$ 0 < r < 7を満たす）と$ x = 24が与えられれば、$ (g^7)^{24} = A * g^3 = g^{168}を確認できる。

m0t0k1ch1.icon $ vが part of $ \lbrack g \ldots A \rbrackでない場合、$ Aは$ vの倍数ではないため、$ A * g^rが$ g^vの倍数となるような$ 0 < r < vを満たす$ rが存在する

m0t0k1ch1.icon よって、$ (g^v)^x = A * g^rとなるような$ rと$ xを提出することで、$ vが part of$ \lbrack g \ldots A \rbrackでないことが証明できる

m0t0k1ch1.icon 例えば$ v = 7であれば$ r = 3が存在するので、$ r = 3と$ x = \frac{165 + 3}{7} = 24を提出すればよい

Now, here is how we can use this technique. We require Plasma Cash roots to come with both a Merkle tree of transactions and an RSA accumulator of coin indices that are modified in that block. Using a transaction in an exit game requires both the Merkle proof and the RSA accumulator proof of membership. This means that a RSA proof of non-membership is sufficient to satisfy a user that there is no data in that block that could be used as a transaction in an exit game. However, the RSA accumulator that we use is cumulative; that is, in the first block, the generator that we use can be 3, but in every subsequent block, the generator used is the accumulator output of the previous block. This allows us to batch proofs of non-membership: to prove that a given coin index was not touched in blocks$ n \ldots n + k, we make a proof of non-membership based on the post-state of the accumulator after block$ n + kusing the pre-state of block$ nas the generator. If a user receives this proof of non-membership, they can be convinced that no proof of membership can be generated for any of the blocks in the range$ n \ldots n + k.

このテクニックを以下のように利用することができる。まず、あるブロックに関して、それに含まれるトランザクションの Merkle tree と、そのブロックで何らかの変更があったコインの識別子に関する RSA accumulator の両方が Plasma Cash の root となる必要がある。exit game においてトランザクションを使用する際、proof of membership に関しては、Merkle proof と RSA proof の両方が必要となる。裏を返せば、exit game においてトランザクションとして使用することができるデータがそのブロックの中に存在しないということをユーザーに確信させる場合は RSA proof of non-membership で十分、という意味である。また、ここで用いる RSA accumulator は cumulative である。すなわち、最初のブロックでは、generator は 3 などでよいが、全ての後続ブロックは 1 つ前のブロックの accumulator 出力を generator として使用する。これによって、複数ブロックに対する proof of non-membership が可能となる。あるコインの識別子に関して、ブロック$ n \ldots n + kの間に変更がないことを証明するためには、ブロック$ nの pre-state を generator とし、ブロック$ n + kの内容を反映させた accumulator の post-state に基づいて proof of non-membership をつくればよい。ユーザーはこの proof of non-membership を受け取ったら、$ n \ldots n + kの間にあるどのブロックについても proof of membership が生成できないことを確認できる。

m0t0k1ch1.icon あるブロックに関して、pre-state は「そのブロックの内容を反映する前の accumulator」で、post-state は「そのブロックの内容を反映した後の accumulator」ということだと思う

m0t0k1ch1.icon proof of membership については、Merkle proof と RSA proof の両方が必要って書いてあるけど、これはトランザクションの中身が関係する場合ってことかな？RSA proof はトランザクションの中身については何も保証しないはずなので

m0t0k1ch1.icon proof of non-membership については、accumulator を cumulative にすることで複数ブロックに対する RSA proof が生成可能になるため、個々のブロックに対する proof を持ち回さなくても証明が可能になる

This means that the history proof size for a coin goes down from one Merkle branch per Plasma block to two RSA accumulator proofs per transaction of that coin: one proof of membership for when the transaction takes place, and one proof of non-membership for the range where it does not. If each coin gets transacted on average once per day, and an RSA proof of non-membership is ~1 kB, then this means$ \approx 1Kb * 365 days * 10 fragments$ \approx 3.6MB for one year.

これによって、1 コインに対する history proof のサイズは「Plasma ブロック 1 つあたり Merkle branch 1 つ」から「コインに関するトランザクション 1 つあたり RSA accumulator proof 2 つ（1 つはトランザクションが発生したときの proof of membership、もう 1 つはトランザクションが発生していない間の proof of non-membership）」になる。各コインが 1 日に 1 回取引される場合、RSA proof of non-membership は 1 kB 以下になるため、1 年あたりで考えると、

$ 1 ~ \mathrm{kB} * 365 ~ \mathrm{days} * 10 ~ \mathrm{fragments} \approx 3.6 ~ \mathrm{MB}

となる。

m0t0k1ch1.icon 1 つ前の段落には Merkle proof も必要って書いてあるけど、Merkle proof の役割は history proof じゃなくなったからここでは言及しないってことかな？

If there are many coins, then we can optimize further. Proofs of non-membership can be batched: for example, if you prove that$ A * g^{50}is a power of$ g^{143}, then you know that$ \log_g{(A)} = -50 \mod{143}, which implies it is not a multiple of 11 or 13. This works well up to a few hundred indices, but if we want to have very fine denominations it may be possible to batch much more, see Log(coins)-sized proofs of inclusion and exclusion for RSA accumulators.

コインが大量に存在する場合、さらに最適化することができる。なぜなら、proof of non-membership をまとめることができるから。例えば、$ A * g^{50}が$ g^{143}の倍数であることを証明する場合、$ \log_g{(A)} = -50 \mod{143}であることは自明である。つまり、$ Aは$ 11もしくは$ 13の倍数ではない。これは、識別子の数が数百であるうちはうまく機能する。より細かい単位で管理したい場合であっても、Log(coins)-sized proofs of inclusion and exclusion for RSA accumulators を利用することで、はるかに多くの proof of non-membership をまとめることが可能かもしれない。

m0t0k1ch1.icon proof of non-membership は「$ Aは$ vの倍数ではない」ことが示すことが目的で、上述されていたのは$ vが 1 つのパターンだったが、ここでは「$ Aは$ 11と$ 13の倍数ではない」という$ vが 2 つ以上のパターンにも拡張できるということを示している

---.icon

m0t0k1ch1.icon memo

ref. https://ethresear.ch/t/rsa-accumulators-for-plasma-cash-history-reduction/3739/3

Vitalik の提案の中では明示されてはいないが、accumulator に追加する値は素数である必要がある。すなわち、EVM が、デポジットされたコインに対して、素数を識別子として割り当てる必要がある。

識別子が素数でない場合、fake proof を生成することが可能となってしまう。例えば、$ 14の proof of inclusion は$ 2と$ 7の proof of inclusion でもあるため、識別子$ 14に対応するコインが存在していた場合、その保有者は識別子$ 2と$ 7に対応する（他人の）コインの proof of inclusion を生成できてしまう。