Bonsai - Reaqtor

Bonsai

クライアントとサービスの間でのクエリ・インテントの転送をサポートするためには、イベント処理用の新しい言語を開発するか、（ASTのような）式木を表現する汎用的なメカニズムを構築するかのどちらかを選択する必要がありました。

In order to support the transport query intent between clients and services, we had a choice of either inventing a new language for event processing, or building a general-purpose mechanism to represent expression trees (similar to ASTs).

SQL組織では、XMLベースの式シリアライザを構築していましたが、これには様々な制限がありました。まず第一に、.NETフレームワークの式ツリーAPIに縛られていたため、JavaScriptやC++などの他の言語には自然に適合しませんでした。次に、名目上の静的型付けの多くの特徴が.NETのルーツから継承されていたため、構造的型付けがより良いデフォルトであるデータ処理の文脈では適用できませんでした。

Back in the SQL organization, we had built an XML-based expression serializer which suffered from various limitations. First and foremost, it was tied to the expression tree APIs in the .NET Framework, so it wouldn’t be a natural fit for other languages such as JavaScript or C++. Second, many traits of nominal static typing were inherited from its .NET roots, making it less applicable in the context of data processing with structural typing being a better default.

これらの欠点を解決するために、Bart De SmetによってBonsaiプロジェクトが開始されました。最初は、IQueryable<T>のクエリをグラフデータベースに転送するための、言語にとらわれない、トランスポートフレンドリーなフォーマットを可能にするという文脈でした。Bonsaiは任意の式を表現できるため、リアクティブイベント処理を含む他のクエリドメインにも容易に適用できました。そのため、IRPプロジェクトを開始するにあたり、式の直列化のニーズに対応するための明白な選択肢となりました。

To address these shortcomings, the Bonsai project was started by Bart De Smet, first in the context of enabling a language-agnostic and transport-friendly format to transfer IQueryable<T> queries to the graph database. Given that Bonsai trees can represent arbitrary expressions, it was readily applicable to other query domains, including reactive event processing. As such, it become the obvious choice to address expression serialization needs when starting the IRP project.

よく手入れされた小さな木を意味するプロジェクト名を表現するために、バートはベルビューにある地元のガーデンセンターの苗床に行き、自分の机の上に置くための小さな盆栽を買ってきました。バートの夏休みには、同僚がこの木の世話をしてくれました。元々の木は枯れてしまったが、その子供たちの多くは、バートの同僚が手入れをしています。これは、Bonsaiの技術が他のサービスに応用されていく軌跡を象徴しているように思えます。

To illustrate the project’s name - referring to tiny well-groomed trees - Bart went to a local garden center’s nursery in Bellevue and bought a little Bonsai tree to put on his desk. During one of Bart’s summer vacations, a colleague took care of the tree. Even though the original tree died, many of its children are now groomed by Bart’s colleague. This almost symbolically reflects the technological trajectory of Bonsai which has been augmented and put to use in other services.

独自の言語を作るのではなく、汎用的な表現である式木にこだわる理由は簡単です。新しい言語を作ることは非生産的であり、意味的に健全な言語を設計するために莫大な投資を必要とし、コンパイラからエディタに至るまで膨大な量のしっかりとしたツールを必要とし、ユーザーにとっては急な学習曲線となります。このような外部のドメイン固有言語（DSL）は、しばしば誤った方向に進み、長続きせず、汎用コードの不透明な文字列（文字列リテラルのSQL文や、XML属性の文字列ベースの表現言語、JSONやYAMLドキュメントなどを思い浮かべてください）になってしまい、ツールが行き届かないことがあります。代わりに内部DSLに焦点を当てることで、ユーザーが選択したホスト言語におんぶに抱っこすることができます。

The rationale of sticking with a general-purpose representation of an expression tree rather than inventing a custom language is simple. The creation of new languages is unproductive, requiring huge investments in the design of a semantically sound language, massive amounts of solid tooling going all the way from compilers to editors, and a steep learning curve for users. Such external domain specific languages (DSLs) are often misguided, short-lived, and often end up in opaque strings in general-purpose code (think of SQL statements in string literals, or string-based expression languages in XML attributes, JSON or YAML documents, etc.) where no tooling can go. By focusing on internal DSLs instead, we can piggyback on the host language of the user’s choice.

Bonsai treeは、オプションで式木を静的に型付けして表現したものです。デフォルトのエンコーディングはJSON（数年後にバイナリ版が作られました）で、ツリーのノードを配列で表します。このような配列は、識別タグを持つ最初の要素と、子やオプションの型情報など、ノードタイプに固有の要素で構成されています。

Bonsai trees are optionally statically typed representations of expression trees. The default of encoding is JSON (a binary variant was built some years later) using arrays to represent nodes of the tree. Each such array consists of a first element holding a discriminator tag, followed by elements specific to the node type, including children, optional type info, etc.

例えば、定数値42の表現は次のようになります。

For example, the representation of a constant value 42 looks like this:

code:json

"::", 42

ここでは、 :: 判別子を使って定数を表し、その値を2番目の要素スロットに格納しています。定数の値をどのようにデシリアライズするかはBonsaiでは規定されておらず、埋め込まれたJSON、Base64エンコードされたバイナリ形式の表現、XMLを保持する文字列リテラルなどが考えられます。Bonsaiツリーの交換に参加する当事者は、定数値のシリアル化フォーマットに合意することができます。

In here, the :: discriminator is used to denote a constant, and the value is stored in the second element slot. The choice to deserialize the constant’s value is not specified by Bonsai and could be embedded JSON, a base64-encoded representation of a binary format, a string literal holding XML, etc. It’s up to the parties participating in an exchange of Bonsai trees to agree on the serialization format of constant values.

Bonsaiはオプションで静的に型付けされています。上の例では、42という値は、例えばJavaScriptやPythonのクライアントが使用するための任意の数字として扱われます。しかし、より多くの静的型情報を追加したい場合、Bonsaiはこのノードタイプのためのオプションの第3スロットをサポートしています。例えば、以下のようになります。

Bonsai trees are optionally statically typed. In the example above, the value 42 may be treated as any number for use by e.g. a JavaScript or Python client. However, if we wish to add more static type info, Bonsai does support an optional third slot for this node type to do so. For example:

code:json

"::", 42, 0

ここで、0は、ツリーと一緒に送られてくる、いわゆる（オプションの）「リフレクション・コンテキスト」で保持される型テーブルのインデックスです。少し拡大してみると、盆栽の全貌は次のようになります。

In here, 0 is an index in a type table held in the so-called (optional) “reflection context” that’s sent alongside the tree. If we zoom out a bit, the full Bonsai tree looks like:

code:json

{

"Context":

{

"Types":

{

"::", "System.Int64"

}

"Expression": "::", 42, 0

}

プリミティブ型の表現は、（偶然ですが）同様に :: 判別子を使って行われ、その後に型の名前を付けます。繰り返しになりますが、Bonsaiは型名の構文を規定しておらず、Bonsaiツリーを交換する当事者が型名に同意することになっています。このケースでは、System.Int64を使用していることから、.NETのバックグラウンドがあることがわかりますが、シリアライザーとデシリアライザーは、言語やランタイムでこの型を任意の（意味的に同等の）型にマッピングする自由があります。

The representation of a primitive type is (coincidentally) done using the :: discriminator as well, followed by a name for the type. Again, Bonsai doesn’t specify the syntax of type names, and it’s up to the parties exchanging a Bonsai tree to agree upon type names. In this case, the use of System.Int64 reveals a .NET background, though serializers and deserializers have the freedom to map this to any (semantically equivalent) type in their language or runtime.

.NETとC/C++のランタイム間で式を交換するBonsaiの様々な具体的な用途では、どちらの環境でも構文的には存在しないが、両方で既知の型にマッピングできる「ニュートラル」な型の使用を選択しています。これは、Mapping属性を使用して、合意されたスキーマリポジトリやオントロジーに基づいてプロパティが周知であることを示すのとよく似ています。例えば、64ビットの符号付き整数は、type://primitive/integer/signed?bits=64または単にsigned_int64と表現することができます。結局のところ、これは「自転車置き場の議論」を彩るものです。

Various concrete uses of Bonsai with exchange of expressions across .NET and C/C++ runtimes have opted for the use of “neutral” types that are syntactically non-existent in either environment, but can be mapped to a known type in both. This is very similar to the use of Mapping attributes to denote properties as being well-known according to some agreed-upon schema repository or ontology. For example, a 64-bit signed integer could be represented as type://primitive/integer/signed?bits=64 or just signed_int64. At the end of the day, this is coloring the bikeshed.

少し複雑なBonsaiの例を（反射の文脈を省いて）以下に示します。

A slightly more complex example of a Bonsai tree (omitting the reflection context) is shown below:

code:json

[

"+",

"$", "x",

"::", 1

]

このツリーは x + 1 を表しており、加算を表す + や、変数やパラメータを参照するための $ など、他の識別記号も表示されています。この他にも、メンバーの検索、関数の呼び出し、様々な算術演算子など、様々な識別記号が存在します。

This tree represents x + 1 and shows other discriminators such as + for addition and $ to refer to variables or parameters. Many more discriminators exist for e.g. member lookup, function invocation, various arithmetic operators, etc.

識別記号やタグの使用は、LISPのS式などの概念と完全に類似していることに注意してください。

Note that the use of discriminators or tags is completely analogous to the concept of e.g. S-expressions in LISP.

Bonsaiの原点は、C#、Visual Basic、F#、IronPython、IronRubyなど、さまざまなフロントエンド言語で使用されている.NETの式木APIにあります。これらのAPIはどちらかというと汎用的で、さまざまな言語に広く適用できます。Bonsaiの判別器と、.NET FrameworkのSystem.Linq.Expressions APIのノードタイプは、ほぼ1対1で対応しています。.NET 4.0以降、これらのAPIには、ブロックやループなどのステートメントコンストラクトのサポートが含まれています。Bonsaiの後のリビジョンでは、これらのサポートも追加されていますので、Bonsaiツリーはステートメントツリーの汎用的な表現として参照するのが良いでしょう。

The origin of Bonsai is inspired by the expression tree APIs in .NET which are used by a variety of front-end languages including C#, Visual Basic, F#, IronPython, and IronRuby. These APIs are rather general-purpose and widely applicable to a wide range of languages. There’s roughly a 1:1 correspondence between Bonsai discriminators and node types in the System.Linq.Expressions APIs in the .NET Framework. Starting with .NET 4.0, these APIs include support for statement constructs such as blocks and loops. A later revision of Bonsai has added support for these as well, so it’d be better to refer to Bonsai trees as a general-purpose representation of statement trees.

Bonsaiは、あらゆるホスト言語において、最も自然でユーザーフレンドリーな方法でコードをデータとして表現する手段を、コード交換のための正規化された式表現にマッピングすることで、言語プロジェクションを作成することができます。例えば、C#ではラムダ式を式木に透過的に変換することでユーザーの意図を汲み取ることができます。

Bonsai enables the creation of language projections whereby means to represent code as data in any host language in the most natural and user-friendly way can be mapped onto a normalized expression representation for code exchange. For example, in C# one would use transparent conversion of lambda expressions to expression trees to capture user intent:

code:C#

// int Calculate(Expression<Func<int, int, int>> expression, ...)

calculatorService.Calculate((x, y) => x * 2 + 1 - y, ...)

F#のような言語では、明示的な引用符を使うことができます。

In languages such as F#, explicit quotations could be used:

code:F#

// or <@@ ... @@> for untyped quotations

calc(<@ (x, y) -> x * 2 + 1 - y @>, ...)

JavaScriptなどの言語では、直接的な引用のサポートはありませんが、ユーザーは、コードの一部を文字列として受け取り、実行時にそれが解析されるのを見るevalのようなAPIに慣れています（例：esprimaの使用）。

In languages such as JavaScript, not direct quotation support exists, but users have gotten accustomed to eval-like APIs that take a piece of code a string and see it getting parsed at runtime (e.g. using esprima):

code:JavaScript

calc("function (x, y) { return x * 2 + 1 - y; }", ...)

最後に、Bonsaiと反応型イベント処理クエリ式のIRP正規形表現を組み合わせると、以下のような「クエリ」になります。

Finally, combining Bonsai with the IRP normal form representation of reactive event processing query expressions, results in “queries” that look like this:

code:json

[

"()", // function invocation

"$", "rx://operators/filter", // identifier of "Where"

[

"$", "bing://streams/weather", // identifier of a weather stream

[

"=>", // lambda

[

"$", "w" // single parameter "w"

[

">", // greater than

[

".", // member lookup

"$", "w", // on parameter "w"

"schema:/weather/temp" // for temperature

[

"::", // constant

25 // with value 25

]

この式は、C#では以下のようなクエリの正規化された等価物を表しています。

This expression represents the normalized equivalent of a query that could look as follows in C#:

code:C#

ctx.GetObservable<Weather>("bing://streams/weather")

.Where(w => w.Temperature > 25);

または、発見されたストリームのマッピングを持つ特殊なコンテキストオブジェクトを使用します。

or, using a specialized context object with mappings for discovered streams,

code:C#

ctx.Weather // annoted with KnownResource("bing://streams/weather")

.Where(w => w.Temperature > 25);

Bonsaiの型システム（オプションのリフレクション・コンテキストを参照）は、プリミティブ型、配列、ジェネリック型、構造型をサポートしています。上の例では、Temperatureフィールドは、Data Modelライブラリを使って識別子に正規化されています。

The type system of Bonsai (cf. the optional reflection context) supports primitive types, arrays, generic types, and structural types. In the example above, the Temperature field was normalized to an identifier through the use of the Data Model library:

code:C#

class WeatherInfo

{

Mapping("schema:/weather/temp")

public double Temperature { get; set; }

}

Bonsaiツリーで型情報が省略されている場合、結果として得られる式を評価するために様々な手法を用いることができます。1つのアプローチは、完全に動的で、遅延結合を行うことです。もう1つは型推論で、例えば、ウェザーストリームによって生成されたイベントの型を発見することで、その後、Where演算子の入力型に流れ、その結果、温度フィールドの型を発見することができます。

When type information is omitted from Bonsai trees, various techniques can be used to evaluate the resulting expression. One approach is to be fully dynamic and perform late binding. Another is type inference, for example by discovering the type of the events produced by the weather stream, which can subsequently flow to the Where operator’s input type, thus allowing to discover the type of the temperature field.

型情報が存在する場合、式のパラメータには型情報への参照が含まれ、例えば

In case type information is present, the parameters in the expression contain a reference to type info, for example:

code:json

"$", "bing://streams/weather", 3

リフレクション・コンテキストのTypesテーブルは、おおよそ次のようになっています。

with the Types table in the reflection context looking roughly like this:

code:json

{

"Types":

[

/*0*/ "::", "System.Double", // primitive double

/*1*/ [

"{;}", // record type

"schema:/weather/temp", 0 // "double temp"

/*2*/ "::", "IObservable`1", // IObservable<T>

/*3*/ 2, 1 // IObservable<{...}>

]

}

ここでは，第1の要素はdouble浮動小数点のプリミティブ型，第2の要素はdouble型の温度フィールドを持つレコード型（インデックスベースの参照を使用），第3の要素はIObservable<T>シーケンスを表すオープンジェネリック型，第4の要素はインデックス1の型パラメータを使用したインデックス2のオープンジェネリックのクローズドジェネリックインスタンスを表しています。これは構造的に型付けされたIObservable<WeatherInfo>と同等のものです。

In here, the first element represents the primitive type for a double floating point, the second element represents a record type with a temperature field of type double (using an index-based reference), the third element represents an open generic type representing an IObservable<T> sequence, and the fourth element represents a closed generic instantiation of the open generic at index 2 using the type parameter at index 1. This is the structurally typed equivalent of IObservable<WeatherInfo>.

Reaqtorで使用されている現在のIRP正規形では、observable、observers、subscriptionなどのリアクティブエンティティの表現は、IAsyncReactiveQ*ファミリーの型を使用して行われます。上の例では、IObservable<T>を使用するのではなく、ツリーは代わりにIAsyncReactiveQbservable<T>を使用します。これもBonsaiのドメインにとらわれないアプローチの一例で、IRPが上に重ねられ、その言語エンコーディングの通常のフォーマットを決定しています。

In the current IRP normal form used by Reaqtor, the representation of reactive entities such as observables, observers, and subscriptions are done using the IAsyncReactiveQ* family of types. In the example above, rather the using IObservable<T>, the tree would use IAsyncReactiveQbservable<T> instead. This is another example of the domain-agnostic approach of Bonsai where IRP is layered on top and makes decisions about the normal format of its language encoding.

IRPの後続バージョンでは、コントラクトタイプを使用して、（IObservable<T>などの）ノミナルタイプへの参照をさらに正規化する努力がなされています。これにより、KnownResourceやMappingなどの属性を使ってメンバーやメソッドを識別子にマッピングするのと同様に、型名を識別子にマッピングすることができます。この表現では、上記の例の通常のフォームは、代わりにCObservable<T>として識別される型を参照し、プロデューサーとコンシューマーには、この識別子を利用可能な最も自然な型にマッピングする自由が与えられます。

In later iterations of IRP, efforts have been made to further normalize references to nominal types (such as IObservable<T>) using contract types. This effectively introduces a mapping of type names to identifiers in a way similar to mapping members and methods to identifiers using attributes such as KnownResource and Mapping. In this representation, the normal form of the example above refers to a type identified as CObservable<T> instead, given producers and consumers the freedom to map this identifier on the most natural type available to them.

最後に、.NETのBonsaiにはExpressionSlimと呼ばれる型を使った「スリムな式木」と呼ばれるオブジェクトモデルがあります。この型とその型階層は、System.Linq.Expressions.Expression型ファミリーとほぼ同等ですが、1つの大きな違いがあります。スリム式は、.NETリフレクションオブジェクトを参照するのではなく、構造型を使用して任意の静的型付けを表現できるオブジェクトモデルを備えています。この型システムは、TypeSlim、MemberInfoSlimなどのAPIを使用して表現されます。これらは、ノミナル型を使用して必須の静的型付けを表現するSystem.Reflection APIの一種です。

Finally, Bonsai trees in .NET come with an object model that’s often referred to as “slim expression trees” using a type called ExpressionSlim. This type and its type hierarchy are roughly equivalent to the System.Linq.Expressions.Expression type family but with one main difference. Rather than referring to .NET reflection objects, slim expressions come with an object model that’s capable of representing optional static typing using structural types. This type system is represented using APIs such as TypeSlim, MemberInfoSlim, etc. which are variants of the System.Reflection APIs that represent mandatory static typing using nominal types.

.NETリフレクションを用いた.NET式ツリーの「太い」世界と、オプションのタイピングを用いた盆栽の「細い」空間との間の変換は、ライブラリによって提供されます。

Conversions between the “fat” world of .NET expression trees with .NET reflection, and the “slim” space of Bonsai trees with optional typing, are provided by the library:

"fat"から"slim"にするには、表現や型の正規化が必要です。例えば、名目上のデータモデルの型を、同等の構造型表現にマッピングします。

Going from “fat” to “slim” involves normalization of expressions and types, e.g. mapping nominal Data Model types to their equivalent structural type representation.

"slim"から"fat"への移行には、結合ステップ（オプションで型推論やレイトバインディングを使用）と、構造型から互換性のある形状の名目型へのマッピングが必要です。

Going from “slim” to “fat” involves binding steps (optionally using type inference or late binding) and mapping of structural types to nominal types with a compatible shape.

これらの変換の例としては、クライアントライブラリのバックエンドと、サービスライブラリのフロントエンドがあります。クライアントライブラリで実行される正規化ステップは、メタデータアノテーションを利用して、意図したユーザーフレンドリーな表現を基本的な正規の形式から切り離すことができる、効果的な「アンバインド」ステップです。サービスライブラリが行う結合ステップは、定義のカタログやレジストリを利用して、識別子を実装にマッピングし、型情報の発見も可能にします。このような最小限の世界観の中では、IRP互換の実行エンジンは、単に式のバインダー、コンパイラ、エバリュエータに過ぎません。

Examples of these transformations are the back-end of a client library, and the front-end of a service library, respectively. Normalization steps carried out by a client library are effectively “unbind” steps that can leverage metadata annotations to decouple the user-friendly representation of intent from the underlying normal form. Binding steps carried out by a service library can leverage catalogs or registries of definitions to map identifiers onto implementations, also allowing for the discovery of type information. In this minimalistic world view, an IRP-compatible execution engine is merely an expression binder, compiler, and evaluator.

このレイヤリングアプローチにより、さまざまなクライアントライブラリやさまざまなサービスの実装が、中間にある単一の通常のフォームで可能になることに注意してください。IRPとReaqtorのコンテキストでは、例えば、.NETベースのクラウドサービスにリアクティブな計算を送信できる.NETクライアントや、デバイス上で動作するリアクティブエンジンのC++ベースの実装を使用することができます。

Note that this layering approach enables a variety of client libraries and a variety of service implementations with a single normal form in the middle. In the context of IRP and Reaqtor this has enabled the use of e.g. .NET clients that can submit reactive computations to .NET-based cloud services as well as C++-based implementations of reactive engines running on devices.