2024-07-21 データ指向アプリケーションデザイン第6章パーティショニング

内容整理

パーティショニング (Partitioning) とは

シャーディング (Sharding) ともいう

データベースをいくつかの Partition に分割して保持すること

Partitioning と Replication は併用可能

パフォーマンスを表す語彙

skew: partition の負荷が偏ること

hot spot: 偏っている partiotn

分割について

レコードをどう分割するか？

主キーの範囲で分割

取得が用意

hot spot が発生する可能性がある

主キーのハッシュで分割

均等な分割がある程度保証

連続した値の取得効率が悪い

（セカンダリ）インデックスをどう分割するか？

インデックスとセカンダリインデックスは横軸と縦軸の関係なので、インデックスを基準に Partitioningをすればセカンダリインデックスは分割されてしまう

対策: セカンダリインデックスをレコードとは独立して分割する

再配置（リバランシング）について

Partitioning の再配置時にどうやってレコードの移動を最小にするか？

ハッシュや余剰を使うとノード数 N が増加したときに、全体的にレコード単位で再配置が発生してしまう

Partitiojn をノード数よりも多く分割し、Partition 単位で再配置を行えば最小限の移動で済む

余談: メモリの再配置とかも似た話があるよね

動的な解決

分割されたレコードをどのように取得するか

リクエストのルーティング

パラレルクエリ

疑問

実際の RDBMS（今回は PostgreSQL）ではどのように Parttioning を設定するのか

Partitioning によってどれぐらいパフォーマンスが変化するか

Partitioning による制約は？

index, row level security

どうやって複数の Partition をまたいで SQL を投げる？

クラウドサービスではどう使う？

実際の RDBMS（今回は PostgreSQL）ではどのように Parttioning を設定するのか

https://www.postgresql.org/docs/9.4/ddl-partitioning.html

The benefits will normally be worthwhile only when a table would otherwise be very large. The exact point at which a table will benefit from partitioning depends on the application, although a rule of thumb is that the size of the table should exceed the physical memory of the database server.

テーブルの物理メモリのサイズを超えたら Partition を検討するといいかも

partition の作成

code:sql

example=# CREATE TABLE books (

id SERIAL PRIMARY KEY,

title VARCHAR(255) NOT NULL,

created_at TIMESTAMP NOT NULL

);

CREATE TABLE

example=# \dt

List of relations

Schema | Name | Type | Owner

--------+-------+-------+-------

public | books | table | admin

(1 row)

example=# CREATE TABLE books_a_to_h () INHERITS (books);

CREATE TABLE

example=# CREATE TABLE books_i_to_p () INHERITS (books);

CREATE TABLE

example=# CREATE TABLE books_q_to_z () INHERITS (books);

CREATE TABLE

example=# \dt

List of relations

Schema | Name | Type | Owner

--------+--------------+-------+-------

public | books | table | admin

public | books_a_to_h | table | admin

public | books_i_to_p | table | admin

public | books_q_to_z | table | admin

(4 rows)

INHERITS 初めて聞いた → PostgreSQL INHERITS

このマスターテーブルでちょっと遊んでみる

まずはテーブル構造の確認

たしかに同じだ

code:sql

example=# \d books

Table "public.books"

Column | Type | Collation | Nullable | Default

------------+-----------------------------+-----------+----------+-----------------------------------

id | integer | | not null | nextval('books_id_seq'::regclass)

title | character varying(255) | | not null |

created_at | timestamp without time zone | | not null |

Indexes:

"books_pkey" PRIMARY KEY, btree (id)

Number of child tables: 3 (Use \d+ to list them.)

\d books_a_to_h

Table "public.books_a_to_h"

Column | Type | Collation | Nullable | Default

------------+-----------------------------+-----------+----------+-----------------------------------

id | integer | | not null | nextval('books_id_seq'::regclass)

title | character varying(255) | | not null |

created_at | timestamp without time zone | | not null |

Inherits: books

次に親テーブルのスキーマ変更

親テーブルのスキーマ変更は子テーブルに変更を及ぼす模様

code:sql

example=# ALTER TABLE books ADD price INT;

ALTER TABLE

example=# \d books

Table "public.books"

Column | Type | Collation | Nullable | Default

------------+-----------------------------+-----------+----------+-----------------------------------

id | integer | | not null | nextval('books_id_seq'::regclass)

title | character varying(255) | | not null |

created_at | timestamp without time zone | | not null |

price | integer | | |

Indexes:

"books_pkey" PRIMARY KEY, btree (id)

Number of child tables: 3 (Use \d+ to list them.)

example=# \d books_a_to_h

Table "public.books_a_to_h"

Column | Type | Collation | Nullable | Default

------------+-----------------------------+-----------+----------+-----------------------------------

id | integer | | not null | nextval('books_id_seq'::regclass)

title | character varying(255) | | not null |

created_at | timestamp without time zone | | not null |

price | integer | | |

Inherits: books

では子テーブルをスキーマ変更したら？

子テーブルにのみスキーマ変更が適用されるみたい…

code:sql

example=# ALTER TABLE books_a_to_h ADD cost INT;

ALTER TABLE

example=# \d books

Table "public.books"

Column | Type | Collation | Nullable | Default

------------+-----------------------------+-----------+----------+-----------------------------------

id | integer | | not null | nextval('books_id_seq'::regclass)

title | character varying(255) | | not null |

created_at | timestamp without time zone | | not null |

price | integer | | |

Indexes:

"books_pkey" PRIMARY KEY, btree (id)

Number of child tables: 3 (Use \d+ to list them.)

example=# \d books_a_to_h

Table "public.books_a_to_h"

Column | Type | Collation | Nullable | Default

------------+-----------------------------+-----------+----------+-----------------------------------

id | integer | | not null | nextval('books_id_seq'::regclass)

title | character varying(255) | | not null |

created_at | timestamp without time zone | | not null |

price | integer | | |

cost | integer | | |

Inherits: books

親テーブルへ直接データ追加をしたらどうなる？

親に入る

親子という関係ではなく、あくまで複数のテーブルに近い

code:sql

example=# INSERT INTO books (title, price, created_at) VALUES ('aaa', 100, '2024-07-21T10:00');

INSERT 0 1

example=# SELECT * FROM books;

id | title | created_at | price

----+-------+---------------------+-------

1 | aaa | 2024-07-21 10:00:00 | 100

(1 row)

Partition の制約をつけていく

code:sql

DROP TABLE books_a_to_h;

DROP TABLE books_i_to_p;

DROP TABLE books_q_to_z;

CREATE TABLE books_a_to_h (CHECK ('a' <= title AND title <= 'h')) INHERITS (books);

example=# INSERT INTO books_a_to_h (title, price, created_at) VALUES ('aaa', 100, '2024-07-21T10:00');

INSERT 0 1

example=# INSERT INTO books_a_to_h (title, price, created_at) VALUES ('bbb', 100, '2024-07-21T10:00');

INSERT 0 1

example=# \d+ books_a_to_h

example=# INSERT INTO books_a_to_h (title, price, created_at) VALUES ('zzz', 100, '2024-07-21T10:00');

ERROR: new row for relation "books_a_to_h" violates check constraint "books_a_to_h_title_check"

DETAIL: Failing row contains (5, zzz, 2024-07-21 10:00:00, 100).

トリガーで自動振り分けをすればもっと自動で入る模様？