AWS記事_Architecture

一文要約

どうやってジョブ実行したのか記載

原文

code:txt

The architecture developed for this project provides completely unattended automation for deployment, data collection and environment tear down. This automation made it possible to run a large number of experiments in parallel to tune parameters, compare different basecallers and assess the performance impact of methylation calling.

Figure 3 – Key components of the benchmarking architecture are the AWS Batch and Amazon FSx for Lustre services. Other services utilized for automated deployment are the AWS Cloud Development Kit (CDK) and Amazon EC2 Image Builder. Benchmarking jobs were created with Python. Results were written to a DynamoDB table and evaluated using Amazon SageMaker Notebooks.

Customers may reuse all or part of the architecture as guidance for deploying genomics workload in the AWS Cloud. The core services utilized are AWS Batch and Amazon FSx for Lustre. AWS Batch is a service to run batch computing workloads on the AWS Cloud. We used separate AWS Batch job queues and compute environments for each instance types to isolate and orchestrate benchmarking experiments. For example, for the g5.2xlarge instance, we created a job queue named “g5-2xlarge-queue” and a compute environment named “g5-2xlarge-ce”. Amazon FSx for Lustre provides a fully-managed, high-performance Lustre file system. You use Lustre for workloads where speed matters, such as machine learning, high performance computing (HPC), video processing, and financial modeling. To get high-performance for reads and writes we used Amazon FSx for Lustre backed by Amazon S3 buckets.

We used Amazon Elastic Container Registry (Amazon ECR) to host container images. We also used EC2 Image Builder to simplify build, test and deployment of container images. This resulted in improved iteration times as the container image definitions changed frequently in the beginning of the project.

We used AWS Cloud Development Kit (CDK) to orchestrate the infrastructure deployment. CDK is an infrastructure as code solution that makes it possible to create and tear down infrastructure rapidly. As a result, we are able to rebuild the environment rapidly after design changes. Furthermore, CDK reduced cost as we can tear down the environment when not in use between test phases.

The CliveOME 5mC dataset is automatically downloaded as part of the environment build by an EC2 Instance (the “downloader”) and placed in the S3 bucket that backs the Amazon FSx for Lustre file system. Once the download is completed, the instance gets terminated by CDK to avoid cost of an idle EC2 instance.

https://scrapbox.io/files/65c5a84e135f6c0023891f56.png

翻訳

code:txt

このプロジェクトのために開発されたアーキテクチャは、デプロイメント、データ収集、環境撤収を完全に自動化する。この自動化により、パラメーターを調整し、異なるベースキャラーを比較し、メチル化コールのパフォーマンスへの影響を評価するために、多数の実験を並行して実行することが可能になった。

図 3 - ベンチマークアーキテクチャの主要コンポーネントは、AWS Batch および Amazon FSx for Lustre サービスです。自動デプロイに利用したその他のサービスは、AWS Cloud Development Kit (CDK) と Amazon EC2 Image Builder です。ベンチマークジョブはPythonで作成しました。結果はDynamoDBのテーブルに書き込まれ、Amazon SageMaker Notebooksを使用して評価されました。

お客様は、AWSクラウドでゲノムワークロードをデプロイするためのガイダンスとして、アーキテクチャの全部または一部を再利用することができます。利用したコアサービスは、AWS BatchとAmazon FSx for Lustreである。AWS Batchは、AWSクラウド上でバッチコンピューティングワークロードを実行するサービスである。我々は、ベンチマーク実験を分離し、オーケストレーションするために、インスタンスタイプごとに個別のAWS Batchジョブキューと計算環境を使用した。例えば、g5.2xlargeインスタンスでは、"g5-2xlarge-queue "という名前のジョブキューと、"g5-2xlarge-ce "という名前の計算環境を作成しました。 Amazon FSx for Lustreは、フルマネージドで高性能なLustreファイルシステムを提供します。Lustreは、機械学習、ハイパフォーマンス・コンピューティング（HPC）、ビデオ処理、金融モデリングなど、スピードが重要なワークロードに使用されます。読み込みと書き込みのハイパフォーマンスを得るために、Amazon S3バケットでバックアップされたAmazon FSx for Lustreを使用しました。

Amazon Elastic Container Registry（Amazon ECR）を使ってコンテナイメージをホストしました。また、EC2 Image Builderを使用して、コンテナ・イメージのビルド、テスト、デプロイを簡素化しました。この結果、プロジェクト開始当初はコンテナイメージの定義が頻繁に変更されたため、反復時間が短縮されました。

インフラデプロイのオーケストレーションには、AWS Cloud Development Kit（CDK）を使用した。CDKはインフラストラクチャをコードとして提供するソリューションであり、インフラストラクチャの迅速な作成と撤収を可能にする。その結果、設計変更後も迅速に環境を再構築できるようになりました。さらにCDKは、テストフェーズの合間に使用しない環境を解体できるため、コスト削減にもつながりました。

CliveOME 5mCデータセットは、EC2インスタンス（"ダウンローダー"）によって環境構築の一部として自動的にダウンロードされ、Amazon FSx for LustreファイルシステムをバックアップするS3バケットに置かれる。ダウンロードが完了すると、アイドル状態のEC2インスタンスのコストを避けるために、CDKによってインスタンスが終了される。