デプロイ - yuyan

デプロイ

ソフトウェアアーキテクチャ

監視・ログ

Push a model to Replicate

https://replicate.com/docs/guides/push-a-model

modal

https://qiita.com/knagi/items/d8148d731d2d2ed020d1

Hugging Face Endpoint

https://ui.endpoints.huggingface.co/

Fireworks ai

netlify

vercel

google cloud

aws

azure

steramlit

gradio

github

Kamal 2 を使い、インフラに詳しくない人でもNext.jsを296円のVPSにデプロイできるよう、説明してみる

https://zenn.dev/naofumik/articles/8849c2e8feecc0

Where is LLM inference run?

https://bentoml.com/llm/llm-inference-basics/cpu-vs-gpu-vs-tpu

Serverless vs. Self-hosted LLM inference

https://bentoml.com/llm/llm-inference-basics/serverless-vs-self-hosted-llm-inference

talos linux

https://github.com/siderolabs/talos