デプロイ
仮想化
Kubernetes
Docker
ソフトウェアアーキテクチャ
クラウド
Git・GitHub
CI/CD
監視・ログ
Push a model to Replicate
https://replicate.com/docs/guides/push-a-model
modal
https://qiita.com/knagi/items/d8148d731d2d2ed020d1
Hugging Face Endpoint
https://ui.endpoints.huggingface.co/
Fireworks ai
netlify
vercel
google cloud
aws
azure
steramlit
gradio
github
Kamal 2 を使い、インフラに詳しくない人でもNext.jsを296円のVPSにデプロイできるよう、説明してみる
https://zenn.dev/naofumik/articles/8849c2e8feecc0
Where is LLM inference run?
https://bentoml.com/llm/llm-inference-basics/cpu-vs-gpu-vs-tpu
Serverless vs. Self-hosted LLM inference
https://bentoml.com/llm/llm-inference-basics/serverless-vs-self-hosted-llm-inference