音声基盤モデル - yuyan

音声基盤モデル

音声工学・音響工学

音声認識入門

Speech 2 Speech

text2speech speech2text

基盤モデル

LLMと音声基盤モデルを用いた音声認識

https://speakerdeck.com/spiralai/llmtoyin-sheng-ji-pan-moderuwoyong-itayin-sheng-ren-shi

Amphion: An Open-Source Audio, Music, and Speech Generation Toolkit

https://github.com/open-mmlab/Amphion

Pushing the frontiers of audio generation

https://deepmind.google/discover/blog/pushing-the-frontiers-of-audio-generation/

#13: 最近のTTSについて語る〜APIサービスから音声モデル作成まで〜

https://listen.style/p/aiengineeringnow/nivntyfu

sense voice

https://github.com/FunAudioLLM/SenseVoice

Foundational Speech Technology：Enterprise-grade APIs for Speech-to-Text and Voice AI Agents

https://www.speechmatics.com/