Whisper2024-03-25

3h44録音をWhisper APIに渡してみた

python whisper.py 0.37s user 0.18s system 0% cpu 10:00.79 total

後半はノイズだけだったので実質2時間だった

https://gyazo.com/93f3a6c6c1ab03e26ee499dee3a285a7

1時間の勉強会音声で試す

python whisper.py 0.34s user 0.14s system 0% cpu 3:27.50 total

https://gyazo.com/0237823730c2d31c4c1eda0f821e2430

さっきの例と違ってこちらは1時間ほぼずっと話している

1時間の音声で、処理時間が3分半、費用が40セント

得られた文字起こしをClaudeにまとめさせた

コードは何も難しくない

code:python

audio_file = open(audio_path, "rb")

transcription = client.audio.transcriptions.create(

model="whisper-1", file=audio_file

)

print(transcription.text)

with open(f"whisper_out/{indir}_{audio}.txt", "w") as f:

f.write(transcription.text)

一番時間がかかったのはffmpegのインストール、gccとか入れ始める

$ brew install ffmpeg

なお音声ファイルの分割のためなので音声ファイルが25MB以下なら必要ない

上記の1時間喋りまくりの音声が15MBだから、1時間で刻んでくれる録音アプリを使ってるなら必要ない