asr

Automatic Speech Recognition

This filter uses PocketSphinx for speech recognition. To enable compilation of this filter, you need to configure FFmpeg with --enable-pocketsphinx.

It accepts the following options:

'rate': Set sampling rate of input audio. Defaults is 16000. This need to match speech models, otherwise one will get poor results.
'hmm': Set dictionary containing acoustic model files.
'dict': Set pronunciation dictionary.
'lm': Set language model file.
'lmctl': Set language model set.
'lmname': Set which language model to use.
'logfn': Set output for log messages.

The filter exports recognized speech as the frame metadata lavfi.asr.text.