Runtime Speech Recognizer

Description

Important note: The Marketplace version of the plugin may have issues with staging the language model asset due to an engine problem in locating the asset specified in DirectoriesToAlwaysCook (Additional Asset Directories to Cook in the editor) for the modules located in the engine folder, which could lead to an inability to use the plugin in the packaged build. This Marketplace-specific issue is currently under investigation. If you encounter such an issue, please consider downloading the plugin directly from GitHub.

Please be aware that in the 4.27 engine version, there are errors in speech recognition, particularly in streaming mode. Therefore, it is highly recommended to use the >=5.0 engine version.

Note: The images with plugin examples are made in conjunction with RuntimeAudioImporter, although you may have your own implementation of audio input implementation to be processed in RuntimeSpeechRecognizer.

GitHub, Documentation

Discord and Telegram support chats.

Runtime Speech Recognizer is an open-source plugin that enables real-time, offline speech recognition. Based on Whisper OpenAI technology, particularly whisper.cpp library, and supports multiple language models pre-selected in the plugin’s settings.