Directus Speech-to-Text Interface
A Directus interface extension that enables voice input for text fields using OpenAI’s Whisper API for speech recognition.
Features
- π€ Voice input directly into Directus text fields
- π Multi-language support (German, English, Spanish, French, Italian, Portuguese, Russian, Japanese, Chinese)
- π Automatic language detection
- π― Seamless integration with existing text fields
- β‘ Real-time transcription with OpenAI Whisper
- π Support for both single-line input and multi-line textarea fields
- βοΈ Configurable append mode and text separators
Screenshots
Configuration Options
Configure the Speech-to-Text interface with OpenAI API key, language selection, and text separator options.
Interface in Action
The interface shows both single-line input and textarea fields with integrated microphone buttons for voice input.
Installation
From npm (Recommended)
npm install directus-extension-speech-to-text
Manual Installation
- Clone and build the extension:
git clone https://github.com/flagbit/directus-extension-speech-to-text.git
cd directus-extension-speech-to-text
npm install
npm run build
- Link extension to Directus:
npm run link
Configuration
Configure the interface with the following options:
- OpenAI API Key (Required): Your OpenAI API key for speech recognition
- Language: Choose between auto-detection or specific languages
- Placeholder: Customizable placeholder text
- Append Mode: Toggle between replacing or appending to existing text
- Text Separator: Configure how new text is separated (auto, space, newline, none)
Usage
- Add the Speech-to-Text interface to a String or Text field in your Directus collection
- Enter your OpenAI API key in the interface options
- Select your preferred language (optional - defaults to auto-detection)
- Configure append mode and text separator as needed
- Use the microphone button to start/stop voice recording
Technical Details
- Audio Format: WebM with Opus codec, converted to WAV for optimal compatibility
- Sample Rate: 16kHz for optimal Whisper API performance
- API: OpenAI Whisper API v1
- Framework: Vue 3 + Directus Extensions SDK
- File Size Limit: 25MB per audio recording
Development
# Development mode with file watching
npm run dev
# Build for production
npm run build
# Link to Directus for testing
npm run link
Requirements
- Directus 10.10.0+
- Valid OpenAI API key
- HTTPS connection for microphone access (in production)
- Modern browser with MediaRecorder API support
Browser Compatibility
- Chrome/Chromium 47+
- Firefox 29+
- Safari 14+
- Edge 79+
License
MIT
Author
JΓΆrg Weller (joerg.weller@flagbit.de)