Skip to main content

Gladia

Gladia implementation for @micdrop/server.

This package provides high-quality real-time speech-to-text implementation using Gladia's streaming API.

Installation​

npm install @micdrop/gladia

Usage​

import { GladiaSTT } from '@micdrop/gladia'
import { MicdropServer } from '@micdrop/server'

const stt = new GladiaSTT({
apiKey: process.env.GLADIA_API_KEY || '',
})

// Use with MicdropServer
new MicdropServer(socket, {
stt,
// ... other options
})

Options​

OptionTypeDefaultDescription
apiKeystringRequiredYour Gladia API key
settingsDeepPartial<GladiaLiveSessionPayload>OptionalAdvanced configuration for Gladia live session
connectionTimeoutnumber5000Timeout in milliseconds for WebSocket connection
transcriptionTimeoutnumber4000Timeout in milliseconds to wait for transcription
retryDelaynumber1000Delay in milliseconds between reconnection attempts
maxRetrynumber3Maximum number of reconnection attempts before failing

Configuration Settings​

The settings option allows you to customize various aspects of the transcription:

Language Configuration​

const stt = new GladiaSTT({
apiKey: 'your-api-key',
settings: {
language_config: {
languages: ['en', 'fr', 'es'], // Specify target languages
code_switching: true, // Enable automatic language switching
},
},
})

Pre-processing Options​

const stt = new GladiaSTT({
apiKey: 'your-api-key',
settings: {
pre_processing: {
audio_enhancer: true, // Enhance audio quality
speech_threshold: 0.7, // Adjust speech detection sensitivity (0.0-1.0)
},
},
})

Real-time Processing Features​

const stt = new GladiaSTT({
apiKey: 'your-api-key',
settings: {
realtime_processing: {
words_accurate_timestamps: true, // Get word-level timestamps
translation: true, // Enable translation
translation_config: {
target_languages: ['fr', 'es'], // Translate to French and Spanish
model: 'enhanced', // Use enhanced translation model
},
named_entity_recognition: true, // Extract entities (names, places, etc.)
sentiment_analysis: true, // Analyze sentiment
custom_vocabulary: true, // Use custom vocabulary
custom_vocabulary_config: {
vocabulary: ['Micdrop', { value: 'API', pronunciations: ['A-P-I'] }],
default_intensity: 1,
},
},
},
})

Supported Languages​

Gladia supports 90+ languages with automatic detection. Some of the most commonly used languages include:

CodeLanguageCodeLanguageCodeLanguage
enEnglishesSpanishfrFrench
deGermanitItalianptPortuguese
ruRussianjaJapanesekoKorean
zhChinesearArabichiHindi
nlDutchplPolishtrTurkish
svSwedishdaDanishnoNorwegian
fiFinnishcsCzechhuHungarian
elGreekheHebrewthThai
viVietnameseidIndonesianmsMalay

See the types file for the complete list of supported language codes.

Advanced Features​

Custom Vocabulary​

Improve transcription accuracy for domain-specific terms:

const stt = new GladiaSTT({
apiKey: 'your-api-key',
settings: {
realtime_processing: {
custom_vocabulary: true,
custom_vocabulary_config: {
vocabulary: [
'Micdrop',
'WebSocket',
{
value: 'OAuth',
pronunciations: ['O-Auth', 'oh-auth'],
intensity: 2,
language: 'en',
},
],
default_intensity: 1,
},
},
},
})

Translation​

Real-time translation during transcription:

const stt = new GladiaSTT({
apiKey: 'your-api-key',
settings: {
realtime_processing: {
translation: true,
translation_config: {
target_languages: ['fr', 'es', 'de'],
model: 'enhanced', // 'base' or 'enhanced'
match_original_utterances: true,
context_adaptation: true,
context: 'Technical software development discussion',
},
},
},
})