Skip to main content

Gladia

Gladia implementation for @micdrop/server.

This package provides high-quality real-time speech-to-text implementation using Gladia's streaming API.

Installation

npm install @micdrop/gladia

Usage

import { GladiaSTT } from '@micdrop/gladia'
import { MicdropServer } from '@micdrop/server'

const stt = new GladiaSTT({
apiKey: process.env.GLADIA_API_KEY || '',
})

// Use with MicdropServer
new MicdropServer(socket, {
stt,
// ... other options
})

Options

OptionTypeDefaultDescription
apiKeystringRequiredYour Gladia API key
settingsDeepPartial<GladiaLiveSessionPayload>OptionalAdvanced configuration for Gladia live session

Configuration Settings

The settings option allows you to customize various aspects of the transcription:

Language Configuration

const stt = new GladiaSTT({
apiKey: 'your-api-key',
settings: {
language_config: {
languages: ['en', 'fr', 'es'], // Specify target languages
code_switching: true, // Enable automatic language switching
},
},
})

Pre-processing Options

const stt = new GladiaSTT({
apiKey: 'your-api-key',
settings: {
pre_processing: {
audio_enhancer: true, // Enhance audio quality
speech_threshold: 0.7, // Adjust speech detection sensitivity (0.0-1.0)
},
},
})

Real-time Processing Features

const stt = new GladiaSTT({
apiKey: 'your-api-key',
settings: {
realtime_processing: {
words_accurate_timestamps: true, // Get word-level timestamps
translation: true, // Enable translation
translation_config: {
target_languages: ['fr', 'es'], // Translate to French and Spanish
model: 'enhanced', // Use enhanced translation model
},
named_entity_recognition: true, // Extract entities (names, places, etc.)
sentiment_analysis: true, // Analyze sentiment
custom_vocabulary: true, // Use custom vocabulary
custom_vocabulary_config: {
vocabulary: ['Micdrop', { value: 'API', pronunciations: ['A-P-I'] }],
default_intensity: 1,
},
},
},
})

Supported Languages

Gladia supports 90+ languages with automatic detection. Some of the most commonly used languages include:

CodeLanguageCodeLanguageCodeLanguage
enEnglishesSpanishfrFrench
deGermanitItalianptPortuguese
ruRussianjaJapanesekoKorean
zhChinesearArabichiHindi
nlDutchplPolishtrTurkish
svSwedishdaDanishnoNorwegian
fiFinnishcsCzechhuHungarian
elGreekheHebrewthThai
viVietnameseidIndonesianmsMalay

See the types file for the complete list of supported language codes.

Advanced Features

Custom Vocabulary

Improve transcription accuracy for domain-specific terms:

const stt = new GladiaSTT({
apiKey: 'your-api-key',
settings: {
realtime_processing: {
custom_vocabulary: true,
custom_vocabulary_config: {
vocabulary: [
'Micdrop',
'WebSocket',
{
value: 'OAuth',
pronunciations: ['O-Auth', 'oh-auth'],
intensity: 2,
language: 'en',
},
],
default_intensity: 1,
},
},
},
})

Translation

Real-time translation during transcription:

const stt = new GladiaSTT({
apiKey: 'your-api-key',
settings: {
realtime_processing: {
translation: true,
translation_config: {
target_languages: ['fr', 'es', 'de'],
model: 'enhanced', // 'base' or 'enhanced'
match_original_utterances: true,
context_adaptation: true,
context: 'Technical software development discussion',
},
},
},
})

Error Handling

The package includes automatic reconnection and error handling:

  • Connection Loss: Automatically reconnects with exponential backoff
  • API Errors: Proper error propagation and logging
  • Audio Issues: Graceful handling of audio stream interruptions