Agent (LLM)

The Agent class is the core abstraction for AI conversation handling in Micdrop. It provides a standardized interface for integrating various AI providers into real-time voice conversations.

Available Implementations

Overview

The Agent class is an abstract base class that extends EventEmitter (from eventemitter3) and manages:

Conversation history with role-based messages (system, user, assistant)
Streaming response generation with both stream and promise interfaces
Event emission for conversation state changes
Cancellation and cleanup mechanisms
Integration with logging systems

export abstract class Agent<
  Options extends AgentOptions = AgentOptions,
> extends EventEmitter<AgentEvents> {
  public logger?: Logger
  public conversation: MicdropConversation

  constructor(protected options: Options)

  // Generate a streaming response with both stream and promise interfaces
  answer(): Readable

  // Generate answer implementation (to be implemented by subclasses)
  protected abstract generateAnswer(stream: PassThrough): Promise<void>

  // Cancel the current answer generation process
  abstract cancel(): void
}

Architecture

Core Interfaces

export interface AgentOptions {
  systemPrompt: string

  // Enable auto ending of the call when user asks to end the call
  // You can provide a custom prompt to use instead of the default one by passing a string
  autoEndCall?: boolean | string

  // Enable detection of an incomplete sentence, and skip the answer (assistant waits)
  // You can provide a custom prompt to use instead of the default one by passing a string
  autoSemanticTurn?: boolean | string

  // Ignore of the last user message when it's meaningless
  // You can provide a custom prompt to use instead of the default one by passing a string
  autoIgnoreUserNoise?: boolean | string

  // Extract a value from the answer
  // Value must be at the end of the answer, in JSON or between tags
  extract?: ExtractJsonOptions | ExtractTagOptions

  // Function called before any answer is generated
  // Return true to skip generation
  onBeforeAnswer?: (
    this: Agent,
    stream: Writable
  ) => void | boolean | Promise<boolean>
}

export interface AgentEvents {
  Message: [MicdropConversationMessage]
  CancelLastUserMessage: []
  SkipAnswer: []
  EndCall: []
  ToolCall: [MicdropToolCall]
}

export interface MicdropConversationMessage {
  role: 'system' | 'user' | 'assistant'
  content: string
  metadata?: any
}

export interface MicdropToolCall {
  name: string
  parameters: any
  output: any
}

export interface ExtractOptions {
  callback?: (value: string) => void
  saveInMetadata?: boolean
}

export interface ExtractJsonOptions extends ExtractOptions {
  json: true
  callback?: (value: any) => void
}

export interface ExtractTagOptions extends ExtractOptions {
  startTag: string
  endTag: string
}

Conversation Management

The Agent maintains a conversation history as an array of messages:

public conversation: MicdropConversation

The conversation is initialized with the system prompt:

constructor(protected options: Options) {
  super()
  this.conversation = [{ role: 'system', content: options.systemPrompt }]
}

Abstract Methods

`generateAnswer(stream: PassThrough): Promise<void>`

Implementation method for generating answers. This is what subclasses should implement.

Implementation Requirements:

Write response chunks to the provided stream
Handle cancellation via the cancel() method
Add the complete response to conversation history when finished with addAssistantMessage()
Check cancellation state regularly during generation

`cancel(): void`

Cancels the current answer generation process. Should:

Abort any ongoing API requests
Clean up resources
Stop the answer stream

Public Methods

`answer(): Readable`

Generates a streaming response based on the current conversation history.

Return Value: A Readable stream that emits text chunks in real-time

Behavior:

Calls onBeforeAnswer hook if provided
If hook returns true, skips answer generation
Otherwise calls generateAnswer() implementation
Handles cancellation and cleanup automatically

`addMessage(role: 'user' | 'assistant' | 'system', text: string, metadata?: MicdropAnswerMetadata)`

Add any type of message with content to the conversation and emit the Message event.

agent.addMessage('assistant', 'Hello there!')

`addUserMessage(text: string, metadata?: MicdropAnswerMetadata)`

Adds a user message to the conversation history and emits a Message event.

agent.addUserMessage('Hello, how are you?')

`addAssistantMessage(text: string, metadata?: MicdropAnswerMetadata)`

Adds an assistant message to the conversation history and emits a Message event.

agent.addAssistantMessage("I'm doing well, thank you!")

`addTool<Schema extends z.ZodObject>(tool: Tool<Schema>)`

Adds a custom tool to the agent's available tools.

import { z } from 'zod'

const weatherTool = {
  name: 'get_weather',
  description: 'Get current weather for a location',
  parameters: z.object({
    location: z.string().describe('The city name'),
  }),
  execute: async ({ location }) => {
    // Fetch weather data
    return { temperature: 22, conditions: 'sunny' }
  },
}

agent.addTool(weatherTool)

`removeTool(name: string)`

Removes a tool from the agent's available tools by name.

agent.removeTool('get_weather')

`getTool(name: string): Tool | undefined`

Retrieves a tool by name from the agent's available tools.

const tool = agent.getTool('get_weather')
if (tool) {
  console.log('Weather tool is available')
}

`addToolMessage(message: MicdropConversationToolCall | MicdropConversationToolResult)`

Adds a tool call or tool result message to the conversation.

// Add tool call
agent.addToolMessage({
  role: 'tool_call',
  toolCallId: 'call_123',
  toolName: 'get_weather',
  parameters: JSON.stringify({ location: 'Paris' }),
})

// Add tool result
agent.addToolMessage({
  role: 'tool_result',
  toolCallId: 'call_123',
  toolName: 'get_weather',
  output: JSON.stringify({ temperature: 22, conditions: 'sunny' }),
})

`extract(message: string): { message: string; metadata?: MicdropAnswerMetadata }`

Extracts structured data from a message based on the configured extraction options. Returns the cleaned message and any extracted metadata.

// With JSON extraction
const result = agent.extract('The answer is 42. {"score":42}')
// Returns: { message: 'The answer is 42.', metadata: { extracted: { score: 42 } } }

// With tag extraction
const result = agent.extract('The answer is 42. <score>42</score>')
// Returns: { message: 'The answer is 42.', metadata: { extracted: '42' } }

It is called automatically when the extract option is provided. It is public so that you can use it to extract data from the answer before adding it to the conversation, if you really need to.

`destroy()`

Cleans up the agent instance:

Removes all event listeners
Calls cancel() to stop ongoing operations
Logs destruction

agent.destroy()

Protected Methods

`endCall()`

Emits the EndCall event to signal that the conversation should end.

`cancelLastUserMessage()`

Removes the last user message from the conversation history (if it exists) and emits the CancelLastUserMessage event.

`skipAnswer()`

Emits the SkipAnswer event to indicate the agent is skipping the current response.

`getDefaultTools(): Tool[]`

Returns the default tools based on the agent's configuration options (autoEndCall, autoSemanticTurn, autoIgnoreUserNoise).

`executeTool(toolCall: MicdropConversationToolCall): Promise<{ output: any; skipAnswer?: boolean }>`

Executes a tool call and handles the result. This method:

Finds the tool by name
Adds the tool call to conversation history
Executes the tool with the provided parameters
Adds the tool result to conversation history
Emits a ToolCall event if the tool has emitOutput: true
Returns the output and whether to skip the answer

`getExtractOptions(): ExtractTagOptions | undefined`

Returns the extraction options configured for the agent, converting JSON extraction to tag-based extraction.

`log(...message: any[])`

Logs messages using the attached logger if available.

agent.log('Processing user message:', userInput)

Protected Properties

`tools: Tool[]`

Array of available tools that the agent can use.

`answerCount: number`

Counter used for cancellation handling. Incremented each time answer() is called.

`answering: boolean`

Flag indicating whether the agent is currently generating an answer.

`options: Options`

The configuration options passed to the agent constructor.

Events

The Agent emits the following events:

`Message`

Emitted when a new message is added to the conversation.

agent.on('Message', (message: MicdropConversationMessage) => {
  console.log(`New ${message.role} message: ${message.content}`)
})

`CancelLastUserMessage`

Emitted when the last user message is cancelled.

agent.on('CancelLastUserMessage', () => {
  console.log('Last user message was cancelled')
})

`SkipAnswer`

Emitted when the agent decides to skip answering.

agent.on('SkipAnswer', () => {
  console.log('Agent skipped answering')
})

`EndCall`

Emitted when the agent determines the call should end.

agent.on('EndCall', () => {
  console.log('Call should end')
})

Debug Logging

Enable detailed logging for development:

import { Logger } from '@micdrop/server'

agent.logger = new Logger('CustomAgent')

Custom Agent Implementation

Creating a Custom Agent Implementation

import { Agent, AgentOptions } from '@micdrop/server'
import { PassThrough, Writable } from 'stream'

interface CustomAgentOptions extends AgentOptions {
  apiKey: string
  model?: string
  settings?: any // Provider-specific settings
}

const DEFAULT_MODEL = 'gpt-4'

class CustomAgent extends Agent<CustomAgentOptions> {
  constructor(options: CustomAgentOptions) {
    super(options)
  }

  protected async generateAnswer(stream: PassThrough): Promise<void> {
    const answerCount = this.answerCount

    try {
      // Build messages for your AI provider
      const messages = this.conversation.map((msg) => ({
        role: msg.role,
        content: msg.content,
      }))

      // Make API call to your AI provider
      const response = await fetch('https://api.your-provider.com/chat', {
        method: 'POST',
        headers: {
          Authorization: `Bearer ${this.options.apiKey}`,
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          messages,
          model: this.options.model || DEFAULT_MODEL,
          stream: true,
          ...this.options.settings,
        }),
      })

      if (!response.ok) {
        throw new Error(`API request failed: ${response.statusText}`)
      }

      const reader = response.body?.getReader()
      let fullResponse = ''

      while (reader) {
        // Check if this answer was cancelled
        if (answerCount !== this.answerCount) return

        const { done, value } = await reader.read()
        if (done) break

        // Parse streaming response (format depends on your provider)
        const chunk = new TextDecoder().decode(value)

        if (chunk) {
          this.log(`Answer chunk: "${chunk}"`)
          stream.write(chunk)
          fullResponse += chunk
        }
      }

      // Only add to conversation if this answer wasn't cancelled
      if (answerCount === this.answerCount) {
        this.addAssistantMessage(fullResponse)
      }
    } catch (error: any) {
      console.error('[CustomAgent] Error:', error)
    }
  }

  cancel(): void {
    if (!this.answering) return
    this.log('Cancel')

    // Increment answer count to cancel current generation
    this.answerCount++
  }
}

Using Custom Agent with MicdropServer

// Create custom agent
const agent = new CustomAgent({
  apiKey: process.env.YOUR_API_KEY || '',
  systemPrompt: 'You are a helpful assistant.',
  model: 'your-model-name',
})

// Add logging
agent.logger = new Logger('CustomAgent')

// Setup event handlers
agent.on('Message', (message) => {
  console.log(`${message.role}: ${message.content}`)
})

// Create server with custom agent
new MicdropServer(socket, {
  agent,
  // ...other options
})

Simple Echo Agent Example

import { Agent, MicdropConversationMessage } from '@micdrop/server'
import { PassThrough } from 'stream'

class EchoAgent extends Agent {
  protected async generateAnswer(stream: PassThrough): Promise<void> {
    // Get the last user message
    const lastUserMessage = this.conversation
      .filter((msg) => msg.role === 'user')
      .pop() as MicdropConversationMessage

    if (lastUserMessage) {
      // Simulate a delay before responding
      await new Promise((resolve) => setTimeout(resolve, 100))

      const response = `You said: "${lastUserMessage.content}"`

      // Stream the response
      stream.write(response)

      // Add to conversation
      this.addAssistantMessage(response)
    }
  }

  cancel(): void {
    // Nothing to cancel for this simple implementation
  }
}

// Usage
const echoAgent = new EchoAgent({
  systemPrompt: 'You are an echo agent that repeats what users say.',
})

echoAgent.addUserMessage('Hello!')
const stream = echoAgent.answer()

stream.on('data', (chunk) => {
  console.log('Chunk:', chunk.toString())
})

stream.on('end', async () => {
  console.log('Stream ended')
  echoAgent.destroy()
})

Available Implementations​

Overview​

Architecture​

Core Interfaces​

Conversation Management​

Abstract Methods​

generateAnswer(stream: PassThrough): Promise<void>​

cancel(): void​

Public Methods​

answer(): Readable​

addMessage(role: 'user' | 'assistant' | 'system', text: string, metadata?: MicdropAnswerMetadata)​

addUserMessage(text: string, metadata?: MicdropAnswerMetadata)​

addAssistantMessage(text: string, metadata?: MicdropAnswerMetadata)​

addTool<Schema extends z.ZodObject>(tool: Tool<Schema>)​

removeTool(name: string)​

getTool(name: string): Tool | undefined​

addToolMessage(message: MicdropConversationToolCall | MicdropConversationToolResult)​

extract(message: string): { message: string; metadata?: MicdropAnswerMetadata }​

destroy()​

Protected Methods​

endCall()​

cancelLastUserMessage()​

skipAnswer()​

getDefaultTools(): Tool[]​

executeTool(toolCall: MicdropConversationToolCall): Promise<{ output: any; skipAnswer?: boolean }>​

getExtractOptions(): ExtractTagOptions | undefined​

log(...message: any[])​

Protected Properties​

tools: Tool[]​

answerCount: number​

answering: boolean​

options: Options​

Events​

Message​

CancelLastUserMessage​

SkipAnswer​

EndCall​

Debug Logging​

Custom Agent Implementation​

Creating a Custom Agent Implementation​

Using Custom Agent with MicdropServer​

Simple Echo Agent Example​