Architecture Reference¶
Pipeline overview¶
[Hotkey] → trigger.py ──socket──► daemon (python -m mirach)
│
├─ AudioRecorder (sounddevice, callback-based)
├─ WhisperTranscriber (faster-whisper, CUDA/CPU)
├─ PiperSpeaker (piper-tts, streaming)
├─ OpenCodeBackend (subprocess: opencode run)
├─ ConversationLog (markdown per session)
├─ ConversationHTML (styled HTML viewer)
├─ ObsidianCache (persistent memory)
├─ UserScripts (custom voice commands)
├─ Notifier (notify-send + beeps)
└─ SocketServer (Unix domain socket)
State machine¶
The assistant operates a three-state FSM:
IDLE ──[toggle]──► RECORDING ──[toggle]──► PROCESSING ──[done]──► IDLE
▲ │
│ [toggle] │
└──────────────────────┘
(interrupt + re-record)
| State | Description |
|---|---|
IDLE |
Waiting for hotkey press. No audio capture. |
RECORDING |
Microphone is open, audio frames collected in a buffer. |
PROCESSING |
Pipeline running in background thread: stop recording → transcribe → LLM → speak. |
Source layout¶
mirach/
├── __main__.py # Entry point: `python -m mirach`
├── __init__.py # Package metadata, version
├── assistant.py # Orchestrator + FSM + shutdown hooks + user scripts
├── audio.py # Thread-safe microphone capture (sounddevice)
├── stt.py # WhisperTranscriber with warmup + downsampling
├── tts.py # PiperSpeaker with streaming + pre-baked fillers
├── llm.py # OpenCodeBackend with progressive feedback
├── ipc.py # Unix socket server (toggle/ping protocol)
├── notify.py # Desktop notifications + beep WAV generation
├── conversation.py # Markdown conversation files + latest.md symlink
├── conversation_html.py # HTML conversation viewer (dark theme, chat layout)
├── obsidian_cache.py # In-memory cache for Obsidian vault files
├── config.py # All MIRACH_* env vars with defaults
├── i18n.py # Localization strings + filler phrases
└── logging_setup.py # Rotating file logger + stdout for journalctl
trigger.py # Standalone hotkey client (sends "toggle" to socket)
run_daemon.sh # Launcher with CUDA 12 library path setup
pyproject.toml # Package metadata, dependencies, tool config
install.py # Interactive setup wizard
install.sh # Non-interactive shell installer
system_prompt.example.md # Template for personalized system prompt
mirach.service.example # Template for systemd user service
user_scripts/ # User-defined voice-triggered scripts (gitignored content)
skills/ # OpenCode skills (installed to ~/.config/opencode/skills/)
IPC protocol¶
The Unix socket accepts two messages:
| Message | Response | Description |
|---|---|---|
toggle |
(none) | Triggers the FSM state transition |
ping |
pong |
Health check for the daemon |
Session persistence¶
The OpenCode session ID is persisted to ~/.cache/mirach/session_id. On daemon startup:
- If the file exists, the session ID and its last modification time are loaded
- If the time since last interaction exceeds
SESSION_IDLE_TIMEOUT, a new session is created - On a new session, the system prompt and Obsidian context are injected into the first query
User script format¶
Scripts in user_scripts/ are parsed for metadata comments:
#!/bin/bash
# triggers: phrase one, phrase two
# response: Spoken confirmation.
# description: Optional description.
# Script body runs here
triggers:— comma-separated, case-insensitive substring matchingresponse:— text spoken after script executiondescription:— for human reference only, not used by the daemon
Beep frequencies¶
| Beep | Frequency | Duration | Purpose |
|---|---|---|---|
| Start | 1320 Hz | 60 ms | Signals recording has started |
| Process | 660 Hz | 80 ms | Signals transcription has begun |
| Shutdown | 660 Hz → 330 Hz | 120 ms each, 40 ms gap | Signals daemon is stopping |
Progressive LLM feedback timeline¶
| Time | Action |
|---|---|
| 0s | Process beep, OpenCode subprocess launched |
| 0-10s | Normal filler phrases every FILLER_DELAY seconds |
| 10s | Desktop notification + filler |
| 30s | "Still working" spoken message |
| 60s | "Complex query" message + desktop notification |