Skip to main content

Mobile App

The DotAgents mobile app puts AI agents in your pocket — chat by voice or text with full hands-free support, on iOS, Android, and the web.


Overview

Built with Expo SDK 54 and React Native, the mobile app provides a portable interface to your DotAgents agents. It connects to your desktop instance's remote server or any OpenAI-compatible API endpoint.

Key Capabilities

  • Voice input with press-and-hold or hands-free VAD mode
  • Text chat with streaming responses
  • Text-to-speech for assistant replies
  • Agent profile management
  • Knowledge note editing
  • Loop (recurring task) scheduling
  • Operator dashboard for health checks, logs, tunnels, integrations, updater, and emergency actions
  • Session history with search
  • QR code connection setup
  • Split chat view for multi-agent conversations

Supported Platforms

PlatformStatusNotes
iOSFull supportRequires development build (not Expo Go)
AndroidFull supportRequires development build (not Expo Go)
WebSupportedSpeech recognition requires Chrome/Edge over HTTPS

Important: The app uses expo-speech-recognition, a native module not available in Expo Go. You must use a development build for native devices.

Screens

Chat Screen

The primary interface for conversations with your agent:

  • Text input — Type messages and receive streaming responses
  • Voice input — Press-and-hold the mic button for real-time transcription
  • TTS playback — Assistant replies can be spoken aloud via expo-speech
  • Hands-free mode — Toggle VAD (Voice Activity Detection) for no-hands interaction
  • Edit mode — Release the mic while in edit state to place the transcript in the input box for review before sending

Settings Screen

Configure your connection and preferences:

SettingDescription
API KeyYour API key (Bearer token)
Base URLAPI endpoint URL
ModelModel identifier (e.g., gpt-5.4-mini)
EnvironmentToggle between Local and Cloud
Run API URLEndpoint for chat completions
Manage API URLEndpoint for agent management
Voice PreferencesTTS voice, auto-play, language

Session List Screen

Browse and search your conversation history:

  • Scrollable list of past sessions
  • Search by content or title
  • Tap to continue a previous conversation
  • Delete sessions you no longer need

Connection Settings Screen

Quick setup via QR code:

  • Scan a QR code from your desktop app to auto-configure connection settings
  • Manually enter connection details as an alternative
  • Test connectivity with a health check

Agent Edit Screen

Create and configure agent profiles directly on mobile:

  • Set name, description, and system prompt
  • Configure guidelines and properties
  • Select available tools

Knowledge Note Edit Screen

Manage durable knowledge notes from your phone:

  • View existing knowledge notes
  • Create new notes
  • Edit note context, summary, body, tags, and references
  • Delete outdated notes

Loop Edit Screen

Schedule recurring tasks:

  • Define the prompt to execute
  • Set the interval (e.g., every 5 minutes, every hour)
  • Choose which agent handles the loop
  • Start, pause, and monitor loops

Operations Screen

Trusted operator dashboard backed by the desktop remote server:

  • View runtime status, health checks, recent errors, and audit events
  • Inspect Cloudflare Tunnel, Discord, WhatsApp, push, updater, and MCP status
  • Start or stop tunnel exposure
  • Connect or disconnect Discord and WhatsApp integrations
  • Trigger updater checks, reveal downloads, restart MCP services, restart the remote server, restart the app, or run an agent

Split Chat Screen

Multi-agent conversation view:

  • See responses from multiple agents side by side
  • Compare agent outputs for the same query
  • Switch between agents within a single interface

Voice UX

Press-and-Hold Mode

  1. Press and hold the mic button
  2. See live transcription as you speak
  3. Release to send — the transcript is immediately sent to the agent
  4. Or release in edit mode — the transcript goes into the text box for review

Hands-Free Mode (VAD)

  1. Toggle hands-free from the chat screen header (microphone icon)
  2. The app listens continuously using Voice Activity Detection
  3. When speech is detected, it automatically transcribes
  4. On silence (end of speech segment), it automatically sends
  5. Great for driving, cooking, or any hands-busy scenario

Text-to-Speech

  • Assistant replies can be read aloud via expo-speech
  • Configure voice and auto-play preferences in Settings
  • Works across all platforms (iOS, Android, Web)

Architecture

┌─────────────────────────────────┐
│ Mobile App (Expo) │
│ │
│ ┌───────────┐ ┌─────────────┐ │
│ │ Screens │ │ Navigation │ │
│ │ (React │ │ (React │ │
│ │ Native) │ │ Navigation)│ │
│ └─────┬─────┘ └─────────────┘ │
│ │ │
│ ┌─────▼─────────────────────┐ │
│ │ Store (AsyncStorage) │ │
│ │ config, sessions, │ │
│ │ profiles, message queue │ │
│ └─────┬─────────────────────┘ │
│ │ │
│ ┌─────▼─────────────────────┐ │
│ │ API Client │ │
│ │ (OpenAI-compatible) │ │
│ │ SSE streaming support │ │
│ └─────┬─────────────────────┘ │
└────────┼────────────────────────┘


┌─────────────────────┐
│ Desktop Remote │
│ Server (Fastify) │
│ OR │
│ Any OpenAI- │
│ Compatible API │
└─────────────────────┘

Key Libraries

LibraryPurpose
expo-speech-recognitionNative speech recognition
expo-speechText-to-speech
expo-cameraQR code scanning for connection setup
@react-native-async-storage/async-storagePersistent config storage
react-native-sseServer-sent events for streaming
@react-navigation/native-stackScreen navigation

State Management

StorePurpose
config.tsPersistent settings (API key, URLs, voice prefs)
sessions.tsLocal session history
connectionManager.tsConnection pooling and recovery
tunnelConnection.tsTunnel status and reconnect state
profile.tsAgent profile state
message-queue.tsQueued messages for offline/slow processing

Connecting to Desktop

The mobile app connects to your desktop app's remote server:

  1. Start DotAgents desktop and enable Settings > Remote Server
  2. On mobile, go to Connection Settings
  3. Scan the QR code displayed on your desktop, or manually enter the URL and API key
  4. The mobile app now communicates with your desktop's agent engine

This gives the mobile app access to all your desktop's MCP tools, agent profiles, and conversation history.

API Compatibility

The mobile app works with any OpenAI-compatible API endpoint:

  • OpenAI — Direct connection to OpenAI's API
  • Groq — Fast inference with Groq's API
  • Azure OpenAI — Microsoft's OpenAI service
  • Ollama — Local models at http://localhost:11434/v1
  • LM Studio — Local models with OpenAI-compatible API
  • DotAgents Remote Server — Your desktop's agent engine

The API key is sent as Authorization: Bearer <API_KEY>.

Getting Started

Prerequisites

  • Node.js 20.19.4+ (Node 24.x recommended via .nvmrc)
  • pnpm 9.x
  • For native builds: Xcode (iOS) or Android Studio (Android)

Install

git clone https://github.com/aj47/dotagents-mono.git
cd dotagents-mono
nvm use
pnpm install

Run

# Start Metro bundler (choose platform in UI)
pnpm --filter @dotagents/mobile start

# Or run directly
pnpm --filter @dotagents/mobile ios # iOS
pnpm --filter @dotagents/mobile android # Android
pnpm --filter @dotagents/mobile web # Web

Development Build

If you see Cannot find native module 'ExpoSpeechRecognition':

# Build and install the development app
pnpm --filter @dotagents/mobile android
# or
pnpm --filter @dotagents/mobile ios

Troubleshooting

IssueSolution
Cannot find native module 'ExpoSpeechRecognition'Use a development build, not Expo Go
Speech recognition not startingGrant microphone/speech permissions
Web speech not workingUse Chrome or Edge over HTTPS
Cannot list agentsVerify Manage API URL and API key
No assistant responseCheck Run API URL and model setting
Operations actions failConfirm the desktop remote server is reachable and the operator allowlists/API key are correct

Next Steps