Open Source · MIT License

Your AI assistant,
self-hosted.

Bio-inspired cognition, persistent memory, and multi-channel deployment — validated against 30 research works, running on your own server.

An AI that remembers, reflects, and evolves. Neural pathways that strengthen with every conversation.

Everything you need, nothing you don't

A complete AI assistant framework built on simplicity, privacy, and cost control.

Hybrid Memory EngineResearch-backed

BM25 + semantic search with relationship graphs, memory decay, and automatic fact extraction. Your assistant remembers context across every conversation.

ScallopBot retrieves memories through BM25 keyword matching and semantic embeddings, then re-ranks top candidates with an LLM call. On the LoCoMo benchmark (1,049 QA items), this achieves F1 0.51 vs OpenClaw’s 0.39—a 31% improvement—with temporal questions showing a 4× gain.

queryBM25keywordSemanticembeddingGraphrelationsRerankerLLM-scoredtop-k results
Cost-Aware Model Routing

7 LLM providers with automatic failover. Each request routes to the cheapest capable model — Groq for speed, Claude for reasoning, GPT-4o for general tasks.

requestRouterclassify task$0.01Groqspeed$0.05Moonshotbalance$0.15Claudereasoningcheapest capable
9 Messaging Channels

Telegram, Discord, WhatsApp, Slack, Signal, Matrix, WebSocket, CLI, and REST API. One process, every platform your team uses.

ScallopBotTelegramDiscordWhatsAppSlackSignalWebSocketMatrixCLIREST API
Local Voice Pipeline

On-device speech-to-text (faster-whisper) and text-to-speech (Kokoro) at zero API cost. Cloud fallbacks when you need them.

audioSTTwhisperLLMreasonTTSkokoroaudioCloud Fallbackif local unavailable
Skills-Only Architecture

16 bundled skills using the OpenClaw format. Install community skills from ClawHub. No hardcoded tools — everything is modular.

Core Enginerouter · memory · schedulingwebskillcalcskillremindskillimgskillClawHubcommunity skill registryinstall
Bio-Inspired CognitionResearch-backed

Dream cycles consolidate memories overnight. Affect detection, self-reflection, and gap scanning create an assistant that genuinely learns.

A three-tier heartbeat—pulse, breath, deep sleep—drives autonomous cognition between interactions. Nightly dream cycles mirror biological sleep: NREM consolidation followed by REM associative discovery. Affect detection uses AFINN-165 with dual exponential smoothing to track emotional state without biasing reasoning.

Sumers et al., TMLR 2024Packer et al., 2023Shinn et al., NeurIPS 2023Zhang, S. et al., 2026Zhang, Q., 2026Pavlović et al., 2025Mozikov et al., NeurIPS 2024Lu & Li, 2025Chandra et al., 2025
DAYNIGHTPulseevery 5m · health · affectBreathevery 6h · decay · fusionSleepnightly · dreams · reflectionNREMconsolidate · fuse topicsREMspreading activationsleepnovel links
Web Dashboard

Real-time chat with markdown rendering and streaming. Debug mode shows tool execution and thinking steps. Built-in cost panel with 14-day spending charts.

ScallopBot DashboardChatDebugCost> helloHi! How canI help you?> remind me to...Done! I'll remindyou tomorrow at 9am.tool: web_searchthink: analyzing...mem: 3 recalledt: 1.2s$0.06todaytype a message...send
Proactive SchedulingResearch-backed

Natural language reminders with timezone awareness. Interval, daily, and weekly schedules. Actionable reminders execute autonomously when triggered.

ScallopBot’s gap scanner actively searches for unresolved questions and approaching deadlines, then diagnoses which gaps deserve attention. Delivery is gated by an asymmetric trust loop—accepted suggestions earn small increments, dismissals subtract more—reflecting how trust builds slowly and breaks quickly.

Deng et al., ACM TOIS 2025Pasternak, 2025Liu et al., CHI 2025Sun et al., 2025
memories + contextGap Scannerunresolved · deadlinesDiagnoseLLM judges relevanceTrust Gatescore > thresholdsurfaceyessuppressno
Reliability Built In

Circuit breakers, graceful degradation, and crash recovery with session persistence. Atomic claim guards prevent duplicate execution across restarts.

CLOSEDpass requestsOPENblock allHALF-OPENtry one reqfailures > Ntimeoutsuccessfailurecrashrecoverreplay session

One process, every platform

Connect to 9 messaging channels simultaneously from a single Node.js process.

TelegramDiscordWhatsAppSlackSignalMatrixWebSocketCLIREST API

7 providers, automatic failover

Every request routes to the cheapest capable model. When a provider goes down, traffic shifts instantly.

Anthropic
Complex
Moonshot
Cost-effective
OpenAI
General
xAI
Real-time
Groq
Ultra-fast
Ollama
Private
OpenRouter
Flexible

Up and running in minutes

One script installs everything on a fresh Ubuntu server. Add a provider key and you're live.

# Clone the repo
git clone https://github.com/tashfeenahmed/scallopbot
cd scallopbot

# One-command server setup (Node 22, PM2, voice deps, Ollama)
bash scripts/server-install.sh

# Configure your provider key
cp .env.example .env
nano .env  # add at least ANTHROPIC_API_KEY

# Build and start
npm run build
node dist/cli.js start

LoCoMo benchmark evaluation

Evaluated on LoCoMo — a standardized long-conversation memory benchmark with 1,049 QA items across 5 conversations and 138 sessions. Both systems use identical models (Moonshot kimi-k2.5) and embeddings (Ollama nomic-embed-text). The system comprises 367 TypeScript source files (~63,000 lines of code) with 1,560 tests across 95 test files. ScallopBot’s hybrid retrieval with LLM reranking, temporal query detection, and score-gated context achieves F1 0.51 vs OpenClaw’s 0.39 — a 31% relative improvement.

0.51
F1 Score
+31% vs OpenClaw on 1,049 QA items
Temporal Gain
F1 0.39 vs 0.10 on time-based questions
$0.06–0.10
Daily cost
Full cognitive pipeline, 7 LLM providers
LoCoMo Results by Category
Overall F1
Token-level F1 across all 1,049 QA items (5 conversations, 138 sessions)
ScallopBot
0.51
OpenClaw
0.39
+31% relative improvement
Temporal Questions
Questions requiring time-based reasoning across sessions
ScallopBot
0.39
OpenClaw
0.10
4× improvement over OpenClaw
Single-hop Questions
Direct factual recall from a single conversation session
ScallopBot
0.23
OpenClaw
0.12
+92% relative improvement
Multi-hop Questions
Questions requiring synthesis of facts across multiple sessions
ScallopBot
0.47
OpenClaw
0.34
+38% relative improvement
Open-domain Questions
General knowledge questions not tied to specific conversation sessions
ScallopBot
0.11
OpenClaw
0.11
No change
Adversarial Questions
Unanswerable questions designed to test refusal accuracy
OpenClaw
0.96
ScallopBot
0.93
OpenClaw leads by 0.03

Standardized benchmark with real embeddings (Ollama nomic-embed-text, 768-dim) and real LLM (Moonshot kimi-k2.5). Temporal gains driven by date-embedded memories and regex-based temporal query detection. Multi-hop gains from memory fusion, NREM dream consolidation, and increased retrieval depth. Full cognitive pipeline adds ~$0.02/day to base conversation cost. Design validated against 30 research works from 2023–2026 across six domains.

Own your AI assistant

MIT licensed. Self-hosted. No vendor lock-in.

Get Started on GitHub