Skeleton
alb-mobile Android app: WebSocket client + logcat streamer + system metrics. No LLM yet. The win: replace adb logcat with a filtered, structured stream.
Design note · 2026-04-21 · open for comments
This page is a design note, not a shipped feature. It describes how we want alb to evolve past the host-side agent loop — by putting a small agent on the device itself, and keeping both sides talking. 这是一份设计笔记,不是已上线的功能。它描述了我们希望 alb 如何从"只在宿主侧跑 agent loop"演进到"宿主 + 设备两端 agent 协作"的路径。
Today alb runs entirely on your Linux (or Mac) host. An LLM reasons on the host, pushes tool calls through adb / ssh / UART, and observes the device from the outside. This works — and M2 just made the loop streaming and cache-friendly — but it hits four real walls.
A small agent on the device closes every one of these gaps — without replacing the host.
Two cooperating agents, each doing what it is good at. A shared WebSocket keeps them in lockstep.
A rough division of labor. Subject to change as we prototype, but the shape holds.
| Task | Runs on | Why |
|---|---|---|
| Threshold alerts (OOM, ANR, thermal) | Device | Zero-latency, adb cannot see the transient |
| logcat stream dedupe, stack cluster | Device (small LLM) | Saves host tokens, filters noise at source |
| perfetto / systrace capture | Device | Direct /sys access, no adb chunk overhead |
| Complex reasoning, code edits, user dialogue | Host (frontier LLM) | Capability + cross-file context |
| Cross-session memory, artifact archive | Host | Device storage is limited and volatile |
| On-site offline debug + later sync | Device (autonomy) | Customer field, no dev reachable |
| Workflow orchestration (multi-step plan) | Host | Planning + memory is host's strong suit |
| First-response "what just happened?" | Device | Sub-second triage before the event ages out |
One WebSocket, six message types. alb host already speaks WebSocket (/chat/ws), so we extend the same FastAPI app with a /device/channel endpoint and let devices connect as clients. Authentication reuses alb's config.toml token.
HEARTBEAT — device → host. Liveness + battery + thermal + freeform health field.EVENT — device → host, unsolicited. OOM, ANR trigger, crash, thermal warn, custom threshold. Includes a short device-side triage summary.REQUEST — device → host. "I think this needs a bigger brain, here is what I saw." Escalates to host LLM.RPC_CALL — host → device. "Run perfetto for 5 s on PID 1234", "dump heap of app X", "capture method profile". Returns structured data.RPC_REPLY — device → host. Result of an RPC_CALL, with {ok, data, error, artifacts} shape — same as the rest of alb.LOG_STREAM — device → host. Pre-filtered, pre-clustered logcat. Replaces the adb logcat pipe with something that has already paid its tokens for filtering.We get value at every phase — we do not need phase 4 for phase 1 to be useful.
alb-mobile Android app: WebSocket client + logcat streamer + system metrics. No LLM yet. The win: replace adb logcat with a filtered, structured stream.
Integrate AICore / Gemini Nano on Android 14+, or LiteRT + Gemma 4. Device does first-pass triage: summarize, cluster, decide whether to escalate.
Host can dispatch device-local tools: perfetto traces, heap dumps, custom probes. No adb round-trip. Tool specs published back to host MCP automatically.
Device runs standalone — captures reproducers, diagnoses locally, packages artifacts. Syncs to host when network returns. Field debugging without anyone on the other end.
Google's public direction already points here. AICore ships a system-level LLM runtime on Pixel 8+ and is expanding; Gemini Nano is the first model delivered through it. LiteRT (formerly TFLite) gives anyone a path to ship their own quantized model on-device. And Google's own "Gemma 4 for Android" push calls out the on-device agent as a flagship use case.
alb does not compete with any of that — it composes. alb-mobile will use whatever the device provides: AICore if available, LiteRT if not, Gemma 4 quantized as a fallback. The interesting work is not the model — it's the host ↔ device protocol and the division of labor. That's what we own.
Where we do not have a final answer. Help welcome.
Developer builds will be sideload (debug APK from Releases). A Play Store track is on the table for field/autonomy use but needs review and a clear safety story.
Yes for v1. The WebSocket channel is generic; porting to HarmonyOS or embedded Linux boards (buildroot + a Gemma runner) is feasible but not on the M4 plan.
The plan assumes unrooted. perfetto, dumpsys, logcat are all available to app-level context via ADB, and AICore / LiteRT run as normal Android libraries. Some probes require adb shell escalation — those stay on the host side.
WebSocket connection uses a shared token (dev environment) or mTLS (production). Device never initiates destructive actions without a host-approved RPC_CALL. All device-local LLM inputs are logged for audit.
Not in scope. iOS's debugging model is fundamentally different, and Apple does not expose anything comparable to AICore or adb-equivalent at this layer. A reduced host-side alb for iOS is conceivable — not on this roadmap.
Open an issue with the label on-device-agent. This is a design in motion — the earlier we hear, the more we can fold in.