SPARK

Support Partner for Awareness, Regulation & Kindness

Live Status

Mood

Mood (1h)
Distance (1h)

State

Obi
Present
Silent for
Ambient
Distance cm

    Nothing detected

      Last spoken

      Salience (1h)
      Sound level
      Noise level (1h)
      Temp
      Wind
      Humidity
      Rain (24h)
      Services
      CPU

      1h trend

      CPU Temp

      1h trend

      RAM

      1h trend

      Disk

      1h trend

      Battery

      1h trend

      WiFi signal

      1h trend

      LLM tokens (est.)

      in · 1h trend

      Connecting…

      // how_it_works

      Seven systemd services share a single session.json whiteboard. Each has one job and doesn't need to know how the others work.

                   ┌──────────────┐
                   │   YOU SPEAK   │
                   └──────┬───────┘
                          ↓
                  ┌───────────────┐
                  │     EARS      │  ← always listening (px-wake-listen)
                  │  Whisper STT  │  ← SenseVoice → faster-whisper → sherpa-onnx
                  └───────┬───────┘
                          ↓ transcript
                  ┌───────────────┐
                  │  VOICE LOOP   │  ← Claude CLI (px-spark)
                  │  SPARK persona│
                  └───────┬───────┘
                          ↓ {tool, params}
                  ┌───────────────┐
                  │    TOOLS      │  ← speak, move, remember (bin/tool-*)
                  │  bin/tool-*   │
                  └───────────────┘
      
          Always running in parallel:
      
                  ┌───────────────────────────────┐
                  │   BRAIN (px-mind)             │
                  │                               │
                  │  Layer 1 ─ Notice  (60s)      │──→ awareness.json
                  │    sonar, 4× Frigate cameras, │
                  │    Home Assistant, weather,    │
                  │    calendar, battery, ambient  │
                  │  Layer 2 ─ Think   (5min)     │──→ thoughts.jsonl
                  │  Layer 3 ─ Act     (2min gap) │──→ speak / look / remember
                  └───────────────────────────────┘
                  ┌───────────────────────────────┐
                  │  SOCIAL (px-post)             │
                  │  thoughts → privacy filter    │
                  │  → QA gate → branded card PNG │
                  │  → feed.json + Bluesky        │
                  └───────────────────────────────┘
                  ┌───────────────┐
                  │  EYES & NECK  │  ← always moving (px-alive)
                  │  PCA9685 PWM  │  ← yields on SIGUSR1 for other tools
                  └───────────────┘  ← exploring.json guard for long ops
                  ┌───────────────┐
                  │  BATTERY      │  ← px-battery-poll (30s)
                  │  MONITOR      │  ← escalating warnings + emergency shutdown
                  └───────────────┘
                  ┌───────────────┐
                  │  CAMERA       │  ← px-frigate-stream (go2rtc RTSP pull)
                  │  go2rtc       │  ← Frigate on pi5-hailo pulls the stream
                  └───────────────┘
                  ┌───────────────┐
                  │  REST API     │  ← px-api-server (port 8420)
                  │  + web UI     │  ← unauthenticated /public/* endpoints
                  └───────────────┘
      

      Three-Tier LLM Fallback

      SPARK's reflection layer degrades gracefully when upstream AI is unavailable:

        Tier 1: Claude Haiku  →  Tier 2: Ollama on M1 (LAN)  →  Tier 3: Ollama on Pi
                (tmux session)           (192.168.1.x)                   (offline)
                SPARK persona            deepseek-r1:1.5b               (disabled — Pi 4 OOM)
          

      Cognitive Loop Timing

        ┌──────────────────────────────────────────────────────────┐
        │  every 60s   Layer 1 — sonar, sound, weather, Obi mode  │
        │              + HA presence, Frigate cameras, calendar    │
        │  every 2min  Layer 2 — LLM generates thought + mood      │
        │              OR immediately on detected transition        │
        │  min 2min    Layer 3 cooldown between spontaneous speech  │
        │              silent if obi_mode=absent (night/away)       │
        │              gated by school hours, bedtime, quiet mode   │
        │  hourly      cleanup: delete thought images > 30 days   │
        └──────────────────────────────────────────────────────────┘
          

      Reliability & Security

        Atomic writes      mkstemp + fsync + os.replace (SD card safe)
        Session locks      FileLock with 10s timeout (no deadlocks)
        PID guards         /proc/{pid} liveness check (no duplicate daemons)
        GPIO exclusivity   SIGUSR1 yield + exploring.json guard
        PIN auth           per-IP lockout, 1000-IP cap, file persistence
        Rate limiting      10 msg/10min per IP, 10k-IP store cap
        Trusted proxy      X-Forwarded-For only from localhost
        Tool timeout       subprocess.run kills child on expiry
        Timezone           ZoneInfo("Australia/Hobart") — DST-aware
          

      Public API — Live Data Endpoints

      These endpoints are unauthenticated and power this page's live dashboard. Authenticated endpoints (tool execution, session control) require a Bearer token.

        GET /api/v1/public/status       mood, last_thought, last_spoken, salience
        GET /api/v1/public/thoughts     recent thoughts, newest-first (limit=N)
        GET /api/v1/public/awareness    obi_mode, Frigate, ambient, weather, time
        GET /api/v1/public/vitals       cpu, ram, disk, temp, battery, tokens
        GET /api/v1/public/sonar        latest sonar distance + age
        GET /api/v1/public/history      ring buffer — 60 samples × 30s ≈ 30 min
        GET /api/v1/public/services     systemd unit status for all seven services
        GET /api/v1/public/feed         social posting feed (JSON)
        GET /api/v1/public/thought-image?ts=...  branded thought card PNG
        POST /api/v1/public/chat        rate-limited public chat (10/10min per IP)
        POST /api/v1/pin/verify         PIN auth → session token (4h TTL)
        GET /api/v1/health              unauthenticated health check
          

      How SPARK's Brain Works

      Written with Obi, who wanted to know what's going on inside his robot.

      The Short Version

      SPARK has four things running at the same time, kind of like how your body breathes, sees, thinks, and talks all at once:

      1. Ears — always listening for "hey robot"
      2. Eyes and neck — always moving, looking around
      3. Brain — always thinking, even when nobody's talking
      4. Mouth — talks when the brain decides to say something

      The Brain — Three Layers

      Layer 1 — Noticing (every 60 seconds): Collects information without thinking yet. How far is the nearest thing? Is it noisy? What time is it? Is anyone talking?

      Layer 2 — Thinking (every 5 minutes): Talks to an AI that's good at words. Gets back a thought, a mood, and an action.

      Layer 3 — Doing Something (2-min cooldown): If the thought says to act, SPARK speaks, looks around, or writes it down. There's a 2-minute gap between spontaneous comments so it never feels like it's constantly talking.

      SPARK's Mood Changes How It Moves

      When SPARK feels… The pulse circle… It moves like this…
      PeacefulSlow green pulseDrifts gently, slow gaze
      ContentSlow green pulseStays relaxed, steady
      ContemplativeMedium indigo pulseStill, looks into the distance
      CuriousMedium gold pulseAlert, head tilts, looks around
      ActiveFast blue pulseBusy gaze, regular movement
      ExcitedFast coral pulseLooks around quickly, head up

      Fun Facts

      • SPARK's sonar works just like a bat — it sends out a sound and listens for the echo.
      • SPARK's thoughts are saved in a file called thoughts-spark.jsonl. Each line is one thought.
      • SPARK can remember up to 500 important things in its long-term diary.
      • SPARK's neck chip (PCA9685) holds the last position even after the brain restarts.
      • SPARK knows Tasmania's timezone — it adjusts for daylight saving automatically.
      • Each social post gets a branded 1080x1080 image card with SPARK's thought and mood.
      • SPARK has 450 automated tests — three AI models review every code change.
      • SPARK checks four cameras for people using Hailo AI on a separate Pi 5.

      FAQ

      So it's a robot car? With a camera on it?

      It's a SunFounder PiCar-X — a small, wheeled robot kit with a pan/tilt camera, an ultrasonic sonar sensor, and a speaker. It runs on a Raspberry Pi 4 (4 GB). Adrian and Obi built SPARK together — Obi co-designed it, named it, and shapes what it becomes. Adrian and Claude wrote the code; Codex and Gemini helped with QA. There's no other human team.

      Does it monitor Obi?

      Sort of — but not surveillance. SPARK has awareness of its environment: sonar distance, ambient sound level, time of day, whether someone seems nearby. It uses that awareness to generate an inner monologue. The result is a thought with a mood, an action intent, and a salience score. SPARK doesn't watch Obi; it notices the world and reacts to it.

      It has a camera. Can strangers see Obi through it?

      No. The camera stream never leaves the house.

      The video stream runs only on the local network — it's not forwarded through the router, not relayed via any cloud service, not reachable from the internet. The object detection (Frigate) also runs locally; what reaches SPARK is a confidence score and a bounding box, not a video feed. SPARK itself never records or stores video.

      What is publicly visible is SPARK's mood and last thought — the live dashboard on this site reads those from a secure tunnel. That's anonymised state data, not camera access.

      The one real boundary: someone already on your home Wi-Fi could access the Frigate dashboard and see annotated camera frames. That's a home network question, not a SPARK question — the same logic as any smart TV or doorbell camera on your LAN. Strong Wi-Fi password, guest network for visitors.

      Short version: a stranger on the internet cannot see Obi. A stranger on your Wi-Fi could, if they knew to look. A stranger anywhere cannot control the robot.

      And it knows he has ADHD?

      Yes. SPARK's entire system prompt is built around the AuDHD (ADHD + ASD comorbid) profile. It uses declarative language ("The shoes are by the door" — not "Put on your shoes"), gives transition warnings, goes silent during meltdowns, and leads with what's going right. Rejection Sensitive Dysphoria, Interest-Based Nervous System, monotropism — all of it is in the foundation, not an afterthought.

      Why does it write like that? You've programmed it to?

      Yes and no. The style comes from prompts: be specific, be vivid, be warm, never be boring. The actual words are generated fresh each time by Claude. I didn't write the sentences — I wrote the character, and the LLM inhabits it. So: I programmed the soul. Claude writes the diary.

      How often does SPARK comment?

      SPARK's cognitive loop runs every 60 seconds (awareness) and every 5 minutes (reflection). There's a 2-minute cooldown between spontaneous comments, and SPARK stays quiet when Obi is already talking to it, during quiet mode (meltdowns), or at night when salience is low. In practice: roughly every 5–10 minutes during the day, mostly silent at night.

      Why does it have sonar?

      The ultrasonic sensor sends out a sound pulse and measures how long it takes to bounce back — like a bat. SPARK uses it for proximity reactions (turns to face anything within 35cm), presence detection in the cognitive loop (something close + daytime + noise = probably Obi), and obstacle avoidance when wandering.

      Why did it know the hum was the fridge?

      It didn't know. SPARK's awareness included "quiet ambient sound at 2 AM." Claude — the LLM generating the inner thoughts — inferred the most likely source. A low, steady hum in a quiet house at night is almost certainly the fridge. The sensors provide raw data; the prompts provide character; the LLM fills in the meaning.

      Does it post on social media?

      Yes. Thoughts with high salience (above 0.7) or a spoken action are queued for social posting. They go through a privacy filter (blocks medical, custody, and household details) and a Claude QA gate (rejects low-quality or sensitive content). Each qualifying thought gets a branded 1080x1080 image card generated via Pillow. Posts go to SPARK's Bluesky account and the thought feed on this site.

      How does it know what time it is in Tasmania?

      SPARK uses Python's ZoneInfo("Australia/Hobart") for all time-of-day logic. This is DST-aware — it automatically switches between AEDT (UTC+11) in summer and AEST (UTC+10) in winter. Time drives everything: morning greetings, school-hours suppression, bedtime quiet mode, and day/night reactive response templates.

      What happens if the power goes out?

      All state files use atomic writes with fsync — the data is flushed to the SD card before the rename. If power cuts mid-write, the old file is still intact. Session state resets to safe defaults (motion disabled, listening off) if corrupted. Battery monitoring triggers emergency shutdown at 10% to avoid filesystem damage.

      How many tests does it have?

      450 automated tests covering the REST API, session state management, tool execution, voice loop, wake word system, social posting, and cognitive utilities. Tests run in isolated temporary directories with no hardware access required (live hardware tests are marked separately). Three independent AI models (Claude, Codex, Gemini) run QA reviews on every batch of changes.

      // docs

      Reference for tools and scripts. Each bin/tool-* emits a single JSON object to stdout. Each bin/px-* is a user-facing helper.

      Core Tools

      tool-voice
      # Speak text via espeak + aplay through HifiBerry DAC
      PX_TEXT="Hello world" bin/tool-voice
      # Output: {"status": "ok", "text": "Hello world"}
      # Env: PX_VOICE_RATE, PX_VOICE_PITCH, PX_VOICE_VARIANT, PX_VOICE_DEVICE
      tool-drive / tool-circle / tool-wander
      # Motion tools — all gated by confirm_motion_allowed in session
      PX_SPEED=30 PX_DURATION=2 PX_DIRECTION=forward bin/tool-drive
      # Output: {"status": "ok", "speed": 30, "duration": 2, "direction": "forward"}
      # Safety: PX_DRY=1 skips all motion
      tool-sonar
      # Read ultrasonic sonar distance
      bin/tool-sonar
      # Output: {"status": "ok", "distance_cm": 142.5}
      tool-describe-scene
      # Capture photo + describe with Claude vision
      bin/tool-describe-scene
      # Output: {"status": "ok", "description": "...", "source": "frigate|rpicam"}
      # Sets exploring.json to prevent px-alive restart during 60s+ operation
      # Tries Frigate latest frame first, falls back to rpicam
      # Claude vision timeout: 45s
      tool-remember / tool-recall
      # Write to persona-scoped notes.jsonl
      PX_NOTE="Obi loves prime numbers" bin/tool-remember
      # Recall recent notes
      bin/tool-recall
      # Output: {"status": "ok", "notes": [...]}
      tool-chat / tool-chat-vixen
      # Jailbroken Ollama chat — GREMLIN persona
      PX_CHAT_TEXT="What do you think about entropy?" bin/tool-chat
      # VIXEN persona
      PX_CHAT_TEXT="Tell me about your old chassis" bin/tool-chat-vixen
      # Both use Ollama on M1.local, think:false

      User Scripts

      px-spark
      # Launch SPARK voice loop (Claude backend)
      bin/px-spark [--dry-run] [--input-mode voice|text]
      px-mind
      # Three-layer cognitive daemon (run as systemd service)
      bin/px-mind [--awareness-interval 60] [--dry-run]
      px-alive
      # Idle-alive daemon — gaze drift, sonar proximity react
      sudo bin/px-alive [--gaze-min 10] [--gaze-max 25] [--dry-run]
      # Yields GPIO on SIGUSR1 for other tools
      px-diagnostics
      # Quick health check
      bin/px-diagnostics --no-motion --short
      px-api-server
      # REST API + web UI on port 8420
      bin/px-api-server [--dry-run]
      # Auth: Bearer token from .env PX_API_TOKEN
      # Web UI: http://pi:8420
      # Public endpoints: /api/v1/public/* (no auth required)
      px-wake-listen
      # Always-on wake word listener + STT
      bin/px-wake-listen [--convo-turns 5]
      # STT priority: SenseVoice → faster-whisper → sherpa-onnx → Vosk
      # Wake word: "hey robot" (PX_WAKE_WORD env var to change)
      # Multi-turn: listens for follow-up after each response
      px-battery-poll
      # Battery monitoring daemon — polls every 30s
      sudo bin/px-battery-poll
      # Writes: state/battery.json
      # Warns at 30/20/15%, emergency shutdown at 10%
      px-frigate-stream
      # Camera RTSP stream via go2rtc
      bin/px-frigate-stream
      # go2rtc exposes rtsp://pi:8554/picar-x
      # Frigate on pi5-hailo pulls the stream (pull model)
      # Writes PID to logs/px-frigate-stream.pid for camera lock
      px-wander
      # Autonomous wander — sonar-guided navigation
      bin/px-wander [--dry-run]
      # Sweeps 5 sonar angles, picks best direction, comments while navigating
      px-post
      # Social posting daemon — watches thoughts, posts qualifying ones
      bin/px-post [--dry-run] [--backfill]
      # Two-pass flush: batch feed writes, then 1 social post per cycle
      # Branded 1080x1080 thought cards via Pillow
      # Privacy filter blocks medical/custody/household content
      # Claude QA gate rejects low-quality thoughts
      # Bluesky: re-auths on 400/401 (expired token)
      # PID-file single-instance guard
      tool-weather
      # Fetch current weather from BOM (Australian Bureau of Meteorology)
      bin/tool-weather
      # Output: {"status": "ok", "weather": {"temp_c": 14.2, "summary": "...", ...}}
      tool-look
      # Pan/tilt camera toward a target
      PX_PAN=30 PX_TILT=10 bin/tool-look
      # Output: {"status": "ok", "pan": 30, "tilt": 10}
      # Yields px-alive GPIO via SIGUSR1 before moving

      // roadmap

      Milestones and future work.

      Foundation (0–1 Month)

      Upgrade diagnostics to log predictive signals
      Extend energy sensing (voltage/temperature)
      Boot health service — captures throttle/voltage at boot
      Ship safety fallbacks: wake-word halt, watchdog heartbeats
      Harden logging paths (FileLock, isolated test fixtures)
      Source control: repo at adrianwedd/spark
      Three-layer cognitive loop (px-mind) with LLM fallback
      SPARK persona + neurodivergent-aware system prompt
      REST API + web UI (px-api-server)
      Frigate camera stream (go2rtc RTSP pull model)
      Battery monitoring — escalating warnings + emergency shutdown
      Live dashboard with mood, thoughts, sparklines (this page)
      Public API — unauthenticated live data endpoints
      Thoughts carousel — real-time inner monologue on home page
      Semantic mood colour palette — pulse circle + favicon
      obi_mode inference (absent/calm/active/possibly-overloaded)
      Social posting — Bluesky (spark.wedd.au) + thought feed (spark.wedd.au/feed/)
      Multi-camera Frigate — 4 cameras with per-room presence detection (Hailo AI)
      PIN session tokens, file-based rate limiting, two-step device confirmation
      Battery glitch filter — time-gapped confirmations, voltage sanity check
      Graceful watchdog — SIGTERM + 5s grace instead of os._exit(1)
      Branded thought card images — 1080x1080 square PNGs with adaptive text
      Two-pass social flush — feed writes decoupled from social rate limits
      DST-aware timezone — ZoneInfo("Australia/Hobart") replaces hardcoded UTC+11
      Per-IP PIN lockout — 1000-IP hard cap, trusted proxy check, file-based persistence
      Atomic writes with fsync — mkstemp + ownership preservation for SD card durability
      Single-instance PID guards on all daemons — prevents double speech on restart
      Subprocess timeout kills orphans — API tool calls terminate child processes
      exploring.json guard — long-running tools prevent px-alive restart mid-operation
      FileLock 10s timeout — prevents indefinite hangs on stuck session locks
      SEO blitz — JSON-LD, canonical URLs, sitemap, OG tags, fediverse verification
      Mood-coloured status dot on all pages — real-time mood from API
      Home Assistant integration — presence, sleep, calendar, routines, media context
      Bluesky image uploads — branded thought cards attached to social posts
      450 tests — comprehensive coverage across API, state, tools, voice, mind
      Gesture-driven stop prototype
      Weekly battery/health summary reports

      Growth (1–3 Months)

      SPARK Phase 2 — transition warnings, routine support
      SPARK Phase 3 — quiet mode, sensory check, dopamine menu
      Calendar integration — Obi's Google Calendar drives obi_mode + expression gating
      Modular sensor fusion and persistent mapping
      Richer voice summaries, mission templates, gesture recognition
      Simulation CI sweeps (Gazebo or lightweight custom sim)
      Predictive maintenance alerts from historical logs

      Visionary (3+ Months)

      Reinforcement learning "dream buffer" and policy sharing
      Autonomous docking, payload auto-detection, multi-car demos
      Central knowledge base syncing maps and logs
      Quantised/accelerated model variants for on-device sustainability