Teaching an AI to Program a Microcontroller

Most “AI + robotics” posts are still simulations and slides.
This one isn’t.

This is the story of how we wired up a Raspberry Pi CM-5, a Raspberry Pi Pico W (running MicroPython), a small OLED display, a green LED, a camera, and an RTX 5090 into a system where:

an AI can write and deploy code to a real microcontroller,
run it, log the result, and iterate like an engineer,
and use vision and sensors to ground that code in the real world.

It’s “hello world”, but instead of printing a string, the world actually blinks back.

Architecture in Short

Pico W connected via USB to a CM-5 / Pi 5.
MicroPython runs on the Pico.
CM-5 runs:
- mpremote for direct serial access,
- a small Flask API (pico_api.py) exposing:
  - code execution, file upload/delete, reset, run logs.
Any agent or AI can:
- upload new code,
- execute it,
- read stdout/stderr and classify errors,
- store each run as JSON for later reasoning.

The system forms a programmable bridge between intelligence and hardware.

Hardware Setup

Raspberry Pi Pico W (or Pico 2 W) connected via USB.
OLED SSD1306 I²C display:
- SCL → GP15, SDA → GP14
Green LED → GP16 (with resistor).
CM-5 / Pi 5 runs Python 3 and Flask.


Copied!sudo apt install python3-venv
python3 -m venv .venv
source .venv/bin/activate
pip install flask mpremote
sudo apt install python3-venv
python3 -m venv .venv
source .venv/bin/activate
pip install flask mpremote

Verify connection:


Copied!mpremote connect auto
>>> # MicroPython REPL should appear
mpremote connect auto
>>> # MicroPython REPL should appear

The Pico API (`pico_api.py`)

A lightweight Flask server wraps mpremote and gives the AI a safe, auditable way to manipulate the board.

Key Endpoints

Method	Path	Description
`POST`	`/pico/runs/exec_snippet`	Run a MicroPython snippet
`POST`	`/pico/runs/exec_file`	Execute existing file
`PUT`	`/pico/files`	Upload / overwrite file
`GET`	`/pico/files`	List files
`GET`	`/pico/files/content`	Read a file
`DELETE`	`/pico/files`	Remove file
`POST`	`/pico/reset`	Soft / hard reset
`GET`	`/pico/info`	FS info
`GET`	`/pico/runs`	List logged runs

Every run is logged as a JSON file in ./pico_runs, including code, stdout, stderr, status and duration.

The AI can then analyse those logs to learn what worked, what failed, and why.

Hello, LED


Copied!curl -X POST http://cm5.ipaddress:8000/pico/runs/exec_snippet \
  -H "Content-Type: application/json" \
  -d '{
    "code": "import machine, time\nled = machine.Pin(16, machine.Pin.OUT)\nled.value(1)\nprint(\"LED ON\")\ntime.sleep(1)\nled.value(0)\nprint(\"LED OFF\")",
    "timeout": 3.0
  }'
curl -X POST http://cm5.ipaddress:8000/pico/runs/exec_snippet \
  -H "Content-Type: application/json" \
  -d '{
    "code": "import machine, time\nled = machine.Pin(16, machine.Pin.OUT)\nled.value(1)\nprint(\"LED ON\")\ntime.sleep(1)\nled.value(0)\nprint(\"LED OFF\")",
    "timeout": 3.0
  }'

Result:


Copied!{
  "ok": true,
  "status": "success",
  "stdout": "LED ON\nLED OFF\n",
  "duration_ms": 1012.5
}
{
  "ok": true,
  "status": "success",
  "stdout": "LED ON\nLED OFF\n",
  "duration_ms": 1012.5
}

A real LED just blinked because of a JSON request.
And that tiny event is where machine learning meets machine doing.

Deploying Code as Files

PUT /pico/files uploads a file:


Copied!curl -X PUT http://cm5.ipaddress:8000/pico/files \
  -H "Content-Type: application/json" \
  -d '{
    "path": ":experiments/led_test1.py",
    "content": "from machine import Pin\nimport time\nled = Pin(16, Pin.OUT)\nled.value(1)\ntime.sleep(1)\nled.value(0)\n"
  }'
curl -X PUT http://cm5.ipaddress:8000/pico/files \
  -H "Content-Type: application/json" \
  -d '{
    "path": ":experiments/led_test1.py",
    "content": "from machine import Pin\nimport time\nled = Pin(16, Pin.OUT)\nled.value(1)\ntime.sleep(1)\nled.value(0)\n"
  }'

Check on device:


Copied!mpremote connect auto fs ls experiments
mpremote connect auto fs ls experiments

And execute via:


Copied!exec(open("experiments/led_test1.py").read(), {})
exec(open("experiments/led_test1.py").read(), )

All remotely managed, fully logged.

Demo: LED + OLED “SOS”


Copied!curl -X POST http://cm5.ipaddress:8000/pico/runs/exec_snippet \
  -H "Content-Type: application/json" \
  -d '{
  "code": "from machine import Pin, SoftI2C\nimport ssd1306, time\n\nled = Pin(16, Pin.OUT)\ni2c = SoftI2C(scl=Pin(15), sda=Pin(14))\noled = ssd1306.SSD1306_I2C(128, 64, i2c)\n\n# Morse helpers\ndef dot(): led(1); time.sleep(0.2); led(0); time.sleep(0.2)\ndef dash(): led(1); time.sleep(0.6); led(0); time.sleep(0.2)\n\ndef sos():\n  for ch in ['S','O','S']:\n    (dot,dash)[ch=='O'](); (dot,dash)[ch=='O'](); (dot,dash)[ch=='O'](); time.sleep(0.4)\n\noled.fill(0); oled.text('SOS',45,10); oled.show()\nsos()\noled.fill(0); oled.text('DONE',40,30); oled.show()",
  "timeout": 10.0
}'
curl -X POST http://cm5.ipaddress:8000/pico/runs/exec_snippet \
  -H "Content-Type: application/json" \
  -d '{
  "code": "from machine import Pin, SoftI2C\nimport ssd1306, time\n\nled = Pin(16, Pin.OUT)\ni2c = SoftI2C(scl=Pin(15), sda=Pin(14))\noled = ssd1306.SSD1306_I2C(128, 64, i2c)\n\n# Morse helpers\ndef dot(): led(1); time.sleep(0.2); led(0); time.sleep(0.2)\ndef dash(): led(1); time.sleep(0.6); led(0); time.sleep(0.2)\n\ndef sos():\n  for ch in ['S','O','S']:\n    (dot,dash)[ch=='O'](); (dot,dash)[ch=='O'](); (dot,dash)[ch=='O'](); time.sleep(0.4)\n\noled.fill(0); oled.text('SOS',45,10); oled.show()\nsos()\noled.fill(0); oled.text('DONE',40,30); oled.show()",
  "timeout": 10.0
}'

Continuous Behavior via `main.py`

Upload a main.py with a breathing LED and progress bar:


Copied!curl -X PUT http://cm5.ipaddress:8000/pico/files \
  -H "Content-Type: application/json" \
  -d '{"path":":main.py","content":"<...code omitted for brevity...>"}'
curl -X POST http://cm5.ipaddress:8000/pico/reset -H "Content-Type: application/json" -d '{}'</...code>
curl -X PUT http://cm5.ipaddress:8000/pico/files \
  -H "Content-Type: application/json" \
  -d '{"path":":main.py","content":"<...code omitted for brevity...>"}'
curl -X POST http://cm5.ipaddress:8000/pico/reset -H "Content-Type: application/json" -d ''

Now the Pico boots autonomously, displaying a “breathing” LED and progress bar — a visual heartbeat.

Fast vs Deep Perception

Once the physical control layer works, we add sight and meaning.

1. Edge Perception — CM-5 + YOLO

The CM-5 continuously processes its Picam feed using a lightweight YOLO model.
It looks for triggers like a cat, a human, movement, or something new in the room.
Only when an event is interesting enough does it escalate.

2. Deep Scene Understanding — RTX 5090 + LLM Studio

When a frame is promoted, it’s sent to a GPU box running LLM Studio.
There, a large model performs full scene parsing:

Who’s in the room, what are they doing?
What objects, text, or interactions exist?
What’s changed since the last observation?

The output becomes structured world data — not just pixels, but facts:


Copied!{
  "time": "2025-11-07T19:32:00Z",
  "entities": [
    {"type":"person","count":2,"actions":["typing","talking"]},
    {"type":"cat","count":1,"action":"sleeping"}
  ],
  "objects":["desk","keyboard","monitor","coffee_cup"],
  "summary":"Two people working at a desk with a cat nearby."
}
{
  "time": "2025-11-07T19:32:00Z",
  "entities": [
    {"type":"person","count":2,"actions":["typing","talking"]},
    {"type":"cat","count":1,"action":"sleeping"}
  ],
  "objects":["desk","keyboard","monitor","coffee_cup"],
  "summary":"Two people working at a desk with a cat nearby."
}

This world model lives as a JSON state the AI can query, reason over, and extend.
The CM-5 thus acts as a bouncer, and the RTX 5090 as the scribe and philosopher.

Ephemeral Micro-Apps on the Pico

In traditional robotics you flash one giant firmware with all libraries baked in.
In this design, the Pico becomes a throw-away runtime.

The AGX Orin (the planner) can command the CM-5:

“Deploy this 3 KB script to the Pico, connect to that Wi-Fi,
read those sensors, send data back, then self-wipe.”

Each mission uses a new, minimal piece of code:

No heavy motor or display libraries unless required.
Lower memory footprint.
Easier reasoning for the AI (“I only need code for this goal”).
Higher security (no permanent broad-capability firmware).

It’s like micro-containers for embedded devices — single-purpose, temporary, and fully auditable.

Autonomous Coordination

The full system ties together like this:

Role	Hardware	Responsibility
Vision / Perception	CM-5 + Picam	Detect motion, cats, people; decide when to escalate
Deep Analysis	RTX 5090 LLM Studio	Parse images in detail, update world model
Planning / Orchestration	AGX Orin	Decide where to go, which sensors to use, generate new Pico code
Execution / Interface	Pico W nodes	Run short-lived code snippets for local sensing or actuation

Even if the CM-5 is 100 km away, the Orin can drive it over VPN/Wi-Fi,
and through it, deploy fresh code to local Pico nodes — dynamically exploring new networks or environments.

Over time the AI learns:

which pins map to which hardware,
which networks are reachable,
what each sensor measures,
and how to optimise its own code for speed, noise or accuracy.

It’s a self-organising embedded ecosystem.

Why This Matters

This isn’t just about blinking LEDs, it’s about building a living, programmable feedback loop between thought and world.

The AI writes code (agency).
The hardware executes it (embodiment).
Sensors and cameras respond (perception).
Logs and analysis refine the model (learning).

And the entire chain is open, inspectable, and under your control.

Every new node, every run, every image adds to its understanding of reality until you no longer have a static device, but a small, distributed intelligence
that can see, act, reflect, and adapt.

Next up: Part II – World Ingestion and Semantic Mapping,
where we’ll show how the AI merges vision, sensor streams and logs into one evolving world model.

I know, you want more code and see it do what it does. If I have time, I will clean up the github repo. But as soon as things work, I change everything and add something new. Everything is always a work in progress…