Profile
Self-improving robotics via VLA model fine-tuning. Applies the Hermes learning loop to physical robot control. Nous Hackathon project.
Signals
Listed in the awesome-hermes-agent README
Sources: 2 / Surfaces: 1
What the upstream surface says
Short excerpt only, so you can decide whether to click out.
"Any robot owner can fine-tune a state-of-the-art VLA by talking to their agent. No ML expertise needed."
Hermes Embodied turns Hermes Agent into a self-improving robotics trainer. It adds three Hermes skills that close the loop between robot execution, training data collection, and model improvement — all orchestrated through natural language.
The same self-improvement loop that Hermes uses to get better at coding tasks (via Tinker-Atropos RL) now extends to physical robot control via Vision-Language-Action models.
- Deploy — Hermes loads a VLA checkpoint and runs it in sim (or on hardware)
- Collect — Every rollout is recorded as a LeRobot trajectory (state, action, camera, reward)
- Curate — Hermes filters successful trajectories (reward > threshold)
- Train — Provisions a GPU on Vast.ai and fine-tunes SmolVLA on the new data
- Evaluate — Runs open-loop eval comparing new checkpoint vs. old
- Promote — If new model is better, it becomes the active policy
- Repeat — Scheduled via Hermes cron, runs autonomously
- "Spin up an A100 for training" → finds cheapest A100, creates instance, returns SSH access