Domain Applications

hermes-embodied

Self-improving robotics via VLA model fine-tuning. Applies the Hermes learning loop to physical robot control. Nous Hackathon project.

Why it matters

Profile

Self-improving robotics via VLA model fine-tuning. Applies the Hermes learning loop to physical robot control. Nous Hackathon project.

setup mediumintegration lowinterface cli
Provenance

Signals

Listed in the awesome-hermes-agent README

Sources: 2 / Surfaces: 1

Fast skim

What the upstream surface says

Short excerpt only, so you can decide whether to click out.

"Any robot owner can fine-tune a state-of-the-art VLA by talking to their agent. No ML expertise needed."

Hermes Embodied turns Hermes Agent into a self-improving robotics trainer. It adds three Hermes skills that close the loop between robot execution, training data collection, and model improvement — all orchestrated through natural language.

The same self-improvement loop that Hermes uses to get better at coding tasks (via Tinker-Atropos RL) now extends to physical robot control via Vision-Language-Action models.

Hermes Embodied: Self-Improving Robotics via Hermes AgentWhat Is This?ArchitectureThe Self-Improvement LoopSkillsvast-gpu — Cloud GPU Infrastructurevla-trainer — VLA Fine-Tuning Pipelinerobot-loop — Continuous Improvement
  • Deploy — Hermes loads a VLA checkpoint and runs it in sim (or on hardware)
  • Collect — Every rollout is recorded as a LeRobot trajectory (state, action, camera, reward)
  • Curate — Hermes filters successful trajectories (reward > threshold)
  • Train — Provisions a GPU on Vast.ai and fine-tunes SmolVLA on the new data
  • Evaluate — Runs open-loop eval comparing new checkpoint vs. old
  • Promote — If new model is better, it becomes the active policy
  • Repeat — Scheduled via Hermes cron, runs autonomously
  • "Spin up an A100 for training" → finds cheapest A100, creates instance, returns SSH access