Everything is Models — AI Engineer Miami Guide

Official Session Summary

Pulled from the live conference page.

Given the same compute budget, does a single frontier model outperform a system of specialized models? Our research says no. We trained three task-specific models for the subtasks budget, multi-model wins: every frontier model we pair with hits #1 on SWE-Bench Pro, 15% cheaper and 28% faster than running alone - with just WarpGrep. As frontier models saturate tasks, those tasks should move to smaller models with custom inference engines. The expensive model reasons. The cheap models do the mechanical work. This talk covers the CUDA kernels, RL training, and speculative decoding behind that split, and why it's the natural way intelligence organizes under compute constraints.

Speaker Background

Quick context on the person or people on stage.

Tejas BhaktaMorph LLMFounder

Founder of Morph LLM, exploring specialized model systems and what multi-model architectures can do better than a single generalist.

Why This Slot Matters

A compact framing layer for navigating the conference.

This is one of the more substantive abstract-backed sessions on the schedule; worth opening when you need enough context to decide whether to stay in the room.

Back to schedule Open speaker index