Latest insights on reasoning model development approaches.
Continue exploring adjacent research, playbooks, and hands-on experiments from the lab.
A collaborative research project exploring whether small models can outperform frontier models—like Gemini 2.5 Pro—when trained to produce ranked lists that better reflect real clinical reasoning.
In recent years, large language models (LLMs) have evolved beyond text-based chatbots into agents capable of executing tools—functions that let them gather new information, interact with external systems, or even take actions that affect the real world.