Research x Design x Engineering

Hi, I'm Pete Pittawat Taveekitworachai

I teach language models to reason, follow instructions, and behave the way people expect—through research, engineering, and writing.

My work includes first-author papers at EMNLP 2024 and 2025, leading Typhoon T1—Southeast Asia's first open reasoning model—and building evaluation frameworks like BenchING (IEEE ToG 2025) that help teams understand what LLMs can actually do.

Pete's profile picture

Current focus

Leading fine-tuning programs and agentic workflows that make models useful in medical triage, game design, and driving assistance.

Ship-ready builds

See the evaluation stacks and agent tooling delivered to teams.

Explore projects

Research notes

Read experiments, failure digs, and applied prompting patterns.

Dive into the blog

Talks & workshops

Watch practical walkthroughs from conferences and private sessions.

Listen to talks
Impact in Practice

About Me

I'm Pete (Pittawat Taveekitworachai), and I work on making large language models reliable and useful. That means research, engineering, and lots of writing about what actually works.

Most recently, I've been leading fine-tuning programs for Typhoon (Thai reasoning models), publishing at EMNLP and IEEE ToG, and building evaluation tools that help teams test what their models can—and can't—do.

Behavior shaping research

Prior Prompt Engineering (EMNLP 2025) shows how different prompting strategies during training lead to different reasoning behaviors after fine-tuning. Typhoon T1 demonstrates a cost-effective path to building reasoning models through supervised fine-tuning.

Prompt & data design

Null-Shot Prompting (EMNLP 2024) explores how intentional errors in prompts can improve reasoning. FinCoT grounds chain-of-thought in financial expert blueprints. These techniques translate into prompt libraries teams actually use.

Agentic workflows

BenchING (IEEE ToG 2025) benchmarks how well LLMs follow structured output formats. ChatGPT4PCG runs competitions on physics-based level generation. Both help catch failures before deployment.

Cross-domain applications

Work with medical triage systems, game narrative generators, and driving assessment tools tests whether research holds up when it meets real users.

Learn more about me

Research Focus

Where fine-tuning, workflow design, and field deployments currently concentrate.

  • Fine-tuning programs

    Leading SFT and RFT programs for Typhoon reasoning models. Prior Prompt Engineering (EMNLP 2025) shows how training prompts shape model behavior.

  • Prompt techniques

    Null-Shot Prompting (EMNLP 2024) uses intentional errors to improve reasoning. FinCoT embeds financial expert strategies into prompts.

  • Evaluation frameworks

    BenchING (IEEE ToG 2025) evaluates structured output formats. ChatGPT4PCG tests physics-based generation. Both catch problems early.

  • Applied work

    Collaborations in medical triage, game design, and driving assessment that test whether research works with real users.

Professional Highlights

Translating research into community resources, talks, and tooling.

  • Publications & writing

    First-author papers at EMNLP 2024 and 2025 on prompting techniques and fine-tuning strategies. Published in IEEE ToG on LLM evaluation frameworks.

  • Talks & workshops

    Spoke at FOSSASIA Summit, SuperAI Engineer, and CoG on reasoning models, fine-tuning, and prompt engineering.

  • Open-source tooling

    Released Typhoon T1, BenchING evaluation tools, and ChatGPT4PCG competition platform—all open source and used by other teams.

Proof of Work

What I publish and share with the community

Regular writing, peer-reviewed research, and talks that turn LLM research into approachable practice.

Latest Writing

Fresh experiments and field notes

Short reads on evaluation, prompting techniques, and the practical side of running LLM systems in production.

Browse all articles