Tripplet — AI that works the way you think

Most AI responses happen in one pass: the model reads your message and starts generating tokens immediately. Extended Thinking changes this. When enabled, Taipei 3.1 allocates a reasoning budget — a block of tokens it uses to work through the problem before forming its final answer.

Think of it as the model scratchpadding. It can explore dead ends, reconsider assumptions, and verify intermediate steps. The final response benefits from all of this invisible work.

When to use it

Extended Thinking shines on problems that require multi-step reasoning: complex debugging, mathematical proofs, architectural design decisions, research synthesis, or any question where the "obvious" first answer is probably wrong. If you're asking something where you'd expect a smart human to say "let me think about this for a moment," Extended Thinking is a good fit.

It's not useful for simple factual questions, short summaries, or casual conversation. In those cases, it adds latency without improving quality. The model still works correctly without it — Extended Thinking is a tool you deploy deliberately, not a setting you leave on all the time.

The performance trade-off

Extended Thinking increases time to first token by 2-4x, depending on problem complexity. On a task that normally takes 1.1 seconds to start streaming, Extended Thinking might take 3-5 seconds. The full response time is also longer.

The quality improvement more than justifies this on hard tasks. In our benchmarks, Extended Thinking improves accuracy on complex reasoning problems by 23% on average. For the hardest tasks in our suite — the ones where base Taipei 3.0 scored under 50% — the improvement was 41%.

How to enable it

Open the model selector in any chat and toggle "Extended Thinking" on. You'll see an amber indicator in the input bar while it's active. Extended Thinking is always active on Max mode, which uses Taipei 3.1 by default.

You can also control the reasoning token budget via the API (Coming Soon). Higher budgets allow deeper reasoning on the most complex problems, at correspondingly higher latency and cost.

Extended Thinking: When and Why to Use It

When to use it

The performance trade-off

How to enable it

Try Tripplet for free

More from the blog

How We Made Tripplet More Stable in March

Introducing Taipei 3.1, Majuli 3.1, and Suzhou 3.1