When we launched Tripplet in January, we shipped three models at version 3.0. They were good. But after two months of real-world use — millions of messages, diverse tasks, edge cases we hadn't anticipated — we knew exactly where each model needed to improve. Today, we're releasing the 3.1 generation across all three.
This isn't a rebranding. The improvements are specific, measurable, and meaningful. We'll walk through each model in detail.
Taipei 3.1 — Deeper Reasoning
Taipei is our flagship reasoning model, and the 3.1 upgrade focuses entirely on the quality of extended thinking. We've increased the internal reasoning token budget from 8K to 32K tokens. In practice, this means Taipei can hold more intermediate steps in its working memory, revisit earlier conclusions, and produce more coherent answers to problems that require multi-pass reasoning.
On our internal benchmark suite — which includes complex coding problems, multi-step math proofs, and research synthesis tasks — Taipei 3.1 scores 23% higher on average than 3.0. For extended thinking specifically, the improvement is 31%. These aren't cherry-picked examples; they represent median performance across 10,000 test cases.
To use Extended Thinking, toggle it on in the model selector. You'll see a brief pause while Taipei works through the problem — this is intentional, and the output quality difference on hard problems is substantial.
Majuli 3.1 — 40% Faster
Majuli is our speed-optimized model, and the 3.1 release makes it meaningfully faster without sacrificing output quality. We achieved this through two changes: speculative decoding and an improved attention mechanism that reduces redundant computation on repetitive phrasing.
Average time to first token dropped from 1.9 seconds to 1.1 seconds. For short responses (under 200 tokens), the full response arrives 40% sooner on average. For longer responses, the throughput improvement compounds — 50 tokens per second in 3.1 versus 35 in 3.0.
We also improved Majuli's conciseness. The model was trained on a more aggressive brevity signal, which means answers are shorter without losing content. This was one of the most consistent pieces of feedback we received about 3.0: Majuli was verbose.
Suzhou 3.1 — Richer Creative Output
Suzhou is our creative model — the one we recommend for writing, ideation, image prompt generation, and open-ended exploration. The 3.1 improvements here are harder to quantify than reasoning accuracy, but they're equally real.
We fine-tuned Suzhou 3.1 on a curated dataset of high-quality creative outputs, with a particular focus on instruction adherence. When you give Suzhou a specific format — a sonnet, a product brief, a character description — it follows the structure more reliably than 3.0 did. It's also better at decomposing multi-part creative tasks: "write a product landing page with five sections" now reliably produces five distinct sections with intentional differentiation.
Image prompt generation quality improved significantly. Suzhou 3.1 produces prompts that result in higher-quality outputs from our image generation pipeline, with more specific lighting, composition, and style guidance.
What's next
Version 3.2 is already in development. The primary focus is long-context improvements across all three models, particularly for document analysis and codebase-scale reasoning. We're also working on voice input and a plugin system — both expected later this quarter.
All 3.1 models are live now. If you're an existing user, you're already on 3.1 — we migrated all conversations automatically. No action required.