In partnership with

opus 4.8 vs mythos

WTF is model routing?: Model routing means choosing the right AI model for each part of the task.

Not the biggest model every time. The right model for the job.

You open an AI tool.

You pick the strongest model.

Then you use it for everything.

Summarizing, Formatting, Fixing tiny text.

It feels safe because the best model is probably the smartest.

But it is also a bad habit.

The smartest model is usually slower, more expensive, and easier to waste on low-value work.

This matters more now because AI pricing is changing fast.

Claude Fable 5 launch shows the same pattern.

A top model may be great for planning, debugging, and hard decisions.

But if it is expensive, you do not want it summarizing logs, formatting notes, or writing repetitive boilerplate all day.

Use the best model where judgment matters.

Use cheaper models where volume matters.

I am Alex, welcome to ShortCu8 by Innov8.

Lets Dive Deep 🐰

Today's Shortcut

Think of AI work in three layers: raw material, execution, and judgment.

Most people send all three layers to one model.

That is where the waste happens.

You paste a long transcript into the best model.

It summarizes, finds ideas, writes, rewrites, and reviews.

The cleaner workflow: cheap model prepares, mid model builds, top model judges.

The model map

Cheap models: use them for volume

Cheap models are for repetitive, messy, or large tasks where the answer is easy to check.

Current examples:

  • DeepSeek V4 Flash

  • Gemini 3.1 Flash-Lite

  • GPT-5.4 mini

  • Claude Haiku 4.5

Give these models transcript cleanup, extraction, duplicate removal, simple tables, classification, and first-pass summaries.

Example prompt:

Extract the useful ideas, repeated points, strong examples, and confusing parts from this transcript. Do not write the final article.

Do not let the cheap model finish the whole work. Let it prepare the material.

2. Mid models: use them to build

Mid models are for normal execution: the direction is clear, but the output still needs skill.

Current examples:

  • Claude Opus 4.8

  • Claude Sonnet 4.6

  • GPT-5.5

  • Gemini 3.5 Flash

  • DeepSeek V4 Pro

Use them for first drafts, normal coding, rewriting sections, UI components, tests, obvious bug fixes, and turning a plan into usable output.

Example prompt:

Use this plan and write the first version. Stay close to the structure. Do not add new sections.

This is where most of the typing should happen.

3. Top models: use them for judgment

Top models are for decisions: taste, planning, hard reasoning, and risk checking.

Current examples:

  • Claude Fable 5

  • GPT-5.5 Pro

  • Claude Mythos 5, if you have approved access

This is where Fable 5 fits. Anthropic lists it as its most capable widely released model, with higher pricing than Opus 4.8, Sonnet 4.6, and Haiku 4.5.

Do not make Fable 5 clean transcripts or write every line. Use it to decide the plan, choose tradeoffs, catch weak logic, and review the final result.

That workflow looks like this:

Fable 5 plans.

Opus, Sonnet, GPT, Gemini, Codex, or another cheaper setup executes.

Fable 5 reviews.

Example prompt:

Review this plan. What is weak? What should be removed? What should a cheaper model execute first? What could go wrong?

This is where the expensive model earns its money: not by formatting bullet points, but by making better decisions.

A simple workflow

Step 1: give the messy input to a cheap model

Use this for a transcript, PDF, research dump, bug report, meeting notes, or long chat.

Prompt:

Clean this into useful notes. Keep the important examples, facts, questions, and repeated ideas. Do not write the final output.

Step 2: give the cleaned notes to the top model

Prompt:

Create the plan. What should the final output do? What should we avoid? What is the best structure?

Step 3: give the plan to a mid model

Prompt:

Execute this plan. Keep the structure. Write the first complete version.

Step 4: give the result back to the top model

Prompt:

Review this like a strict editor. Find weak logic, missing context, wrong claims, generic lines, and anything that should be cut.

That is model routing in plain English: cheap model prepares, top model plans, mid model builds, top model reviews.

Why this is real

A paper called Switchcraft tested model routing for AI agents that call tools. Its router matched or beat the best single model's accuracy while cutting inference cost by 84%.

The paper also found that bigger models are not always better for every small tool-use task.

So do not route only by price.

Route by job.

Now go create something great.

🛠️Cool Tools of the Week:

  • Ramp Applied AI Solutions: The fintech firm is offering agentic tools for complex financial workflows.

  • DiffusionGemma: A 26 billion-parameter open model by Google with four-times faster text generation. 

  • Perplexity Computer: Claude Fable 5 is available as an orchestrator model for Perplexity Pro and Max subscribers. 

  • Backplanes Spotlight: This tool reads your Claude Code and Codex sessions and offers session reports to improve your code.

📩 Innathe Shortcu8 engane undarunnu 👇️?

We read every reply - just reply to this email and let us know how we can improve !

Appo adutha Shortcu8il kanaam bie…👋

If you read till here, you might find this interesting

#AD1

Six people doing the work. Your headcount is one.

Your finance close runs in #finance. Stripe and QuickBooks reconciled, runway updated, posted Sunday night without you asking.

Engineering review lands in #eng. Viktor pulled the open PRs, left comments on auth-refactor, flagged a dependency blocking api-pagination.

Campaign brief lands in #growth: Meta CPA up 18%, recommendation to pause broad match, a draft landing page already deployed for the variant test.

You hired him on day zero. He lives in Slack and Microsoft Teams alongside your contractors and investors, connects to 3,000+ tools, pushes back when you ship something dumb.

"Viktor is now an integral team member, and after weeks of use we still feel we haven't uncovered the full potential." Patrick, Director, Yarra Web.

#AD2

Postgres Didn't Fail You. Your Architecture Did.

Adding a second database was supposed to fix things. Now you manage sync, drift, and pipelines on top of queries that are still slow.

TimescaleDB extends Postgres instead. Hypertables, 95% compression, continuous aggregates. One database. No pipeline.

Keep reading