This week Anthropic shipped Claude Opus 4.8. Earlier this month OpenAI quietly made GPT-5.5 Instant the new default in ChatGPT, replacing GPT-5.3. Google announced Gemini 3.5 Flash, Gemini Omni, and a new persistent agent called Spark at I/O. If you built workflows that depend on a specific model's behavior, the floor is moving under your feet — and not on a schedule you control.

The "default model" surprise

Most SMBs don't pin a model version. They use whatever's default in the app they pay for. When the provider swaps defaults — and they do, several times a year, sometimes without a loud announcement — your prompts produce slightly different output. Often better. Occasionally weirdly worse. Almost never communicated clearly enough that your team notices until something downstream breaks.

Where this actually breaks SMBs

Three places, in order of frequency. First: over-tuned prompts. You spent two hours crafting the perfect instruction for a recurring task, and the new model's interpretation is just different enough that the output structure drifts. Second: automation chains. The model's output feeds a script or another tool, and the downstream parsing assumes specific wording the new model no longer uses. Third: client-facing chatbots. Tone drifts on something you reviewed and approved three months ago, and you only notice when a customer points it out.

How to hedge without becoming an engineering team

Four habits, none of which require a developer. One: re-test your critical workflows once a month — run the same inputs you ran when you first built them and compare. Two: write prompts that describe what you want, not prompts that reverse-engineer a specific model's quirks. Three: if you call the API directly, pin model versions in the call rather than using "latest." Four: keep a short "regression suite" — ten or twenty prompts you re-run whenever a big release lands. That's the entire program.

The honest caveat

Fighting the upgrade cycle isn't free. Pinning to an older model means you eventually pay for legacy access, miss real improvements that the new versions deliver, and end up on a deprecated path anyway when the provider sunsets it. The point isn't to freeze in place — it's to know what's changing and why, so you can move on your timing instead of theirs.

Your next step

List the three or four AI-touched workflows you'd notice if they got worse tomorrow. Write down what each one is supposed to produce. That short list is your monthly re-test. Every SMB running AI in production should have one. Almost none of them do.