How to add LLM features to an existing product without breaking it

Most teams do not need an AI product. They need one or two AI features inside the product they already have — and they need them to ship without destabilising everything around them. That constraint changes how you build.

Start with a narrow, boring use case

The best first LLM feature is one where a wrong answer is cheap and a right answer saves real time: drafting a reply, summarising a thread, classifying an incoming ticket. Avoid anything that writes to the database or moves money on the first iteration.

Treat the model as an unreliable third-party API

It will be slow sometimes, wrong sometimes, and down sometimes. Design for that from day one:

Wrap every call with a timeout and a deterministic fallback path.
Validate the output against a schema before it touches your app — never trust free text.
Log every prompt, response and cost so you can debug and price it later.

Ship evals before you ship features

A small set of example inputs with expected outputs turns "it feels better" into a number. Run them on every prompt change. Without evals you are not engineering, you are guessing — and you will regress silently.

Where we start with clients

We usually spend the first week mapping one high-value workflow, wiring a single guarded LLM call behind a feature flag, and standing up the eval harness. Boring, measurable, reversible — then we expand. If that sounds like the pace you need, get in touch.