| raw | raw/so-shipping-ai-apps-is-hard.md |
|---|---|
| url |
TL;DR: Shipping an AI assistant is not just prompt work. Chris Raroque’s Ellie agent hit the real production problems: token cost, abuse limits, model routing, framework choice, mobile form factor, and how a focused app can beat a general assistant on workflow friction.
Biggest lessons
- Track cost before launch. Chris spent more than $30/month on his own AI usage against a $10/month app price. The first fixes were shrinking an 8,000-token system prompt and limiting conversation history to the last useful window.
- Abuse prevention is product infrastructure. Message-size caps, per-user daily/monthly limits, a remote kill switch, and per-user token/cost analytics keep one user from turning a feature into an unbounded bill.
- Do not rebuild the obvious plumbing forever. The Vercel AI SDK replaced custom streaming, tool-calling, retry, and conversation-state code with a cleaner and more reliable implementation.
- Plan for multiple models. Different tasks worked better on different models. Chris used a cheap model-routing layer to choose between smaller/faster models and larger/specialized models only when needed.
- Form factor changes the product. The assistant became more useful on mobile because the natural behavior was dictating quick commands on the go.
- Focused AI apps can beat general chat. Ellie could skip many confirmation steps because it controlled a narrow workflow; that made it feel faster than using a general assistant with broad safety prompts.
Why it matters
- This is a production-grade caveat for vibe-coding: AI features are easy to demo and much harder to operate.
- It belongs in app-tool-stack because model routing, AI SDKs, PostHog-style analytics, and kill switches are part of the real AI-app stack.
- It also supports app-product-craft: cost, safety, and form factor shape the user experience.