Source
rawraw/so-shipping-ai-apps-is-hard.md
url

TL;DR: Shipping an AI assistant is not just prompt work. Chris Raroque’s Ellie agent hit the real production problems: token cost, abuse limits, model routing, framework choice, mobile form factor, and how a focused app can beat a general assistant on workflow friction.

Biggest lessons

  • Track cost before launch. Chris spent more than $30/month on his own AI usage against a $10/month app price. The first fixes were shrinking an 8,000-token system prompt and limiting conversation history to the last useful window.
  • Abuse prevention is product infrastructure. Message-size caps, per-user daily/monthly limits, a remote kill switch, and per-user token/cost analytics keep one user from turning a feature into an unbounded bill.
  • Do not rebuild the obvious plumbing forever. The Vercel AI SDK replaced custom streaming, tool-calling, retry, and conversation-state code with a cleaner and more reliable implementation.
  • Plan for multiple models. Different tasks worked better on different models. Chris used a cheap model-routing layer to choose between smaller/faster models and larger/specialized models only when needed.
  • Form factor changes the product. The assistant became more useful on mobile because the natural behavior was dictating quick commands on the go.
  • Focused AI apps can beat general chat. Ellie could skip many confirmation steps because it controlled a narrow workflow; that made it feel faster than using a general assistant with broad safety prompts.

Why it matters

  • This is a production-grade caveat for vibe-coding: AI features are easy to demo and much harder to operate.
  • It belongs in app-tool-stack because model routing, AI SDKs, PostHog-style analytics, and kill switches are part of the real AI-app stack.
  • It also supports app-product-craft: cost, safety, and form factor shape the user experience.