Most AI-focused startups shouldn’t focus on training, fine-tuning, or otherwise making significant hardware investments (e.g., GPUs) before finding product market fit. (GPUs for inference is, of course, OK). In many cases, this is the wrong sequence for startups. Why?
- Training a model from scratch creates long feedback cycles. Startups need to iterate fast and change direction quickly before they’ve figured out product market fit.
- It’s unlikely you’ll be able to predict emergent behaviors in finely tuned models. If your product depends on this, it might not work (see the human-in-the-loop era of AI chat-bots).
- Model architectures are changing too quickly for startups to realistically catch up with heavily funded research institutions.
- “Do things that don’t scale.”
- Foundational models plus a few tricks should be enough to validate a particular use case.
There are exceptions — if your startup’s value proposition is fine-tuning models for customers, it makes sense. However, it might make more sense to invest in training custom models after product-market fit.
The original quip comes from Stanislaw Polu on Twitter.