Companies are excited to add AI to their application. They just don’t know how. Talking to customers yields the same desires that repeat themselves. It remains to be seen if these products are faster horses or carriages in disguise.
Fine-tuned models. Custom models might make sense after a point. But most companies would be better off proving value add with off-the-shelf models. Fine-tuned models require a significant commitment — training pipelines, serving infrastructure, load balancing, GPUs, monitoring, data cleaning, and more. Prove out the value first before undertaking fine-tuned models. The underlying models are also changing quickly enough so that customers (and vendors) bear the cost of keeping up (against open-source and well-funded tech giants).
Restricted access to hosted models. Developers copy and paste questions into Google or post questions on StackOverflow that reveal proprietary technology. Non-technical users do the same. AI models will increase productivity, and there’s probably a trade-off between making your employees more productive and not leaking trade secrets. There are probably some policies to ensure employees opt out of their data being used as a training set, but the productivity increases likely always outweigh the costs.
Completely self-hosted infrastructure. Companies are protective of their data. No data must be exfiltrated. What used to be on-premise data centers now is an AWS accounts owned by the customer. Often, the maintenance cost isn’t accurately reflected in the cost equation. It’s costly to self-host. Even with a managed service provider, it’s expensive. And you have to trust that managed service provider when it comes to security anyways (how else will they handle the service on your infrastructure?).
Serverless GPUs. On the other hand, some companies want to own the endpoints but not the infrastructure. Countless startups resell GPUs in “serverless” form by autoscaling up and down automatically. My take is: you are either (1) experimenting or have AI that’s not core to your business, and you outsource it to a hosted model provider, or (2) you must own the GPUs directly via AWS or your cloud provider (at least). In the second case, it’s probably expensive enough to have a few engineers just manage the infrastructure via the cloud provider.
QA tools. Companies are looking for an outsourced QA framework to track changes to inference pipelines (prompts, models, etc.). The problem is that I don’t think customers or vendors are qualified to assess this at a horizontal level. Plus, the underlying abstractions are moving far too quickly to build this product well today.