Maligned #6 - Small Models, Big Impact

Here’s what’s worth knowing.

Small models are having a moment

There’s a clear trend toward smaller, more efficient models that can run on modest hardware. Mistral, Microsoft’s Phi series, and several others are releasing models in the 3-8 billion parameter range that punch well above their weight. For most production applications, you don’t need a 400B parameter model. You need something fast, cheap, and good enough. The 7B class of models is hitting that sweet spot for a growing number of use cases, from customer support to code completion to document summarization. The practical impact is significant: you can run these on a single GPU, latency drops, costs plummet, and you can actually iterate quickly.

OpenAI’s enterprise strategy is getting clearer

OpenAI has been rolling out enterprise features at a steady clip. Custom model training, data residency options, admin controls, audit logging. They’re building the enterprise SaaS playbook that Salesforce and others perfected. The interesting tension is between their research ambitions and their enterprise revenue needs. Enterprise customers want stability and predictability. Research teams want to ship new capabilities fast. Those two things are in direct conflict, and how OpenAI manages that tension will determine whether they hold their enterprise market share as competitors catch up on model quality.

The evaluation problem nobody is solving

We still don’t have great ways to evaluate language models for real-world tasks. Benchmarks are gamed almost as soon as they’re published. Human evaluation is expensive and inconsistent. The result is that companies are making model selection decisions based on vibes, marketing materials, and whichever benchmark happens to favor the model they already chose. I’d love to see more investment in domain-specific evaluation frameworks that actually measure what matters for specific use cases. A few startups are working on this, but it’s not getting the attention or funding it deserves.

Energy consumption is becoming a board-level conversation

Data center energy consumption for AI workloads is becoming impossible to ignore. Multiple companies reported significant increases in their energy usage attributable to AI inference, and the projections for the next two years are staggering. This isn’t just an environmental concern, though it is that. It’s a cost concern and a capacity concern. In some regions, there simply isn’t enough power available to support the data center buildout that current AI demand projections require.

See you next week.

Maligned - AI news by Mal