The pattern is already visible
In many small and mid-sized companies, the AI conversation starts too early with the tool and too late with the work itself. Someone sees a Copilot demo, somebody else tests ChatGPT, another person mentions Gemini and for a few weeks it feels like the company is moving. Then reality shows up: nobody defined which task should improve, who will use it or how value will be measured.
That pattern is not anecdotal. IBM recently highlighted an MIT finding that is hard to ignore: roughly 95% of generative AI pilots fail to turn into initiatives with real business value. That figure matches what many companies experience outside vendor slides: fast enthusiasm, vague use cases and licenses that later become difficult to justify.
Reference figure
Most pilots do not reach measurable business value.
IBM Think cites MIT’s NANDA initiative and sets a difficult baseline: most generative AI pilots stall before becoming operational value.
95%
no value
Pilots that fail to consolidate measurable value
Initiatives IBM places at enterprise scale
84% Still far from scale
Graphic prepared from IBM Think, which in turn references MIT NANDA and the IBM Institute for Business Value.
Before buying anything
Four prior checkpoints worth settling before buying licenses, launching a pilot or presenting AI as a universal answer.
Which specific task should improve and how much time it wastes today.
Who will use it and where human review remains mandatory.
What data the tool may access and what it must never touch.
Which metric will decide whether it stays or goes.
The mistake is rarely technical
The main error usually happens before deployment. Terralogic makes the point clearly: many businesses adopt AI because competitors are doing it, not because they have a concrete operational need. Once that is the starting point, everything becomes fuzzy. The tool may look current, but nobody can explain what work it is meant to improve.
A more useful starting point is the opposite one. Not “should we buy Copilot, ChatGPT, Gemini or something else?” but “where are we actually losing time?” Email drafting, proposal writing, meeting summaries, document retrieval or repetitive support responses are all valid places to look. Once the use case is concrete, the conversation stops being about trend and starts being about work.
AI on top of disorder does not fix disorder
There is another expectation worth cutting early: the idea that AI will somehow repair a messy environment by itself. It usually does not. If documents are scattered, if processes are poorly designed or if nobody trusts the underlying data, AI works on top of that same situation. It may speed up parts of it, but it can also scale confusion.
The World Economic Forum makes a similar point: many AI initiatives stall because companies try to unlock value before fixing the processes underneath. Put more plainly, a solid template, a clean integration or a better document structure often solves more than adding another assistant on top of an already weak workflow.
Without limits and metrics, it becomes an expensive experiment
The next part that often gets skipped is governance. What information can the tool handle, which outputs must always be reviewed and which responses should never be automated because they require judgement or accountability. That work does not slow the project down. It gives it a chance to survive.
APQC puts it well: if the problem is not clearly defined and success is not measured explicitly, enterprise AI projects tend to become expensive experiments. For a smaller company, that is a useful test. If nobody can say what will be measured in eight weeks, the project is probably still novelty, not implementation.
Where it does fit
AI can be very useful when it enters a narrow part of the workflow and has a decent operational base behind it. First drafts, classification, meeting summaries, document support, response preparation and context-based search are all sensible use cases when supervision remains in place.
The meaningful question is not whether a company “has AI”. The meaningful question is whether a specific task is now faster, cleaner or more consistent than before. If the answer is yes, expand it. If the answer is no, stop. That discipline matters more than any promise about transformation.

