Business Intelligence - Dominic Plouffe (CTO)

The State of LLMs in Business Today: What’s Working, What’s Not, and What Comes Next

Most businesses are no longer asking whether large language models, or LLMs, are useful. They are asking where they fit, what they can be trusted to do, and how much human oversight they still need. That shift matters. The early excitement was about what these tools might do. The current conversation is about what they actually do well in daily work.

For analysts, virtual assistants, and BI-heavy teams, the answer is practical. LLMs are already helping people draft emails, summarize long documents, search internal knowledge, answer routine customer questions, and speed up repetitive workflows. They are also creating new problems: wrong answers that sound confident, unclear ownership, privacy concerns, and hidden costs that show up after the pilot phase.

The teams getting value from LLMs are not treating them as magic. They are using them for specific tasks where speed matters, the output can be checked, and the risk of a mistake is manageable. The teams struggling with them are often trying to use them as replacements for judgment, process, or clean data. That does not work.

Where LLMs Are Already Useful

The strongest use cases are the ones that reduce time spent on first drafts and routine reading. LLMs are good at turning rough input into something structured enough to work with. That is why drafting is one of the first places they show up. A manager can paste notes from a meeting and get a clean follow-up email. An analyst can turn bullet points into a report summary. A virtual assistant can create a polished response from a few key facts.

Summarization is another clear win. Many teams deal with long documents, call transcripts, policy updates, tickets, or research notes. An LLM can cut that down to the parts that matter most. It will not always choose the right details, but it can save a lot of reading time when the goal is to get oriented fast.

Search is changing too. Traditional search works best when you know the right keyword. LLM-based search is better when the question is messy. A user can ask, “What is our refund policy for annual plans in Europe?” and get an answer that pulls from several documents instead of a list of links. For internal knowledge bases, that is a real improvement.

Customer support is one of the most visible business use cases. LLMs can handle common questions, explain basic steps, and route cases to the right team. They are especially useful when the support volume is high and the questions repeat. The model does not need to solve every issue. It just needs to reduce queue time and handle the easy cases cleanly.

Workflow automation is where the value starts to compound. An LLM can read an incoming message, classify the request, draft a reply, extract the needed fields, and send the task to the right system. In practice, that means less copy-paste work and fewer manual handoffs. The model is not replacing the workflow. It is handling the parts that are repetitive and text-heavy.

A simple workflow diagram showing an LLM sitting between email, documents, support tickets, and internal systems, with arrows for drafting, summarizing, searching, and routing tasks.

What LLMs Still Do Poorly

The biggest issue is accuracy. LLMs can produce answers that sound complete and are still wrong. That is not a small flaw. In business settings, a wrong policy answer, a bad calculation, or a false summary can create real work for the team that has to clean it up.

They also struggle with consistency. Ask the same question twice and you may get slightly different answers. Change the wording a little and the output can shift more than you expect. For tasks that require strict repeatability, that is a problem. A model can help prepare the work, but it should not be the only system of record.

Control is another gap. Most business processes need clear rules. They need to know which source is trusted, what happens when information is missing, and who approves the final output. LLMs are flexible, which is useful, but that flexibility makes them harder to govern than a fixed rules engine or a standard report.

They also have trouble with narrow context when the surrounding information is important. A model may understand a policy in general terms and still miss a clause that changes the answer for a specific customer segment, region, or contract type. In other words, it can sound right while skipping the detail that matters most.

Finally, they are not naturally good at accountability. If a dashboard is wrong, someone can trace the data source. If an LLM gives a poor answer, the path from input to output is often less transparent. That makes review, logging, and source grounding more important than they are in many other tools.

The Main Adoption Patterns

Most companies are adopting LLMs in one of three ways. The first is the standalone chat tool. This is the version many people know best: a public interface where a user types a prompt and gets a response. It is fast to try and useful for individual productivity. People use it to draft text, brainstorm ideas, rewrite content, and summarize material.

Standalone chat is also the easiest place to create shadow usage. Employees start using public tools for work content without a clear policy on what can be pasted in. That creates privacy and compliance risk if sensitive client data, financial data, or internal plans enter a system the company does not control.

The second pattern is embedded AI inside existing software. This is now showing up in CRM systems, help desks, productivity suites, document tools, and analytics platforms. The advantage is simple: the AI is already where the work happens. A support agent can draft a reply inside the ticketing system. A spreadsheet user can ask for a formula suggestion without leaving the file. A BI user can ask a question in plain English and get a chart or summary.

Embedded AI tends to be easier to adopt than a separate tool because it fits into a known workflow. It also reduces the friction of training people on a new interface. The downside is that the model is only as useful as the product’s integration and guardrails. If the embedded feature cannot access the right data or cannot explain where its answer came from, the convenience disappears fast.

The third pattern is the custom internal copilot. This is a company-built assistant connected to internal documents, systems, and permissions. It may answer HR questions, help sales teams find product information, or support analysts by pulling from internal reports and notes. These copilots are attractive because they can be tailored to the business and limited to approved data sources.

They are also harder to build well. A useful internal copilot needs clean permissions, good retrieval from internal content, clear boundaries, and ongoing maintenance. If the knowledge base is outdated or badly organized, the copilot will surface bad answers faster than a human search process would. The tool does not fix messy information. It exposes it.

Privacy, Governance, and Human Review Are Not Optional

For business use, data privacy is one of the first questions to settle. Teams need to know what data can be sent to a model, where that data is stored, and whether it is used for training. That is not a legal footnote. It affects whether the tool can be used on customer records, contracts, internal financials, or employee data.

Governance matters just as much. Someone has to decide which use cases are approved, which models are allowed, who can configure them, and how outputs are reviewed. Without those rules, adoption becomes inconsistent. One team uses a consumer chatbot. Another uses an embedded feature with different controls. A third builds its own workflow with no logging at all. That is not a strategy.

Cost is easy to underestimate. LLM usage can look cheap in a demo and become expensive at scale. A few test prompts are not the real bill. Real costs appear when hundreds of users run daily queries, when documents are large, when output must be reprocessed, or when the company adds retrieval, logging, and security controls around the model.

Prompt quality also matters, but not in the mystical way people sometimes describe it. Good prompts are simply clear instructions. They specify the task, the audience, the format, the source material, and the limits. A vague prompt gives a vague answer. A precise prompt reduces the amount of cleanup needed after the model responds.

Human review is still required for many business tasks. That does not mean the model is useless. It means the model should draft, classify, extract, or summarize, and a person should verify the result when the stakes are high. In practice, the best systems use LLMs to reduce effort, not to remove accountability.

How to Judge a Use Case Without Getting Distracted by the Hype

A good way to evaluate LLM use cases is to start with risk, not novelty. Ask what happens if the model is wrong. If the answer is “a minor edit,” that is a good sign. If the answer is “a customer gets the wrong financial guidance,” the use case needs stronger controls or a different design.

ROI should be measured in time saved, error reduction, and throughput, not just in excitement. A useful pilot often reduces repetitive work for a specific team. For example, if support agents spend ten minutes summarizing each ticket before routing it, an LLM that cuts that in half can create real value. The gain is easy to see and easy to measure.

Operational readiness is the third test. Some teams have the data quality, process discipline, and review capacity to support an LLM workflow. Others do not. If the source documents are scattered across folders, if nobody owns the knowledge base, or if approvals are already slow, adding an LLM will not fix the underlying process.

One practical filter is to separate tasks into three buckets:

Low risk, high volume: drafting, summarization, classification, internal search, and simple customer replies.
Medium risk, controlled review: sales enablement content, analyst support, policy Q&A, and workflow routing.
High risk, strict oversight: legal, financial, medical, compliance, and anything that directly affects external commitments.

The first bucket is where most companies should start. It gives teams experience with the tools without putting the business in a fragile position. The second bucket can work when source data is solid and review is built in. The third bucket needs careful design, and in some cases it should stay human-led.

What’s Coming Next

Multimodal models are already changing how people use LLMs. These systems can work with text, images, tables, charts, audio, and sometimes video. For business users, that means a model can review a screenshot of a dashboard, read a scanned document, or summarize a meeting recording. The value is not just that the model handles more formats. It is that more of the messy real-world input becomes usable.

Agentic workflows are the next step after simple chat. In a basic chat setup, the user asks a question and gets a response. In an agentic workflow, the model can take a sequence of actions: search for information, compare records, draft a response, update a ticket, and flag exceptions. That is powerful, but it also raises the stakes. The more steps the model can take on its own, the more important permissions, audit logs, and stop conditions become.

Enterprise integration will matter more than model size. The winning systems will not just answer questions. They will connect to document stores, CRMs, ticketing systems, ERP tools, and analytics platforms. A model that can see the right internal context and act inside the right workflow will be more useful than a slightly smarter model that sits on its own.

There is also a shift toward narrower, better-controlled deployments. Many companies are moving away from “let everyone experiment” and toward approved use cases with defined data sources, review steps, and measurable outcomes. That is a healthier pattern. It reduces risk and makes it easier to see what the tools are actually doing for the business.

The near future will probably not be defined by one giant breakthrough. It will be defined by better fit. LLMs will get more useful where the work is text-heavy, repetitive, and connected to business systems. They will stay weak where precision, traceability, and hard control matter more than speed.

What Smart Teams Are Doing Now

The most effective teams are not asking whether to “adopt AI.” They are asking which tasks are worth speeding up, which tasks need better controls, and which tasks should stay human-reviewed. That is a more useful question, and it leads to better decisions.

If you are evaluating LLMs in your own workflow, start small and specific. Pick one process that is repetitive, text-heavy, and easy to check. Measure the time saved. Look at the error rate. Decide who reviews the output. Then decide whether the result is good enough to scale.

That approach is slower than buying into the hype, but it is also how teams end up with tools people actually use. The value is not in having an AI feature. The value is in making real work faster, cleaner, and easier to manage.

Category: Business Intelligence

The State of LLMs in Business Today: What’s Working, What’s Not, and What Comes Next

The State of LLMs in Business Today: What’s Working, What’s Not, and What Comes Next

Where LLMs Are Already Useful

What LLMs Still Do Poorly

The Main Adoption Patterns

Privacy, Governance, and Human Review Are Not Optional

How to Judge a Use Case Without Getting Distracted by the Hype

What’s Coming Next

What Smart Teams Are Doing Now