Smart language models, an alternative way of using Natural Language Processing, are the key to accurate AI, which can also drive value in the engineering enterprise

Google published the demo for its new AI chatbot, Bard, on Monday, the 6th of February 2023. By Wednesday, its parent company Alphabet had lost $100 billion in share price. How did that come about? And what can we do to ensure that future AI chatbots aren’t prone to such catastrophic lapses in judgement? To counter the dangers of rampant misinformation, what we need are smart language models, as opposed to Google and Microsoft’s sprawling Large Language Models (LLMs), that prize domain-specific knowledge, thorough training in fields like scientific research, and factual validation above all else.

Smart language models are the key to accurate AI and, in time, to the winners and losers of this AI arms race.

Clash of the Titans: Microsoft vs Google

Much has been said about the competition, and collaboration, between Microsoft, OpenAI, Google, and DeepMind to capture the hundred-billion-dollar search market using integrated generative AI models like ChatGPT. It was this competition that drove Google to release Bard in response to Microsoft’s much-anticipated announcement: that it would use OpenAI’s technology to enhance Bing and take on Google on its home turf.

The purpose of search engines is to answer a user’s question, so when AI chatbots are known to get facts wrong, it has a serious impact on the businesses using them. The main issue is that many users’ questions will have an aspect of domain-specificity to them – whether that be in science, medicine, or other technical subjects.

This means that a chatbot may not have enough relevant training data on a specialised subject, or may miss crucial semantic links – take, for example, the difference between the ‘colour’ of a quark, and ‘colour’ as we use it in regular speech. These models have huge datasets to back them up, but bigger isn’t always better.

What are smart language models?

Smart language models (SLMs) are an alternative way of using Natural Language Processing, a sibling of the Large Language Models (LLMs) produced by companies like Google and OpenAI. LLMs have some powerful upsides – emergent capabilities; extensive general knowledge; and plausible, ‘human-sounding’ text – but there are a variety of engineering approaches, including SLMs, that can drive value in the enterprise.

When you drill down into specifics, unsolved questions about LLMs bubble to the surface. How do we ensure bots like ChatGPT are telling the truth and not ‘hallucinating’? How can we build knowledge validation into LLMs? And when these models do cite sources, how do we know they’re the right ones? The sheer size of their knowledge bases inevitably leads to inscrutable decision-making.

Solving these issues for specialised domains and business applications requires substantial investment. Otherwise, these models are unusable. That’s where smart language models come in.

Because they use a much smaller set of training data, SLMs can drive real value for enterprises. They keep down compute costs through more efficient training processes, whilst also focusing on domain-specific accuracy. The latter is vital because it decreases the margin of error when it comes to factual validation – reassuring leaders worried about the consequences of integrating faulty AI into their business-critical processes.

teacher in a computer lesson
Image © gorodenkoff | Istock

Write what you know: domain-specific training

Aside from factuality concerns, LLMs require immense running costs and a reliance on volumes of data that may not even exist in certain fields. As an industry, we must take one thing at a time and look at the end result, rather than falling in love with a technology and then deciding where to use it.

Smart language models, composed of millions of parameters as opposed to billions, adopt this approach. They start with the business use-case and then work backwards to build a model that can complete that task with a high degree of accuracy, based on its comprehensive training in that field.

When hiring a microbiologist, you wouldn’t want a human who confuses ‘generation time’ for the time taken for a single microorganism to be created, when it in fact means the time taken for a population to double in number. That degree of semantic knowledge is vital, and something LLMs currently lack because they are pre-trained and not fine tuned on these details.

Work smart, not hard

One thing is clear, while the novelty of LLM-driven chatbots has captured international attention, businesses can’t afford to integrate them into their core business functions. Most applications to date have focused on creative ideation or content creation because LLMs simply cannot be trusted not to hallucinate.

Smart language models, built on a foundation of factual validation and domain-specific understanding, are the way forward. By focusing on quality training and improved fact-checking software, we can make AI reliable for the critical tasks on which a business – and an economy – depends. SLMs can do all this while driving down costs and making AI collaboration more accessible to the organisations who need it, providing an alternative for LLMs that is smarter, more accurate and more accessible.

This piece was written and provided by Victor Botev, CTO and co-founder of Iris.ai

LEAVE A REPLY

Please enter your comment!
Please enter your name here