Make Your LLM Smarter: A Grounding Experience

Suchismita Sahu
5 min readNov 23, 2024

--

Give Your Generative AI Model More Specific Information Beyond Grammar And Syntax

Large Language Models (LLM) that you use “off-the-shelf” create pretty generic output. That’s the going assumption. While it is true for a lot of use cases and it depends on your ability to craft an effective prompt, it’s also similar to the difference between having access to just grammar books and having access to domain-specific textbooks and news on top of that.

If you’ve ever worked in a country where your mother tongue isn’t the dominant language, the following example will be familiar. I’ve had several colleagues and friends share the same point with me over the years. Two colleagues were having a conversation about a work-related topic in English rather than their native language. When I asked one of them, they said: “Yes, I speak French. But I’ve never worked in this domain in France and don’t know the right terminology. But talk about it in English ever day.”

With LLMs, the situation is actually quite similar.

Strengths And Weaknesses Of LLMs

LLMs are extremely good at predicting the next word in a sentence. They have been trained on vast bodies of text from the internet up until a certain point in time. For example, GPT-3.5 and GPT-4 have been trained on data up until January 2022. Any new information after that cut-off date is unknown to these models: scientific breakthroughs, world events, politics, etc. This can limit their usefulness when prompting the model to generate output on a recent topic that it does not have any data on — for example, the new superconductor material ‘reddmatter’1 that was in the headlines this spring (see screenshot).

OpenAI ChatGPT

But despite LLMs missing current information, let’s not forget that they still are a powerful new technology. For example, LLMs can process and store information about an entire book (e.g. Anthropic’s Claude model) and as a user, you can ask questions about the book. LLMs can also translate text into an action or into another language, style, or tone. But the LLM typically doesn’t have deep industry-, domain-, or company-specific context. Hence, its responses can seem rather generic.

On the flip side, though, languages like English haven’t changed dramatically within the last 18 months since the last update of the training data. So, LLMs are highly useful to understand and generate language-based in- and output — just not on the latest news or the deepest domain expertise.

Whether or not generic output is an issue will highly depend on your use case. Respond to an e-mail with the help of an LLM? If it’s about whether and when you’ll meet, for example, a generic reply might just be “good enough”. If it’s about a complex sales deal or a customer support escalation that requires additional details and finesse in handling the relationship, you’d likely want more context and specificity in the reply — and a human to edit it before sending it off.

Approaches For Guiding LLMs

Whether the output an LLM generates is sufficient depends on the use case of which you want to use it. If you need to generate output that is highly domain-specific or based on proprietary data (and training data is not publicly available on the internet), you will likely need to look beyond just prompting the LLM. The same applies to situations in which the LLMs is likely to generate factual inaccuracies and you need to trace the information back to a known source.

Let’s recap the most common approaches to guiding an LLM to generate output:

  1. Prompting: Submit natural language instructions to get the model to create an output. This is the method of choice when you’re looking to generate output that doesn’t need to be highly specific to a domain or be dependent on the latest information. Examples: Concepts, historic events, known facts
  2. Fine-tuning: Add additional data to the training set and adjusting the model weights. This is a highly resource-intensive process which requires data as well as deep data science expertise. Example: Finance-specific foundation model
  3. Retrieval-Augmented Generation (RAG): Provide additional content and context to the LLM to draw from when generating an output. This approach promises to be a middle ground for providing the LLM the latest information while being able to trace the answer back to it’s source. Example: Customer support documentation and purchase history

Depending upon the need to use external knowledge and how specific the output should be for a given domain, the following 2x2 provides an overview of the approaches and which to use when. As an example, let’s look at RAG when aiming to balance high demand for external knowledge and low domain specificity.

Approaches for guiding LLMs

RAG: Fresh Data For Your LLM

A significant challenge when using LLMs is their tendency to generate factual inaccuracies and making up plausible sounding information that is in fact utter nonsense when they do not have accurate information. To mitigate this behavior, Retrieval-Augmented Generation2, a new approach to providing fresh data to otherwise static, pre-trained LLMs, was first introduced in 2020. It follows the following steps (aka “grounding”):

Preparation

  1. Convert target data (e.g. support documents) into vectors
  2. Store vectors in vector database

Use

  1. Convert user query (prompt) into vectors
  2. Perform semantic search, comparing query vectors (user) with stored vectors (target data)
  3. Augment user prompt with retrieved information
  4. Submit augmented prompt to LLM for output generation

Overview of Retrieval-Augmented Generation approach (Lewis et al., 2020)

The benefits of this approach are:

  • Access to current information (→ since model training)
  • Use of specific information (→ vs generic training data)
  • Reduction of hallucination (→ more specific context)
  • Traceability of information (→ known content source)

Prompting is the easiest approach for generating output. However augmenting prompts via RAG provides more contextual and specific information while balancing cost compared to fine-tuning and reducing the risk for hallucinations/ factual inaccuracies.

--

--

Suchismita Sahu
Suchismita Sahu

Written by Suchismita Sahu

Working as a Technical Product Manager at Jumio corporation, India. Passionate about Technology, Business and System Design.

No responses yet