RAG for Text Generation Processes in Businesses

InakiPeeters

Iñaki Peeters

AI Solutions Analyst

Welcome to the fourth and last part of this series on Retrieval-Augmented Generation for text generation processes in businesses. In the previous part, we looked at the possibilities and tools to deploy a cloud-based RAG system to ensure scalability throughout the organisation. In this last part we’ll dive into some possible extensions to build on top of a conventional RAG system.
OfficeWorker4
State-of-the-art machine model, connected to boxes brimming with documents, with a human validator on top, directing the model towards the right box with documents, hyperrealistic digital art – Dall-E 3

The Need for Accuracy

The last cluster of business cases we identified in the first part of this series concerns companies that are looking to deploy a RAG system for text generation processes where the produced output has to be 100% correct. We identified the following use cases:

  • Pharma and biotech companies compiling regulatory documents using RAG.

  • RAG assisted writing for Vehicle Design Specifications documents in the automotive industry.

  • Software companies drafting Software Requirements Specification documents using a RAG system.

This search for accuracy indicates the need for additional requirements in order to fully tailor the RAG system to the specific needs of your company. RAG is a very useful system as is, however, to fully leverage the potential of RAG, some additional extensions can be build on top of your RAG system. Now that we’re all RAG experts, we can give free rein to our innovative minds and think about how we can try to build on the power of RAG… Save yourself the effort, at Faktion, we’ve already been thinking about this and we summarised our findings for you.

Faktion’s Approach towards ‘Full Productization’

Solely having an operational AI application is not sufficient to stay ahead of competitors. At Faktion, we believe that ‘productising AI applications’, through combining both Machine Learning & Software Engineering, truly makes the difference for businesses. An AI product is a seamlessly integrated software solution that embeds complex AI technology into a simple, user-friendly, robust and scalable platform. We believe that every business owner should be able to manage, validate, and leverage the power an AI system throughout its entire lifetime, without necessarily having a technical background.

Going back to the automation of text generation processes (see Part 1) by leveraging the power of RAG (see Part 2), we saw that RAG isn’t just a one-size-fits-all solution for any business. We identified three in-house key competences needed to deploy a fully operational RAG system solution. First of all elaborate LLM and NLP skills are required in order to design the RAG system. Besides that, data quality optimisation skills are needed. The input for the RAG system’s vector database is subject to specific formatting requirements to make the most out of the RAG system. Since both internal and external data sources are used, the preprocessing part goes far beyond the conventional data cleaning. Web scraping, advanced querying and structuring issues, and overall data quality optimisation requirements are just some of the hurdles to take. Last but certainly not least, MLOps and LLMOps capabilities are necessary to transform the RAG system into a no-code environment, where any business user can leverage the power of the application.

Turn Lemons into Lemonade: Two Possible Extensions to Build on Top of Your RAG System

At Faktion, we identified two possible extensions to build on top of your RAG system to allow for a seamless integration to take full advantage of the RAG system and deploy a fully fledged solution to tackle the automation of text generation processes. These extensions help answering the following questions:

  • What happens if I’m not satisfied with the generated output?

  • How can I make sure that the system doesn’t make the same mistakes in the future?

Let’s have a look at the visual flow of the RAG system, looks familiar right? Previously (see Part 2), we’ve only been looking at the upper part of this visual. We’ve added two additional extensions to build on top of the RAG system to solve the two questions above.

RAG-system-extensions

Extension 1: Human validation

As it is with humans, nobody’s perfect, even RAG isn’t. You might find yourself disappointed by the generated output and ask yourself: “What now?” Well, the first extension incorporates a ‘human-in-the-loop’. This human validation component might look like the figure below. In the Microsoft Word environment, the business user can access the output of the RAG system. A custom-built AI Assistant Word plug-in allows the user to validate the generated paragraph and the various data sources that have been used to construct the text. The user can browse between several potential suggestions. Each suggested output can be validated and corrections can be made. Validation happens through assigning scores to the different suggested outputs. The user can also correct the output directly in the AI assistant plugin or request another version by specifying what part is of suboptimal quality or what characteristics need to change. Besides that, also the internal and external sources used to generate the output are listed and the domain expert can indicate if the source is relevant or not. If not, the information captured by a specific source is removed and the overall suggestion is updated until the needs of the business user are met.

HumanValidation
Human Validation Extension

Extension 2: Continuous Learning & Improvement

In accordance with the saying ‘Use it or lose it!', we believe that you shouldn’t let your valuable feedback go to waste. The second extension introduces a learning component to the system. In order to make the RAG system more efficient and to limit the future validation work to be done by the business users, the RAG system should be able to incorporate human feedback and use it to make even better suggestions in the future. The feedback is kept in what we envision as a ‘system memory’ and can then be used later to improve the generated output in future queries. First of all, the scores given to the different suggestions by the human validator are used in an internal reward model. This model boils down to rewarding the system in case of a good score for a specific output and no reward for outputs with a bad score, comparable to learning a dog a new trick and rewarding him with a treat in case of success.

On the one hand, feedback on the different sources can be used to improve the retrieval component of the RAG system. On the other hand, the validation of the generated text and potential corrections can be used to improve the generation component of the RAG system. The figure below provides an overview of the different ways to incorporate feedback in your RAG system.

FeedbackMechanism
Feedback Mechanism

It’s important to note that you can basically do two things with the feedback. Either you add feedback to future prompts or you use it to fine-tune your LLM. The former concerns providing examples of ‘good’ and ‘bad’ paragraphs generated in the past and adding them to new prompts through prompt engineering. In essence, it boils down to adding ‘oh hey by the way, here are 10 good examples and here are 10 bad examples’ at the end of your prompt. The same reasoning holds for providing examples of ‘good’ and ‘bad’ sources from previous outputs in the prompt.

The second option is to use this feedback to finetune your model. Although this requires additional effort, there is one important advantage. By finetuning your model with the feedback from the human validation component you can add additional context understanding to the model. Let me explain by an example. Suppose the LLM in your RAG system is a 18 year-old student who starts medical school tomorrow. Suppose he/she has access to all textbooks. When you ask the student to write a medical paper, the student can retrieve information from the textbooks and write the paper (RAG), however, he/she will most probably lack contextual understanding of the medical jargon. By completing medical school, the student will be trained in understanding and using the medical jargon, we could say that he/she is finetuned and is able to write a better medical paper. The same reasoning holds when finetuning your model to gain improved textual understanding. The source feedback can be used to fine-tune the retrieval component and the output validation can be used to fine-tune the generation component.

That’s a Wrap: Productizing a RAG System into a Scalable Platform

In order to create value at scale using your RAG system, the inevitable next step after operationalising the RAG application is embedding it into a simple, user-friendly, no-code, and scalable platform. This platform makes the RAG system accessible and operable for any technical or non-technical business user. The advantages of such a platform imply that any business user can:

  • leverage the power of the RAG system by interacting, extracting insights, and taking actions based on the insights

  • stay in control of the features and configurations

  • manage the model throughout its lifetime, which includes inception and training, deployment, maintenance and retraining (finetuning), and eventually retirement

By productising the RAG system and extensions into a integrable platform, business can go from a generic and of-the-shelf solution to a specific solution that is tailored at its specific needs.

In the Land of Data, Quality is King: Lessons in Data Classification

Data classification is the process of categorising data based on its attributes, characteristics, or content. It involves organising data into distinct groups or classes to make it more manageable, searchable, and secure. Proper data classification is a key prerequisite to have an optimally functioning RAG system. Let me make this clear with an example. Suppose you work for a consumer electronics manufacturer. With every new product launch, your company needs to create a user manual. Because this is a time consuming, manual process, your boss wants to deploy a RAG system to assist in user manual generation. It is important that all available data that can be used to generate these manuals is classified in a structural way, for example per product category, like smartphones, tablets, laptops, etc. This classification allows for efficient and relevant information retrieval, improved accuracy and quality of the generated output, and it avoids information overload. In order to reach desired accuracy levels, the

Of course, manually classifying or labeling documents or other data types is a boring job. Faktion’s IDQO toolbox offers a solution to automate classification tasks. By using a classification model, existing data and new incoming data can be classified in the right category, allowing for relevant and efficient information retrieval in the RAG system. To reach sufficient accuracy, additional tooling is available to increase accuracy of the classification algorithm. Data enrichment through search response or an active learning approach can boost accuracy to reach the required levels.

To Make a Long Story Short…

In this series of blog posts, we discussed one of today’s hottest topics in the field of generative AI: context-aware Generative AI with Retrieval-Augmented Generation. In the first part, we identified three clusters of business cases where such a RAG system can provide substantial value for businesses. In part two we looked at the architecture of a basic RAG system. In part three we discussed the tooling that is available to facilitate the deployment of a RAG system in the cloud environment of your business. In part four, we identified two possible extensions to build on top of a RAG system to fully leverage its power. Throughout the series, we paid special attention to data quality implications and identified important considerations to take into account when deploying your RAG system. It is clear that context-aware Generative AI with Retrieval-Augmented Generation has immense potential to elevate the power of LLMs by giving the models access to unseen data in an efficient and cost-effective way. As Alexander Ratner, CEO of SnorkelAI, wittily puts it: “We are moving from GPT-X to GPT-You”. This clearly indicates that tailoring generative AI at your specific needs, could indeed by the next big thing.

Get in touch!

Inquiry for your POC

=
Scroll to Top