Skip to the content

The RAG architecture is a widely used alternative for customizing artificial intelligence models. With this approach, you can develop an exclusive chat for your company, adding specific business information to the knowledge of a Large Language Model (LLM) like GPT, for example.

This is an alternative to fine-tuning, another way to customize an AI. Check out the main features and possibilities of the RAG architecture below!

✍️ This article was written in collaboration with cwiser and data scientist Wesllei Heckler.


Context enrichment

RAG stands for Retrieval-Augment Generation. This architecture allows for the expansion of the context in which and artificial intelligence for an answer to a query.

AI applications like ChatGPT and Gemini are based on Large Language Models (LLMs). These models are typically trained on a massive amount of public information available on the web.

Using the RAG architecture, we can add references in various formats to the model’s consulted database. This way, the AI will have a richer context, pulling information from the web as well as from provided documents – such as reports, contact lists, spreadsheets, codes and more…

This technique is especially recommended for scenarios that require specific data or knowledge, such as support chatbots, questions-answering (Q&A) systems and situations that require real-time information. This way, the AI model can answer questions about topics not covered in the data used for its training, increasing the accuracy of responses in these scenarios.

Practical use of RAG architecture

An example we have here is the AI chat for use by CWI employees. The application allows the creation of different workspaces and, in each one, you can upload documents in .pdf, .txt, .docx and .csv formats, as well as software projects.

This enriches the context/prompt with data from a specific scenario. Generally, the documents are only available to the workspace in which they were uploaded, for security reasons. However, technically, it is possible to make a document available to different environments simultaneously. 

One of the workspaces tested in this chat is aimed at questions from CWI employees regarding Marketing topics. In it, we included material with guidelines for publishing articles on the Content Hub, so that employees could interact with the AI about the subject and clarify their questions.

Quality assurance is the focus of another workspace in this chat. Our team members can upload software projects to the environment and configure the application to assist in generating unit tests for specific classes of the code.

The chat enabled a 45% increase in productivity in quality assurance (QA) and test automation activities, and a 30% increase in development and programming tasks.

Features of RAG architecture

One of the benefits of RAG is that it requires less computational robustness than, for example, fine-tuning techniques. With this approach. it’s possible to customize an AI for your company’s context without the need for a significant investment of time and resources in servers.

Another characteristic of this architecture is its reliance on vectors, which are data transformed into numerical representations (embeddings). All information is partitioned into variable sizes, depending on the scenario and stored in a vector database, such as VectorDB.

Questions submitted to a chat with RAG architecture are also transformed into vectors by an algorithm and this numerical representation is compared with others in the database. The application conducts a search for similarity and returns the closest representations to the question sent by the user, typically quite accurately. 
It’s also worth highlighting a point that sometimes generates concern: that the AI used in the application “learns” the company’s information and uses it in responses outside the context of its applications. However, the artificial intelligence model will not learn. If asked about any information from the documents in an environment that does not contain them, it won’t be able to respond correctly.

When not to use

Every technique related to AI personalization has its peculiarities and more suitable use cases. The RAG architecture, for example, is not recommended in situations where there are many rules to be considered, such as teaching a new language to the model (whether programming or a spoken language).

In summary, the technique is not suitable for introducing capabilities that the AI model did not previously possess, but it is excellent for responding to documents and databases provided in the application.


Case: AI to prevent setbacks based on summaries

A system developed by CWI for a real estate group, for example, applies the RAG architecture. The software uses artificial intelligence to summarize documents, but it’s not the summary itself that truly generates value for our client.

From the summaries, the group can anticipate impactful movements and act to prevent certain setbacks from occurring. In other words, an application developed with AI and RAG architecture directly and positively impacts the financial results of the company.

Count on CWI to utilize this architecture in AI customization projects for your company. Explore other case studies and get in touch with us!

More blog posts