Technology
Fine-tuning: the process of specializing and customizing an artificial intelligence
4 min read
If you’re considering the need for customized artificial intelligence for your company, the fine-tuning process is one option you should consider.
The definitions from the Cambridge and Merriam-Webster dictionaries provide a good starting point for understanding what the term means. “To make small changes to something so that it works as well as possible”, or “to adjust precisely to bring it to the highest level of performance or effectiveness”.
In the context of a neural network, this is what we do when we have a pre-trained model, like a Large Language Model (LLM), and we want to customize and specialize it in some way.
OpenAI’s GPT, Google’s Gemini and LaMDA, and Meta’s LLaMA are some of the existing LLMs on the market. With fine-tuning, we can leverage everything GPT, for instance, has already been trained to do and adapt it to meet your company’s specific needs, including proprietary data.
Keep reading to learn what to consider when deciding if this is the best process for your situation!
When to Choose Fine-Tuning
The fine-tuning process in artificial intelligence involves training a model on a smaller and more specific dataset to perform a particular task. The result is an AI tailored to the needs of the user.
In simple terms, there are three main stages in fine-tuning. The first involves preparing and uploading the training data. The second is training the newly adjusted model. Finally, the third stage requires analyzing the results and, if necessary, going back to the first stage.
These stages require financial investment, time and effort which must be weighed against other solutions that enhance the use of artificial intelligence. Consider costs, quality and ease of use, keeping in mind the quality of the data available to improve the AI.
OpenAI, for instance, recommends trying to improve responses with other techniques before opting for fine-tuning. The creators of ChatGPT suggest:
- prompt engineering,
- prompt chaining (breaking complex tasks into multiple prompts),
- and function calling.
Another technique is Retrieval Augmented Generation (RAG). If this architecture and prompt optimization don’t provide the required solution, then it’s time to consider fine-tuning. Some cases cited by OpenAI where fine-tuning might improve results include:
- Defining style, tone, format or other qualitative aspects;
- Improving reliability in producing the desired outcome;
- Correcting failures to follow complex instructions;
- Activities that involves many specific edge cases;
- Executing a new skill or task that’s hard to articulate quickly.
Situations that require real-time data or where information changes rapidly aren’t solved with fine-tuning. Customizing a model is a process that takes days, and once it’s completed, the data may have changed, leaving the model with outdated knowledge.
Understanding the Anatomy of a Neural Network
Fine-tuning is done on specific layers of a neural network. Typically, the deeper layers of the model are adjusted while keeping the initial layers fixed.
This approach is taken because the initial layers capture more generic features, like high-level sentence structures. Generally, this type of feature is taken from the original model without being updated during training. The deeper the layer, the more specific the patterns that are captured.
To prevent drastic changes on already known features, one fine-tuning strategy is to use a lower learning rate. This approach makes the process more stable, allowing the model to retain previous instructions.
Benefits of the Fine-Tuning Process
It’s possible to improve AI model outputs by including instructions, examples and demonstrations of how to perform a task in a prompt. This technique is known as “few-shot learning.”
However, according to OpenAI:
“Fine-tuning enhances few-shot learning by training on many more examples than can fit into a prompt, allowing you to achieve better results across a wide range of tasks. Once a model has been fine-tuned, you won’t need to provide as many examples in the prompt, reducing costs and allowing for lower-latency requests.”
Learn more about these and other fine-tuning benefits in the following sections!
Higher accuracy
When a Large Language Model (LLM) provides incorrect or fabricated information, it’s called a hallucination. With the fine-tuning process, the model is more likely to respond based on the data used for its adjustment.
Resource savings
Compared to the investment required to train a neural network from scratch, fine-tuning requires significantly less time and computational resources. Additionally, in the case of fine-tuning GPT, OpenAI highlights the savings in tokens due to shorter prompts.
Higher quality
The creators of ChatGPT also point out the improved quality of results as a benefit of fine-tuning, compared to using prompts alone. They note that with fine-tuning, you can train with more examples than can fit in a prompt, as mentioned earlier.
Greater agility
The increase in agility occurs in two ways. One is the previously mentioned lower latency in requests. Additionally, the training process itself is faster, as the fundamental resources are already understood.
CWI performs fine-tuning to deliver the best results with artificial intelligence tools. We can integrate with major LLMs, whether as a service or on-premise and switch between models to determine which is most suitable to meet your business needs.
Count on us to support your growth!
BOUCHARD, Louis-François. How to Improve your LLM? Find the Best & Cheapest Solution. Accessed on: April 12, 2024.
BUHL, Nikolaj. Training vs. Fine-tuning: What is the Difference? Accessed on: April 10, 2024.
FINE-TUNE. In: Merriam-Webster. Accessed on: April 10, 2024.
FINE-TUNING. In: Cambridge English Dictionary. Accessed on: April 10, 2024.
OPENAI. Fine-tuning. Accessed on: April 09, 2024.
SANTOS, André. IA Sob Medida: A Arte do Fine-Tuning em LLMs. Accessed on: April 11, 2024.
TORRES, Moisés Barrios. O que é o fine-tuning (ajuste fino) e como ele funciona nas redes neurais? Accessed on: April 09, 2024.