What is fine-tuning? How to fine-tune GPT-3.5 Model?
Fine-tuning is an effective way to make ChatGPT output match our expectations. Recently OpenAI The fine-tuning feature for GPT-3.5 models has been published. In this introductory article, we will first introduce the concept of fine-tuning, then explain how to use OpenAI's Fine-tuning API, and finally, we will have practical examples to explain how to do fine-tuning well.
What is Fine-tuning? What are the benefits?
In the field of machine learning, fine-tuning is the process of building on a model that has already been trained, and then adjusting it so that the output of your model better matches your expectations. Fine-tuning allows us to eliminate the need to re-train a new model, which saves us the high cost of training a new model.
Fine-tuning is simple, you just need to prepare pairs of training materials. Then feed the Fine-tuning API and you're done. By pairs, I mean inputs paired with outputs, such as "Who will win the men's singles gold medal in badminton at the Tokyo Olympics?" and what you think is the ideal output when you receive such an input, e.g., "The men's singles gold medalist at the Tokyo Olympics is Danish badminton player Anselmo".
Fine-tuning allows us to achieve the following benefits: Better fine-tuning will result in better output than prompts. Using shorter instructions to get the desired output reduces token usage, which in turn reduces overhead and speeds up the response.
This may be a bit abstract, so let's take an example to further illustrate. Suppose today you're making an airport smart customer service bot for an airport, and you want that bot to answer questions based on airport regulations. But if you just string ChatGPT on it, it might not work as well as it should. When you type "Can I bring pork products into the country?" you'll get the following reply:
The answer to this question depends on the laws and regulations of the country you are in and the country you are entering. In general, pork products may be restricted or prohibited from entering certain countries, especially those where swine fever or other pork diseases are present. If you plan to bring pork products into a country, it is recommended that you check with your local customs or immigration office for information and regulations. In addition, if you are unsure whether you can bring certain items into the country, it is best not to take the risk of violating local laws and regulations.
This reply is actually quite good, but it's a bit too broad and not specific to airport regulations. If you want ChatGPT to be airport-specific, you would need to be more descriptive in your instructions, e.g., "You are now an expert on airport regulations, and I'm going to the airport, so can I bring pork products into the country?"
However, there are a couple of problems with this. One is that by doing this your instructions become longer, so the token consumption number goes up; the second problem is that sometimes even with more precise descriptions, the output may still be too broad, which makes it less desirable for enterprise-level scenarios.
What the fine-tuning can do is that after the fine-tuning, you can directly get the output for the Taoyuan Airport specification through the short command "Can I bring pork products into the country?" This is a short command to get the output for Taoyuan Airport's specifications. After learning what fine-tuning can do, I'm sure you can't wait to get the benefits of fine-tuning in your enterprise products. Let's take a look at how it can be done.
How to use Fine-tuning API?
We can use the Fine-tuning API provided by OpenAI to fine-tune it. The following is in Chinese, for more information you can refer to the official OpenAI documentation.
Fine-tuning can be broken down into the following three steps
Prepare training materials
Training a fine-tuned model
Using the fine-tuned model
Let's first look at how to prepare the training material. The Fine-tuning API currently accepts JSONL in the following format. In your training material, you need to provide questions and answers in the following format
{
"messages": [
{ "role": "system", "content": "" },
{ "role": "user", "content": "" },
{ "role": "assistant", "content": "." }
]
}
Using the example above, you can provide the following training materials
"messages": [
{ "role": "system", "content": "You're the airport's intelligent assistant, and you specialize in answering any airport-related questions and regulations." },
{ "role": "user", "Can I bring pork products into the country?" },
{ "role": "assistant", "content": ""According to the Reference Table of Quarantine Provisions for Animals, Plants or Their Products Commonly Carried by Inbound Travelers, travelers carrying meat or processed meat (including vacuum-packed meat) into the country are required to declare their quarantine to the Bureau of Prevention and Inspection (BPI), and may not be imported if they do not have a certificate of quarantine for the animals of the exporting country or if their quarantine is not satisfactory by the Bureau of Prevention and Inspection (BPI)"". }
]
After you have prepared your training materials, you can upload them via the File API provided by OpenAI and get the id of the file, which will be used when using the Fine-tuning API.
curl -https://api.openai1TP 3Te5 1TP 3Tb6.com/v1/files \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-F "purpose=fine-tune" \
-F "file=@path_to_your_file"
Then you have to create the fine-tuning task and then use the training information to fine-tune the base model. This can be done through OpenAI's CLI. The TRAIN_FILE_ID_OR_PATH at the bottom is the id of the file you uploaded above, and the BASE_MODEL is the model you want to use. As I mentioned earlier, gpt-3.5 and gpt-4 are not yet available, so the current options for BASE_MODEL include ada, babbage, curie, or davinci.
curl https://api.openai1TP 3Te5 1TP 3Tb6.com/v1/fine_tuning/jobs \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-d '{
"training_file": "TRAINING_FILE_ID",
"model": "gpt-3.5-turbo-0613",
}'
After completing the above steps, you need to wait for OpenAI to help you fine-tune the API, and after the fine-tuning is done, you can use it. The usage is the same as using the ChatGPT API, except that you need to add the org_id here.
curl https://api.openai1TP 3Te5 1TP 3Tb6.com/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-d '{
"model": "ft:gpt-3.5-turbo:org_id",
"messages": [
{
"role": "system",
"content": "You're the airport's intelligent assistant, and you specialize in answering any airport-related questions and regulations."
},
{
"role": "user",
"content": "Hi, I have some entry related questions."
}
]
}'
Because OpenAI has designed the Fine-tuning API very well, it makes it very easy to do fine-tuning after you just prepare the fine-tuning information. However, there are still some details to pay attention to when doing fine-tuning. Let's explain more in the next paragraph.
Notes on using the Fine-tuning API
The benefit that fine-tuning can bring is to make the model more manipulable, allowing the model to be more customized to your demand scenario. However, there are two costs to doing this. The first is the cost of the API itself. The cost of fine-tuning plus the cost of using the fine-tuned model is about 6-7 times more than the original GPT-3.5 model.
The second cost is the cost of labor. Take the scenarios that we have helped enterprises introduce in the past. Fine-tuning will involve several parties to work together, including the team responsible for the model, the product team, the business team. In terms of customer service scenarios that are better understood by everyone, there will be a need to have first-line senior customer service, to help determine what kind of answer is a good answer, to organize the information to the team in charge of the model, and then there is the role of the product manager to coordinate everything in the process.
This is something that often doesn't happen all at once, but rather iterates back and forth. If, after fine-tuning, you still feel that the output is not as good as expected, you need to pull together a team meeting to discuss, re-examine the training materials, and then take the time to make corrections before another round of fine-tuning. From the start of the project to the production environment, as fast as a month, often at least a quarter. Such time and labor costs can never be ignored.
Although fine tuning may make the model perform better than GPT-4 in some cases, it is not a sure thing. The original GPT-3.5 model is less than one tenth of the cost of GPT-4, but if you fine-tune it, the cost becomes about one third. If you include the labor cost of fine-tuning, the fine-tuned GPT-3.5 version may be more expensive, so if the GPT-3.5 model is not as good as GPT-4 with embedding after fine-tuning, then you might as well use GPT-4 with embedding.
Some people may ask, how to choose between fine-tuning and embedding? How to judge which one is better? Judging which one is better depends on the context, and usually requires someone who is more familiar with the business (e.g. customer service scenarios require a senior customer service judgement). In general, fine-tuning is to improve the controllability of the model, so that the model can be more biased towards a certain tone of voice that you prefer; embedding is to allow you to constantly add new information on the fly. In fact, the two are not in conflict and can be used together. But in the end, it's all about whether the benefits are worth the cost.
Therefore, our suggestion is to start with a small-scale test, make a good comparison and analysis between costs and benefits, and make sure that it is really worthwhile to fine-tune it on a large scale.