CN117708307B

CN117708307B - Method and device for fusing micro-tuning and Adapter of large language model

Info

Publication number: CN117708307B
Application number: CN202410170139.4A
Authority: CN
Inventors: 王震; 高德宏; 马宇飞; 蔡晓妍; 杨黎斌
Original assignee: Northwestern Polytechnical University
Current assignee: Northwestern Polytechnical University
Priority date: 2024-02-06
Filing date: 2024-02-06
Publication date: 2024-05-14
Anticipated expiration: 2044-02-06
Also published as: CN117708307A

Abstract

The invention discloses a large language model tuning and Adapter fusion method and device, and relates to the field of deep learning. The method is used for solving the problems of large consumption and poor data quality caused by the fact that the data collection is required to be manually carried out in the construction of the existing multi-mode data set. The method comprises the following steps: collecting a plurality of question-answer data sets and dialogue data sets from a set network platform; performing LoRA-adapter fine tuning on the question-answer data set and the dialogue data set respectively to sequentially obtain a question-answer large language model, a question-answer negative log-likelihood loss function, a dialogue large language model and a dialogue negative log-likelihood loss function; obtaining an ideal loss function, ideal fusion weight and a first ideal parameter of the question-answer data set and the dialogue data set in an ideal state; obtaining optimal parameters of the questioning and answering LoRA-adapter, optimal parameters of the dialogue LoRA-adapter and optimal fusion parameters; and obtaining the general LORA-adapter according to the optimal parameters of the questioning and answering LoRA-adapter, the optimal parameters of the dialogue LoRA-adapter and the optimal fusion parameters.

Description

Method and device for fusing micro-tuning and Adapter of large language model

Technical Field

The invention relates to the field of deep learning, in particular to a large language model fine tuning and Adapter fusion method and device.

Background

The training of the large language model has important scientific research and application value, can promote the performance of natural language processing tasks, improve the interactive experience of a dialogue system, and promote the universality of scientific research, technical innovation and artificial intelligence development. The large language model can learn rich language knowledge and grammar rules by training a massive corpus, so that the large language model has better performance in natural language processing tasks such as machine translation, text generation, text classification and the like. These models are able to understand and generate more accurate and fluent natural language. Large language models can be used to build intelligent dialog systems that provide more natural, accurate, and personalized replies by performing dialogues with users. The trained model can understand and generate human language, so that the requirements of users can be better met, and the interactive experience of a dialogue system is improved. Training a large language model requires processing massive amounts of data and huge computing resources, which is of great significance in promoting scientific research and technological innovation. In the process of training a large language model, many technical challenges, such as data processing, model design, training algorithms, etc., need to be solved, and the solution of these challenges has a positive driving effect on research and development in the related field.

The conventional scheme adopted by the current large language model training is as follows: and collecting a large amount of instruction fine tuning data, fusing the instruction fine tuning data to construct a large-scale data set, and fine tuning an open-source large language model on the data set. However, it seems impossible to fuse multiple data sets to construct one multi-functional data set, on the one hand, there is a possibility that contradictions may exist between different data sets, and it is difficult to evaluate the quality of data; on the other hand these datasets consist of examples of various specific tasks such as mathematics, coding, role playing, authoring, etc. If these datasets are blended and fine-tuned on this blended dataset, the performance of the large language model may be degraded or even severely degraded.

Disclosure of Invention

The embodiment of the invention provides a large language model tuning and Adapter fusion method and device, which can prevent the problem of performance degradation caused by conflict of different data sets in a semantic space.

The embodiment of the invention provides a large language model tuning and Adapter fusion method, which comprises the following steps:

Collecting a plurality of question-answer data sets and dialogue data sets from a set network platform, and respectively performing LoRA-adapter fine tuning on the question-answer data sets and the dialogue data sets to sequentially obtain a question-answer large language model, a question-answer negative log-likelihood loss function, a dialogue large language model and a dialogue negative log-likelihood loss function;

Obtaining an ideal loss function of the question-answer dataset and the dialogue dataset in an ideal state according to the question-answer negative log-likelihood loss function, the dialogue negative log-likelihood loss function and initial fusion weights which are included based on fine tuning of each LoRA-adapter, and obtaining an ideal fusion weight and a first ideal parameter which correspond to the ideal loss function according to the minimum value of the ideal loss function; wherein the first ideal parametric representation is added to all LoRA-adapters of the question-answer large language model and the dialogue large language model, respectively;

according to the ideal loss function, fine tuning is carried out on the question-answer LoRA-adapter corresponding to each question-answer data set and the dialogue LoRA-adapter corresponding to each dialogue data set, so as to respectively obtain the optimal parameters of the question-answer LoRA-adapter, the optimal parameters of the dialogue LoRA-adapter and the optimal fusion parameters;

and obtaining the general LoRA-adapter according to the optimal parameters of the question-answer LoRA-adapter, the optimal parameters of the dialogue LoRA-adapter and the optimal fusion parameters.

Preferably, the performing LoRA-adapter fine tuning on the question-answer dataset sequentially obtains a question-answer large language model and a question-answer negative log likelihood loss function, which specifically includes:

Training the question-answer data set to obtain a question-answer LoRA-adapter, and obtaining the question-answer large language model according to the question-answer LoRA-adapter and the question-answer data set;

obtaining the question-answer negative log-likelihood loss function according to the question-answer large language model and the token of the question-answer large language model;

the question-answer dataset, the question-answer large language model, and the question-answer negative log-likelihood loss function are as follows:

Where Q _i represents the ith question-answer dataset, s _i,j represents the jth system information of the ith question-answer dataset, Q _i,j represents the jth question of the ith question-answer dataset, r _i,j represents the jth reply of the ith question-answer dataset, |Q _i | represents the length of the question-answer dataset Q _i, Represents the question and answer LoRA-adapter trained on the question and answer dataset Q _i, |r _i,j | represents the length of r _i,j, r _k represents the kth token generated by the large language model, p _θ represents the large language model, θ represents the freezing parameters of the large language model,/>Representing a question-answer negative log-likelihood loss function.

Preferably, the performing LoRA-adapter fine tuning on the dialogue data set sequentially obtains a dialogue large language model and a dialogue negative log likelihood loss function, which specifically includes:

Training the dialogue data set to obtain a dialogue LoRA-adapter, and obtaining a dialogue large language model according to the dialogue LoRA-adapter and the dialogue data set;

obtaining the dialogue negative log likelihood loss function according to the dialogue large language model and the token of the dialogue large language model;

The dialogue dataset, dialogue large language model, and the dialogue negative log likelihood loss function are as follows:

Wherein C _i denotes the ith session dataset, A j-th query representing an i-th session dataset in a T-th round,The j-th reply representing the i-th dialog data set in the T-th round, |c _i | represents the length of dialog data set C _i,/>Representing a dialog LoRA-adapter trained on dialog dataset C _i, Q _j representing all tags belonging to a user query, R _j representing a target tag,/>Representing the number of j-th data containing tokens in the dialog data set C _i,/>Representing a dialogue negative log-likelihood loss function, p _θ representing a large language model, θ representing a freeze parameter of the large language model.

Preferably, the ideal loss function is as follows:

The minimum of the ideal loss function is as follows:

Wherein L represents an ideal loss function, Representing fine-tuning the acquisition/>, on the question-answer dataset Q _i Initial fusion weights of,/>Representing fine-tuning of acquisition/>, on dialog dataset C _i A ^* denotes all the first ideal parameters, ω ^* denotes all the ideal fusion weights, a denotes the first ideal parameters, ω denotes the ideal fusion weights.

Preferably, the first ideal parameter is as follows:

the ideal fusion weights are as follows:

Wherein A represents a first ideal parameter, Representing the first ideal parameters obtained by fine tuning on the question-answer dataset Q _M,Represents the first ideal parameter obtained by fine tuning on the dialogue dataset C _N, M represents the number of question-answer datasets, N represents the number of dialogue datasets, ω represents the ideal fusion weight,/>Representation/>Is/are the ideal fusion weights of (1)Representation/>Is used for the optimal fusion weight of the (c).

Preferably, the optimal parameters of the question-answer LoRA-adapter, the optimal parameters of the dialogue LoRA-adapter, and the optimal fusion parameters are as follows:

Wherein, Optimal parameters representing question-answer LoRA-adapter,/>The best parameters for dialog LoRA-adapter, ω ^** the best fusion parameters,/>Representing a question-answer negative log-likelihood loss function,/>Representing a dialogue negative log likelihood loss function,/>Representing initial fusion weights obtained by fine-tuning on the question-answer dataset Q _i,/>Representing initial fusion weights obtained by fine-tuning on dialog dataset C _i,/>Representing the resulting dialog LoRA-adapter trained on dialog data set C _i.

The embodiment of the invention provides a large language model tuning and Adapter fusion device, which comprises the following components:

a first obtaining unit for collecting a plurality of question-answer data sets and dialogue data sets from the set network platform; performing LoRA-adapter fine tuning on the question-answer dataset and the dialogue dataset respectively to sequentially obtain a question-answer large language model, a question-answer negative log-likelihood loss function, a dialogue large language model and a dialogue negative log-likelihood loss function;

The second obtaining unit is used for obtaining an ideal loss function of the question-answer data set and the dialogue data set in an ideal state according to the question-answer negative log-likelihood loss function, the dialogue negative log-likelihood loss function and initial fusion weights which are included on the basis of fine adjustment of each LoRA-adapter, and obtaining ideal fusion weights and first ideal parameters which correspond to the ideal loss function according to the minimum value of the ideal loss function; wherein the first ideal parametric representation is added to all LoRA-adapters of the question-answer large language model and the dialogue large language model, respectively;

The third obtaining unit is configured to fine tune the question answer LoRA-adapter corresponding to each question-answer dataset and the dialogue LoRA-adapter corresponding to each dialogue dataset according to the ideal loss function, so as to obtain an optimal parameter of the question answer LoRA-adapter, an optimal parameter of the dialogue LoRA-adapter and an optimal fusion parameter respectively;

And a fourth obtaining unit, configured to obtain a general LoRA-adapter according to the optimal parameter of the question-answer LoRA-adapter, the optimal parameter of the dialogue LoRA-adapter, and the optimal fusion parameter.

The embodiment of the invention provides a computer device, which comprises a memory and a processor, wherein the memory stores a computer program, and the computer program, when executed by the processor, enables the processor to execute the large language model refinement and Adapter fusion method described in any one of the above.

An embodiment of the present invention provides a computer readable storage medium storing a computer program, where the computer program when executed by a processor causes the processor to execute the large language model refinement and Adapter fusion method described in any one of the above.

The embodiment of the invention provides a large language model tuning and Adapter fusion method and device, wherein the method comprises the following steps: collecting a plurality of question-answer data sets and dialogue data sets from a set network platform; performing LoRA-adapter fine tuning on the question-answer dataset and the dialogue dataset respectively to sequentially obtain a question-answer large language model, a question-answer negative log-likelihood loss function, a dialogue large language model and a dialogue negative log-likelihood loss function; obtaining an ideal loss function of a question-answer data set and a dialogue data set in an ideal state according to the question-answer negative log-likelihood loss function, the dialogue negative log-likelihood loss function and initial fusion weights which are included based on fine adjustment of each LoRA-adapter, and obtaining an ideal fusion weight and a first ideal parameter which correspond to the ideal loss function according to the minimum value of the ideal loss function; wherein the first ideal parametric representation is added to all LoRA-adapters of the question-answer large language model and the dialogue large language model, respectively; according to the ideal loss function, fine tuning is carried out on the question-answer LoRA-adapter corresponding to each question-answer data set and the dialogue LoRA-adapter corresponding to each dialogue data set, so as to respectively obtain the optimal parameters of the question-answer LoRA-adapter, the optimal parameters of the dialogue LoRA-adapter and the optimal fusion parameters; and obtaining the general LoRA-adapter according to the optimal parameters of the question-answer LoRA-adapter, the optimal parameters of the dialogue LoRA-adapter and the optimal fusion parameters. According to the method, a plurality of instruction fine-tuning data sets are constructed, consumption of a GPU (English: graphic Process Unit, chinese: graphic processor) is saved by utilizing a quantization technology of QLoRA, a large language model training mode which saves computing resource cost and is high in quality is provided, and meanwhile, a multi-LoRA-adapter fusion mode based on Grid-Search (Chinese: parameter tuning means) optimization is designed to fuse the trained LoRA-adapters. By fusing LoRA-adapters, semantic space conflict caused by data set fusion can be effectively avoided, and generalization performance of a large language model on a plurality of tasks is improved. The problem of performance degradation caused by conflict of different data sets in semantic space in the prior art is solved.

Drawings

In order to more clearly illustrate the embodiments of the invention or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, it being obvious that the drawings in the following description are only some embodiments of the invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a schematic flow chart of a large language model tuning and Adapter fusion method provided by an embodiment of the invention;

Fig. 2 is a schematic structural diagram of a large language model tuning and Adapter fusion device according to an embodiment of the present invention.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

Training a large language model not only aims at improving the performance of natural language processing tasks and improving the interactive experience of a dialogue system, but also has richer value and significance in scientific research and application fields.

Firstly, a large language model can learn rich language knowledge and grammar rules by training a massive corpus. The models can understand and generate more accurate and smoother natural language, and provide better performance for natural language processing tasks such as machine translation, text generation, text classification and the like. In the machine translation task, the large language model can more accurately understand the meaning of the source language and generate a more natural target language translation result. In a text generation task, a model can generate text content that is more logical and coherent. In the text classification task, the model can judge the type of the text more accurately, and the classification accuracy is improved.

Second, a large language model can be used to build an intelligent dialog system that provides a more natural, accurate and personalized reply by having a dialog with the user. This capability is very useful for chat robots, intelligent customer service, etc. in everyday life. The trained model can understand and generate human language, so that the requirements of users can be better met, and the interactive experience of a dialogue system is improved. The dialogue system can generate personalized replies through the model, so that the user can feel the same interaction experience as human dialogue, and the satisfaction degree of the user is enhanced.

In addition, training a large language model requires processing massive amounts of data and huge computing resources, which is of great significance in promoting scientific research and technological innovation. In training large language models, many technical challenges need to be addressed, such as data processing, model design, training algorithms, etc. The resolution of these challenges not only can drive the development of language models, but also helps in research and development of related fields. For example, by improving and optimizing the model, the efficiency and performance of the model can be improved, and technical support is provided for development and application of other natural language processing tasks.

Finally, training a large language model can provide intelligent natural language processing services and promote the development of the universality of artificial intelligence technology. These models can be applied to various fields such as education, medical treatment, finance, and the like. In the education field, the model can be used for assisting learning, intelligent answering and the like, and personalized learning resources and communication platforms are provided. In the medical field, the model can be used for assisting doctor diagnosis, intelligent medical record and the like, and the quality and efficiency of medical service are improved. In the financial field, the model can be used for intelligent customer service, risk management and the like, and more personalized and efficient financial services are provided.

In summary, training a large language model has important scientific research and application values, not only can improve the performance of natural language processing tasks and improve the interactive experience of a dialogue system, but also can promote the universality of scientific research, technical innovation and artificial intelligence development. By training a large language model, an intelligent natural language processing technology can be applied to various fields, and better intelligent service and solution can be provided for society.

Because of the conventional scheme adopted by the current large language model training, it seems impossible to combine multiple data sets to construct a multifunctional data set, and there is a possibility that contradictions may exist between different data sets, and it is difficult to evaluate the quality of data. Based on the above, the embodiment of the invention provides an efficient training method for constructing a high-quality and strong-capacity large language model. Cleaning and arranging a plurality of open source data on a Huggingface platform to obtain a plurality of different knowledge question-answer data sets and dialogue data sets, then independently training one LoRA-adapter on each data set by QLoRA (Chinese: low-rank adaptation), and finally dynamically optimizing the fusion weights of the LoRA-adapters by using Grid-Search.

FIG. 1 is a schematic flow chart of a large language model tuning and Adapter fusion method provided by an embodiment of the invention; as shown in fig. 1, the method comprises the steps of:

Step 101, collecting a plurality of question and answer data sets and dialogue data sets from a set network platform; performing LoRA-adapter fine tuning on the question-answer dataset and the dialogue dataset respectively to sequentially obtain a question-answer large language model, a question-answer negative log-likelihood loss function, a dialogue large language model and a dialogue negative log-likelihood loss function;

102, obtaining ideal loss functions of a question-answer data set and a dialogue data set in an ideal state according to the question-answer negative log-likelihood loss function, the dialogue negative log-likelihood loss function and initial fusion weights which are included on the basis of fine adjustment of each LoRA-adapter, and obtaining ideal fusion weights and first ideal parameters which correspond to the ideal loss functions according to the minimum values of the ideal loss functions; wherein the first ideal parametric representation is added to all LoRA-adapters of the question-answer large language model and the dialogue large language model, respectively;

Step 103, fine tuning the question-answer LoRA-adapter corresponding to each question-answer dataset and the dialogue LoRA-adapter corresponding to each dialogue dataset according to the ideal loss function to respectively obtain the optimal parameters of the question-answer LoRA-adapter, the optimal parameters of the dialogue LoRA-adapter and the optimal fusion parameters;

And 104, obtaining the general LoRA-adapter according to the optimal parameters of the question-answer LoRA-adapter, the optimal parameters of the dialogue LoRA-adapter and the optimal fusion parameters.

It should be noted that, the embodiment of the invention provides a large language model tuning and Adapter fusion method, the execution main body of which is a processor.

In step 101, a plurality of question-answer data sets and dialogue data sets are collected from a setting network platform, where the setting network platform may be a Huggingface community, and in the embodiment of the present invention, the setting network platform is not specifically limited.

Specifically, after collecting a plurality of data sets from a set network platform, cleaning the plurality of data sets is required, and finally obtaining a plurality of question-answer data sets and a plurality of dialogue data sets after cleaning, in the embodiment of the present invention, the cleaning rule of the data sets is as follows:

1) Deletion and ChatGPT (english: CHAT GENERATIVE PRE-trained Transformer) -3.5-Turbo, only the dialogue instance with GPT-4 is reserved; 2) Deleting the GPT-4 refused answer or directly interpreted dialog; 3) The deleted answer is GPT-4 null or GPT-4 missed answer dialogue; 4) Deleting dialogs containing toxic or illegal information; 5) Deleting a dialogue containing a OpenAI or ChatGPT typeface or replacing a dialogue containing a OpenAI or ChatGPT typeface with information that ensures that the model has the correct identity; 6) Deleting user questions with similarity to the reference questions being greater than 85%; 7) Lengthy dialog instances are separated into dialogs that match the model maximum context length.

Through the above situation rules, a plurality of question-answer data sets and a plurality of dialogue data sets required by the embodiment of the invention can be finally obtained.

In an embodiment of the invention, the question-answer dataset may be represented as { Q ₁,Q₂,…,Q_M }, and the dialogue dataset may be represented as { C ₁,C₂,…,C_N }.

Specifically, the question-answer dataset may be represented by formula (1):

where s represents system information SYSTEM MESSAGE, Q represents a question query of the user, r represents a response of the artificial intelligence, Q _i represents an i-th question-answer dataset, s _i,j represents a j-th system information of the i-th question-answer dataset, Q _i,j represents a j-th question of the i-th question-answer dataset, r _i,j represents a j-th response of the i-th question-answer dataset, and |q _i | represents a length of the question-answer dataset Q _i.

In the embodiment of the invention, the obtained question-answer data set is given from a specific example when being subjected to LoRA-adapter fine tuningThe large language model should learn to generate the corresponding reply r _i,j, the system message s _i,j and the query q _i,j. The process can obtain a question and answer language model, which is as follows:

Wherein, Representing a question and answer LoRA-adapter trained on a question and answer dataset Q _i, p _θ representing a large language model, |r _i,j | representing the length of r _i,j, r _k representing the kth token generated by the large language model, θ representing the freeze parameters of the large language model, and r _<k representing all r with a small scale less than k.

Further, a question-answer negative log-likelihood loss function is obtained according to the question-answer large language model and the token of the question-answer large language model.

Wherein the question-answer negative log-likelihood loss function is as follows:

Wherein, Representing a question-answer negative log-likelihood loss function, s _i,j representing the j-th system information of the i-th question-answer data set, q _i,j representing the j-th question of the i-th question-answer data set, and r _i,j representing the j-th answer of the i-th question-answer data set.

Accordingly, the dialog data set may be represented by formula (4):

Wherein the session dataset comprises a plurality of session instances having T rounds, C _i represents the ith session dataset, Jth query representing ith session dataset in the T-th round,/>Represents the jth reply to the ith session dataset in the T-th round, |c _i | represents the length of session dataset C _i.

In embodiments of the present invention, when a dialog dataset is subjected to LoRA-adapter fine-tuning, the large language model will learn the dialog history and queries before a given round TPredicting per reply/>This process results in a large dialog language model, which is shown below:

Wherein, Representing a dialog LoRA-adapter trained on dialog dataset C _i, Q _j representing all tags belonging to a user query, R _j representing a target tag,/>Representing the number of j-th data containing tokens in the session dataset C _i.

Further, a dialogue negative log likelihood loss function is obtained according to the dialogue large language model and the token of the dialogue large language model. Wherein the dialogue negative log-likelihood loss function is as follows:

Wherein, Representing a dialogue negative log likelihood loss function,/>Representing the number of jth data containing tokens in the dialog data set C _i, p _θ representing a large language model, θ representing the freeze parameters of the large language model,/>Representing the resulting dialog LoRA-adapter trained on dialog data set C _i.

It should be noted that, in the embodiment of the present invention, for LoRA-adapter fusion, the loss function of each fine-tuning LoRA-adapter is given a trainable weight, and the loss functions of all LoRA-adapters with fusion weights are fine-tuned, where the fusion weights can be expressed as:

In step 102, an ideal loss function of the question-answer dataset and the dialogue dataset in an ideal state can be obtained according to the question-answer negative log-likelihood loss function, the dialogue negative log-likelihood loss function and the initial fusion weights included based on each LoRA-adapter fine tuning, wherein the ideal loss function is as follows:

Wherein L represents an ideal loss function, Representing a question-answer negative log-likelihood loss function,/>Representing a dialogue negative log likelihood loss function,/>Representing fine-tuning the acquisition/>, on the question-answer dataset Q _i Initial fusion weights of,/>Representing fine-tuning of acquisition/>, on dialog dataset C _i Is used to determine the initial fusion weights of (a).

Further, an ideal fusion weight and a first ideal parameter corresponding to the ideal loss function are obtained according to the minimum value of the ideal loss function, and the minimum value of the ideal loss function is expressed by the following formula:

Where A ^* represents all ideal LORA-adaper, ω ^* represents all ideal fusion weights, argmin represents the values of the corresponding parameters A and ω when the latter formula takes the minimum value.

In an embodiment of the present invention, all ideal LORA-adaper may also be referred to as all first ideal parameters, which represent all LoRA-adapters added to the quiz language model and the dialog language model, respectively, wherein the first ideal parameters are represented by the following formulas:

Wherein A represents a first ideal parameter, Representing the first ideal parameters obtained by fine tuning on the question-answer dataset Q _M,Representing the first ideal parameter obtained by trimming on the dialogue dataset C _N, M representing the number of question-answer datasets and N representing the number of dialogue datasets.

Specifically, ω represents the ideal fusion weight for all LoRA-adapters, i.e., for question and answer LoRA-adapters and dialogue LoRA-adapters, which can be expressed by the following formula:

Where ω represents the ideal fusion weight, Representation/>Is/are the ideal fusion weights of (1)Representation/>Is used for the optimal fusion weight of the (c).

In step 103, fine tuning is performed on the question and answer LoRA-adapter corresponding to each question and answer data set and the dialogue LoRA-adapter corresponding to each dialogue data set according to the ideal loss function, so as to obtain the optimal parameters of the question and answer LoRA-adapter, the optimal parameters of the dialogue LoRA-adapter and the optimal fusion parameters respectively.

In practical applications, the first ideal parameters and ideal fusion weights are trimmed sequentially for efficiency and simplicity. In the first stage, each LoRA-adapter (all first ideal parameters) on each question-answer dataset or dialogue dataset is fine-tuned, namely, for formula (8), the formula (8) is split into two formulas shown below, and then the optimal parameters of the question-answer LoRA-adapter and the optimal parameters of the dialogue LoRA-adapter can be obtained, specifically as follows:

Wherein, Optimal parameters representing question-answer LoRA-adapter,/>Optimal parameters representing dialog LoRA-adapter,/>Representing a dialog LoRA-adapter,/>, trained on dialog dataset C _i Representing a question LoRA-adapter, i.e./>, trained on a question dataset Q _i Representing a question-answer negative log-likelihood loss function,/>Representing a dialogue negative log-likelihood loss function.

Further, only fine tuning the ideal fusion weight in the second stage, and freezing the basic large language model and the first ideal parameters to obtain the optimal fusion parameters, wherein the optimal fusion parameters are specifically as follows:

Wherein omega ^** represents the optimal fusion parameters, Representing fine-tuning the acquisition/>, on the question-answer dataset Q _i Initial fusion weights of,/>Representing fine-tuning of acquisition/>, on dialog dataset C _i Initial fusion weights of,/>Representing a question-answer negative log-likelihood loss function,/>Representing the dialogue negative log-likelihood loss function, ω represents the ideal fusion weight.

It should be noted that in practical applications, when the number of question-answer datasets and dialogue datasets is small, some simple and fast algorithms may be used to optimize the ideal fusion weights.

In summary, the embodiment of the invention provides a large language model refinement and adaptation fusion method and device, and the method can effectively avoid semantic space conflict caused by data set fusion by fusing LoRA-adaptation, and simultaneously promote generalization performance of a large language model on a plurality of tasks. The problem of performance degradation caused by conflict of different data sets in semantic space in the prior art is solved.

Based on the same inventive concept, the embodiment of the invention provides a large language model tuning and Adapter fusion device, and because the principle of the device for solving the technical problem is similar to that of a large language model tuning and Adapter fusion method, the implementation of the device can be referred to the implementation of the method, and the repetition is omitted.

As shown in fig. 2, the apparatus mainly includes a first obtaining unit 201, a second obtaining unit 202, a third obtaining unit 203, and a fourth obtaining unit 204.

A first obtaining unit 201, configured to collect a plurality of question-answer data sets and dialogue data sets from a set network platform, and perform LoRA-adapter fine tuning on the question-answer data sets and the dialogue data sets respectively, so as to obtain a question-answer large language model, a question-answer negative log likelihood loss function, a dialogue large language model and a dialogue negative log likelihood loss function in sequence;

A second obtaining unit 202, configured to obtain an ideal loss function of the question-answer dataset and the dialogue dataset in an ideal state according to the question-answer negative log-likelihood loss function, the dialogue negative log-likelihood loss function, and initial fusion weights included based on fine tuning of each LoRA-adapter, and obtain an ideal fusion weight and a first ideal parameter corresponding to the ideal loss function according to a minimum value of the ideal loss function; wherein the first ideal parametric representation is added to all LoRA-adapters of the question-answer large language model and the dialogue large language model, respectively;

A third obtaining unit 203, configured to fine tune a question-answer LoRA-adapter corresponding to each question-answer dataset and a dialogue LoRA-adapter corresponding to each dialogue dataset according to the ideal loss function, to obtain an optimal parameter of the question-answer LoRA-adapter, an optimal parameter of the dialogue LoRA-adapter, and an optimal fusion parameter respectively;

A fourth obtaining unit 204, configured to obtain a general LoRA-adapter according to the optimal parameter of the question-answer LoRA-adapter, the optimal parameter of the dialogue LoRA-adapter, and the optimal fusion parameter.

It should be understood that the above large language model tuning and Adapter fusion device includes units that are logically divided only according to functions implemented by the device, and in practical application, the units may be overlapped or split. The functions implemented by the large language model tuning and Adapter fusion device provided in this embodiment are in one-to-one correspondence with the large language model tuning and Adapter fusion method provided in the foregoing embodiment, and the more detailed process flow implemented by the device is described in detail in the foregoing method embodiment one, which is not described in detail herein.

Another embodiment of the present invention also provides a computer apparatus, including: a processor and a memory; the memory is used for storing computer program codes, and the computer program codes comprise computer instructions; when the processor executes the computer instructions, the electronic device executes the steps of the large language model refinement and Adapter fusion method in the method flow shown in the method embodiment.

Another embodiment of the present invention further provides a computer readable storage medium, where computer instructions are stored, where the computer instructions, when executed on a computer device, cause the computer device to execute the steps of the large language model refinement and Adapter fusion method in the method flow shown in the foregoing method embodiment.

While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. It is therefore intended that the following claims be interpreted as including the preferred embodiments and all such alterations and modifications as fall within the scope of the invention.

It will be apparent to those skilled in the art that various modifications and variations can be made to the present invention without departing from the spirit or scope of the invention. Thus, it is intended that the present invention also include such modifications and alterations insofar as they come within the scope of the appended claims or the equivalents thereof.

Claims

1. A large language model micro-tuning and Adapter fusion method is characterized by comprising the following steps:

obtaining a general LoRA-adapter according to the optimal parameters of the question-answer LoRA-adapter, the optimal parameters of the dialogue LoRA-adapter and the optimal fusion parameters;

The ideal loss function, the optimal parameters of the question LoRA-adapter, the optimal parameters of the dialogue LoRA-adapter, and the optimal fusion parameters are as follows:

Wherein L represents an ideal loss function, Representing fine-tuning the acquisition/>, on the question-answer dataset Q _i Is used to determine the initial fusion weights of (1),Representing fine-tuning of acquisition/>, on dialog dataset C _i Initial fusion weights of,/>Optimal parameters representing question-answer LoRA-adapter,/>The best parameters for dialog LoRA-adapter, ω ^** the best fusion parameters,/>Representing a question-answer negative log-likelihood loss function,/>Representing a dialogue negative log likelihood loss function,/>Representing a question LoRA-adapter, i.e./>, trained on a question dataset Q _i Representing the resulting dialog LoRA-adapter trained on dialog data set C _i.

2. The method of claim 1, wherein said performing LoRA-adapter fine-tuning on said question-answer dataset sequentially results in a question-answer large language model, a question-answer negative log-likelihood loss function, comprising:

where Q _i represents the ith question-answer dataset, s _i,j represents the jth system information of the ith question-answer dataset, Q _i,j represents the jth question of the ith question-answer dataset, r _i,j represents the jth reply of the ith question-answer dataset, |Q _i | represents the length of the question-answer dataset Q _i, Representing the question and answer LoRA-adapter trained on the question and answer dataset Q _i, p _θ representing a large language model, |r _i,j | representing the length of r _i,j, r _k representing the kth token generated by the large language model, θ representing the freezing parameters of the large language model,Representing a question-answer negative log-likelihood loss function.

3. The method of claim 1, wherein performing LoRA-adapter fine-tuning on the dialogue dataset sequentially obtains a dialogue large language model and a dialogue negative log likelihood loss function, specifically comprising:

Wherein C _i denotes the ith session dataset, Jth query representing ith session dataset in the T-th round,/>The j-th reply representing the i-th dialog data set in the T-th round, |c _i | represents the length of dialog data set C _i,/>Representing a dialog LoRA-adapter trained on dialog dataset C _i, Q _j representing all tags belonging to a user query, R _j representing a target tag,/>Representing the number of j-th data containing tokens in the dialog data set C _i,/>Representing a dialogue negative log-likelihood loss function, p _θ representing a large language model, θ representing a freeze parameter of the large language model.

4. The method of claim 1, wherein the minimum value of the ideal loss function is as follows:

Wherein, Representing fine-tuning the acquisition/>, on the question-answer dataset Q _i Initial fusion weights of,/>Representing fine-tuning of acquisition/>, on dialog dataset C _i A ^* denotes all the first ideal parameters, ω ^* denotes all the ideal fusion weights, a denotes the first ideal parameters, ω denotes the ideal fusion weights.

5. The method of claim 1, wherein the first desired parameter is as follows:

the ideal fusion weights are as follows:

Wherein A represents a first ideal parameter, Representing the first ideal parameter obtained by fine tuning on the question-answer dataset Q _M,/>Represents the first ideal parameter obtained by fine tuning on the dialogue dataset C _N, M represents the number of question-answer datasets, N represents the number of dialogue datasets, ω represents the ideal fusion weight,/>Representation/>Is/are the ideal fusion weights of (1)Representation/>Is used for the optimal fusion weight of the (c).

6. A large language model refinement and adaptation fusion apparatus, comprising:

The first obtaining unit is used for collecting a plurality of question-answer data sets and dialogue data sets from a set network platform, respectively performing LoRA-adapter fine tuning on the question-answer data sets and the dialogue data sets, and sequentially obtaining a question-answer large language model, a question-answer negative log likelihood loss function, a dialogue large language model and a dialogue negative log likelihood loss function;

the second obtaining unit is used for obtaining an ideal loss function of the question-answer dataset and the dialogue dataset in an ideal state according to the question-answer negative log-likelihood loss function, the dialogue negative log-likelihood loss function and initial fusion weights which are included based on fine adjustment of each LoRA-adapter, and obtaining an ideal fusion weight and a first ideal parameter which correspond to the ideal loss function according to the minimum value of the ideal loss function; wherein the first ideal parametric representation is added to all LoRA-adapters of the question-answer large language model and the dialogue large language model, respectively;

A fourth obtaining unit, configured to obtain a general LoRA-adapter according to the optimal parameter of the question-answer LoRA-adapter, the optimal parameter of the dialogue LoRA-adapter, and the optimal fusion parameter;

7. A computer device comprising a memory and a processor, the memory storing a computer program that, when executed by the processor, causes the processor to perform the large language model refinement and adaptation fusion method of any one of claims 1-5.

8. A computer readable storage medium, storing a computer program which, when executed by a processor, causes the processor to perform the large language model refinement and adaptation fusion method of any one of claims 1-5.