CN113555008A

CN113555008A - Parameter adjusting method and device for model

Info

Publication number: CN113555008A
Application number: CN202010307414.4A
Authority: CN
Inventors: 朱晓如
Original assignee: Alibaba Group Holding Ltd
Current assignee: Alibaba Group Holding Ltd
Priority date: 2020-04-17
Filing date: 2020-04-17
Publication date: 2021-10-26

Abstract

The application discloses an automatic parameter adjusting method aiming at a model, which comprises the following steps: obtaining any group of hyper-parameter combination information from the original hyper-parameter combination information as first hyper-parameter combination information; training a first base model to be trained according to the first hyper-parameter combination information to obtain a first target model; if the first target model does not comprise a target model meeting preset performance conditions, acquiring second hyper-parameter combination information and a second basic model to be trained according to the original hyper-parameter combination information and the first target model; and obtaining the target model according to the second hyper-parameter combination information and the second basic model to be trained. The method can flexibly and quickly realize the continuous iteration of the model and increase the accuracy of the target model.

Description

Parameter adjusting method and device for model

Technical Field

The application relates to the technical field of computers, in particular to a parameter adjusting method and device for a model, an electronic device and a storage device. The application also relates to a method and a device for obtaining the voice recognition model, electronic equipment and storage equipment. The application also relates to a voice recognition method, a voice recognition device, electronic equipment and storage equipment.

Background

With the continuous development of the Deep Learning (DL) technology, the model obtained by training is applied to different application scenes, so that people can better live conveniently. For example, the model obtained by training is applied to applications such as speech recognition, semantic understanding, speech synthesis or search recommendation, and various requirements of people can be met conveniently.

At present, when a model is trained, different models are generally trained by manually adjusting hyper-parameters (hyper-parameters) in the model, for example, for an initial basic model to be trained, after first hyper-parameter combination information is generally set manually, a target model corresponding to the first hyper-parameter combination information is obtained by training through training data; then, manually setting second hyper-parameter combination information, and training to obtain a target model corresponding to the second hyper-parameter combination; after the at least one target model is obtained through manual adjustment, a model meeting the performance requirement is selected from the at least one target model for use. Of course, there is also an automatic parameter tuning method for a model, which generally simulates manual parameter tuning through a computing device, that is, after obtaining a plurality of hyper-parameter combination information for an initial base model, the computing device obtains at least one target model corresponding to different hyper-parameter combination information in series or in parallel, and selects a model meeting performance requirements from the at least one target model for use.

However, with the continuous development of deep learning technology, training of a model usually occupies more computing resources, and the time consumption of a single round of training is also increasing, so that in the prior art, when a parameter of the model is adjusted, a problem that the time consumption is greatly increased due to a relatively fixed flow is usually existed, and further, when a target model or a hyper-parameter combination obtained by training is applied to an online, the performance effect of a newly online model is rather inferior to the effect of a model obtained by continuously training through actual data on the online. Therefore, the existing automatic parameter adjusting method for the model has the problems of low flexibility and relatively low accuracy.

Disclosure of Invention

The embodiment of the application provides a parameter adjusting method for a model, and aims to solve the problems of low flexibility and low accuracy in the prior art.

The embodiment of the application provides a parameter adjusting method for a model, which comprises the following steps: obtaining any group of hyper-parameter combination information from the original hyper-parameter combination information as first hyper-parameter combination information; training a first base model to be trained according to the first hyper-parameter combination information to obtain a first target model; if the first target model does not have a target model meeting preset performance conditions, acquiring second hyper-parameter combination information and a second basic model to be trained according to the original hyper-parameter combination information and the first target model; and obtaining the target model according to the second hyper-parameter combination information and the second basic model to be trained.

Optionally, the method further includes: obtaining first performance information corresponding to the first target model; and judging whether a target model meeting the preset performance condition exists in the first target model or not according to the first performance information.

Optionally, the determining, according to the first performance information, whether the first target model includes the target model that meets the preset performance condition includes: and if the first performance information contains performance information not smaller than a preset performance threshold, judging that the target model exists in the first target model.

Optionally, the method further includes: and obtaining the target model according to a model corresponding to the performance information not less than a preset performance threshold in the first target model.

Optionally, the obtaining, according to the original hyper-parameter combination information and the first target model, second hyper-parameter combination information and a second basic model to be trained includes: obtaining the second hyper-parameter combination information from the original hyper-parameter combination information, wherein the second hyper-parameter combination information is any group of hyper-parameter combination information except the first hyper-parameter combination information in the original hyper-parameter combination information; selecting a model meeting preset model screening conditions from the first target model; and obtaining the second basic model to be trained according to the model meeting the preset model screening condition.

Optionally, the obtaining the second hyper-parameter combination information from the original hyper-parameter combination information includes: acquiring second initial hyper-parameter combination information from the original hyper-parameter combination information, wherein the second initial hyper-parameter combination information is any group of hyper-parameter combination information except the first hyper-parameter combination information in the original hyper-parameter combination information; selecting hyper-parameter combination information meeting preset hyper-parameter screening conditions from the first hyper-parameter combination information; and acquiring the second hyper-parameter combination information according to the second initial hyper-parameter combination information and the hyper-parameter combination information meeting the preset hyper-parameter screening condition.

Optionally, the selecting, from the first hyper-parameter combination information, hyper-parameter combination information that meets a preset hyper-parameter screening condition includes: obtaining first performance information corresponding to the first target model; acquiring performance information with a numerical value meeting a preset first numerical value condition from the first performance information; and acquiring the hyper-parameter combination information meeting the preset hyper-parameter screening condition according to the hyper-parameter combination information corresponding to the performance information meeting the preset first numerical condition in the first hyper-parameter combination information.

Optionally, the selecting a model satisfying a preset model screening condition from the first target model includes: obtaining first performance information corresponding to the first target model; acquiring performance information with a numerical value meeting a preset second numerical value condition from the first performance information; and obtaining a model meeting the preset model screening condition according to a model corresponding to the performance information meeting the preset second numerical condition in the first target model.

Optionally, the obtaining the target model according to the second hyper-parameter combination information and the second basic model to be trained includes: training the second basic model to be trained according to the second hyper-parameter combination information to obtain a second target model; and if the second target model has a model meeting the preset performance condition, obtaining the target model according to the model meeting the preset performance condition.

Optionally, the first base model to be trained includes an initial base model to be trained configured by a user.

Optionally, the method further includes: obtaining an initial basic model to be trained; if the original hyper-parameter combination information does not meet the preset grouping training condition, training the initial basic model to be trained according to the original hyper-parameter combination information to obtain performance information for representing the performance of the model obtained by training; acquiring performance information with a numerical value meeting a preset second numerical value condition from the performance information; and acquiring target experience hyper-parameter combination information according to hyper-parameter combination information corresponding to the performance information meeting the preset second numerical condition in the original hyper-parameter combination information, wherein the target experience hyper-parameter combination information is used as experience hyper-parameters when the initial basic model to be trained is continuously optimized.

Optionally, the method further includes: obtaining original training data, wherein the training of a first base model to be trained by using the first hyper-parameter combination information to obtain a first target model comprises the following steps: training the first base model to be trained according to the first hyper-parameter combination information and the original training data to obtain the first target model; the obtaining the target model according to the second hyper-parameter combination information and the second basic model to be trained includes: and training the second basic model to be trained according to the second hyper-parameter combination information and the original training data to obtain the target model.

Optionally, if the original training data meets a preset data splitting condition, the method further includes: splitting the original training data according to the preset data splitting condition to obtain at least one group of original packet training data; the training of the first base model to be trained according to the first hyper-parameter combination information to obtain the first target model further comprises: acquiring any one group of original packet training data from the at least one group of original packet training data as first training data; training the first base model to be trained according to the first hyper-parameter combination information and the first training data to obtain the first target model;

optionally, the obtaining the target model according to the second hyper-parameter combination information and the second basic model to be trained includes: obtaining second training data from the at least one group of original packet training data; and training the second basic model to be trained according to the second hyper-parameter combination information and the second training data to obtain the target model, wherein the second training data is any one group of original grouped training data except the first training data in the at least one group of original grouped training data.

Optionally, if the number of the original training data is not less than a preset training data threshold, it is determined that the original training data meets the preset data splitting condition.

Optionally, the method further includes: obtaining first performance information corresponding to a first target model, and obtaining second performance information corresponding to the target model; and acquiring performance change information corresponding to the original hyper-parameter combination information according to the first performance information and the second performance information.

Optionally, the method further includes: acquiring a trigger operation for representing starting parameter adjusting operation; and responding to the triggering operation, and executing a step of obtaining any group of hyper-parameter combination information from the original hyper-parameter combination information as first hyper-parameter combination information.

Optionally, the original hyper-parameter combination information includes information of at least one hyper-parameter combination of the following: the hyper-parameter combination configured by the user and corresponding to the initial basic model to be trained and the hyper-parameter combination obtained from the historical hyper-parameter combination corresponding to the initial basic model to be trained.

The embodiment of the present application further provides a method for obtaining a speech recognition model, including: obtaining any group of hyper-parameter combination information as first hyper-parameter combination information from original hyper-parameter combination information corresponding to the voice recognition model; training a first to-be-trained voice recognition model according to the first hyper-parameter combination information to obtain a first target voice recognition model; if the first target voice recognition model does not have a target voice recognition model meeting preset performance conditions, acquiring second hyper-parameter combination information and a second to-be-trained voice recognition model according to the original hyper-parameter combination information and the first target voice recognition model; and obtaining the target speech recognition model according to the second hyper-parameter combination information and the second speech recognition model to be trained.

An embodiment of the present application further provides a speech recognition method, including: acquiring voice information to be recognized; and inputting the voice information to be recognized into a target voice recognition model, and acquiring target recognition information corresponding to the voice information to be recognized, wherein the target voice recognition model is obtained by using the voice recognition model acquisition method.

Optionally, the method is applied to a server, and the method further includes: and providing the target identification information to a client.

Optionally, the method further includes: acquiring service content information corresponding to the target identification information; and providing the service content information to the client.

Optionally, the method is applied to a client, and the method further includes: acquiring the target identification information; and displaying or playing the target identification information.

Optionally, the method further includes: acquiring service content information corresponding to the target identification information; and displaying or playing the service content information.

Optionally, the client comprises a computing device providing the most proximal service through edge computing.

Optionally, the computing device includes at least one of the following computing devices: intelligent audio amplifier equipment, vehicle navigation equipment, translation equipment.

The embodiment of the present application further provides a parameter adjusting device for a model, including: a first information obtaining unit, configured to obtain any one group of hyper-parameter combination information from the original hyper-parameter combination information as first hyper-parameter combination information; a first target model obtaining unit, configured to train a first to-be-trained base model according to the first hyper-parameter combination information, so as to obtain a first target model; a second information obtaining unit, configured to determine whether a target model meeting a preset performance condition exists in the first target model, and if not, obtain second hyper-parameter combination information and a second basic model to be trained according to the original hyper-parameter combination information and the first target model; and the target model obtaining unit is used for obtaining the target model according to the second hyper-parameter combination information and the second basic model to be trained.

An embodiment of the present application further provides an electronic device, including:

a processor;

a memory for storing a program of a model-specific parameter method, wherein the following steps are executed after the device is powered on and the program of the model-specific parameter method is executed by the processor:

obtaining any group of hyper-parameter combination information from the original hyper-parameter combination information as first hyper-parameter combination information; training a first base model to be trained according to the first hyper-parameter combination information to obtain a first target model; if the first target model does not have a target model meeting preset performance conditions, acquiring second hyper-parameter combination information and a second basic model to be trained according to the original hyper-parameter combination information and the first target model; and obtaining the target model according to the second hyper-parameter combination information and the second basic model to be trained.

The embodiment of the present application further provides a storage device, in which a program for a parameter tuning method for a model is stored, where the program is run by a processor and executes the following steps:

The embodiment of the present application further provides an obtaining apparatus of a speech recognition model, including: a first voice information obtaining unit, configured to obtain any one group of hyper-parameter combination information as first hyper-parameter combination information from original hyper-parameter combination information corresponding to the voice recognition model; a first target speech recognition model obtaining unit, configured to train a first to-be-trained speech recognition model according to the first hyper-parameter combination information, so as to obtain a first target speech recognition model; a second voice information obtaining unit, configured to determine whether a target voice recognition model meeting a preset performance condition does not exist in the first target voice recognition model, and if not, obtain second hyper-parameter combination information and a second to-be-trained voice recognition model according to the original hyper-parameter combination information and the first target voice recognition model; and the target speech recognition model obtaining unit is used for obtaining the target speech recognition model according to the second hyper-parameter combination information and the second speech recognition model to be trained.

a processor;

a memory for storing a program of an obtaining method of a speech recognition model, the apparatus performing the following steps after being powered on and running the program of the obtaining method of the speech recognition model by the processor:

obtaining any group of hyper-parameter combination information as first hyper-parameter combination information from original hyper-parameter combination information corresponding to the voice recognition model; training a first to-be-trained voice recognition model according to the first hyper-parameter combination information to obtain a first target voice recognition model; if the first target voice recognition model does not have a target voice recognition model meeting preset performance conditions, acquiring second hyper-parameter combination information and a second to-be-trained voice recognition model according to the original hyper-parameter combination information and the first target voice recognition model; and obtaining the target speech recognition model according to the second hyper-parameter combination information and the second speech recognition model to be trained.

An embodiment of the present application further provides a storage device, in which a program of an obtaining method of a speech recognition model is stored, where the program is executed by a processor, and executes the following steps:

An embodiment of the present application further provides a speech recognition apparatus, including: the acquisition unit is used for acquiring voice information to be recognized; and the recognition unit is used for inputting the voice information to be recognized into a target voice recognition model and obtaining target recognition information corresponding to the voice information to be recognized, wherein the target voice recognition model is obtained by using the obtaining method of the voice recognition model.

a processor;

a memory for storing a program of a speech recognition method, the apparatus performing the following steps after being powered on and running the program of the speech recognition method by the processor:

acquiring voice information to be recognized; and inputting the voice information to be recognized into a target voice recognition model, and acquiring target recognition information corresponding to the voice information to be recognized, wherein the target voice recognition model is obtained by using the voice recognition model acquisition method.

An embodiment of the present application further provides a storage device, in which a program of a speech recognition method is stored, where the program is executed by a processor, and executes the following steps:

Compared with the prior art, the method has the following advantages:

the embodiment of the application provides an automatic parameter adjusting method for a model, which comprises the following steps: obtaining any group of hyper-parameter combination information from the original hyper-parameter combination information as first hyper-parameter combination information; training a first base model to be trained according to the first hyper-parameter combination information to obtain a first target model; if the first target model does not comprise a target model meeting preset performance conditions, acquiring second hyper-parameter combination information and a second basic model to be trained according to the original hyper-parameter combination information and the first target model; and obtaining the target model according to the second hyper-parameter combination information and the second basic model to be trained. Compared with the existing automatic parameter adjusting method, after the first target model is obtained through the training of the first hyper-parameter combination information, if the first target model is judged not to include the target model, second hyper-parameter combination information is obtained according to the original hyper-parameter combination information and the first target model obtained in the previous training, a second basic model to be trained corresponding to the second hyper-parameter combination information is obtained, that is, on the basis of the upper round of training, the hyper-parameter combination information and the basic model to be trained for the second round of training are obtained, so that the continuous iteration of the model can be flexibly and rapidly realized, and, because the second basic model to be trained used in the second round of training is the model obtained according to the first target model obtained in the first round of training, the continuous optimization of the model can be realized, and the accuracy of the target model is increased.

The embodiment of the present application further provides a method for obtaining a speech recognition model, including: obtaining any group of hyper-parameter combination information as first hyper-parameter combination information from original hyper-parameter combination information corresponding to the voice recognition model; training a first to-be-trained voice recognition model according to the first hyper-parameter combination information to obtain a first target voice recognition model; if the first target voice recognition model does not have a target voice recognition model meeting preset performance conditions, acquiring second hyper-parameter combination information and a second to-be-trained voice recognition model according to the original hyper-parameter combination information and the first target voice recognition model; and obtaining the target speech recognition model according to the second hyper-parameter combination information and the second speech recognition model to be trained. The method can flexibly and quickly realize continuous iteration of the speech recognition model to be trained, and further obtain the target speech recognition model with higher accuracy.

An embodiment of the present application further provides a speech recognition method, including: acquiring voice information to be recognized; and inputting the voice information to be recognized into a target voice recognition model, and obtaining target recognition information corresponding to the voice information to be recognized, wherein the target voice recognition model is obtained by using an obtaining method of a voice recognition model. The method can increase the recognition accuracy of the voice information to be recognized.

Drawings

Fig. 1-a is a schematic view of a first application scenario of a parameter tuning method for a model according to a first embodiment of the present application.

Fig. 1-B is a schematic view of a second application scenario of a parameter tuning method for a model according to a first embodiment of the present application.

Fig. 2 is a schematic diagram of a framework of a parameter adjusting method for a model in the prior art according to a first embodiment of the present application.

Fig. 3 is a flowchart of a parameter tuning method for a model according to a first embodiment of the present application.

Fig. 4 is a schematic block diagram of a parameter tuning method for a model according to a first embodiment of the present application.

Fig. 5 is a flowchart of a method for obtaining a speech recognition model according to a second embodiment of the present application.

Fig. 6 is a flowchart of a speech recognition method according to a third embodiment of the present application.

Fig. 7 is a schematic diagram of a parameter adjusting apparatus for a model according to a fourth embodiment of the present application.

Fig. 8 is a schematic diagram of an electronic device according to a fifth embodiment of the present application.

Fig. 9 is a schematic diagram of an apparatus for obtaining a speech recognition model according to a seventh embodiment of the present application.

Fig. 10 is a schematic diagram of a speech recognition apparatus according to a tenth embodiment of the present application.

Detailed Description

In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present application. This application is capable of implementation in many different ways than those herein set forth and of similar import by those skilled in the art without departing from the spirit of this application and is therefore not limited to the specific implementations disclosed below.

In order to make those skilled in the art better understand the solution of the present application, a detailed description is given below of a specific application scenario of an embodiment of the present application based on a model-specific parameter tuning method provided by the present application. The parameter adjusting method for the model provided in the first embodiment of the present application may be applied to an interaction scenario between a client and a server, as shown in fig. 1-a, which is a first application scenario diagram of the parameter adjusting method for the model provided in the first embodiment of the present application.

In specific implementation, the method generally includes, when a user needs to train to obtain a model for processing certain application data or to further optimize performance of the model, obtaining a hyper-parameter combination for the model in advance, and setting the hyper-parameter combination into the model, thereby obtaining an optimized target model corresponding to the hyper-parameter combination, where the model for processing certain application data may be a speech recognition model, a semantic understanding model, a speech synthesis model, or a search recommendation model. Specifically, after a client receives original hyper-parameter combination information configured by a user and acquires a trigger operation for representing starting of parameter adjusting operation, the client sends the original hyper-parameter combination information to a server in response to the trigger operation, the server acquires original hyper-parameter combination information corresponding to a model to be trained, acquires any group of hyper-parameter combination information from the original hyper-parameter combination information as first hyper-parameter combination information, trains a first basic model to be trained corresponding to the model to be trained according to the first hyper-parameter combination information, and acquires a first target model; then, if the first target model does not have a target model meeting a preset performance condition, the server side obtains second hyper-parameter combination information and a second basic model to be trained according to the original hyper-parameter combination information and the first target model, and obtains a target model meeting the preset performance condition according to the second hyper-parameter combination information and the second basic model to be trained; after obtaining the target model, the server may provide the target model to the client, or may use training data corresponding to the target model to train and optimize the weight parameter of the target model on the basis of the target model, thereby implementing performance optimization of the model and improving accuracy of a model processing result.

Fig. 1-B is a schematic view of a second application scenario of a parameter tuning method for a model according to a first embodiment of the present application. As shown in fig. 1-B, in a speech recognition scenario, for example, in an interaction scenario between a user and a smart speaker device, an interaction scenario between the user and a navigation device, or an interaction scenario between the user and an instant translation device, after obtaining speech information to be recognized, the client sends the speech information to be recognized to a server, and after obtaining the speech information to be recognized, the server inputs the speech information to be recognized into a target speech recognition model obtained by pre-training, and obtains target recognition information corresponding to the speech information to be recognized through the target speech recognition model, where the target speech recognition model may be obtained by the server in advance through the following steps: obtaining any group of hyper-parameter combination information as first hyper-parameter combination information from original hyper-parameter combination information corresponding to the voice recognition model; training a first to-be-trained voice recognition model according to the first hyper-parameter combination information to obtain a first target voice recognition model; if the first target voice recognition model does not have a target voice recognition model meeting preset performance conditions, acquiring second hyper-parameter combination information and a second to-be-trained voice recognition model according to the original hyper-parameter combination information and the first target voice recognition model; and obtaining the target speech recognition model according to the second hyper-parameter combination information and the second speech recognition model to be trained.

Of course, with the continuous progress of the technology, the method may also be applied to the client or the server separately, for example, the method is directly deployed in the client, after the client obtains the hyper-parameter combination information configured by the user and obtains the trigger operation for starting the parameter tuning operation, the method is directly used to perform the parameter tuning operation on the initial basic model configured by the user, so as to obtain and output the target model, where the trigger operation may be an operation triggered manually by the user in real time, or an operation preset by the user and executed within a predetermined time. In addition, during the implementation of the method, a step of interacting with a user may be added, for example, after the first target model is obtained, user configuration information may be further obtained, and according to the user configuration information, the original hyper-parameter combination information, and the first target model, second hyper-parameter combination information and a second basic model to be trained are obtained.

The client may be a mobile terminal device, such as a mobile phone, a tablet computer, or the like, or may be a commonly used computer device. The server is generally a server, and the server may be a physical server or a cloud server, and is not particularly limited herein.

The original hyper-parameter combination information is information of at least one hyper-parameter combination corresponding to the model to be trained, and the information may be an identifier for uniquely identifying a certain hyper-parameter combination, where the hyper-parameter combination includes a value of at least one hyper-parameter corresponding to the model to be trained. In deep learning, when training a model, hyper-parameters (superparameters) of the model, such as a learning rate (learning rate) of the model, a regularization coefficient (regularization parameter), a drop rate (drop), weight attenuation (weight decay), a convolution size (kernel size), and the like, need to be set in advance; when a model to be trained is trained to obtain a target model, at least one hyper-parameter combination corresponding to the model to be trained needs to be configured in advance by a user according to characteristics of the model, and the hyper-parameter combination needs to include at least one hyper-parameter.

It should be noted that the above application scenarios are only specific examples of the parameter tuning method for the model provided in the first embodiment of the present application, and the above application scenarios are provided for the purpose of facilitating understanding of the method and are not intended to limit the method.

Before specifically describing the method provided in the first embodiment of the present application, a brief description is first made of a parameter adjustment method for a model in the prior art, as shown in fig. 2, which is a schematic diagram of a framework of the parameter adjustment method for a model in the prior art provided in the first embodiment of the present application.

As can be seen from fig. 2, a prior art parameter adjusting method for a model includes: "user configuration", that is, initial data configured by a user, for example, a parameter adjusting strategy, a model training script command, at least one set of original hyper-parameter combination information, a process ending condition, and a parallel task data configuration, wherein the parameter adjusting strategy generally includes: bayesian Optimization (Bayesian Optimization), Grid Search (Grid Search), Random Search (Random Search), and the like; the model training script command is an instruction for carrying out automatic parameter adjustment on initial data configured by a user; the original hyper-parameter combination information is the combination information of at least one hyper-parameter corresponding to the model to be trained; parallel task data configuration, which means that when the method is implemented specifically, original hyper-parameter combination information is generally split into a plurality of groups, and the split groups are used in parallel to train a model to be trained; after a user initiates a parameter adjusting task aiming at a model through configuration, a scheduling processor in the method screens out a plurality of super-parameter combination information from original super-parameter combination information configured by the user through parameter adjusting strategy selection, wherein a super-parameter combination corresponding to each super-parameter combination information is used for being input into a model training script command configured by the user, and the model training script command initiates a training task in parallel; the training task management is used for training a model to be trained by calling computing resources, obtaining an intermediate target model corresponding to each hyper-parameter combination information and performance information of the intermediate target model, and returning the intermediate target model and the corresponding performance information to a scheduling processor, wherein the computing resources refer to resources such as a Central Processing Unit (CPU), a Graphics Processing Unit (GPU) and the like; then, the scheduling processor judges whether a target model meeting performance conditions is obtained or not according to the obtained performance information, if not, the next hyper-parameter combination information is selected according to a parameter adjusting strategy, and corresponding training is continued; and obtaining a target model meeting the performance condition and hyper-parameter combination information corresponding to the target model through cyclic optimization.

According to the above description, the target model and the corresponding hyper-parameter combination information obtained by the parameter adjusting method for the model in the prior art are only output results of one iteration, that is, the target model and the hyper-parameter combination information corresponding to the target model are screened from the intermediate target model obtained from each hyper-parameter combination information; moreover, as the deep learning progresses, the time consumption of the target model is long, and the performance effect of the target model obtained by using the method is lower than that of the model after on-line training and tuning (tuning), that is, the accuracy of the model is inferior to that of the model running on-line. Therefore, the target model obtained by the method can be rarely put into practical use, and only the hyper-parameter combination corresponding to the obtained target model can be used as the empirical hyper-parameter for training the actual model running on the optimization line; that is, the above methods in the prior art have the problems of low flexibility and low accuracy when performing parameter adjustment on a model.

The parameter adjusting method for a model according to the first embodiment of the present application improves the above-mentioned method in the prior art to solve the above-mentioned problems, as shown in fig. 3 and fig. 4, which are respectively a flowchart and a frame diagram of the parameter adjusting method for a model according to the first embodiment of the present application. The method provided by the first embodiment of the present application is described below with reference to fig. 3 and 4.

Step S301, obtaining any group of hyper-parameter combination information from the original hyper-parameter combination information as first hyper-parameter combination information.

As shown in fig. 4, when the method according to the first embodiment of the present application is implemented, the user configuration module is also required to configure initial data, for example, a parameter tuning strategy, a model training script command, at least one set of original hyper-parameter combination information, a first flow end condition, a second flow end condition, and a parallel task data configuration.

After the original hyper-parameter combination information configured by the user is obtained, in order to save computing resources and flexibly adjust the number of training data during model training, the original hyper-parameter combination information can be split into at least one group of hyper-parameter combination information, and then any group of hyper-parameter combination information is used as the first hyper-parameter combination information to be subjected to subsequent processing.

It should be noted that, the original hyper-parameter combination information in the first embodiment of the present application may be information of at least one hyper-parameter combination corresponding to the initial base model to be trained, or may be information of a hyper-parameter combination obtained from a historical hyper-parameter combination corresponding to the initial base model to be trained; in addition, "first" and "second" in the first hyper-parameter combination information and the second hyper-parameter combination information in the subsequent processing are generic terms and are used to distinguish different hyper-parameter combination information, for example, if the original hyper-parameter combination information is 100 combinations and the 100 combinations are split into 10 groups, when processing the second hyper-parameter combination information, the first hyper-parameter combination information is equivalent to the first hyper-parameter combination information, and the second hyper-parameter combination information is the second hyper-parameter combination information; when the third group of hyper-parameter combination information is processed, the second group of hyper-parameter combination information is equivalent to the first hyper-parameter combination information, and the third group of hyper-parameter combination information is equivalent to the second hyper-parameter combination information. In addition, in the first embodiment of the present application, a model to be subjected to parameter adjustment is taken as a speech recognition model for performing speech recognition, and original hyper-parameter combination information is taken as original speech hyper-parameter combination information corresponding to the speech recognition model for example; of course, the method may also be used to refer to other models, for example, models such as semantic understanding models, speech synthesis models, or search recommendation models.

Step S302, training a first base model to be trained according to the first hyper-parameter combination information, and obtaining a first target model.

In the method provided in the first embodiment of the present application, after obtaining a set of hyper-parameter combination information and a corresponding basic model to be trained, a reference method for a model described in the prior art is used as a subtask of the method described in the first embodiment of the present application, and specifically, see "subtask system" shown in fig. 4. The training of the first to-be-trained base model according to the first hyper-Parameter combination information to obtain the first target model is performed by selecting training data corresponding to the first hyper-Parameter combination information and a corresponding first to-be-trained base model according to the configuration before the subtask runs by using a first scheduling processor shown in fig. 4, taking the obtained data as an Input Parameter (Input Parameter) of a subtask system, and obtaining the first target model by using the subtask system.

It should be noted that the first base model to be trained includes an initial base model to be trained configured by a user, where the initial base model to be trained is an initial model configured by the user in the user configuration module, that is, when the method described in the first embodiment of the present application is initially run, the first base model to be trained corresponding to the first hyper-parameter combination information is the initial base model to be trained.

For example, when performing parameter tuning on a speech recognition model, the original hyper-parameter combination information corresponding to the speech recognition model is 100 combinations, and the 100 combinations are split into ten groups, and when training a first to-be-trained base model using the first group of hyper-parameter combination information as initial first hyper-parameter combination information, the first to-be-trained base model is configured as an initial speech recognition model to be trained in the "pre-subtask run configuration".

It should be noted that, the above is only one specific embodiment provided in the first embodiment of the present application, and in specific implementation, the first to-be-trained base model may also be trained according to the first hyper-parameter combination information by other methods, which is not described herein again.

After the step S302, executing a step S303, if there is no target model satisfying a preset performance condition in the first target model, obtaining second hyper-parameter combination information and a second to-be-trained base model according to the original hyper-parameter combination information and the first target model.

In the above steps, a first target model corresponding to the first hyper-parameter combination information is obtained, that is, at least one intermediate target model corresponding to at least one hyper-parameter combination information in a certain group of hyper-parameter combination information is obtained; in the method provided in the first embodiment of the present application, each time the "subtask system" ends and returns to at least one intermediate target model, it needs to determine whether a target model meeting the user requirement has been obtained according to a first flow end condition configured by the user. The first flow end condition may be a user preset performance condition, for example, may be an accuracy threshold corresponding to the recognition accuracy of the speech recognition model.

In a first embodiment of the present application, a method for determining whether a target model meeting a preset performance condition exists in a first target model is provided, which specifically includes: obtaining first performance information corresponding to the first target model; and judging whether a target model meeting the preset performance condition exists in the first target model or not according to the first performance information.

The first performance information corresponds to the first target model and is information corresponding to at least one intermediate target model obtained through training of each group of hyper-parameter combination information, and specifically may be used to characterize the performance effect of the intermediate target model obtained through training, for example, for a speech recognition model, the corresponding performance information may generally be an accuracy value characterizing the accuracy of the processing result of the model, or may also be a response speed value characterizing the processing speed of the model, or may also be a statistical value characterizing the comprehensive performance effect of the model, for example, the corresponding performance information may be a statistical value obtained by performing weighted average according to the accuracy value, the response speed value, and other values and used to characterize the performance effect of the model in multiple dimensions.

Specifically, the determining whether the target model satisfying the preset performance condition is included in the first target model according to the first performance information includes: and if the first performance information contains performance information not smaller than a preset performance threshold, judging that the target model exists in the first target model. That is to say, if the first performance information includes performance information not less than the preset performance threshold, it is considered that a target model satisfying a user requirement has been obtained, and a process end condition is reached, so that the target model may be obtained according to a model corresponding to the performance information not less than the preset performance threshold in the first target model. For example, a model corresponding to the performance information not less than the preset performance threshold in the first target model is directly used as the target model. It should be noted that, the setting of the preset performance condition and the preset performance threshold may be set according to actual situations, and is not particularly limited herein.

Of course, if there is no model satisfying the preset performance condition in the first target model, the second hyper-parameter combination information and the second to-be-trained base model may be obtained based on the first target model and according to the original hyper-parameter combination information and the first target model, which will be described below.

The obtaining of second hyper-parameter combination information and a second basic model to be trained according to the original hyper-parameter combination information and the first target model comprises: obtaining the second hyper-parameter combination information from the original hyper-parameter combination information, wherein the second hyper-parameter combination information is any group of hyper-parameter combination information except the first hyper-parameter combination information in the original hyper-parameter combination information; selecting a model meeting preset model screening conditions from the first target model; and obtaining the second basic model to be trained according to the model meeting the preset model screening condition.

Referring to the sub-task result processing shown in fig. 4, compared to the prior art in which each training round is performed on an initial basic model to be trained configured by a user, in order to increase the accuracy of an obtained target model, a first embodiment of the present application provides a method for obtaining second hyper-parameter combination information and a second basic model to be trained, and specifically, at each end of one training round, the entry parameters of a "sub-task system" in a next training round can be obtained on the basis of the previous training round.

Specifically, the obtaining the second hyper-parameter combination information from the original hyper-parameter combination information includes: acquiring second initial hyper-parameter combination information from the original hyper-parameter combination information, wherein the second initial hyper-parameter combination information is any group of hyper-parameter combination information except the first hyper-parameter combination information in the original hyper-parameter combination information; selecting hyper-parameter combination information meeting preset hyper-parameter screening conditions from the first hyper-parameter combination information; and acquiring the second hyper-parameter combination information according to the second initial hyper-parameter combination information and the hyper-parameter combination information meeting the preset hyper-parameter screening condition.

The "super-parameter screening" shown in fig. 4 is to improve the accuracy of the target model obtained by training, after at least one intermediate target model is obtained by training through at least one piece of super-parameter combination information in the first super-parameter combination information, and when the second super-parameter combination information is obtained, the super-parameter combination information with better performance effect of the model obtained by training in the first super-parameter combination information can also be used as the data in the second super-parameter combination information, so that the method can further train the model to be trained in the current round on the basis of the result obtained by the previous round of training, and each round of training in the method can be tightly combined and is not mutually independent.

Wherein, from the first hyper-parameter combination information, selecting the hyper-parameter combination information meeting the preset hyper-parameter screening condition comprises: obtaining first performance information corresponding to the first target model; acquiring performance information with a numerical value meeting a preset first numerical value condition from the first performance information; and acquiring the hyper-parameter combination information meeting the preset hyper-parameter screening condition according to the hyper-parameter combination information corresponding to the performance information meeting the preset first numerical condition in the first hyper-parameter combination information.

In the first embodiment of the present application, the performance information whose obtained value meets the preset first value condition may be several pieces of individual performance information of the maximum value in the first performance information, that is, Top-K pieces of preferred performance information with relatively good performance effect are selected from the first performance information.

For example, when the parameters are adjusted for the speech recognition model, the original speech hyper-parameter combination information corresponding to the speech recognition model is 100 combinations, and the 100 combinations are divided into ten groups, the first hyper-parameter combination information is a first group hyper-parameter combination information, such as (paraguap 1, …, paraguap 10), and meanwhile, the first performance information of the first target model corresponding to the first hyper-parameter combination information is an evaluation score for representing the performance effect of the model, such as (score1, …, score 10); when the second hyper-parameter combination information is obtained, the second group of hyper-parameter combination information may be used as the second initial hyper-parameter combination information, such as (paraguap 11, …, paraguap 20), and the first hyper-parameter combination information corresponding to the performance information of Top-K maximum values in the first performance information is selected, for example, Top-2 maximum values in the first performance information are selected, such as score3 and score6, and then the first hyper-parameter combination information corresponding thereto is paraguap 3 and paraguap 6 respectively; then, the second hyper parameter combination information may be (paraguap 3, paraguap 6, paraguap 11, …, paraguap 20).

Correspondingly, in order to further improve the accuracy of the obtained target model, compared with the method in the prior art that each training is performed on the initial base model configured by the user, in each training of the method provided in the first embodiment of the present application, in each training of the method, the base model to be trained may be a model that is screened from intermediate target models obtained in the previous training and satisfies the preset model screening condition, and is used as the base model to be trained in the current training, that is, the method selects a model that satisfies the preset model screening condition from the first target model, and includes: obtaining first performance information corresponding to the first target model; acquiring performance information with a numerical value meeting a preset second numerical value condition from the first performance information; and obtaining a model meeting the preset model screening condition according to a model corresponding to the performance information meeting the preset second numerical condition in the first target model. The preset second numerical condition may be a maximum value of numerical values in the first performance information, that is, when the current round of training is performed, a model with better performance in the intermediate target model obtained in the previous round of training is used as a second basic model to be trained in the current round of training through the "initial model setting" shown in fig. 4.

It should be noted that, in specific implementation, the preset model screening condition, the preset hyper-parameter screening condition, the preset first numerical condition, and the preset second numerical condition may be set according to an actual situation, and are not particularly limited herein.

In the above, how to obtain the second hyper-parameter combination information and the second basic model to be trained according to the original hyper-parameter combination information and the first target model when the target model meeting the preset performance condition does not exist in the first target model is described in detail.

After step S303, step S304 is executed to obtain the target model according to the second hyper-parameter combination information and the second base model to be trained.

After the second hyper-parameter combination information and the second basic model to be trained of the current round are obtained on the basis of the previous round of training, the target model can be obtained according to the second hyper-parameter combination information and the second basic model to be trained.

Specifically, the obtaining the target model according to the second hyper-parameter combination information and the second base model to be trained includes: training the second basic model to be trained according to the second hyper-parameter combination information to obtain a second target model; and if the second target model has a model meeting the preset performance condition, obtaining the target model according to the model meeting the preset performance condition.

Namely, on the basis of the previous round of training, continuing to train a second basic model to be trained in the subtask system according to the second hyper-parameter combination information, and judging whether a target model meeting the preset performance condition exists in the obtained second target model, if not, continuing to iterate on the basis of the current round until the target model meeting the preset performance condition is obtained.

In addition, it should be noted that the method further includes: obtaining an initial basic model to be trained; if the original hyper-parameter combination information does not meet the preset grouping training condition, training the initial basic model to be trained according to the original hyper-parameter combination information to obtain performance information for representing the performance of the model obtained by training; acquiring performance information with a numerical value meeting a preset second numerical value condition from the performance information; and acquiring target experience hyper-parameter combination information according to hyper-parameter combination information corresponding to the performance information meeting the preset numerical condition in the original hyper-parameter combination information, wherein the target experience hyper-parameter combination information is used as experience hyper-parameters when the initial basic model to be trained is continuously optimized.

That is to say, when the method provided in the first embodiment of the present application is used to perform parameter tuning on an initial basic model to be trained configured by a user, if a parameter space of a hyper-parameter corresponding to the initial basic model to be trained is small, that is, the number of combinations of the hyper-parameter is small, training can be completed through one round of training, and then there is a case where a target model satisfying a preset performance condition cannot be obtained. For this situation, when the method is specifically implemented, the hyper-parameter combination information with better performance effect can be selected according to the performance information corresponding to the obtained intermediate target model, and is used as the experience hyper-parameter for continuously optimizing the model later, that is, the experience hyper-parameter for adjusting and optimizing the initial to-be-trained basic model in daily training is obtained.

In a specific implementation of the method provided in the first embodiment of the present application, the same amount of training data may be used in each cycle for training the basic model to be trained, that is, the method further includes: acquiring original training data; the training of the first base model to be trained by using the first hyper-parameter combination information to obtain the first target model comprises: training the first base model to be trained according to the first hyper-parameter combination information and the original training data to obtain the first target model; the obtaining the target model according to the second hyper-parameter combination information and the second basic model to be trained includes: and training the second basic model to be trained according to the second hyper-parameter combination information and the original training data to obtain the target model.

Referring to "training data screening" as shown in fig. 4, in order to solve the problem of long time consumption and insufficient flexibility of the prior art method, a first embodiment of the present application provides a flexible implementation method, specifically, if the original training data meets a preset data splitting condition, the method further includes: splitting the original training data according to the preset data splitting condition to obtain at least one group of original packet training data; the training of the first base model to be trained according to the first hyper-parameter combination information to obtain the first target model further comprises: acquiring any one group of original packet training data from the at least one group of original packet training data as first training data; training the first base model to be trained according to the first hyper-parameter combination information and the first training data to obtain the first target model; the obtaining the target model according to the second hyper-parameter combination information and the second basic model to be trained includes: obtaining second training data from the at least one group of original packet training data; and training the second basic model to be trained according to the second hyper-parameter combination information and the second training data to obtain the target model, wherein the second training data is any one group of original grouped training data except the first training data in the at least one group of original grouped training data. Wherein if the number of the original training data is not less than a preset training data threshold, it is determined that the original training data satisfies the preset data splitting condition

That is, in specific implementation, if the original training data meets a preset data splitting condition, for example, the number of the original training data is not less than a preset training data threshold, the original training data may be split according to a preset data splitting method, for example, the original training data is split into packets having the same value as the original hyper-parameter combination information; then, during each round of training, each group of hyper-parameter combination information can use the corresponding grouping training data to train the basic model to be trained.

For example, when parameters are adjusted for a speech recognition model, original speech hyper-parameter combination information corresponding to the speech recognition model is 100 combinations, and the 100 combinations are split into ten groups, that is, 10 hyper-parameter combinations are trained in each round of training; if 5000 pieces of original training data exist, if 5000 pieces of original training data are used for training in each round of training, the training time consumption is extremely high; to reduce training time, the 5000 ten thousand original training data pages can be split into 10 groups; in this way, only 500 pieces of training data can be used in each round of training, and thus, the whole iteration can be performed on the 5000 pieces of original training data; and if the target model meeting the preset performance condition is obtained in a certain iteration, the training can be extracted and terminated, and the time consumption of training can be further reduced.

It should be noted that, the above is only a specific embodiment provided in the first embodiment of the present application, and in specific implementation, the preset data splitting condition, the preset training data threshold, and the selection of the training data in each training round may be set according to actual situations, and are not particularly limited herein.

As can be seen from the above description, in the process of obtaining the target model by the method according to the first embodiment of the present application, performance information corresponding to different intermediate target models, for example, first performance information corresponding to a first target model, second performance information corresponding to a second target model, and the like, may be obtained. Therefore, after the target model is obtained, the performance change information corresponding to each hyper-parameter combination information in the original hyper-parameter combination information can be obtained through the performance information corresponding to each hyper-parameter combination information, so that when the model is actually trained later, the hyper-parameter combination of the model can be set by referring to the performance change information, and the model is further and conveniently manually optimized, and the accuracy of the model processing result is improved. Therefore, the method provided by the first embodiment of the present application further includes: obtaining first performance information corresponding to a first target model, and obtaining second performance information corresponding to the target model; and acquiring performance change information corresponding to the original hyper-parameter combination information according to the first performance information and the second performance information.

In summary, the parameter tuning method for a model provided in the first embodiment of the present application includes: obtaining any group of hyper-parameter combination information from the original hyper-parameter combination information as first hyper-parameter combination information; training a first base model to be trained according to the first hyper-parameter combination information to obtain a first target model; if the first target model does not comprise a target model meeting preset performance conditions, acquiring second hyper-parameter combination information and a second basic model to be trained according to the original hyper-parameter combination information and the first target model; and obtaining the target model according to the second hyper-parameter combination information and the second basic model to be trained. Compared with the existing automatic parameter adjusting method, after the first target model is obtained through the training of the first hyper-parameter combination information, if the first target model is judged not to include the target model, second hyper-parameter combination information is obtained according to the original hyper-parameter combination information and the first target model obtained in the previous training, a second basic model to be trained corresponding to the second hyper-parameter combination information is obtained, that is, on the basis of the upper round of training, the hyper-parameter combination information and the basic model to be trained for the second round of training are obtained, so that the continuous iteration of the model can be flexibly and rapidly realized, and, because the second basic model to be trained used in the second round of training is the model obtained according to the first target model obtained in the first round of training, the continuous optimization of the model can be realized, and the accuracy of the target model is increased.

Corresponding to the reference method for model provided in the first embodiment of the present application, the second embodiment of the present application further provides a method for obtaining a speech recognition model, please refer to fig. 5, which is a flowchart of the method for obtaining a speech recognition model provided in the second embodiment of the present application, wherein some steps have been described in detail in the first embodiment of the present application, so that the description herein is relatively simple, and for the relevant points, reference may be made to some descriptions in the first embodiment of the present application, and the processing procedure described below is only exemplary.

Step S501, obtaining any group of hyper-parameter combination information as first hyper-parameter combination information from the original hyper-parameter combination information corresponding to the voice recognition model.

Step S502, training a first to-be-trained speech recognition model according to the first hyper-parameter combination information, and obtaining a first target speech recognition model.

Step S503, if there is no target speech recognition model satisfying the preset performance condition in the first target speech recognition model, obtaining second hyper-parameter combination information and a second to-be-trained speech recognition model according to the original hyper-parameter combination information and the first target speech recognition model.

Step S504, the target speech recognition model is obtained according to the second hyper-parameter combination information and the second speech recognition model to be trained.

In correspondence with the methods provided in the first and second embodiments of the present application, the third embodiment of the present application further provides a speech recognition method, please refer to fig. 6, which is a flowchart of the speech recognition method provided in the third embodiment of the present application, wherein some steps have been described in detail in the first and second embodiments of the present application, so that the description herein is relatively simple, and for the relevant points, reference may be made to some descriptions in the first and second embodiments of the present application, and the processing procedures described below are only schematic.

Step S601, acquiring the speech information to be recognized.

Step S602, inputting the speech information to be recognized into a target speech recognition model, and obtaining target recognition information corresponding to the speech information to be recognized, where the target speech recognition model is a model obtained by using an obtaining method of the speech recognition model.

When the method is applied to a server, after the server obtains target identification information corresponding to-be-identified voice information sent by a client, the target identification information can be provided for the client so that the client can display the target identification information; or, the service end can also obtain service content information corresponding to the voice information to be recognized and provide the service content information to the client, so that the client provides services to the user in a mode of displaying or playing the service content information according to the service content information.

For example, in a smart home environment, a user may send a voice message to be recognized to a smart speaker device, "please play today", the smart speaker device receives the voice message to be recognized, and then sends the voice message to be recognized to a cloud server connected to the smart speaker device, the cloud server recognizes the voice message to be recognized using a target voice recognition model obtained by the obtaining method of the voice recognition model, obtains corresponding target recognition information, and obtains service content information "sunny day, breeze, current temperature 20 degree" related to the today by searching according to the content of the target recognition information, and provides the service content information to the smart speaker device, the smart speaker device may directly display the service content information on a display screen of the smart speaker device after obtaining the service content information sent by the cloud server, or the service content information can be directly played, so that the service can be quickly and accurately provided for the user, and the user experience is improved.

In addition, to further increase the response speed of the client Computing device, the method may also be applied solely to Computing devices that provide the closest end service to the user through Edge Computing (Edge Computing), for example, the target speech recognition model obtained by the method for obtaining the speech recognition model can be independently deployed in computing equipment such as intelligent sound box equipment, navigation equipment or translation equipment which directly interacts with the user, so that after the client obtains the speech information to be recognized sent by the user, the built-in target speech recognition model is used for rapidly recognizing the speech information to be recognized so as to display the acquired target recognition information to the user, or directly acquiring the service content information corresponding to the target identification information, and providing the service content information to the user in a display or play mode so as to rapidly provide the nearest-end service for the user. For example, the target speech recognition model obtained by the obtaining method of the speech recognition model may be deployed in the vehicle-mounted navigation device separately, and after the user sends out the speech information to be recognized of "navigate to xxx destination", the vehicle-mounted navigation device recognizes the speech information, obtains the navigation path corresponding to the speech information through the built-in navigation path calculation module thereof, and displays the obtained navigation path to the user, so that the user can select an appropriate navigation path.

It should be noted that, here, the speech recognition method provided in the third embodiment of the present application is illustrated by the smart sound box device and the vehicle-mounted navigation device, and when the method is implemented specifically, the method may also be applied to other computing devices that provide different service contents, such as an instant translation device and an information query device, as needed, and details are not described here.

Corresponding to the parameter adjusting method for the model provided in the first embodiment of the present application, a parameter adjusting device for the model is also provided in the fourth embodiment of the present application, please refer to fig. 7, which is a schematic diagram of the parameter adjusting device for the model provided in the fourth embodiment of the present application. A parameter adjusting device for a model provided in a fourth embodiment of the present application includes the following components:

a first information obtaining unit 701, configured to obtain, from the original hyper-parameter combination information, any one group of hyper-parameter combination information as first hyper-parameter combination information.

A first target model obtaining unit 702, configured to train a first to-be-trained base model according to the first hyper-parameter combination information, so as to obtain a first target model.

A second information obtaining unit 703 is configured to determine whether a target model meeting a preset performance condition exists in the first target model, and if not, obtain second hyper-parameter combination information and a second to-be-trained base model according to the original hyper-parameter combination information and the first target model.

An object model obtaining unit 704, configured to obtain the object model according to the second hyper-parameter combination information and the second basic model to be trained.

Optionally, the apparatus further includes a performance information obtaining unit, specifically configured to obtain first performance information corresponding to the first target model; and judging whether a target model meeting the preset performance condition exists in the first target model or not according to the first performance information.

Optionally, the performance information obtaining unit includes a determining subunit, configured to determine whether performance information that is not less than a preset performance threshold exists in the first performance information, and if so, determine that the target model exists in the first target model.

Optionally, the performance information obtaining unit further includes a target model obtaining subunit, configured to obtain the target model according to a model in the first target model corresponding to the performance information that is not less than the preset performance threshold.

Optionally, the apparatus further includes a target experience hyper-parameter combination information obtaining unit, configured to obtain an initial basic model to be trained; if the original hyper-parameter combination information does not meet the preset grouping training condition, training the initial basic model to be trained according to the original hyper-parameter combination information to obtain performance information for representing the performance of the model obtained by training; acquiring performance information with a numerical value meeting a preset second numerical value condition from the performance information; and acquiring target experience hyper-parameter combination information according to hyper-parameter combination information corresponding to the performance information meeting the preset second numerical condition in the original hyper-parameter combination information, wherein the target experience hyper-parameter combination information is used as experience hyper-parameters when the initial basic model to be trained is continuously optimized.

Optionally, the training a first base model to be trained by using the first hyper-parameter combination information to obtain a first target model includes: training the first base model to be trained according to the first hyper-parameter combination information and the original training data to obtain the first target model; the obtaining the target model according to the second hyper-parameter combination information and the second basic model to be trained includes: and training the second basic model to be trained according to the second hyper-parameter combination information and the original training data to obtain the target model.

Optionally, if the original training data meets a preset data splitting condition, the apparatus further includes a splitting training unit, configured to split the original training data according to the preset data splitting condition, so as to obtain at least one group of original packet training data; the training of the first base model to be trained according to the first hyper-parameter combination information to obtain the first target model further comprises: acquiring any one group of original packet training data from the at least one group of original packet training data as first training data; training the first base model to be trained according to the first hyper-parameter combination information and the first training data to obtain the first target model; the obtaining the target model according to the second hyper-parameter combination information and the second basic model to be trained includes: obtaining second training data from the at least one group of original packet training data; and training the second basic model to be trained according to the second hyper-parameter combination information and the second training data to obtain the target model, wherein the second training data is any one group of original grouped training data except the first training data in the at least one group of original grouped training data.

Optionally, the apparatus further includes a performance change information obtaining unit, configured to obtain first performance information corresponding to a first target model, and obtain second performance information corresponding to the target model; and acquiring performance change information corresponding to the original hyper-parameter combination information according to the first performance information and the second performance information.

Corresponding to the method for tuning a model provided in the first embodiment of the present application, a fifth embodiment of the present application further provides an electronic device, please refer to fig. 8, which is a schematic diagram of an electronic device provided in the fifth embodiment of the present application. A fifth embodiment of the present application provides an electronic device including:

a processor 801;

a memory 802 for storing a program of a model-specific parameter adjusting method, wherein the following steps are executed after the device is powered on and the program of the model-specific parameter adjusting method is executed by the processor:

Corresponding to the parameter adjusting method for the model provided in the first embodiment of the present application, a sixth embodiment of the present application further provides a storage device, since the embodiment of the storage device is substantially similar to the embodiment of the method, the description is relatively simple, and the relevant points can be referred to the partial description of the embodiment of the method, and the embodiment of the storage device described below is only illustrative. A storage device according to a sixth embodiment of the present application stores a program for a model parameter adjustment method, where the program is executed by a processor to perform the following steps:

Corresponding to the method for obtaining a speech recognition model provided in the second embodiment of the present application, a seventh embodiment of the present application further provides a device for obtaining a speech recognition model, please refer to fig. 9, which is a schematic diagram of the device for obtaining a speech recognition model provided in the seventh embodiment of the present application. A seventh embodiment of the present application provides an apparatus for obtaining a speech recognition model, including:

a first speech information obtaining unit 901, configured to obtain any one group of hyper-parameter combination information as first hyper-parameter combination information from original hyper-parameter combination information corresponding to the speech recognition model.

A first target speech recognition model obtaining unit 902, configured to train a first to-be-trained speech recognition model according to the first hyper-parameter combination information, so as to obtain a first target speech recognition model.

A second speech information obtaining unit 903, configured to determine whether a target speech recognition model meeting a preset performance condition does not exist in the first target speech recognition model, and if not, obtain second hyper-parameter combination information and a second to-be-trained speech recognition model according to the original hyper-parameter combination information and the first target speech recognition model.

A target speech recognition model obtaining unit 904, configured to obtain the target speech recognition model according to the second hyper-parameter combination information and the second speech recognition model to be trained.

Corresponding to the method for obtaining a speech recognition model provided in the second embodiment of the present application, the eighth embodiment of the present application further provides an electronic device, which is substantially similar to the method embodiment, so that the description is simple, and the relevant points can be referred to part of the description of the method embodiment, and the electronic device embodiments described below are only schematic. An eighth embodiment of the present application provides an electronic device including:

a processor;

Corresponding to the method for obtaining a speech recognition model provided in the second embodiment of the present application, the ninth embodiment of the present application further provides a storage device, since the storage device embodiment is substantially similar to the method embodiment, the description is relatively simple, and the relevant points can be referred to the partial description of the method embodiment, and the storage device embodiment described below is only illustrative. A storage device according to a ninth embodiment of the present application stores a program of a method for obtaining a speech recognition model, the program being executed by a processor and executing the steps of:

In correspondence with a speech recognition method provided by the third embodiment of the present application, a speech recognition apparatus is also provided by the tenth embodiment of the present application, please refer to fig. 10, which is a schematic diagram of a speech recognition apparatus provided by the tenth embodiment of the present application. A speech recognition apparatus according to a tenth embodiment of the present application includes:

an obtaining unit 1001 is configured to obtain voice information to be recognized.

A recognition unit 1002, configured to input the speech information to be recognized into a target speech recognition model, and obtain target recognition information corresponding to the speech information to be recognized, where the target speech recognition model is a model obtained by using the obtaining method of the speech recognition model.

Corresponding to a speech recognition method provided by the third embodiment of the present application, the eleventh embodiment of the present application further provides an electronic device, which is substantially similar to the method embodiment, so that the description is relatively simple, and the relevant points can be referred to the partial description of the method embodiment, and the electronic device embodiments described below are only schematic. An electronic device provided in an eleventh embodiment of the present application includes:

a processor;

Corresponding to a speech recognition method provided by the third embodiment of the present application, the twelfth embodiment of the present application further provides a storage device, since the storage device embodiment is substantially similar to the method embodiment, the description is relatively simple, and for the relevant points, reference may be made to part of the description of the method embodiment, and the storage device embodiment described below is only illustrative. A storage device according to a twelfth embodiment of the present application stores a program of a speech recognition method, the program being executed by a processor to perform the steps of:

Although the present application has been described with reference to the preferred embodiments, it is not intended to limit the present application, and those skilled in the art can make variations and modifications without departing from the spirit and scope of the present application, therefore, the scope of the present application should be determined by the claims that follow.

In a typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.

The memory may include forms of volatile memory in a computer readable medium, Random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of a computer-readable medium.

Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, computer readable media does not include non-transitory computer readable media (transient media), such as modulated data signals and carrier waves.

As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

Claims

1. A parameter adjusting method for a model is characterized by comprising the following steps:

obtaining any group of hyper-parameter combination information from the original hyper-parameter combination information as first hyper-parameter combination information;

training a first base model to be trained according to the first hyper-parameter combination information to obtain a first target model;

if the first target model does not have a target model meeting preset performance conditions, acquiring second hyper-parameter combination information and a second basic model to be trained according to the original hyper-parameter combination information and the first target model;

and obtaining the target model according to the second hyper-parameter combination information and the second basic model to be trained.

2. The method for tuning parameters of a model according to claim 1, further comprising:

obtaining first performance information corresponding to the first target model;

and judging whether a target model meeting the preset performance condition exists in the first target model or not according to the first performance information.

3. The method for tuning parameters of a model according to claim 2, wherein the determining whether the target model satisfying the preset performance condition is included in the first target model according to the first performance information includes:

and if the first performance information contains performance information not smaller than a preset performance threshold, judging that the target model exists in the first target model.

4. The method for tuning parameters of a model according to claim 3, further comprising:

and obtaining the target model according to a model corresponding to the performance information not less than a preset performance threshold in the first target model.

5. The parameter tuning method for the model according to claim 1, wherein the obtaining of the second hyper-parameter combination information and the second base model to be trained according to the original hyper-parameter combination information and the first target model comprises:

obtaining the second hyper-parameter combination information from the original hyper-parameter combination information, wherein the second hyper-parameter combination information is any group of hyper-parameter combination information except the first hyper-parameter combination information in the original hyper-parameter combination information;

selecting a model meeting preset model screening conditions from the first target model;

and obtaining the second basic model to be trained according to the model meeting the preset model screening condition.

6. The method of tuning parameters for a model according to claim 5, wherein said obtaining the second hyper-parameter combination information from the original hyper-parameter combination information comprises:

acquiring second initial hyper-parameter combination information from the original hyper-parameter combination information, wherein the second initial hyper-parameter combination information is any group of hyper-parameter combination information except the first hyper-parameter combination information in the original hyper-parameter combination information;

selecting hyper-parameter combination information meeting preset hyper-parameter screening conditions from the first hyper-parameter combination information;

and acquiring the second hyper-parameter combination information according to the second initial hyper-parameter combination information and the hyper-parameter combination information meeting the preset hyper-parameter screening condition.

7. The parameter tuning method for the model according to claim 6, wherein the selecting the hyper-parameter combination information satisfying a preset hyper-parameter screening condition from the first hyper-parameter combination information includes:

acquiring performance information with a numerical value meeting a preset first numerical value condition from the first performance information;

and acquiring the hyper-parameter combination information meeting the preset hyper-parameter screening condition according to the hyper-parameter combination information corresponding to the performance information meeting the preset first numerical condition in the first hyper-parameter combination information.

8. The parameter adjusting method for the model according to claim 5, wherein the selecting the model satisfying a preset model screening condition from the first target model comprises:

acquiring performance information with a numerical value meeting a preset second numerical value condition from the first performance information;

and obtaining a model meeting the preset model screening condition according to a model corresponding to the performance information meeting the preset second numerical condition in the first target model.

9. The method for tuning parameters of a model according to claim 1, wherein the obtaining the target model according to the second hyper-parameter combination information and the second base model to be trained comprises:

training the second basic model to be trained according to the second hyper-parameter combination information to obtain a second target model;

and if the second target model has a model meeting the preset performance condition, obtaining the target model according to the model meeting the preset performance condition.

10. The method for tuning parameters of a model according to claim 1, wherein the first base model to be trained comprises an initial base model to be trained configured by a user.

11. The method for tuning parameters of a model according to claim 1, further comprising:

obtaining an initial basic model to be trained;

if the original hyper-parameter combination information does not meet the preset grouping training condition, training the initial basic model to be trained according to the original hyper-parameter combination information to obtain performance information for representing the performance of the model obtained by training;

acquiring performance information with a numerical value meeting a preset second numerical value condition from the performance information;

and acquiring target experience hyper-parameter combination information according to hyper-parameter combination information corresponding to the performance information meeting the preset second numerical condition in the original hyper-parameter combination information, wherein the target experience hyper-parameter combination information is used as experience hyper-parameters when the initial basic model to be trained is continuously optimized.

12. The method for tuning parameters of a model according to claim 1, further comprising:

acquiring original training data;

the training of the first base model to be trained by using the first hyper-parameter combination information to obtain the first target model comprises: training the first base model to be trained according to the first hyper-parameter combination information and the original training data to obtain the first target model;

the obtaining the target model according to the second hyper-parameter combination information and the second basic model to be trained includes: and training the second basic model to be trained according to the second hyper-parameter combination information and the original training data to obtain the target model.

13. The method of tuning a model according to claim 12, wherein if the original training data satisfies a preset data splitting condition, the method further comprises:

splitting the original training data according to the preset data splitting condition to obtain at least one group of original packet training data;

the training of the first base model to be trained according to the first hyper-parameter combination information to obtain the first target model further comprises: acquiring any one group of original packet training data from the at least one group of original packet training data as first training data; training the first base model to be trained according to the first hyper-parameter combination information and the first training data to obtain the first target model;

the obtaining the target model according to the second hyper-parameter combination information and the second basic model to be trained includes: obtaining second training data from the at least one group of original packet training data; and training the second basic model to be trained according to the second hyper-parameter combination information and the second training data to obtain the target model, wherein the second training data is any one group of original grouped training data except the first training data in the at least one group of original grouped training data.

14. The method of claim 13, wherein if the number of the original training data is not less than a preset training data threshold, it is determined that the original training data satisfies the preset data splitting condition.

15. The method for tuning parameters of a model according to claim 1, further comprising:

obtaining first performance information corresponding to a first target model, and obtaining second performance information corresponding to the target model;

and acquiring performance change information corresponding to the original hyper-parameter combination information according to the first performance information and the second performance information.

16. The method for tuning parameters of a model according to claim 1, wherein the method further comprises:

acquiring a trigger operation for representing starting parameter adjusting operation;

and responding to the triggering operation, and executing a step of obtaining any group of hyper-parameter combination information from the original hyper-parameter combination information as first hyper-parameter combination information.

17. The method of tuning parameters for a model according to claim 10, wherein the original hyper-parameter combination information comprises information of at least one hyper-parameter combination of: the hyper-parameter combination configured by the user and corresponding to the initial basic model to be trained and the hyper-parameter combination obtained from the historical hyper-parameter combination corresponding to the initial basic model to be trained.

18. A method for obtaining a speech recognition model, comprising:

obtaining any group of hyper-parameter combination information as first hyper-parameter combination information from original hyper-parameter combination information corresponding to the voice recognition model;

training a first to-be-trained voice recognition model according to the first hyper-parameter combination information to obtain a first target voice recognition model;

if the first target voice recognition model does not have a target voice recognition model meeting preset performance conditions, acquiring second hyper-parameter combination information and a second to-be-trained voice recognition model according to the original hyper-parameter combination information and the first target voice recognition model;

and obtaining the target speech recognition model according to the second hyper-parameter combination information and the second speech recognition model to be trained.

19. A speech recognition method, comprising:

acquiring voice information to be recognized;

inputting the speech information to be recognized into a target speech recognition model, and obtaining target recognition information corresponding to the speech information to be recognized, wherein the target speech recognition model is a model obtained by using the obtaining method of the speech recognition model according to claim 18.

20. The speech recognition method of claim 19, wherein the method is applied to a server, and further comprising:

and providing the target identification information to a client.

21. The speech recognition method of claim 20, further comprising:

acquiring service content information corresponding to the target identification information;

and providing the service content information to the client.

22. The speech recognition method of claim 19, wherein the method is applied to a client, and further comprising:

acquiring the target identification information;

and displaying or playing the target identification information.

23. The speech recognition method of claim 22, further comprising:

and displaying or playing the service content information.

24. The speech recognition method of claim 22, wherein the client comprises a computing device that provides a most proximal service through edge computing.

25. The speech recognition method of claim 24, wherein the computing device comprises at least one of: intelligent audio amplifier equipment, vehicle navigation equipment, translation equipment.

26. A parameter adjusting device for a model is characterized by comprising:

a first information obtaining unit, configured to obtain any one group of hyper-parameter combination information from the original hyper-parameter combination information as first hyper-parameter combination information;

a first target model obtaining unit, configured to train a first to-be-trained base model according to the first hyper-parameter combination information, so as to obtain a first target model;

a second information obtaining unit, configured to determine whether a target model meeting a preset performance condition exists in the first target model, and if not, obtain second hyper-parameter combination information and a second basic model to be trained according to the original hyper-parameter combination information and the first target model;

and the target model obtaining unit is used for obtaining the target model according to the second hyper-parameter combination information and the second basic model to be trained.

27. An electronic device, comprising:

a processor;

28. A storage device storing a program for a parameter adjustment method for a model, the program being executed by a processor and performing the steps of:

29. An apparatus for obtaining a speech recognition model, comprising:

a first voice information obtaining unit, configured to obtain any one group of hyper-parameter combination information as first hyper-parameter combination information from original hyper-parameter combination information corresponding to the voice recognition model;

a first target speech recognition model obtaining unit, configured to train a first to-be-trained speech recognition model according to the first hyper-parameter combination information, so as to obtain a first target speech recognition model;

a second voice information obtaining unit, configured to determine whether a target voice recognition model meeting a preset performance condition does not exist in the first target voice recognition model, and if not, obtain second hyper-parameter combination information and a second to-be-trained voice recognition model according to the original hyper-parameter combination information and the first target voice recognition model;

and the target speech recognition model obtaining unit is used for obtaining the target speech recognition model according to the second hyper-parameter combination information and the second speech recognition model to be trained.

30. An electronic device, comprising:

a processor;

31. A storage device storing a program of an obtaining method of a speech recognition model, the program being executed by a processor and performing the steps of:

32. A speech recognition apparatus, comprising:

the acquisition unit is used for acquiring voice information to be recognized;

a recognition unit configured to input the speech information to be recognized into a target speech recognition model, and obtain target recognition information corresponding to the speech information to be recognized, wherein the target speech recognition model is a model obtained by using the method for obtaining a speech recognition model according to claim 18.

33. An electronic device, comprising:

a processor;

acquiring voice information to be recognized;

34. A storage device storing a program of a speech recognition method, the program being executed by a processor to perform the steps of:

acquiring voice information to be recognized;