CN109754068A - Transfer learning method and terminal device based on deep learning pre-training model - Google Patents

Transfer learning method and terminal device based on deep learning pre-training model Download PDF

Info

Publication number
CN109754068A
CN109754068A CN201811473650.2A CN201811473650A CN109754068A CN 109754068 A CN109754068 A CN 109754068A CN 201811473650 A CN201811473650 A CN 201811473650A CN 109754068 A CN109754068 A CN 109754068A
Authority
CN
China
Prior art keywords
training
data
model
data set
new
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811473650.2A
Other languages
Chinese (zh)
Inventor
许国杰
刘川
吴又奎
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhongke Hengyun Co Ltd
Original Assignee
Zhongke Hengyun Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhongke Hengyun Co Ltd filed Critical Zhongke Hengyun Co Ltd
Priority to CN201811473650.2A priority Critical patent/CN109754068A/en
Publication of CN109754068A publication Critical patent/CN109754068A/en
Pending legal-status Critical Current

Links

Landscapes

  • Electrically Operated Instructional Devices (AREA)

Abstract

The present invention is suitable for model construction techniques field, a kind of transfer learning method and terminal device based on deep learning pre-training model is provided, this method comprises: data set is divided into training dataset and test data set;New model is obtained to the pre-training model re -training of acquisition according to the data that the training data is concentrated;According to the data that the test data is concentrated, detection is carried out to the Generalization Capability of the new model and obtains testing result;When the testing result reaches pre-set level value, it is determined that the new model is the model for meeting application.The program can solve the data of scale needed for when facing the particular problem in a certain field, may being usually unable to get building model in the prior art, and construct the new model problem that time-consuming, resource consumption is larger.

Description

Transfer learning method and terminal device based on deep learning pre-training model
Technical field
The invention belongs to model construction techniques field more particularly to a kind of migrations based on deep learning pre-training model Learning method and terminal device.
Background technique
Under traditional machine learning frame, the task of study is exactly to be constructed on the basis of data according to given train up A new model.However, following key is had in machine learning in current research:
1, large-scale data needed for training new model.By the large-scale data for a certain specific field and bad obtain It takes.
2, time-consuming.Deep learning is a large-scale neural network, and the number of plies is relatively more, training needed for expend the time compared with It is long, and neural network is more complicated, and data are more, it would be desirable to which the time spent in training process is also more.
3, resource is consumed.Neural network usually requires a large amount of marker samples, usually a large amount of data and neural network In the response of each layer can consume a large amount of memories.It is identical that traditional machine learning usually assumes that training data is obeyed with test data Data distribution.However, in many cases, this same distributional assumption is simultaneously unsatisfactory for, such as training data is expired, results in the need for me Remove to mark a large amount of training data again to meet the needs of we train, but mark new data and need a large amount of manpower and object Power, even and if we have a large amount of, training data under different distributions, abandoning these data completely is also to waste very much 's.
Summary of the invention
In view of this, the embodiment of the invention provides a kind of transfer learning method based on deep learning pre-training model and Terminal device, can solve in the prior art face a certain field particular problem when, may usually be unable to get building mould The data of scale needed for type, and construct the new model problem that time-consuming, resource consumption is larger.
The first aspect of the embodiment of the present invention provides a kind of transfer learning method based on deep learning pre-training model, Include:
Data set is divided into training dataset and test data set;
New model is obtained to the pre-training model re -training of acquisition according to the data that the training data is concentrated;
According to the data that the test data is concentrated, detection is carried out to the Generalization Capability of the new model and obtains detection knot Fruit;
When the testing result reaches pre-set level value, it is determined that the new model is the model for meeting application.
In one embodiment, the training dataset and the test data set are that two data distributions are consistent and mutual exclusion Data acquisition system.
In one embodiment, the test data set is carried out sampling in the data set by way of stratified sampling and be obtained , the data in the data set in addition to the test data set are the training dataset.
In one embodiment, the data that the training data is concentrated are more than the data that the test data is concentrated.
In one embodiment, the ratio section that the data that the training data is concentrated account for the data of the data set is [2/ 3,4/5]。
It is in one embodiment, described that data set is divided into training dataset and test data set, comprising:
Data set is subjected to n times random division, N group training dataset is obtained and corresponding test data set, the N is greater than Equal to 1.
In one embodiment, the data concentrated according to the training data, instruct the pre-training model of acquisition again Practice, obtain new model, comprising:
According to the data that the training data is concentrated, to the training in the level of model rear end in the pre-training model of acquisition Weight re -training obtains new weight;
According to the training data concentrate data, to the parameter in the level of model rear end in the pre-training model into Row adjustment, obtains new parameter;
According to remained unchanged in the level of model front end in the pre-training model training weight, the new weight with And the new parameter, obtain new model.
The second aspect of the embodiment of the present invention provides a kind of transfer learning device based on deep learning pre-training model, Include:
Division module, for data set to be divided into training dataset and test data set;
Training module, the data for being concentrated according to the training data obtain the pre-training model re -training of acquisition Obtain new model;
Test module, the data for being concentrated according to the test data, examines the Generalization Capability of the new model It surveys and obtains testing result;
Determining module, for when the testing result reaches pre-set level value, it is determined that the new model is to meet to answer Model.
The third aspect of the embodiment of the present invention provides a kind of terminal device, including memory, processor and is stored in In the memory and the computer program that can run on the processor, which is characterized in that described in the processor executes Step described in the transfer learning method based on deep learning pre-training model realized when computer program.
The fourth aspect of the embodiment of the present invention provides a kind of computer readable storage medium, the computer-readable storage Media storage has computer program, which is characterized in that realizes when the computer program is executed by processor and is based on deep learning Step described in the transfer learning method of pre-training model.
Existing beneficial effect is the embodiment of the present invention compared with prior art: scheme provided in an embodiment of the present invention is led to It crosses and is adjusted the part layer in the pre-training model in deep learning to obtain new model according to the data of frontier, then to new Model is assessed, adjusts to be applied to practical problem, trains up data to solve and need to give in the prior art On the basis of learn a new model, when the new model of needs is a large-scale neural network, time-consuming and consumes Problem more than resource.
Detailed description of the invention
It to describe the technical solutions in the embodiments of the present invention more clearly, below will be to embodiment or description of the prior art Needed in attached drawing be briefly described, it should be apparent that, the accompanying drawings in the following description is only of the invention some Embodiment for those of ordinary skill in the art without any creative labor, can also be according to these Attached drawing obtains other attached drawings.
Fig. 1 is a kind of process of transfer learning method based on deep learning pre-training model provided in an embodiment of the present invention Schematic diagram;
Fig. 2 is the stream of another transfer learning method based on deep learning pre-training model provided in an embodiment of the present invention Journey schematic diagram;
Fig. 3 is a kind of example of transfer learning device based on deep learning pre-training model provided in an embodiment of the present invention Figure;
Fig. 4 is the schematic diagram of terminal device provided in an embodiment of the present invention.
Specific embodiment
In being described below, for illustration and not for limitation, the tool of such as particular system structure, technology etc is proposed Body details, to understand thoroughly the embodiment of the present invention.However, it will be clear to one skilled in the art that there is no these specific The present invention also may be implemented in the other embodiments of details.In other situations, it omits to well-known system, device, electricity The detailed description of road and method, in case unnecessary details interferes description of the invention.
In order to illustrate technical solutions according to the invention, the following is a description of specific embodiments.
The embodiment of the present invention provides a kind of transfer learning method based on deep learning pre-training model, as shown in Figure 1, should Method the following steps are included:
Step 101, data set is divided into training dataset and test data set.
Optionally, the training dataset and the test data set are that two data distributions are consistent and the data of mutual exclusion Set.For example, data set D is divided into two data distributions unanimously and the data acquisition system of mutual exclusion, one is training dataset S, it be the intersection of D, S and T is empty that one, which is the union of test data set T, S and T,.
Optionally, S is consistent with the data distribution of T, be in order to avoid because data partition process introduces additional deviation, thus Influence final result.
Further, in order to guarantee the consistency of data distribution, in the application by the way of stratified sampling to data into Row sampling.Specifically, the test data set carries out sampling acquisition by way of stratified sampling in the data set, described Data in data set in addition to the test data set are the training dataset.For example, having m in data set D1A positive sample This, there is m2A negative sample, and the ratio that S accounts for D is p, then the ratio that T accounts for D is (1-p), then it can be by m1A positive sample Middle acquisition (m1* the positive sample that p) a sample is concentrated as training data, and by m2(m is acquired in a negative sample2* p) a sample This negative sample concentrated as training data, remaining sample concentrated as test data.
Optionally, the data that the training data is concentrated are more than the data that the test data is concentrated.
After dividing to data set D, there are many sample in training dataset S, close to D, train the new model and D come Itself training the model come may be very close to, but since T is smaller, it is not accurate enough at this time to may result in assessment result Stablize;If S sample is seldom, and the new model for training can be made to train the model come with D and differed greatly.Therefore, Specifically, the ratio section that the data that the training data is concentrated account for the data of the data set is [2/3,4/5], the then survey The ratio section that data in examination data set account for the data of the data set is [1/3,1/5].
Further, when being divided to data set, data set can be subjected to n times random division, obtains the training of N group Data set and corresponding test data set, the N are more than or equal to 1, the N group testing result of acquisition can be averaged work in this way For final assessment result, the assessment result for being is more acurrate, more meets the quick application of different field.
Step 102, the data concentrated according to the training data obtain new the pre-training model re -training of acquisition Model.
Optionally, as shown in Fig. 2, the step includes following sub-step:
Step 1021, the data concentrated according to the training data, to the level of model rear end in the pre-training model In training weight re -training, obtain new weight.
It optionally, can also include: to obtain pre-training model before this step, the pre-training model includes training power Weight.
During deep learning, since computing resource is limited or training dataset is smaller, but we want again obtain compared with It is well more stable as a result, still can obtain some trained models, i.e. pre-training model first, directly to pre-training Model carries out re -training acquisition new model can save a large amount of people without trained a new model of starting from scratch in this way Power material resources.
The source model of one pre-training be it is select from available model, many research institutions have all issued based on super The model of large data sets, these all can serve as the choice of source model.The pre-training model that this programme obtains is with training The pre-training model of weight.
Further alternative, deep learning passes through forward calculation and backpropagation, continuous adjusting parameter, to extract optimal spy Sign, to achieve the purpose that prediction.Advanced connection of the level of model front end commonly used to capture input data, such as image border With main body etc.;The level of model rear end helps to make the information finally determined commonly used to capture, such as distinguishing target The detailed information of output.
After obtaining pre-training model, re -training total is not needed, it is only necessary to several layers of be trained for therein ?.Some layers of the weight that model originates is remained unchanged, the subsequent layer of re -training obtains new weight.That is basis The data that the training data is concentrated, instruct the training weight in the level of model rear end in the pre-training model of acquisition again Practice, obtains new weight.
During adjusting model, we can repeatedly be attempted, the number concentrated according to the different training data of N group It is adjusted according to pre-training model, freezes layer frozen layers and retraining layer retrain so as to find according to result Best collocation between layers.
Step 1022, the data concentrated according to the training data, to the level of model rear end in the pre-training model In parameter be adjusted, obtain new parameter.
Optionally, it is finely adjusted, trained model is applied to similar or only by the parameter to pre-training model Have in the different task of nuance.
Step 1023, according to the training weight, described remained unchanged in the level of model front end in the pre-training model New weight and the new parameter obtains new model.
It should be understood that the application of new model is the process of loop iteration, only by the lasting adjustment of model and tune It is excellent just to adapt to online data and business objective, it can just find most effective new model.
Step 103, the data concentrated according to the test data, carry out detection acquisition to the Generalization Capability of the new model Testing result.
After obtaining new model, need to assess the performance of new model, the application passes through the generalization to new model It can be carried out detection, to obtain the result to new model assessment.
Generalization Capability, that is, generalization ability, generalization ability (generalization ability) refer to machine learning algorithm pair The adaptability of fresh sample.The destination of study is to acquire the rule for lying in data to behind, to same rule The data other than collection are practised, trained network can also provide suitable output, which is known as generalization ability.
Optionally, in order to obtain more accurate testing result, the data that N group test data is concentrated, to the new model Generalization Capability detected, obtain N number of testing result, take the average value of N number of testing result as final testing result.
Step 104, when the testing result reaches pre-set level value, it is determined that the new model is the mould for meeting application Type.
, can be online by new model when new model reaches the pre-set level value of setting, it puts into production, obtains for enterprise The transfer learning method model used with user.
Optionally, as shown in Fig. 2, before this step further include:
Step 105, detect whether the testing result reaches pre-set level value.
Step 106, when the testing result is not up to the pre-set level value, it is determined that the new model is unsatisfactory for answering With, continue adjust model, i.e., execution step 1021, until assessment result meet apply.
Transfer learning method provided in an embodiment of the present invention based on deep learning pre-training model, by being drawn to data set It is divided into the training dataset and test data set of independent same distribution and mutual exclusion, according to the data that the training data is concentrated, to obtaining The pre-training model re -training taken obtains new model;According to the data that the test data is concentrated, to the general of the new model Change performance and carries out detection acquisition testing result;When the testing result reaches pre-set level value, it is determined that the new model is Meet the model of application.Can solve in the prior art face a certain field particular problem when, may usually be unable to get The problem of large-scale data needed for constructing model, and if it is given train up data on the basis of it is new to learn one Model, when the new model of needs is a large-scale neural network, time-consuming and consumes the problem more than resource.This programme is logical Re -training, the new model of acquisition can be carried out to the part layer in pre-training model for certain categorical data by crossing transfer learning In relationship can also be easily applied to the different problems in same field.
It should be understood that the size of the serial number of each step is not meant that the order of the execution order in above-described embodiment, each process Execution sequence should be determined by its function and internal logic, the implementation process without coping with the embodiment of the present invention constitutes any limit It is fixed.
The embodiment of the present invention provides a kind of transfer learning device based on deep learning pre-training model, as shown in figure 3, should Device includes: division module 301, training module 302, test module 303, determining module 304.
Division module 301, for data set to be divided into training dataset and test data set.
Optionally, the training dataset and the test data set are that two data distributions are consistent and the data of mutual exclusion Set.For example, data set D is divided into two data distributions unanimously and the data acquisition system of mutual exclusion, one is training dataset S, it be the intersection of D, S and T is empty that one, which is the union of test data set T, S and T,.
Optionally, S is consistent with the data distribution of T, be in order to avoid because data partition process introduces additional deviation, thus Influence final result.
Further, in order to guarantee the consistency of data distribution, in the application by the way of stratified sampling to data into Row sampling.Specifically, the test data set carries out sampling acquisition by way of stratified sampling in the data set, described Data in data set in addition to the test data set are the training dataset.For example, having m in data set D1A positive sample This, there is m2A negative sample, and the ratio that S accounts for D is p, then the ratio that T accounts for D is (1-p), then it can be by m1A positive sample Middle acquisition (m1* the positive sample that p) a sample is concentrated as training data, and by m2(m is acquired in a negative sample2* p) a sample This negative sample concentrated as training data, remaining sample concentrated as test data.
Optionally, the data that the training data is concentrated are more than the data that the test data is concentrated.
After dividing to data set D, there are many sample in training dataset S, close to D, train the new model and D come Itself training the model come may be very close to, but since T is smaller, it is not accurate enough at this time to may result in assessment result Stablize;If S sample is seldom, and the new model for training can be made to train the model come with D and differed greatly.Therefore, Specifically, the ratio section that the data that the training data is concentrated account for the data of the data set is [2/3,4/5], the then survey The ratio section that data in examination data set account for the data of the data set is [1/3,1/5].
Further, when being divided to data set, data set can be subjected to n times random division, obtains the training of N group Data set and corresponding test data set, the N are more than or equal to 1, the N group testing result of acquisition can be averaged work in this way For final assessment result, the assessment result for being is more acurrate, more meets the quick application of different field.
Training module 302, the data for being concentrated according to the training data, instructs the pre-training model of acquisition again Practice, obtains new model.
Optionally, training module 302, specifically for the data concentrated according to the training data, to the pre-training of acquisition Training weight re -training in model in the level of model rear end, obtains new weight;It is concentrated according to the training data Data are adjusted the parameter in the level of model rear end in the pre-training model, obtain new parameter;According to described pre- Training weight, the new weight and the new parameter remained unchanged in the level of model front end in training pattern, is obtained Obtain new model.
During deep learning, since computing resource is limited or training dataset is smaller, but we want again obtain compared with It is well more stable as a result, still can obtain some trained models, i.e. pre-training model first, directly to pre-training Model carries out re -training acquisition new model can save a large amount of people without trained a new model of starting from scratch in this way Power material resources.
The source model of one pre-training be it is select from available model, many research institutions have all issued based on super The model of large data sets, these all can serve as the choice of source model.The pre-training model that this programme obtains is with training The pre-training model of weight.
After obtaining pre-training model, re -training total is not needed, it is only necessary to several layers of be trained for therein ?.Some layers of the weight that model originates is remained unchanged, the subsequent layer of re -training obtains new weight.That is basis The data that the training data is concentrated, instruct the training weight in the level of model rear end in the pre-training model of acquisition again Practice, obtains new weight.
During adjusting model, we can repeatedly be attempted, the number concentrated according to the different training data of N group It is adjusted according to pre-training model, freezes layer frozen layers and retraining layer retrain so as to find according to result Best collocation between layers.
Test module 303, the data for being concentrated according to the test data carry out the Generalization Capability of the new model Detection obtains testing result.
Optionally, in order to obtain more accurate testing result, the data that test module 303 concentrates N group test data are right The Generalization Capability of the new model is detected, and N number of testing result is obtained, and takes the average value of N number of testing result as final Testing result.
Determining module 304, for when the testing result reaches pre-set level value, it is determined that the new model is to meet The model of application.
When the testing result is not up to the pre-set level value, it is determined that the new model is unsatisfactory for applying, then by Training module 302 continues to adjust model, until the assessment result of the new model of acquisition meets application.
The embodiment of the present invention provides a kind of transfer learning device based on deep learning pre-training model, passes through transfer learning Pre-training model in deep learning is adjusted to obtain new model, re-test mould by training module according to the data of frontier Block assesses the new model of acquisition, so that the new model for meeting application that determining module determines can also be applied easily In the different problems in same field.
Fig. 4 is the schematic diagram for the terminal device that one embodiment of the invention provides.As shown in figure 4, the terminal of the embodiment is set Standby 4 include: processor 401, memory 402 and are stored in the memory 402 and can run on the processor 401 Computer program 403, such as the transfer learning program based on deep learning pre-training model.The processor 401 executes institute The step in the above-mentioned transfer learning embodiment of the method based on deep learning pre-training model is realized when stating computer program 403, Such as step 101 shown in FIG. 1, to 104 or step 101 shown in Fig. 2 to step 106, the processor 401 executes described The function of each module in above-mentioned each Installation practice, such as the function of module 301 to 304 shown in Fig. 3 are realized when computer program 403 Energy.
Illustratively, the computer program 403 can be divided into one or more modules, one or more of Module is stored in the memory 402, and is executed by the processor 401, to complete the present invention.It is one or more of Module can be the series of computation machine program instruction section that can complete specific function, and the instruction segment is for describing the computer Implementation procedure of the program 403 in the device or terminal device 4 of the transfer learning based on deep learning pre-training model. For example, the computer program 403 can be divided into division module 301, training module 302, test module 303 determines mould Block 304, each module concrete function is as shown in figure 3, this is no longer going to repeat them.
The terminal device 4 can be the calculating such as desktop PC, notebook, palm PC and cloud server and set It is standby.The terminal device may include, but be not limited only to, processor 401, memory 402.It will be understood by those skilled in the art that Fig. 4 is only the example of terminal device 4, does not constitute the restriction to terminal device 4, may include more more or fewer than illustrating Component, perhaps combines certain components or different components, for example, the terminal device can also include input-output equipment, Network access equipment, bus etc..
Alleged processor 401 can be central processing unit (Central Processing Unit, CPU), can also be Other general processors, digital signal processor (Digital Signal Processor, DSP), specific integrated circuit (Application Specific Integrated Circuit, ASIC), ready-made programmable gate array (Field- Programmable Gate Array, FPGA) either other programmable logic device, discrete gate or transistor logic, Discrete hardware components etc..General processor can be microprocessor or the processor is also possible to any conventional processor Deng.
The memory 402 can be the internal storage unit of the terminal device 4, for example, terminal device 4 hard disk or Memory.The memory 402 is also possible to the External memory equipment of the terminal device 4, such as is equipped on the terminal device 4 Plug-in type hard disk, intelligent memory card (Smart Media Card, SMC), secure digital (Secure Digital, SD) card, Flash card (Flash Card) etc..Further, the memory 402 can also have been deposited both the inside including the terminal device 4 Storage unit also includes External memory equipment.The memory 402 is for storing the computer program and the terminal device 4 Other required programs and data.The memory 402, which can be also used for temporarily storing, have been exported or will export Data.
It is apparent to those skilled in the art that for convenience of description and succinctly, only with above-mentioned each function Can unit, module division progress for example, in practical application, can according to need and by above-mentioned function distribution by different Functional unit, module are completed, i.e., the internal structure of described device is divided into different functional unit or module, more than completing The all or part of function of description.Each functional unit in embodiment, module can integrate in one processing unit, can also To be that each unit physically exists alone, can also be integrated in one unit with two or more units, it is above-mentioned integrated Unit both can take the form of hardware realization, can also realize in the form of software functional units.In addition, each function list Member, the specific name of module are also only for convenience of distinguishing each other, the protection scope being not intended to limit this application.Above system The specific work process of middle unit, module, can refer to corresponding processes in the foregoing method embodiment, and details are not described herein.
In the above-described embodiments, it all emphasizes particularly on different fields to the description of each embodiment, is not described in detail or remembers in some embodiment The part of load may refer to the associated description of other embodiments.
Those of ordinary skill in the art may be aware that list described in conjunction with the examples disclosed in the embodiments of the present disclosure Member and algorithm steps can be realized with the combination of electronic hardware or computer software and electronic hardware.These functions are actually It is implemented in hardware or software, the specific application and design constraint depending on technical solution.Professional technician Each specific application can be used different methods to achieve the described function, but this realization is it is not considered that exceed The scope of the present invention.
In embodiment provided by the present invention, it should be understood that disclosed device/terminal device and method, it can be with It realizes by another way.For example, device described above/terminal device embodiment is only schematical, for example, institute The division of module or unit is stated, only a kind of logical function partition, there may be another division manner in actual implementation, such as Multiple units or components can be combined or can be integrated into another system, or some features can be ignored or not executed.Separately A bit, shown or discussed mutual coupling or direct-coupling or communication connection can be through some interfaces, device Or the INDIRECT COUPLING or communication connection of unit, it can be electrical property, mechanical or other forms.
The unit as illustrated by the separation member may or may not be physically separated, aobvious as unit The component shown may or may not be physical unit, it can and it is in one place, or may be distributed over multiple In network unit.It can select some or all of unit therein according to the actual needs to realize the mesh of this embodiment scheme 's.
It, can also be in addition, the functional units in various embodiments of the present invention may be integrated into one processing unit It is that each unit physically exists alone, can also be integrated in one unit with two or more units.Above-mentioned integrated list Member both can take the form of hardware realization, can also realize in the form of software functional units.
If the integrated module/unit be realized in the form of SFU software functional unit and as independent product sale or In use, can store in a computer readable storage medium.Based on this understanding, the present invention realizes above-mentioned implementation All or part of the process in example method, can also instruct relevant hardware to complete, the meter by computer program Calculation machine program can be stored in a computer readable storage medium, the computer program when being executed by processor, it can be achieved that on The step of stating each embodiment of the method.Wherein, the computer program includes computer program code, the computer program Code can be source code form, object identification code form, executable file or certain intermediate forms etc..Computer-readable Jie Matter may include: can carry the computer program code any entity or device, recording medium, USB flash disk, mobile hard disk, Magnetic disk, CD, computer storage, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), electric carrier signal, telecommunication signal and software distribution medium etc..It should be noted that described The content that computer-readable medium includes can carry out increasing appropriate according to the requirement made laws in jurisdiction with patent practice Subtract, such as does not include electric carrier signal and electricity according to legislation and patent practice, computer-readable medium in certain jurisdictions Believe signal.
Embodiment described above is merely illustrative of the technical solution of the present invention, rather than its limitations;Although referring to aforementioned reality Applying example, invention is explained in detail, those skilled in the art should understand that: it still can be to aforementioned each Technical solution documented by embodiment is modified or equivalent replacement of some of the technical features;And these are modified Or replacement, the spirit and scope for technical solution of various embodiments of the present invention that it does not separate the essence of the corresponding technical solution should all It is included within protection scope of the present invention.

Claims (10)

1. a kind of transfer learning method based on deep learning pre-training model characterized by comprising
Data set is divided into training dataset and test data set;
New model is obtained to the pre-training model re -training of acquisition according to the data that the training data is concentrated;
According to the data that the test data is concentrated, detection is carried out to the Generalization Capability of the new model and obtains testing result;
When the testing result reaches pre-set level value, it is determined that the new model is the model for meeting application.
2. the transfer learning method as described in claim 1 based on deep learning pre-training model, which is characterized in that the instruction Practicing data set and the test data set is that two data distributions are consistent and the data acquisition system of mutual exclusion.
3. the transfer learning method as claimed in claim 2 based on deep learning pre-training model, which is characterized in that the survey Examination data set carries out sampling acquisition by way of stratified sampling in the data set, removes the test number in the data set It is the training dataset according to the data except collection.
4. the transfer learning method as claimed in claim 3 based on deep learning pre-training model, which is characterized in that the instruction The data practiced in data set are more than the data that the test data is concentrated.
5. the transfer learning method as claimed in claim 4 based on deep learning pre-training model, which is characterized in that the instruction The ratio section for practicing the data that the data in data set account for the data set is [2/3,4/5].
6. the transfer learning method according to any one of claims 1 to 5 based on deep learning pre-training model, feature It is, it is described that data set is divided into training dataset and test data set, comprising:
Data set is subjected to n times random division, N group training dataset is obtained and corresponding test data set, the N is more than or equal to 1。
7. the transfer learning method as claimed in claim 6 based on deep learning pre-training model, which is characterized in that described New model is obtained to the pre-training model re -training of acquisition according to the data that the training data is concentrated, comprising:
According to the data that the training data is concentrated, to the training weight in the level of model rear end in the pre-training model of acquisition Re -training obtains new weight;
According to the data that the training data is concentrated, the parameter in the level of model rear end in the pre-training model is adjusted It is whole, obtain new parameter;
According to the training weight, the new weight and institute remained unchanged in the level of model front end in the pre-training model New parameter is stated, new model is obtained.
8. a kind of transfer learning device based on deep learning pre-training model characterized by comprising
Division module, for data set to be divided into training dataset and test data set;
Training module, the data for being concentrated according to the training data obtain new the pre-training model re -training of acquisition Model;
Test module, the data for being concentrated according to the test data carry out detection to the Generalization Capability of the new model and obtain Take testing result;
Determining module, for when the testing result reaches pre-set level value, it is determined that the new model is to meet application Model.
9. a kind of terminal device, including memory, processor and storage are in the memory and can be on the processor The computer program of operation, which is characterized in that the processor realizes such as claim 1 to 7 when executing the computer program The step of any one the method.
10. a kind of computer readable storage medium, the computer-readable recording medium storage has computer program, and feature exists In when the computer program is executed by processor the step of any one of such as claim 1 to 7 of realization the method.
CN201811473650.2A 2018-12-04 2018-12-04 Transfer learning method and terminal device based on deep learning pre-training model Pending CN109754068A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811473650.2A CN109754068A (en) 2018-12-04 2018-12-04 Transfer learning method and terminal device based on deep learning pre-training model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811473650.2A CN109754068A (en) 2018-12-04 2018-12-04 Transfer learning method and terminal device based on deep learning pre-training model

Publications (1)

Publication Number Publication Date
CN109754068A true CN109754068A (en) 2019-05-14

Family

ID=66403533

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811473650.2A Pending CN109754068A (en) 2018-12-04 2018-12-04 Transfer learning method and terminal device based on deep learning pre-training model

Country Status (1)

Country Link
CN (1) CN109754068A (en)

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110532314A (en) * 2019-08-30 2019-12-03 国家电网有限公司 The method and terminal device of High-Voltage Electrical Appliances quality testing
CN110598737A (en) * 2019-08-06 2019-12-20 深圳大学 Online learning method, device, equipment and medium of deep learning model
CN110688288A (en) * 2019-09-09 2020-01-14 平安普惠企业管理有限公司 Automatic testing method, device, equipment and storage medium based on artificial intelligence
CN110929877A (en) * 2019-10-18 2020-03-27 平安科技(深圳)有限公司 Model establishing method, device, equipment and storage medium based on transfer learning
CN111191558A (en) * 2019-12-25 2020-05-22 深圳市优必选科技股份有限公司 Robot and face recognition teaching method and storage medium thereof
CN111898650A (en) * 2020-07-08 2020-11-06 国网浙江省电力有限公司杭州供电公司 Marketing and distribution data automatic clustering analysis equipment and method based on deep learning
CN112434717A (en) * 2019-08-26 2021-03-02 杭州海康威视数字技术股份有限公司 Model training method and device
CN112712213A (en) * 2021-01-15 2021-04-27 上海交通大学 Method and system for predicting energy consumption of deep migration learning of centralized air-conditioning house
CN113094994A (en) * 2021-04-12 2021-07-09 上海电享信息科技有限公司 Power battery prediction method based on big data migration learning
CN113127614A (en) * 2020-01-16 2021-07-16 微软技术许可有限责任公司 Providing QA training data and training QA model based on implicit relevance feedback
CN114121161A (en) * 2021-06-04 2022-03-01 东莞太力生物工程有限公司 Culture medium formula development method and system based on transfer learning
CN115114863A (en) * 2022-08-23 2022-09-27 苏州清研精准汽车科技有限公司 Battery pack Y capacitance prediction method and device, computer equipment and storage medium
WO2024016637A1 (en) * 2022-07-22 2024-01-25 中控技术股份有限公司 Method for constructing parameter setting model and industrial process control method

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108399420A (en) * 2018-01-30 2018-08-14 北京理工雷科电子信息技术有限公司 A kind of visible light naval vessel false-alarm elimination method based on depth convolutional network
CN108805137A (en) * 2018-04-17 2018-11-13 平安科技(深圳)有限公司 Extracting method, device, computer equipment and the storage medium of livestock feature vector
CN108875590A (en) * 2018-05-25 2018-11-23 平安科技(深圳)有限公司 BMI prediction technique, device, computer equipment and storage medium

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108399420A (en) * 2018-01-30 2018-08-14 北京理工雷科电子信息技术有限公司 A kind of visible light naval vessel false-alarm elimination method based on depth convolutional network
CN108805137A (en) * 2018-04-17 2018-11-13 平安科技(深圳)有限公司 Extracting method, device, computer equipment and the storage medium of livestock feature vector
CN108875590A (en) * 2018-05-25 2018-11-23 平安科技(深圳)有限公司 BMI prediction technique, device, computer equipment and storage medium

Cited By (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110598737A (en) * 2019-08-06 2019-12-20 深圳大学 Online learning method, device, equipment and medium of deep learning model
CN112434717A (en) * 2019-08-26 2021-03-02 杭州海康威视数字技术股份有限公司 Model training method and device
CN112434717B (en) * 2019-08-26 2024-03-08 杭州海康威视数字技术股份有限公司 Model training method and device
CN110532314A (en) * 2019-08-30 2019-12-03 国家电网有限公司 The method and terminal device of High-Voltage Electrical Appliances quality testing
CN110688288A (en) * 2019-09-09 2020-01-14 平安普惠企业管理有限公司 Automatic testing method, device, equipment and storage medium based on artificial intelligence
CN110688288B (en) * 2019-09-09 2023-11-07 新疆北斗同创信息科技有限公司 Automatic test method, device, equipment and storage medium based on artificial intelligence
CN110929877A (en) * 2019-10-18 2020-03-27 平安科技(深圳)有限公司 Model establishing method, device, equipment and storage medium based on transfer learning
CN110929877B (en) * 2019-10-18 2023-09-15 平安科技(深圳)有限公司 Model building method, device, equipment and storage medium based on transfer learning
CN111191558A (en) * 2019-12-25 2020-05-22 深圳市优必选科技股份有限公司 Robot and face recognition teaching method and storage medium thereof
CN111191558B (en) * 2019-12-25 2024-02-02 深圳市优必选科技股份有限公司 Robot and face recognition teaching method and storage medium thereof
CN113127614A (en) * 2020-01-16 2021-07-16 微软技术许可有限责任公司 Providing QA training data and training QA model based on implicit relevance feedback
CN111898650A (en) * 2020-07-08 2020-11-06 国网浙江省电力有限公司杭州供电公司 Marketing and distribution data automatic clustering analysis equipment and method based on deep learning
CN112712213A (en) * 2021-01-15 2021-04-27 上海交通大学 Method and system for predicting energy consumption of deep migration learning of centralized air-conditioning house
CN112712213B (en) * 2021-01-15 2023-07-04 上海交通大学 Method and system for predicting deep migration learning energy consumption of concentrated air conditioning house
CN113094994A (en) * 2021-04-12 2021-07-09 上海电享信息科技有限公司 Power battery prediction method based on big data migration learning
CN114121161B (en) * 2021-06-04 2022-08-05 深圳太力生物技术有限责任公司 Culture medium formula development method and system based on transfer learning
CN114121161A (en) * 2021-06-04 2022-03-01 东莞太力生物工程有限公司 Culture medium formula development method and system based on transfer learning
WO2024016637A1 (en) * 2022-07-22 2024-01-25 中控技术股份有限公司 Method for constructing parameter setting model and industrial process control method
CN115114863A (en) * 2022-08-23 2022-09-27 苏州清研精准汽车科技有限公司 Battery pack Y capacitance prediction method and device, computer equipment and storage medium

Similar Documents

Publication Publication Date Title
CN109754068A (en) Transfer learning method and terminal device based on deep learning pre-training model
CN112163465B (en) Fine-grained image classification method, fine-grained image classification system, computer equipment and storage medium
US20190130266A1 (en) System and method for improved neural network training
CN110968701A (en) Relationship map establishing method, device and equipment for graph neural network
CN110147456A (en) A kind of image classification method, device, readable storage medium storing program for executing and terminal device
CN109919316A (en) The method, apparatus and equipment and storage medium of acquisition network representation study vector
CN107480789A (en) The efficient conversion method and device of a kind of deep learning model
CN106951825A (en) A kind of quality of human face image assessment system and implementation method
CN107871166A (en) For the characteristic processing method and characteristics processing system of machine learning
US11423307B2 (en) Taxonomy construction via graph-based cross-domain knowledge transfer
Yang et al. Associative memory optimized method on deep neural networks for image classification
CN110288007A (en) The method, apparatus and electronic equipment of data mark
CN110399487A (en) A kind of file classification method, device, electronic equipment and storage medium
CN110008961A (en) Text real-time identification method, device, computer equipment and storage medium
CN106682702A (en) Deep learning method and system
CN106778910A (en) Deep learning system and method based on local training
CN111523324A (en) Training method and device for named entity recognition model
Liu et al. Research of animals image semantic segmentation based on deep learning
CN108228684A (en) Training method, device, electronic equipment and the computer storage media of Clustering Model
CN113761250A (en) Model training method, merchant classification method and device
Zhou et al. Classroom learning status assessment based on deep learning
CN109214444B (en) Game anti-addiction determination system and method based on twin neural network and GMM
CN108537270A (en) Image labeling method, terminal device and storage medium based on multi-tag study
CN109670927A (en) The method of adjustment and its device of credit line, equipment, storage medium
CN112035671A (en) State detection method and device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination