CN108460455A - Model treatment method and device - Google Patents

Model treatment method and device Download PDF

Info

Publication number
CN108460455A
CN108460455A CN201810103695.4A CN201810103695A CN108460455A CN 108460455 A CN108460455 A CN 108460455A CN 201810103695 A CN201810103695 A CN 201810103695A CN 108460455 A CN108460455 A CN 108460455A
Authority
CN
China
Prior art keywords
model
training
target
network model
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810103695.4A
Other languages
Chinese (zh)
Inventor
张翀
黄鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chengdu Xiaoduo Tech Co Ltd
Original Assignee
Chengdu Xiaoduo Tech Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chengdu Xiaoduo Tech Co Ltd filed Critical Chengdu Xiaoduo Tech Co Ltd
Priority to CN201810103695.4A priority Critical patent/CN108460455A/en
Publication of CN108460455A publication Critical patent/CN108460455A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • G06F16/355Class or cluster creation or modification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Databases & Information Systems (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The embodiment of the present invention provides a kind of model treatment method and device.The model treatment method includes:The initial network model is trained to obtain pre-training model using multi-group data, the initial network model includes the object construction formed by multilayered structure and the collocation structure that is formed by multilayered structure, and the pre-training model is the corresponding network model of the object construction after the training initial network model;Target data is inputted the pre-training model to carry out that intermediate output data is calculated;And train the target network model to obtain overlap joint model using the intermediate output data, it can form the identification model for the target signature in data to be identified to be identified after the pre-training model and the overlap joint model splicing.

Description

Model treatment method and device
Technical field
The present invention relates to data processing fields, in particular to a kind of model treatment method and device.
Background technology
With the development of computer technology, machine learning has also obtained extensively as application.It is existing mainly to pass through input Circuit training is carried out in a large amount of data to network model, so that network model can identify specific characteristic by study.But Each type of data volume is bigger, when needing to learn new feature every time, is required for carrying out identification model complete Training, cause data processing amount bigger, cause the training time also slow.
Invention content
In view of this, the embodiment of the present invention is designed to provide a kind of model treatment method and device.
A kind of model treatment method provided in an embodiment of the present invention, is applied to electric terminal, and the electric terminal is stored with Initial network model and target network model, the model treatment method include:
Train the initial network model to obtain pre-training model using multi-group data, the initial network model include by The object construction that multilayered structure is formed and the collocation structure formed by multilayered structure, the pre-training model are that training is described initial The corresponding network model of the object construction after network model;
Target data is inputted the pre-training model to carry out that intermediate output data is calculated;And
Train the target network model to obtain overlap joint model using the intermediate output data, the pre-training model and The identification model for the target signature in data to be identified to be identified can be formed after the overlap joint model splicing.
The embodiment of the present invention also provides a kind of model treatment method, advance in the electric terminal applied to electric terminal The pre-training model for being stored with target network model and being trained using multi-group data, the model treatment method include:
Target data is inputted into the pre-training model and obtains intermediate output data;And
Train the target network model to obtain overlap joint model using the intermediate output data, the pre-training model and Identification model can be formed after the overlap joint model splicing, to identify the spy of the target in data to be identified by the identification model Sign.
The embodiment of the present invention also provides a kind of model treatment device, is applied to electric terminal, and the electric terminal is stored with Initial network model and target network model, the model treatment device include:
Pre-training module, it is described first for training the initial network model to obtain pre-training model using multi-group data Beginning network model includes the object construction formed by multilayered structure and the collocation structure that is formed by multilayered structure, the pre-training mould Type is the corresponding network model of the object construction after the training initial network model;
Computing module for target data to be inputted the pre-training model carries out that intermediate output data is calculated;With And
Target training module, for training the target network model to obtain seaming die using the intermediate output data It can be formed after type, the pre-training model and the overlap joint model splicing for knowing to the target signature in data to be identified Other identification model.
The embodiment of the present invention also provides a kind of model treatment device, advance in the electric terminal applied to electric terminal The pre-training model for being stored with target network model and being trained using multi-group data, the model treatment device include:
Computing module obtains intermediate output data for target data to be inputted the pre-training model;And
Target training module, for training the target network model to obtain seaming die using the intermediate output data Identification model can be formed after type, the pre-training model and the overlap joint model splicing, to be waited for by identification model identification Identify the target signature in data.
Compared with prior art, the model treatment method and device of the embodiment of the present invention first passes through multi-group data training in advance Pre-training model is obtained, when needing to obtain identification model, it is thus only necessary to be trained to obtain seaming die to target network model Type, the pre-training model and overlap joint model splicing can form the identification model.Described in training every time The corresponding archetype of identification model, it is possible to reduce obtain the training burden needed for identification model, improve the efficiency of model training.
To enable the above objects, features and advantages of the present invention to be clearer and more comprehensible, preferred embodiment cited below particularly, and coordinate Appended attached drawing, is described in detail below.
Description of the drawings
In order to illustrate the technical solution of the embodiments of the present invention more clearly, below will be to needed in the embodiment attached Figure is briefly described, it should be understood that the following drawings illustrates only certain embodiments of the present invention, therefore is not construed as pair The restriction of range for those of ordinary skill in the art without creative efforts, can also be according to this A little attached drawings obtain other relevant attached drawings.
Fig. 1 is the block diagram for the electric terminal that present pre-ferred embodiments provide.
Fig. 2 is the flow chart for the model treatment method that present pre-ferred embodiments provide.
Fig. 3 is the detail flowchart of the step S101 for the model treatment method that present pre-ferred embodiments provide.
Fig. 4 is the training process flow signal of the pre-training model for the model treatment method that present pre-ferred embodiments provide Figure.
Fig. 5 is the flow chart for the model treatment method that another preferred embodiment of the present invention provides.
Fig. 6 is that the intermediate data in the model treatment method that present pre-ferred embodiments provide exports process flow signal Figure.
Fig. 7 is the training process flow signal of the split-join model for the model treatment method that present pre-ferred embodiments provide Figure.
Fig. 8 is the high-level schematic functional block diagram for the model treatment device that present pre-ferred embodiments provide.
Specific implementation mode
Below in conjunction with attached drawing in the embodiment of the present invention, technical solution in the embodiment of the present invention carries out clear, complete Ground describes, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.Usually exist The component of the embodiment of the present invention described and illustrated in attached drawing can be arranged and be designed with a variety of different configurations herein.Cause This, the detailed description of the embodiment of the present invention to providing in the accompanying drawings is not intended to limit claimed invention below Range, but it is merely representative of the selected embodiment of the present invention.Based on the embodiment of the present invention, those skilled in the art are not doing The every other embodiment obtained under the premise of going out creative work, shall fall within the protection scope of the present invention.
It should be noted that:Similar label and letter indicate similar terms in following attached drawing, therefore, once a certain Xiang Yi It is defined, then it further need not be defined and explained in subsequent attached drawing in a attached drawing.Meanwhile the present invention's In description, term " first ", " second " etc. are only used for distinguishing description, are not understood to indicate or imply relative importance.
As shown in Figure 1, being the block diagram of an electric terminal 100.The electric terminal 100 includes model treatment device 110, memory 111, storage control 112, processor 113, Peripheral Interface 114, input-output unit 115, display unit 116.It will appreciated by the skilled person that structure shown in FIG. 1 is only to illustrate, not to the knot of electric terminal 100 It is configured to limit.For example, electric terminal 100 may also include more either less components than shown in Fig. 1 or have and figure Different configuration shown in 1.Electric terminal 100 described in the present embodiment can be personal computer, processing server or movement Electronic equipment etc. has the computing device of data-handling capacity.
The memory 111, storage control 112, processor 113, Peripheral Interface 114, input-output unit 115 and aobvious Show that 116 each element of unit is directly or indirectly electrically connected between each other, to realize the transmission or interaction of data.For example, these Element can be realized by one or more communication bus or signal wire be electrically connected between each other.The model treatment device 110 It can be stored in the memory 111 or be solidificated in the electricity in the form of software or firmware (Firmware) including at least one Software function module in the operating system (Operating System, OS) of sub- terminal 100.The processor 113 is for holding The executable module stored in line storage, such as software function module or computer that the model treatment device 110 includes Program.
Wherein, the memory 111 may be, but not limited to, random access memory (Random Access Memory, RAM), read-only memory (Read Only Memory, ROM), programmable read only memory (Programmable Read-Only Memory, PROM), erasable read-only memory (Erasable Programmable Read-Only Memory, EPROM), electricallyerasable ROM (EEROM) (Electric Erasable Programmable Read-Only Memory, EEPROM) etc..Wherein, memory 111 is for storing program, the processor 113 after receiving and executing instruction, Described program is executed, the method performed by electric terminal 100 that the process that any embodiment of the embodiment of the present invention discloses defines can To be applied in processor 113, or realized by processor 113.
The processor 113 may be a kind of IC chip, the processing capacity with signal.Above-mentioned processor 113 can be general processor, including central processing unit (Central Processing Unit, abbreviation CPU), network processes Device (Network Processor, abbreviation NP) etc.;It can also be digital signal processor (DSP), application-specific integrated circuit (ASIC), field programmable gate array (FPGA) either other programmable logic device, discrete gate or transistor logic, Discrete hardware components.It may be implemented or execute disclosed each method, step and the logic diagram in the embodiment of the present invention.It is general Processor can be microprocessor or the processor can also be any conventional processor etc..
The Peripheral Interface 114 couples various input/output devices to processor 113 and memory 111.At some In embodiment, Peripheral Interface 114, processor 113 and storage control 112 can be realized in one single chip.Other one In a little examples, they can be realized by independent chip respectively.
The input-output unit 115 is for being supplied to user input data.The input-output unit 115 can be, But it is not limited to, mouse and keyboard etc..
The display unit 116 provided between the electric terminal 100 and user an interactive interface (such as user behaviour Make interface) or for display image data give user reference.In the present embodiment, the display unit can be liquid crystal display Or touch control display.Can be the capacitance type touch control screen or resistance for supporting single-point and multi-point touch operation if touch control display Formula touch screen etc..Single-point and multi-point touch operation is supported to refer to touch control display and can sense on the touch control display one Or the touch control operation generated simultaneously at multiple positions, and transfer to processor to be calculated and located the touch control operation that this is sensed Reason.
Referring to Fig. 2, being the model treatment side applied to electric terminal shown in FIG. 1 that present pre-ferred embodiments provide The flow chart of method.The electric terminal is stored with initial network model and target network model.It below will be to shown in Fig. 2 specific Flow is described in detail.
Step S101 trains the initial network model to obtain pre-training model using multi-group data.
In the present embodiment, the initial network model includes the object construction formed by multilayered structure and by multilayered structure shape At collocation structure, the pre-training model is the corresponding network mould of the object construction after the training initial network model Type.
In the present embodiment, each layer of structure in the initial network model includes parameter to be determined.By using institute It states multi-group data and is trained determination.
In the present embodiment, the multi-group data can be the number for the multiple fields being stored in advance in the electric terminal According to for example, weather data, everyday problem, historical knowledge data.
Target data is inputted the pre-training model and carries out that intermediate output data is calculated by step S102.
In the present embodiment, step S101 can be first carried out in advance, and be preserved obtained pre-training model is executed.It is needing When using pre-training model, the pre-training model is reloaded, the target data, which is inputted the pre-training model, to be carried out It calculates.In the present embodiment, the pre-training model can be used for multiple times, and not need to all hold before executing step S102 every time Row step S101.It is protected that is, the step S101 after being executed once, will execute obtained pre-training model It deposits, for being used for multiple times.
In the present embodiment, the target data can be the sample data sampled for a certain field.For example, The target data can be every-day language sample data, briefing class data etc..
Step S103 trains the target network model to obtain overlap joint model using the intermediate output data, described pre- The identification for the target signature in data to be identified to be identified can be formed after training pattern and the overlap joint model splicing Model.
The model treatment method of the embodiment of the present invention first passes through multi-group data and trains to obtain pre-training model, needing in advance When obtaining identification model, it is thus only necessary to be trained to obtain overlap joint model to target network model, the pre-training model and take The identification model can be formed by connecing model splicing.Without training the corresponding original mould of the identification model every time Type, it is possible to reduce obtain the training burden needed for identification model, improve the efficiency of model training.
In the present embodiment, every group of data include multiple sentences in the multi-group data, as shown in figure 3, the step S101 Including:Step S1011 and step S1012.
Each sentence in the multi-group data is carried out numerical value conversion, obtains the vector of designated length by step S1011.
In the present embodiment, the electric terminal is first identified the character in each sentence and is converted into number.
In one embodiment, can sentence be converted into number in the following manner.
First, character string is cleaned, removes forbidden character.The forbidden character includes additional character, network address.Its It is secondary, by digital normalized, number is all converted to designated character, for example, number can be converted to "@".Again, by sentence Son carries out word, word segmentation.Finally, word, word feature are converted to number, and single sample curtailment according to the index of dictionary 35 supply to 35, and dictionary is the unique number to each words feature, global identical dictionary.It is, for example, possible to use word Allusion quotation is increased income library jieba.
It is described below with a specific example sentence:
Example sentence:" hello, today, weather was pretty good, 20 degree of &_&https of temperature://www.***.com ", will be in example sentence Forbidden character, number carry out processing can obtain:" the pretty good temperature@degree of hello weather today ".
It can be obtained by character segmentation sentence:" the pretty good temperature@degree of hello weather today "=>['h','e','l','l',' O', ' modern ', ' day ', ' day ', ' gas ', ' or not, ' wrong ', ' temperature ', ' degree ', ' degree '].
It can be obtained according to word segmentation sentence:" the pretty good temperature@degree of hello weather today "=>[' hello', ' modern Its ', ' good ', ' the temperature of ', ' weather ', ' weather today ', ' ', '@' everyday, ' degree '].
Words after cutting is indexed to obtain corresponding number in dictionary.By this step formed (n_sample, 35) data are the training data of model, and n_sample is total quantity of the sample after over-sampling.In an example, institute Can be sampled by stating n_sample by 1000, as soon as if the i.e. corresponding sample size of classification will be adopted less than 1000 in the category Sample is to 1000.Mode is:Multiplying power=(the current sample sizes of 1000-)/current sample size, then takes available sample to mend by this multiplying power Foot 1000.
Above-mentioned word, word are indexed the vector searched and obtained in dictionary to be:['h','e','l','l',' O', ' modern ', ' day ', ' day ', ' gas ', ' or not, ' wrong ', ' temperature ', ' degree ', ' degree ']=>[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1695 2793 473 473 440 477 32 32 398 3 181 1538 459 459][' Hello', ' today ', ' everyday ', ' weather ', ' today weather ', ' good ', ' temperature ', '@', ' degree ']=>[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2508 10926 10622 22894 79562 12070 18442 459]。
Further, multi-group data can also be classified, to add label data per a kind of.Each sample is corresponding Label data is the corresponding digital number of item name, one_hot forms is converted into when training, such as our categorical measures Class_num is 10, and it is (n_sample, 10) that label data shape is obtained after processing, and here it is our training labels.
Step S1012, by initial network model described in the Input matrix of the vector formation of each sentence in multi-group data Training obtains the pre-training model.
In the present embodiment, step S1012, including:A. the matrix vector of each sentence in multi-group data formed is defeated Enter the initial network model and is iterated calculating;B. the value of the loss function of the initial network model is calculated, described in adjusting Parameter to be determined in each layer of structure of initial network model, so that the corresponding loss function of parameter to be determined after adjusting The average value of value reduces;Repeat the difference for the average value that step a and b is iterated to calculate until continuous preset times Less than preset value, wherein the object construction of the parameter to be determined after last time adjusting is the pre-training model.
It can obtain being applicable in multiple necks by being first trained initial training model using multi-group data in the present embodiment The pre-training model in domain.When needing the identification model to target signature for identification, directly part-structure can be instructed Calculation amount when white silk can reduce trained improves identification model and obtains efficiency.
As shown in figure 4, the training described in a specific model obtains the flow of the pre-training model below.Its In, the initial network model includes the object construction formed by multilayered structure and the collocation structure that is formed by multilayered structure. In example shown in Fig. 4, the object construction includes embedding layers, LSTM layers and CNN layers;The collocation structure includes FCNN layers and softmax layers.First, by treated, the multi-group data input initial network model calculates.This implementation In example, when multi-group data inputs the initial network model for the first time, the value of the parameter to be determined is initial default value.It will Initial network model is iterated calculating described in the Input matrix that the vector of each sentence in multi-group data is formed. When softmax layers of output, it is calculated the value of the loss function (loss functions) of every layer of structure, adjusts the to be determined of every layer of structure Parameter makes the average value of the value of the loss functions of whole layer structures reduce, and this completes an iterations.Then, it repeats above-mentioned Output valve of the multi-group data after initial network model calculating is inputted the initial network model and counted by process again Calculation obtains the value of loss functions, adjusts parameter to be determined, judges whether loss stops declining, can not until adjusting parameter to be determined The average value for reducing loss functional values obtains pre-training model with regard to deconditioning.In this example, the pre-training model is true Embedding layers, LSTM layers and CNN layers of formation after fixed parameter to be determined.
Referring to Fig. 5, being the model treatment side applied to electric terminal shown in FIG. 1 that present pre-ferred embodiments provide The flow chart of method.The pre-training for being previously stored with target network model in the electric terminal and being trained using multi-group data Model.Detailed process shown in fig. 5 will be described in detail below.
In the present embodiment, it is previously stored with target network model in the electric terminal and trains to obtain using multi-group data Pre-training model.
Target data input pre-training model is obtained intermediate output data by step S201.
Step S202 trains the target network model to obtain overlap joint model using the intermediate output data, described pre- Identification model can be formed after training pattern and the overlap joint model splicing, to be identified in data to be identified by the identification model Target signature.
Step S201 in the present embodiment is similar to the step S102 in former approach embodiment, the step in the present embodiment S202 is similar to the step S103 in former approach embodiment, and the description closed in this present embodiment can be referring again to previous embodiment In description, details are not described herein.
The model treatment method of the embodiment of the present invention first passes through multi-group data and trains to obtain pre-training model, needing in advance When obtaining identification model, it is thus only necessary to be trained to obtain overlap joint model to target network model, the pre-training model and take The identification model can be formed by connecing model splicing.Without training the corresponding original mould of the identification model every time Type, it is possible to reduce obtain the training burden needed for identification model, improve the efficiency of model training.
It is described to train the target network model to obtain overlap joint model using the intermediate output data in the present embodiment Step includes:C. the intermediate output data is inputted into the target network model and is iterated calculating;D. the target is calculated The value of the loss function of network model adjusts the parameter to be determined in each layer of structure of the target network model, so as to adjust The average value of the value of the loss function of the corresponding each layer of structure of parameter to be determined after section reduces;It is straight to repeat step c and d The difference of the average value iterated to calculate to continuous preset times is less than preset value, wherein after last time is adjusted The target network model of parameter to be determined is the overlap joint model.
As shown in Figures 6 and 7, the training described in a specific model below obtains the flow of the overlap joint model.Such as Shown in Fig. 6, the target data is inputted into the pre-training model and carries out that intermediate output data is calculated.Reality shown in fig. 6 Pre-training model includes embedding layers, LSTM layers and CNN layers described in example.As shown in fig. 7, by the intermediate output data The target network model is inputted to be trained.The target network model in example shown in Fig. 7 includes three layers FCNN layers. In this example, when the intermediate output data inputs the target network model for the first time, the ginseng to be determined of the target network Several values is initial default value.The initial default value can be arranged in those skilled in the art according to demand.It then, will be described Intermediate output data inputs the target network model and is iterated calculating.The target network mould is calculated when being exported at FCNN layers The loss functional values of type, the parameter to be determined of every layer of structure of adjusting make the average value of the value of the loss functions of whole layer structures subtract Small, this completes an iterations.Then, it repeats the above process multi-group data after initial network model calculating Output valve input the initial network model again and carry out the value that loss functions are calculated, adjust parameter to be determined, judge Whether loss, which stops, declining, and is obtained with regard to deconditioning until adjusting the average value that parameter to be determined can not reduce loss functional values Overlap model.In this example, the overlap joint model is three layers of FCNN layers of formation after determining parameter to be determined.
Referring to Fig. 8, being the function module for the model treatment device 110 shown in FIG. 1 that present pre-ferred embodiments provide Schematic diagram.The model treatment device 110 includes pre-training module 1101, computing module 1102 and target training module 1103。
The pre-training module 1101, for training the initial network model to obtain pre-training mould using multi-group data Type, the initial network model include the object construction formed by multilayered structure and the collocation structure that is formed by multilayered structure, institute Pre-training model is stated as the corresponding network model of the object construction after the training initial network model.
The computing module 1102 for target data to be inputted the pre-training model carries out that intermediate output is calculated Data.
The target training module 1103, for training the target network model to obtain using the intermediate output data Model is overlapped, can be formed for the target signature in data to be identified after the pre-training model and the overlap joint model splicing The identification model being identified.
In the present embodiment, every group of data include multiple sentences in the multi-group data, and the pre-training module 1101 includes: Date Conversion Unit and data training unit.
The Date Conversion Unit is specified for each sentence in the multi-group data to be carried out numerical value conversion The vector of length.
The data training unit, by original net described in the Input matrix of the vector formation of each sentence in multi-group data Network model training obtains the pre-training model.
In the present embodiment, training obtains the pre-training model to the data training unit in the following manner:
Primary iteration computation subunit, for will each sentence in multi-group data vector formed Input matrix described in Initial network model calculates the value of the loss function in each layer of structure of the initial network model.
Initial parameter regulator unit, the ginseng to be determined in each layer of structure for adjusting the initial network model It counts, so that the average value of the value of the loss function of the corresponding each layer of structure of parameter to be determined after adjusting reduces.
Primary iteration computation subunit and initial parameter regulator unit are repeated until continuous preset times iteration meter The difference of the obtained average value is less than preset value, wherein the object construction of the parameter to be determined after last time adjusting For the pre-training model.
The other details closed in this present embodiment can be with a step with reference to the description in above method embodiment, herein no longer It repeats.
The model treatment device of the embodiment of the present invention first passes through multi-group data and trains to obtain pre-training model, needing in advance When obtaining identification model, it is thus only necessary to be trained to obtain overlap joint model to target network model, the pre-training model and take The identification model can be formed by connecing model splicing.Without training the corresponding original mould of the identification model every time Type, it is possible to reduce obtain the training burden needed for identification model, improve the efficiency of model training.
The high-level schematic functional block diagram for the model treatment device shown in FIG. 1 that present pre-ferred embodiments provide.The present embodiment The model treatment device provided is similar with the model treatment device that previous embodiment provides, the difference is that, this implementation The pre-training model to prestore in executive agent electric terminal 100 in example.The model treatment device includes computing module 1102 With target training module 1103.
The computing module 1102 obtains intermediate output data for target data to be inputted the pre-training model.
The target training module 1103, for training the target network model to obtain using the intermediate output data Model is overlapped, identification model can be formed after the pre-training model and the overlap joint model splicing, to pass through the identification model Identify the target signature in data to be identified.
In the present embodiment, training obtains the overlap joint model to the target training module in the following manner:
Target iteration computing unit, for the intermediate output data to be inputted the target network model, described in calculating The value of loss function in each layer of structure of target network model.
Target component adjusting unit, the parameter to be determined in each layer of structure for adjusting the target network model, So that adjust after the corresponding each layer of structure of parameter to be determined loss function value average value reduce.
It repeats the target iteration computing unit and target component adjusts unit until continuous preset times iteration meter The difference of the obtained average value is less than preset value, wherein the target network of the parameter to be determined after last time adjusting Model is the overlap joint model.
The other details closed in this present embodiment can be with a step with reference to the description in above method embodiment, herein no longer It repeats.
The model treatment device of the embodiment of the present invention first passes through multi-group data and trains to obtain pre-training model, needing in advance When obtaining identification model, it is thus only necessary to be trained to obtain overlap joint model to target network model, the pre-training model and take The identification model can be formed by connecing model splicing.Without training the corresponding original mould of the identification model every time Type, it is possible to reduce obtain the training burden needed for identification model, improve the efficiency of model training.
In several embodiments provided herein, it should be understood that disclosed device and method can also pass through Other modes are realized.The apparatus embodiments described above are merely exemplary, for example, the flow chart in attached drawing and block diagram Show the device of multiple embodiments according to the present invention, the architectural framework in the cards of method and computer program product, Function and operation.In this regard, each box in flowchart or block diagram can represent the one of a module, section or code Part, a part for the module, section or code, which includes that one or more is for implementing the specified logical function, to be held Row instruction.It should also be noted that at some as in the realization method replaced, the function of being marked in box can also be to be different from The sequence marked in attached drawing occurs.For example, two continuous boxes can essentially be basically executed in parallel, they are sometimes It can execute in the opposite order, this is depended on the functions involved.It is also noted that every in block diagram and or flow chart The combination of box in a box and block diagram and or flow chart can use function or the dedicated base of action as defined in executing It realizes, or can be realized using a combination of dedicated hardware and computer instructions in the system of hardware.
In addition, each function module in each embodiment of the present invention can integrate to form an independent portion Point, can also be modules individualism, can also two or more modules be integrated to form an independent part.
It, can be with if the function is realized and when sold or used as an independent product in the form of software function module It is stored in a computer read/write memory medium.Based on this understanding, technical scheme of the present invention is substantially in other words The part of the part that contributes to existing technology or the technical solution can be expressed in the form of software products, the meter Calculation machine software product is stored in a storage medium, including some instructions are used so that a computer equipment (can be People's computer, server or network equipment etc.) it performs all or part of the steps of the method described in the various embodiments of the present invention. And storage medium above-mentioned includes:USB flash disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), arbitrary access are deposited The various media that can store program code such as reservoir (RAM, Random Access Memory), magnetic disc or CD.It needs Illustrate, herein, relational terms such as first and second and the like be used merely to by an entity or operation with Another entity or operation distinguish, and without necessarily requiring or implying between these entities or operation, there are any this realities The relationship or sequence on border.Moreover, the terms "include", "comprise" or its any other variant are intended to the packet of nonexcludability Contain, so that the process, method, article or equipment including a series of elements includes not only those elements, but also includes Other elements that are not explicitly listed, or further include for elements inherent to such a process, method, article, or device. In the absence of more restrictions, the element limited by sentence "including a ...", it is not excluded that including the element Process, method, article or equipment in there is also other identical elements.
The foregoing is only a preferred embodiment of the present invention, is not intended to restrict the invention, for the skill of this field For art personnel, the invention may be variously modified and varied.All within the spirits and principles of the present invention, any made by repair Change, equivalent replacement, improvement etc., should all be included in the protection scope of the present invention.It should be noted that:Similar label and letter exist Similar terms are indicated in following attached drawing, therefore, once being defined in a certain Xiang Yi attached drawing, are then not required in subsequent attached drawing It is further defined and is explained.
The above description is merely a specific embodiment, but scope of protection of the present invention is not limited thereto, any Those familiar with the art in the technical scope disclosed by the present invention, can easily think of the change or the replacement, and should all contain Lid is within protection scope of the present invention.Therefore, protection scope of the present invention should be subject to the protection scope in claims.

Claims (10)

1. a kind of model treatment method, which is characterized in that be applied to electric terminal, the electric terminal is stored with initial network mould Type and target network model, the model treatment method include:
The initial network model is trained to obtain pre-training model using multi-group data, the initial network model includes by multilayer The object construction that structure is formed and the collocation structure formed by multilayered structure, the pre-training model are the training initial network The corresponding network model of the object construction after model;
Target data is inputted the pre-training model to carry out that intermediate output data is calculated;And
The target network model is trained to obtain overlap joint model using the intermediate output data, the pre-training model and described The identification model for the target signature in data to be identified to be identified can be formed after overlap joint model splicing.
2. model treatment method as described in claim 1, which is characterized in that every group of data include multiple in the multi-group data Sentence, it is described using multi-group data train the initial network model to obtain pre-training model the step of include:
Each sentence in the multi-group data is subjected to numerical value conversion, obtains the vector of designated length;
Initial network model training described in the Input matrix of the vector formation of each sentence in multi-group data is obtained described pre- Training pattern.
3. model treatment method as claimed in claim 2, which is characterized in that each sentence by multi-group data to Measuring the step of initial network model training described in the Input matrix formed obtains the pre-training model includes:
A. initial network model described in the Input matrix of the vector formation of each sentence in multi-group data is iterated calculating;
B. the value for calculating the loss function of the initial network model, in each layer of structure for adjusting the initial network model Parameter to be determined so that adjust after the corresponding loss function of parameter to be determined value average value reduce;
The difference for repeating the average value that step a and b is iterated to calculate until continuous preset times is less than preset value, Wherein, the object construction of the parameter to be determined after last time is adjusted is the pre-training model.
4. a kind of model treatment method, which is characterized in that be applied to electric terminal, target is previously stored in the electric terminal Network model and the pre-training model trained using multi-group data, the model treatment method include:
Target data is inputted into the pre-training model and obtains intermediate output data;And
The target network model is trained to obtain overlap joint model using the intermediate output data, the pre-training model and described Identification model can be formed after overlap joint model splicing, to identify the target signature in data to be identified by the identification model.
5. model treatment method as claimed in claim 4, which is characterized in that described to train institute using the intermediate output data Stating the step of target network model obtains overlap joint model includes:
C. the intermediate output data is inputted into the target network model and is iterated calculating;
D. the value for calculating the loss function of the target network model, in each layer of structure for adjusting the target network model Parameter to be determined so that adjust after the corresponding each layer of structure of parameter to be determined loss function value average value reduce;
The difference for repeating the average value that step c and d is iterated to calculate until continuous preset times is less than preset value, Wherein, the target network model of the parameter to be determined after last time is adjusted is the overlap joint model.
6. a kind of model treatment device, which is characterized in that be applied to electric terminal, the electric terminal is stored with initial network mould Type and target network model, the model treatment device include:
Pre-training module, for training the initial network model to obtain pre-training model, the original net using multi-group data Network model includes the object construction formed by multilayered structure and the collocation structure that is formed by multilayered structure, and the pre-training model is The corresponding network model of the object construction after the training initial network model;
Computing module for target data to be inputted the pre-training model carries out that intermediate output data is calculated;And
Target training module, for training the target network model to obtain overlap joint model, institute using the intermediate output data It can be formed after stating pre-training model and the overlap joint model splicing for the target signature in data to be identified to be identified Identification model.
7. model treatment device as claimed in claim 6, which is characterized in that every group of data include multiple in the multi-group data Sentence, the pre-training module include:
Date Conversion Unit, for by the multi-group data each sentence carry out numerical value conversion, obtain designated length to Amount;
Data training unit instructs initial network model described in the Input matrix of the vector formation of each sentence in multi-group data Get the pre-training model.
8. model treatment device as claimed in claim 7, which is characterized in that the data training unit is instructed in the following manner Get the pre-training model:
Primary iteration computation subunit, for will described in the Input matrix that is formed of vector of each sentence in multi-group data it is initial Network model is iterated calculating;
Initial parameter regulator unit, the value of the loss function for calculating the initial network model, adjusts the original net Parameter to be determined in each layer of structure of network model, so that the value of the corresponding loss function of parameter to be determined after adjusting is flat Mean value reduces;
Primary iteration computation subunit and initial parameter regulator unit are repeated until continuous preset times iterate to calculate The difference of the average value arrived is less than preset value, wherein the object construction of the parameter to be determined after last time adjusting is institute State pre-training model.
9. a kind of model treatment device, which is characterized in that be applied to electric terminal, target is previously stored in the electric terminal Network model and the pre-training model trained using multi-group data, the model treatment device include:
Computing module obtains intermediate output data for target data to be inputted the pre-training model;And
Target training module, for training the target network model to obtain overlap joint model, institute using the intermediate output data Identification model can be formed after stating pre-training model and the overlap joint model splicing, to identify number to be identified by the identification model Target signature in.
10. model treatment device as claimed in claim 9, which is characterized in that the target training module is in the following manner Training obtains the overlap joint model:
Target iteration computing unit is iterated calculating for the intermediate output data to be inputted the target network model;
Target component adjusts unit, and the value of the loss function for calculating the target network model adjusts the target network Parameter to be determined in each layer of structure of model, so that the loss letter of the corresponding each layer of structure of parameter to be determined after adjusting The average value of several values reduces;
It repeats the target iteration computing unit and target component adjusts unit until continuous preset times iterate to calculate The difference of the average value arrived is less than preset value, wherein the target network model of the parameter to be determined after last time adjusting For the overlap joint model.
CN201810103695.4A 2018-02-01 2018-02-01 Model treatment method and device Pending CN108460455A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810103695.4A CN108460455A (en) 2018-02-01 2018-02-01 Model treatment method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810103695.4A CN108460455A (en) 2018-02-01 2018-02-01 Model treatment method and device

Publications (1)

Publication Number Publication Date
CN108460455A true CN108460455A (en) 2018-08-28

Family

ID=63239310

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810103695.4A Pending CN108460455A (en) 2018-02-01 2018-02-01 Model treatment method and device

Country Status (1)

Country Link
CN (1) CN108460455A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109243493A (en) * 2018-10-30 2019-01-18 南京工程学院 Based on the vagitus emotion identification method for improving long memory network in short-term
CN109685120A (en) * 2018-12-11 2019-04-26 中科恒运股份有限公司 Quick training method and terminal device of the disaggregated model under finite data
CN111221963A (en) * 2019-11-19 2020-06-02 成都晓多科技有限公司 Intelligent customer service data training model field migration method
CN111274422A (en) * 2018-12-04 2020-06-12 北京嘀嘀无限科技发展有限公司 Model training method, image feature extraction method and device and electronic equipment
CN111105020B (en) * 2018-10-29 2024-03-29 西安宇视信息科技有限公司 Feature representation migration learning method and related device

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111105020B (en) * 2018-10-29 2024-03-29 西安宇视信息科技有限公司 Feature representation migration learning method and related device
CN109243493A (en) * 2018-10-30 2019-01-18 南京工程学院 Based on the vagitus emotion identification method for improving long memory network in short-term
CN109243493B (en) * 2018-10-30 2022-09-16 南京工程学院 Infant crying emotion recognition method based on improved long-time and short-time memory network
CN111274422A (en) * 2018-12-04 2020-06-12 北京嘀嘀无限科技发展有限公司 Model training method, image feature extraction method and device and electronic equipment
CN109685120A (en) * 2018-12-11 2019-04-26 中科恒运股份有限公司 Quick training method and terminal device of the disaggregated model under finite data
CN111221963A (en) * 2019-11-19 2020-06-02 成都晓多科技有限公司 Intelligent customer service data training model field migration method
CN111221963B (en) * 2019-11-19 2023-05-12 成都晓多科技有限公司 Intelligent customer service data training model field migration method

Similar Documents

Publication Publication Date Title
CN108460455A (en) Model treatment method and device
CN109934706A (en) A kind of transaction risk control method, apparatus and equipment based on graph structure model
CN110334357A (en) A kind of method, apparatus, storage medium and electronic equipment for naming Entity recognition
WO2021073390A1 (en) Data screening method and apparatus, device and computer-readable storage medium
CN107391545A (en) A kind of method classified to user, input method and device
CN108305158A (en) A kind of method, apparatus and equipment of trained air control model and air control
CN109598517B (en) Commodity clearance processing, object processing and category prediction method and device thereof
CN109446328A (en) A kind of text recognition method, device and its storage medium
CN104361415B (en) A kind of choosing method and device for showing information
CN107341173A (en) A kind of information processing method and device
CN113449187A (en) Product recommendation method, device and equipment based on double portraits and storage medium
CN108509407A (en) Text semantic similarity calculating method, device and user terminal
CN110263161A (en) A kind of processing method of information, device and equipment
CN110019790A (en) Text identification, text monitoring, data object identification, data processing method
CN110276382A (en) Listener clustering method, apparatus and medium based on spectral clustering
CN113592605B (en) Product recommendation method, device, equipment and storage medium based on similar products
CN109582792A (en) A kind of method and device of text classification
CN107515896A (en) A kind of resource recommendation method, device and equipment
CN110033382A (en) A kind of processing method of insurance business, device and equipment
CN110321430A (en) Domain name identification and domain name identification model generation method, device and storage medium
CN110457470A (en) A kind of textual classification model learning method and device
US20220335209A1 (en) Systems, apparatus, articles of manufacture, and methods to generate digitized handwriting with user style adaptations
CN109255629A (en) A kind of customer grouping method and device, electronic equipment, readable storage medium storing program for executing
CN115392237A (en) Emotion analysis model training method, device, equipment and storage medium
CN113505273B (en) Data sorting method, device, equipment and medium based on repeated data screening

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information
CB02 Change of applicant information

Address after: 610000 Huayang Street, Tianfu New District, Chengdu City, Sichuan Province, No. 1, No. 2, No. 19 Building, Unit 2, 1903

Applicant after: Chengdu Xiaoduo Technology Co., Ltd.

Address before: 610000 New Hope International Block A 2207, No. 19 Tianfu Third Street, Chengdu High-tech Zone, Sichuan Province

Applicant before: CHENGDU XIAODUO TECH CO., LTD.

RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20180828