CN109886105A

CN109886105A - Price tickets recognition methods, system and storage medium based on multi-task learning

Info

Publication number: CN109886105A
Application number: CN201910033930.XA
Authority: CN
Inventors: 牟永强; 严蕤; 韩冉; 孙超; 刘荣杰; 黄耀鸿; 郭怡适
Original assignee: Guangzhou Carpenter Data Technology Co Ltd
Current assignee: Guangzhou Carpenter Data Technology Co Ltd
Priority date: 2019-01-15
Filing date: 2019-01-15
Publication date: 2019-06-14
Anticipated expiration: 2039-01-15
Also published as: CN109886105B

Abstract

The invention discloses price tickets recognition methods, system and storage medium based on multi-task learning, method includes identification model training step and price tickets identification step.The present invention by price tickets integer part and fractional part be trained and identify respectively, not only reduce the difficulty of neural metwork training, and the position of decimal point can be distinguished, accuracy of identification is improved, can be widely applied to depth learning technology field.

Description

Price tickets recognition methods, system and storage medium based on multi-task learning

Technical field

The present invention relates to depth learning technology fields, are based especially on price tickets recognition methods, the system of multi-task learning And storage medium.

Background technique

Recently as the rapid development of depth learning technology, it is also more and more extensive for applying, and is led from traditional security protection Domain, the wisdom retail domain to rising in recent years have its figure.It is the important link in being newly sold that disappears fastly that channel, which is verified, Traditional operation mode mainly includes that business agent's on-site verification and third party verify, and it is respective scarce that both modes suffer from its Point, such as human error is big, the verification period is long, data error can not trace.Image recognition technology based on deep learning has The features such as precision is high, scene can restore, be very suitable to the business scenario of channel verification.Channel based on image verifies main packet Containing two big identification contents, SKU identification and price tickets identification.Important component of the price as sales data, based on image The required precision of price tickets identification is very high, and price tickets identification is easy to be influenced by price tickets design pattern, quality of taking pictures etc., such as The influence of fuzzy, very oblique etc. factors.

Existing price tickets identification technology be all largely by the various pieces of price tickets, as integer part, fractional part with And decimal point is as a whole, then unifies to carry out indiscriminate identification, but due to many kinds of, the Yi Jirong of price tickets Vulnerable to illumination, the influence of photo angle, feature in the picture is not that clearly, identification difficulty is high, even with Recognition sequence algorithm with context relation, it is also difficult to position the position of decimal point, therefore accuracy is not high.

Summary of the invention

In order to solve the above technical problems, it is an object of the invention to: provide that a kind of difficulty is low and accuracy is high, based on more Price tickets recognition methods, system and the storage medium of tasking learning.

The technical solution that one aspect of the present invention is taken are as follows:

Price tickets recognition methods based on multi-task learning, including identification model training step and price tickets identification step, Wherein,

The identification model training step the following steps are included:

According to the image data of price tickets, the pricing information on price tickets is detected；

According to the pricing information on price tickets, based on preset data format to the integer part and fractional part of pricing information Divide and be labeled, obtains price data；

Enhancing processing is carried out to the price data marked；

Enhancing treated price data is inputted preset neural network model to be trained, obtains identification model；

The price tickets identification step the following steps are included:

The price tickets position in images to be recognized is detected by target detection model, obtains price tickets picture number According to；

Price tickets image data is pre-processed；

By identification model, pretreated price tickets image data is identified, obtains the integer portion of pricing information Divide recognition result and fractional part recognition result.

Further, the identification model training step, further comprising the steps of:

Acquire shelf image；

Collected shelf image is detected, confirms the position in price tickets region in shelf image；

According to the position in price tickets region, the image data of price tickets is intercepted.

Further, the described pair of price data marked carried out in the step for enhancing processing, and the enhancing processing includes Brightness processed, rotation processing, scaling processing, translation processing, increases noise processed, skimulated motion Fuzzy Processing at contrast processing With simulation ambiguity of space angle processing.

Further, described enhancing treated price data is inputted into preset neural network model to be trained, it obtains The step for identification model, comprising the following steps:

By convolutional neural networks and LSTM network, to enhancing, treated that price data calculates, and obtains the whole of price Number part and fractional part；

The feature vector of integer part and fractional part is calculated separately by LSTM network；

Processing is optimized by the feature vector that Attention mechanism obtains LSTM network query function；

According to the feature vector after optimization processing, the loss function of integer part and fractional part is calculated separately；

According to the loss function being calculated, identification model is obtained.

It is further, described that by convolutional neural networks and LSTM network, to enhancing, treated that price data calculates, The step for obtaining the integer part and fractional part of price, comprising the following steps:

To enhancing, treated that price data carries out normalization processing, obtains picture to be trained；

The characteristic pattern of picture to be trained is extracted by convolutional neural networks；

Processing is reconstructed to characteristic pattern；

By LSTM network, to reconstruct, treated that characteristic pattern calculates, and obtains the integer part and fractional part of price Point.

Further, the step for the feature vector that integer part and fractional part are calculated separately by LSTM network, The following steps are included:

The characteristic pattern of the characteristic pattern of integer part and fractional part is inputted into several LSTM networks respectively；

Pass through the timing information between LSTM network query function feature；

According to the timing information being calculated, the characteristic pattern that can express timing information is generated.

Further, the feature vector obtained by Attention mechanism to LSTM network query function optimizes processing The step for, comprising the following steps:

Notice that force parameter is normalized by context of the softmax to Attention mechanism, obtains weight ginseng Number；

Be weighted processing by the feature vector that weight parameter obtains LSTM network query function, obtain new feature to Amount.

Another aspect of the present invention is adopted the technical scheme that:

Price tickets identifying system based on multi-task learning, comprising: training module and identification module, wherein

The training module includes:

First detection unit detects the pricing information on price tickets for the image data according to price tickets；

Unit is marked, for according to the pricing information on price tickets, based on preset data format to the whole of pricing information Number part and fractional part are labeled, and obtain price data；

Enhancement unit, for carrying out enhancing processing to the price data marked；

Training unit inputs preset neural network model for the price data that will enhance that treated and is trained, obtains To identification model；

The identification module includes:

Second detection unit, for being detected by target detection model to the price tickets position in images to be recognized, Obtain price tickets image data；

Pretreatment unit, for being pre-processed to price tickets image data；

Recognition unit, for being identified to pretreated price tickets image data, obtaining price by identification model The integer part recognition result and fractional part recognition result of information.

Another aspect of the present invention is adopted the technical scheme that:

Price tickets identifying system based on multi-task learning, comprising:

At least one processor；

At least one processor, for storing at least one program；

When at least one described program is executed by least one described processor, so that at least one described processor is realized The price tickets recognition methods based on multi-task learning.

Another aspect of the present invention is adopted the technical scheme that:

A kind of storage medium, wherein be stored with the executable instruction of processor, the executable instruction of the processor by For executing the price tickets recognition methods based on multi-task learning when processor executes.

The beneficial effects of the present invention are: the present invention by price tickets integer part and fractional part be trained respectively with And identification, the difficulty of neural metwork training is not only reduced, and the position of decimal point can be distinguished, improves identification Precision.

Detailed description of the invention

Fig. 1 is the step flow chart of the embodiment of the present invention；

Fig. 2 is the schematic network structure of the embodiment of the present invention.

Specific embodiment

The present invention is further explained and is illustrated with specific embodiment with reference to the accompanying drawings of the specification.For of the invention real The step number in example is applied, is arranged only for the purposes of illustrating explanation, any restriction is not done to the sequence between step, is implemented The execution sequence of each step in example can be adaptively adjusted according to the understanding of those skilled in the art.

The price tickets recognition methods based on multi-task learning that the embodiment of the invention provides a kind of, including identification model training Step and price tickets identification step, wherein

The identification model training step the following steps are included:

Enhancing processing is carried out to the price data marked；

The price tickets identification step the following steps are included:

Price tickets image data is pre-processed；

It is further used as preferred embodiment, the identification model training step is further comprising the steps of:

Acquire shelf image；

It is further used as preferred embodiment, the described pair of price data marked carries out the step for enhancing is handled In, the enhancing processing includes brightness processed, contrast processing, rotation processing, scaling processing, translation processing, increases at noise Reason, skimulated motion Fuzzy Processing and simulation ambiguity of space angle processing.

Be further used as preferred embodiment, it is described will enhancing treated that price data inputs preset neural network The step for model is trained, and obtains identification model, comprising the following steps:

It is further used as preferred embodiment, described treated to enhancing by convolutional neural networks and LSTM network The step for price data is calculated, and the integer part and fractional part of price are obtained, comprising the following steps:

Processing is reconstructed to characteristic pattern；

It is further used as preferred embodiment, it is described that integer part and fractional part are calculated separately by LSTM network The step for feature vector, comprising the following steps:

It is further used as preferred embodiment, the spy obtained by Attention mechanism to LSTM network query function The step for sign vector optimizes processing, comprising the following steps:

It is corresponding with method, the price tickets identifying system based on multi-task learning that the embodiment of the invention also provides a kind of, It include: training module and identification module, wherein

The training module includes:

The identification module includes:

Pretreatment unit, for being pre-processed to price tickets image data；

It is corresponding with method, the price tickets identifying system based on multi-task learning that the embodiment of the invention also provides a kind of, Include:

At least one processor；

At least one processor, for storing at least one program；

Corresponding with method, the embodiment of the invention also provides a kind of storage mediums, wherein it is executable to be stored with processor Instruction, the executable instruction of the processor is when executed by the processor for executing the valence based on multi-task learning Lattice board recognition methods.

With reference to the accompanying drawings of the specification 1, the tool of price tickets recognition methods of the present invention is described in detail based on multi-task learning Body implementation steps:

(1), price tickets data are acquired, the price tickets data in the present invention are mainly derived from true shelf image, data It is rich and varied, cover various design patterns, angle change, illumination variation；

(2), shelf image collected is detected, to find out the position in price tickets region；

(3), training data marks, according to the pricing information on the price tickets detected, the artificial integer for marking price tickets And fractional part supplies number, such as price 100.05 according to 9 digits in mark, and in mark, integer and fractional part It is respectively labeled as 100AAAAAA, 05AAAAAAA；

(4), processing of the present embodiment Jing Guo above step, price tickets data bulk about 4w generated, in order to The diversity of data is further increased, the present invention has carried out enhancing processing to above-mentioned truthful data, mainly included, brightness, comparison Degree, scaling, translation, increases several aspects such as noise, fuzzy, the simulation ambiguity of space angle of skimulated motion at rotation；

(5), data are sent into pre-designed neural network model to be trained, training pattern is until model is restrained；

(6), it in the actual use stage, first by target detection model, detects the position of price tickets and intercepts picture, in advance Trained model, the integer and fractional part of export price board, then removal filling respectively in input step (5) after processing Character finally combines the pricing information on export price board.

Referring to network structure shown in Fig. 2, specifically, the step (5) the following steps are included:

S1, the picture that training data is normalized to 96 × 200 first, utilize convolutional neural networks to extract the spy of picture Sign obtains the characteristic pattern of picture, and the specification of characteristic pattern is 12 × 25 × 64, then by characteristic pattern reshape to 25 × 768, and with This is characterized input LSTM network, calculates integer and fractional part in price tickets picture；

S2,25 LSTM nets are sequentially input by the characteristic pattern of the extraction of upper level network for the integer part in picture Network, the input of each network are the feature vector of 1 × 768 dimension；By 25 feature vectors by LSTM network respectively from a left side to It right connection and connects from right to left, calculates the timing information between feature, finally obtain 25 × 768 of expressed sequence information after energy Characteristic pattern；

S3, the feature vector obtained by LSTM network query function, the problem is that: no matter list entries length all can be by The vector for being encoded into a regular length indicates, and decodes the vector expression for being then limited to the regular length, limits model Performance；For more accurate expression characteristic vector, invention introduces Attention mechanism, fundamental design idea is By reservation LSTM encoder to the intermediate output of list entries as a result, training a model then to select these inputs The study of selecting property and output sequence is associated therewith in model output, Attention mechanism proposed by the present invention It is implemented as follows:

Firstly, input feature value c:c={ c₁,c₂,......,c_L, L=25, c therein_iIt indicates to pass through LSTM network Some the spatial position feature being calculated；The length of L expression input feature vector sequence.

Then, the context for Attention mechanism being arranged pays attention to force parameter e:

e_i=f_ATT(h,c_i)

Wherein, f_ATT() representation remaps function；The function f of the present embodiment_ATTIt is realized using multitiered network.c_iRepresent The feature vector in i channel；H represents the hidden state parameter of multitiered network.

Then, it is normalized using softmax, obtains weight parameter a:

The present embodiment uses the feature c obtained after Attention mechanism^tIt can indicate are as follows:

Finally export the feature vector c obtained after Attention mechanism^t。

S4,1 × 768 feature vector c has been obtained after learning by Attention mechanism^t, to each LSTM network list The output of member is weighted, and calculates separately the feature vector of 9 bit digital of integer part, is calculated and is classified further according to softmax loss Loss；

Wherein, the network losses function L of integer part_intAre as follows:

z_i,j=ω_i,jx+b_i,j

Wherein, M=9 indicates the loss of 9 bit digitals of integer part output respectively；N indicates to participate in the number of samples of training； y_i,jIndicate target category；s_i,jIndicate the softmax output of j-th of integer part, i-th of training sample position；It indicates z_{I, j}Index mapping；Indicate z_i,kIndex mapping；z_i,jIndicate the response magnitude of neuron；K=11 presentation class Classification number；ω_i,jAnd b_i,jIndicate network parameter；X is the feature vector of network.

S5, the algorithmic procedure of fractional part are identical as integer part, loss function L_decCalculation method also with integer part It is identical, it may be assumed that

The present invention calculates the loss function of integer part and fractional part in network training, is then calculated Total loss function L are as follows:

L=L_int+L_dec；

Finally, the present invention carries out dynamic update by reverse propagated error, to network parameter, until completing when network convergence Training, obtains final identification model.

In conclusion the problem of present invention is difficult to differentiate between with regard to scaling position in existing price tickets identification technology, proposing will Integer part and fractional part in price tickets separately identify that main process is divided into following several stages to solve the problems, such as this, the One stage was to carry out CNN feature extraction to the image of input；Second stage be using LSTM to the feature in a upper stage into The expression of row sequence；Three phases are the interference that background is eliminated using attention mechanism；The last stage is by integer part Multi-task learning modeling is carried out with fractional part.Data in price tickets are divided into two parts and not identified by present invention proposition, i.e., whole Several and decimal separately identifies, and is learnt using multitask Training strategy end to end, not only reduces identification in this way Difficulty can also distinguish the position of decimal point.

It is to be illustrated to preferable implementation of the invention, but the present invention is not limited to the embodiment above, it is ripe Various equivalent deformation or replacement can also be made on the premise of without prejudice to spirit of the invention by knowing those skilled in the art, this Equivalent deformation or replacement are all included in the scope defined by the claims of the present application a bit.

Claims

1. the price tickets recognition methods based on multi-task learning, it is characterised in that: including identification model training step and price tickets Identification step, wherein

The identification model training step the following steps are included:

According to the pricing information on price tickets, based on preset data format to the integer part of pricing information and fractional part into Rower note, obtains price data；

Enhancing processing is carried out to the price data marked；

The price tickets identification step the following steps are included:

The price tickets position in images to be recognized is detected by target detection model, obtains price tickets image data；

Price tickets image data is pre-processed；

By identification model, pretreated price tickets image data is identified, the integer part for obtaining pricing information is known Other result and fractional part recognition result.

2. the price tickets recognition methods according to claim 1 based on multi-task learning, it is characterised in that: the identification mould Type training step, further comprising the steps of:

Acquire shelf image；

3. the price tickets recognition methods according to claim 1 based on multi-task learning, it is characterised in that: described pair of mark Good price data carried out in the step for enhancing processing, and the enhancing is handled including at brightness processed, contrast processing, rotation Reason, translation processing, increases noise processed, skimulated motion Fuzzy Processing and simulation ambiguity of space angle processing at scaling processing.

4. the price tickets recognition methods according to claim 1 based on multi-task learning, it is characterised in that: described to enhance Treated, and price data inputs the step for preset neural network model is trained, obtains identification model, including following Step:

By convolutional neural networks and LSTM network, to enhancing, treated that price data calculates, and obtains the integer portion of price Point and fractional part；

5. the price tickets recognition methods according to claim 4 based on multi-task learning, it is characterised in that: described to pass through volume Treated that price data calculates to enhancing for product neural network and LSTM network, obtains the integer part and fractional part of price The step for dividing, comprising the following steps:

Processing is reconstructed to characteristic pattern；

By LSTM network, to reconstruct, treated that characteristic pattern calculates, and obtains the integer part and fractional part of price.

6. the price tickets recognition methods according to claim 4 based on multi-task learning, it is characterised in that: described to pass through LSTM network calculates separately the step for feature vector of integer part and fractional part, comprising the following steps:

7. the price tickets recognition methods according to claim 4 based on multi-task learning, it is characterised in that: described to pass through The step for feature vector that Attention mechanism obtains LSTM network query function optimizes processing, comprising the following steps:

Notice that force parameter is normalized by context of the softmax to Attention mechanism, obtains weight parameter；

It is weighted processing by the feature vector that weight parameter obtains LSTM network query function, obtains new feature vector.

8. the price tickets identifying system based on multi-task learning, it is characterised in that: include: training module and identification module, wherein

The training module includes:

Unit is marked, for according to the pricing information on price tickets, based on preset data format to the integer portion of pricing information Divide and fractional part is labeled, obtains price data；

Training unit inputs preset neural network model for the price data that will enhance that treated and is trained, known Other model；

The identification module includes:

Second detection unit is obtained for being detected by target detection model to the price tickets position in images to be recognized Price tickets image data；

Pretreatment unit, for being pre-processed to price tickets image data；

Recognition unit, for being identified to pretreated price tickets image data, obtaining pricing information by identification model Integer part recognition result and fractional part recognition result.

9. the price tickets identifying system based on multi-task learning, it is characterised in that: include:

At least one processor；

At least one processor, for storing at least one program；

When at least one described program is executed by least one described processor, so that at least one described processor is realized as weighed Benefit requires the price tickets recognition methods described in any one of 1-7 based on multi-task learning.

10. a kind of storage medium, wherein being stored with the executable instruction of processor, it is characterised in that: the processor is executable Instruction when executed by the processor for executes such as the price of any of claims 1-7 based on multi-task learning Board recognition methods.