CN105893968B - The unrelated person's handwriting recognition methods end to end of text based on deep learning - Google Patents

The unrelated person's handwriting recognition methods end to end of text based on deep learning Download PDF

Info

Publication number
CN105893968B
CN105893968B CN201610202734.7A CN201610202734A CN105893968B CN 105893968 B CN105893968 B CN 105893968B CN 201610202734 A CN201610202734 A CN 201610202734A CN 105893968 B CN105893968 B CN 105893968B
Authority
CN
China
Prior art keywords
character
text
pseudo
sample
stroke
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201610202734.7A
Other languages
Chinese (zh)
Other versions
CN105893968A (en
Inventor
金连文
杨维信
刘曼飞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing Sign Digital Technology Co ltd
Original Assignee
South China University of Technology SCUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by South China University of Technology SCUT filed Critical South China University of Technology SCUT
Priority to CN201610202734.7A priority Critical patent/CN105893968B/en
Publication of CN105893968A publication Critical patent/CN105893968A/en
Application granted granted Critical
Publication of CN105893968B publication Critical patent/CN105893968B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/32Digital ink
    • G06V30/333Preprocessing; Feature extraction
    • G06V30/347Sampling; Contour coding; Stroke extraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Multimedia (AREA)
  • Character Discrimination (AREA)

Abstract

The present invention provides the unrelated person's handwriting recognition methods end to end of the text based on deep learning, steps are as follows: A, pre-processes to online handwritten text, generates pseudo- character sample;B, the path integral characteristic image of pseudo- character sample is calculated;C, the deep neural network model of writer's sample known to training;D, using the deep neural network model of step C, automatic identification is carried out to the sample of uncertain writer.This method can automatically process online line of text, not need artificially to extract character feature, efficiently realize the unrelated online writer identification of text.

Description

The unrelated person's handwriting recognition methods end to end of text based on deep learning
Technical field
The invention belongs to deep learnings and artificial intelligence field, more particularly to computer user's clipboard computer The unrelated hand script Chinese input equipment document of text carry out feature learning to distinguish the technology of writer.
Background technique
Person's handwriting is an important evidence of authentication, law evidence obtaining etc..In recent years, mobile terminal and clipboard are touched It is universal to improve the attention rate of hand script Chinese input equipment writer verification.But since insufficient, the no discovery handwriting samples of data are especially good Feature, writer identification accuracy need to be improved.Previous recognition methods is all based on the relevant meter such as pattern-recognition Calculation machine image processing techniques artificially determines some feature statements and apart from calculating, artificially extracts character feature, the accuracy rate of identification It is not high.
Summary of the invention
In order to solve technical problem present in the prior art, the present invention provides the text based on deep learning is unrelated Person's handwriting recognition methods end to end, this method can automatically process online line of text, not need artificially to extract character feature, efficiently Realize the unrelated online writer identification of text.
The present invention adopts the following technical scheme that realize: the unrelated identification of person's handwriting end to end of the text based on deep learning Method, which comprises the steps of: A, online handwritten text is pre-processed, generates pseudo- character sample;B, it calculates The path integral characteristic image of pseudo- character sample;C, the deep neural network model of writer's sample known to training;D, step is utilized The deep neural network model of rapid C carries out automatic identification to the sample of uncertain writer.
Preferably, the step A specifically: resampling A1, is carried out to each stroke of online handwritten text, is sampled The uniform hand script Chinese input equipment text chunk of dot density;
A2, stroke segmentation is carried out to the hand script Chinese input equipment text after resampling, obtains the path of more broken small stroke section composition Set;
A3, stroke is divided after text carry out Character segmentation, generate pseudo- character;
A4, the pseudo- character progress stroke section after each segmentation is removed at random;
A5, the size for normalizing pseudo- character;
A6, affine transformation generate pseudo- character sample.
Preferably, in the step A1, resampling is calculated according to set tracing point density parameter and original stroke number The quantity of the tracing point of new samples;
In the step A2, stroke segmentation is first to carry out Corner Detection, and then the stroke is cut off from corner point, is generated new Shorter stroke section;
In the step A3, Character segmentation is successively to take out stroke section in order, when these stroke sections being extracted into are combined into The width of character when be just above character average height, the last one current stroke section is taken as a fresh character Start;
In the step A4, the quantity for removing obtained pseudo- character at random of each character stroke section is total stroke number of segment si Function;
In the step A5, normalization is that the figure that each pseudo- character is mapped to two-dimensional surface zooms in and out, wider With high length, longest edge is transformed to fixed required length value, and in the case where keeping the ratio of width to height constant, to short side Multiplied by corresponding scaling multiple;
In the step A6, affine transformation includes rotation, stretching and the inclination to entire character path.
Preferably, the step B specifically: one group of path integral feature B1, is calculated to each pseudo- character sample;
B2, every group of path integral feature is reassembled into different path integral characteristic patterns according to the feature of identical dimensional;
B3, path integral characteristic pattern progress margin pixel is filled up.
Preferably, when the step B1 calculates path integral feature, it is assumed that a finite length stroke section P is two-dimensional spacePath, the track mobile time meets0 < τ1< ... < τk< T, τiAmong indicating I-th of time point, and positive integer i meets 1≤i≤k, then calculates the k rank path integral feature in time [0, T] interior P When P is straight line, Δ is used0,TIndicate path displacement,It can be acquired by being segmented to calculate;N rank path integral feature is calculated, just It is path integral feature to be done the truncation of k rank, obtained feature set isObtain 2n+1The path product of dimension Dtex sign;
In the step B2, each dimension of path integral feature is individually become into a path integral characteristic pattern, institute Have 2 with each puppet character samplen+1Path integral characteristic pattern is opened, including the two-dimensional image in path itself;
In the step B3, path integral characteristic pattern is first set as the size that pixel value is z × z, is placed on a pixel value For the center of the figure of Z × Z, then these path integral characteristic patterns are input in deep neural network model described in step C It is trained, z < Z≤3z.
Preferably, the step C specifically:
The neuron number of C1, the projected depth neural network number of plies, the template number of convolutional layer and full articulamentum;
C2, whole sample extraction characteristic images of training set are trained as the input of deep neural network model;
C3, when network converges to accuracy rate on training set and no longer rises, deconditioning, save deep neural network mould Shape parameter.
Preferably, in the step C1, deep neural network includes five convolutional layers, have behind each convolutional layer one most Great Chiization layer;
In the step C2, the training of deep neural network includes the successive ignition of two steps of forward and backward, is first used After feedforward network obtains network error, network parameter is updated using back-propagation algorithm, continuous iteration optimization network ginseng Number.
Preferably, the step D specifically: depth nerve D1, will be inputted by pretreated path integral characteristic image Each random corresponding candidate item probability tables of pseudo- character sample removed after stroke are calculated in network model;
D2, it asks probability average the corresponding candidate item probability tables addition of multiple puppet character samples of each character, is somebody's turn to do The candidate item probability tables addition of all characters of text is asked probability averagely to obtain the text by the candidate item probability tables of pseudo- character Candidate item probability tables;
D3, it selects the candidate item of highest scoring according to the candidate item probability tables of text and is determined as writer.
Preferably, in the step D1, each text dividing is multiple characters, and each puppet character generates multiple pseudo- characters Sample, these pseudo- character samples have identical label, calculate candidate item probability tables respectively as an independent sample;
In the step D2, probability averagely includes that the more character probabilities of text are average and multiple puppet character sample probability of character It is average;The candidate item for being added to obtain the character to the pseudo- each dimension of character sample candidate item probability tables of each of character is average Probability tables are added to obtain the time of the text by each dimension of candidate item average probability table of each character to a text chunk Option probability tables;
In the step D3, the candidate item probability tables that are averaged of the character according to text are selected the candidate item of highest scoring and are determined as The writer of the text.
From the above technical scheme, the end to end person's handwriting identification side unrelated the present invention is based on the text of deep learning Method, main includes preprocessing process, deep neural network model training process and the automatic identification process of hand script Chinese input equipment text. It wherein pre-processes the stroke section dividing method used, random removal stroke phase method and is used to write by path integral feature for the first time Person's identification is innovation emphasis of the invention.Compared with prior art, the invention has the following advantages and beneficial effects:
1, pretreated method include text dividing, sample augmentation and text it is unrelated generalization ability enhancing;Pretreatment behaviour Make so that text of the present invention suitable for various lengths, can be long text or short text, it might even be possible to be individual character.
2, it removes stroke section and generates training sample abundant, over-fitting is prevented when for training deep neural network, It is also used for generating multiple pseudo- characters when test and be identified to improve discrimination.
3, the present invention proposes a path integral feature for writer's identification mission for the first time, and uses depth for the first time Convolutional neural networks realize writer's identification.Path integral feature can extract the validity feature that can be used for writer's identification, Learnt by the supplemental characteristic of deep neural network, discrimination is up to 95.72% (Chinese), 98.51% (English).Based on depth Neural network can identify the handwriting samples of the different length of writer, accuracy with higher and robustness.
Detailed description of the invention
Fig. 1 is flow chart of the invention;
Fig. 2 is the schematic diagram of random drop stroke section in the present invention, has chosen 30 random drop strokes of a character Pseudo- character after section is as example;
Fig. 3 indicates the visualized graphs of path integral feature in the present invention.
Specific embodiment
Present invention will now be described in further detail with reference to the embodiments and the accompanying drawings, but a specific embodiment of the invention It is not limited to this.
Embodiment
Present invention mainly solves the identifications and its specific implementation of online text written person, cut using to online text Point and the preprocess method that removes at random of stroke section, establish completely unrelated end-to-end of the text based on deep learning Person's handwriting recognition methods.There is no limit to text, also there is no limit being capable of maximum journey to the character types that user inputs by the present invention User is allowed to carry out free text written on degree, overall flow is as shown in Figure 1.
Referring to Fig. 1, the present invention includes following four process: the A, preprocessing process of hand script Chinese input equipment text;B, known writing The deep neural network model training process of person's sample;C, path integral feature is calculated;D, the sample of uncertain writer is automatic Identification process.Specifically, it first has to carry out resampling to the line of text of hand script Chinese input equipment long text, it is equal to become sampled point spacing Online line of text, then the line of text after resampling is divided into smaller stroke section set, these stroke sections is based on width The height character more single than being divided into.Then the stroke section of each character is removed at random, generates multiple pseudo- characters.It calculates later every A puppet character generates one group of path integral characteristic pattern after carrying out affine transformation, and zero point is filled around.By the puppet of training set The path integral characteristic pattern of character sample, which is input in deep neural network, carries out the training of depth network model to close to saturation, protects Deposit depth network training parameter.In test, training set hand script Chinese input equipment long text is subjected to data prediction described above simultaneously It is input in the depth network model of preservation and is calculated, export the candidate probability table of each pseudo- character sample, then calculate every The probability tables of a character.The probability tables from the same text fragment are corresponded to candidate item later to sum, are obtained final Probability tables, and the maximum candidate item of probability value is selected according to the probability tables, it is determined as writer.The label of the test item of this system It needs to occur in training set.
Each key step of the invention is described in detail individually below:
Step A data prediction
The purpose of step A data prediction is split to the hand script Chinese input equipment line of text data of user's input, and formation can With the format utilized, and some features are extracted, help deep neural network preferably learns and processing feature, in efficiency and identification There is good auxiliaring effect in accuracy.Sample passes through the method resampling of linear interpolation, is calculated and is examined by local buckling degree Angle measurement point.Stroke section after segmentation is combined into character, then the stroke section inside each character is removed at random, is obtained big The pseudo- character of amount, these pseudo- characters obtain more diversified pseudo- character sample by size normalization and affine transformation.
A1, sample resampling
Resampling is that the quantity of the tracing point of new samples is calculated according to set tracing point density parameter and original stroke number; The total length that a stroke is calculated according to original tracing point obtains dot density divided by the quantity of the tracing point of new samples, into And determine former tracing point whether retain and need on line two-by-two the number of interpolation so that it is determined that new samples tracing point coordinate.
If a stroke has the sampled point { (x of p constant duration1,x2),...,(xp,kp)}.Due to writing speed The Euclidean distance of difference, these points is also different.When integer i meets 1≤i≤p, it is assumed that (xi,yi) and (xi+1,yi+1) two o'clock Between Euclidean distance be di, (x0,y0) arrive (xi,yi) inter-two-point path is a length ofIf the puppet after the interpolation to be obtained Specimen sample point sum is l, and l is the integral multiple of p.First point of each stroke remains unchanged after interpolation, from second point Start, i-th point of position coordinates are:
(xi×α+xi+1×(1-α),yi×α+yi+1×(1-α)), (1)
Wherein
The set of point after each resampling still falls within the stroke.
A2, the segmentation of stroke section
Stroke segmentation is first to carry out Corner Detection, and the stroke is cut off from corner point then, generates new shorter pen Draw section;Judge that a point is angle point, need to be calculated its curvature by the coordinate of the point before and after the point, local buckling degree is maximum Point be considered as angle point;Assuming that (xi,yi) it is i-th of trajectory coordinates point after interpolation, respectively with the of the front and back K point (xi-k,yi-k) and (xi+k,yi+k) coordinate value calculate curvature.
The segmentation of stroke section is first to carry out the identification of stroke endpoint according to storing data.Terminate point identification or a connection running through The first coordinate of the hand-written long article this document of machine, is just defaulted as the starting point of a stroke, i.e. endpoint.Angle point is judged in Corner Detection Principle be that local buckling degree is maximum.Its curvature is calculated by the coordinate of the front and back point of each point: assuming that (xi,yi) be interpolation after Trajectory coordinates point, k-th point of the front and back is (xi-k,yi-k) and (xi+k,yi+k), curvature is defined as:
β=max (| xi+k+xi-k-2xi|,|yi+k+yi-k-2yi|)/2k, (3)
Then the stroke is cut off from corner point, generates new shorter stroke section;For training data, if each word The height of symbol is ymax-ymin, then estimate the average height y of each character an of documentaverFor Character segmentation.For Test data traverses each character of text, obtains maximum, the minimum value y of local vertical coordinatemaxAnd ymin, and then estimate this The average character height y of textaver
A3, Character segmentation, pseudo- character generate
Character segmentation is the character that length-width ratio is fixed in order to obtain.Character segmentation be successively take out stroke section in order, and Record the maximum value x of the abscissa of its appearancemaxWith minimum value xmin.When the width for the character that these stroke sections being extracted into are combined into Just above character average height yaverWhen, the last one stroke section is taken as the beginning of a fresh character;Character Average height is calculated when cutting stroke section.
A4, stroke section remove at random
Assuming that a total stroke number of character is m, the stroke number of segment s of i-th of strokei, what the random removal of stroke section obtained The sample size of character is siWith the function of m.If each character is removed d at randomi(0≤di< si) pen, these are remaining Stroke section is reassembled into pseudo- character sample according to original sequencing, then obtained pseudo- character sample sum is exactly:
Schematic diagram is shown in Fig. 2.
A5, size normalization
Normalization is that the figure that each pseudo- character is mapped to two-dimensional surface zooms in and out, and wider and high length will Longest edge transforms to fixed required length value, and in the case where keeping the ratio of width to height constant, to short side multiplied by corresponding contracting Put multiple.
Size normalization is the coordinate (x for first traversing the path point of a characteri,yi), find out width w=xmax-xminAnd height Spend h=ymax-ymin, long side max (w, h) is then stretched to fixed value Q, short side expands corresponding multiple, after obtaining normalization Path point coordinate
A6, affine transformation
Affine transformation includes rotation, stretching, the inclination etc. to entire character path;The angle of rotation depends on certain section The twiddle factor w of interior random size, the coordinate of postrotational point is:
(xi×cos(w)+yi×sin(w),-xi×sin(w)+yi×cos(x)), (6)
Stretching is that linear transformation is carried out to the abscissa or ordinate of path coordinate points, drawing coefficient be set to α and β, ((α, β) ∈ [- 1,1]), coordinate (xi,yi) coordinate after stretching conversion is:
(xi×(1+α),yi×(1+β)), (7)
Tilt variation includes the tilt variation to horizontal direction and vertical direction.Coordinate (xi,yi) inclining in the horizontal direction The tiltedly coordinate after variation are as follows:
(xi×(1+αx),yi), (8)
Coordinate (xi,yi) coordinate after tilt variation in the vertical direction:
(xi,yi×(1+αy)), (9)
Step B calculates path integral characteristic pattern
B1, path integral feature is calculated
Calculate the method that path integral is characterized in using path integral feature.Assuming that a finite length stroke section P is two dimension SpacePath, i-th intermediate of time point be τi, positive integer i satisfaction 1≤i≤k, the track mobile timeAnd 0 < τ1< ... < τk< T, then the k rank path integral feature of P is exactly:
When P is straight line, Δ is used0,TIndicate that path displacement, segmentation calculate:
N rank path integral feature is calculated, obtained feature set is expressed as
The dimension of the obtained path integral feature including path itself is 2n+1
B2, path integral characteristic pattern is generated
The multidimensional path integral feature of image is generated in step bl is determined, each dimension can correspond into width path product Divide characteristic pattern.Path image itself is removed, the quantity for the characteristic pattern that each puppet character sample obtains is 2n+1-1.Generate path product Divide schematic diagram as shown in Figure 3.
B3, margin pixel are filled up
In order to keep image do not lost because of convolution operation image rim path integrate feature, to input tomographic image into Line blank pixel is filled up.The specific embodiment filled up is that path integral characteristic pattern is first converted into the big of 54 × 54 pixels Small, then surrounding fills the blank pixel point of 21 layers of pixel, becomes 96 × 96 figure.
Step C is trained deep neural network
C1, projected depth neural network model
In the present invention, the deep neural network of setting includes convolutional layer and maximum pond layer;Its structure is five convolutional layers, There is a maximum pond layer (MP) behind each convolutional layer;The size of first layer convolution kernel is 3 × 3 (being expressed as C3), behind four The size of layer convolution kernel is 2 × 2 (being expressed as C2);Step-length is 2;Last there are two full articulamentums, are 480 and 512 respectively Neuron.Whole network structure unified representation are as follows:
M×96×96Input-80C3-MP2-160C2-MP2-240C2-MP2-320C2-MP2-400C2-MP2- 480FC-512FC-Output,
Wherein M indicates the port number of input layer, equal with each pseudo- integral quantity of characteristic pattern of character sample.
C2, training deep neural network
The data of training set are used to train deep neural network.Classification problem is done when training.Deep neural network Training include two steps of forward and backward successive ignition.After first obtaining network error with feedforward network, passed using reversed It broadcasts algorithm to be updated network parameter, continuous iteration optimization network parameter, point for training data is tested after every suboptimization Class accuracy rate.
C3, deep neural network model parameter is saved
The trend that concussion rises is presented in the accuracy rate of training data.When the accuracy rate of training data almost no longer rises, The close saturation of training is thought, then preservation model Parameter File, for testing.
Step D automatic identification writer
D1, candidate probability calculate
Tens to thousands of pseudo- character samples can be generated for each long text, the writer of text is exactly each The label of a puppet character sample.Each puppet character sample can be generated 2n+1Path integral characteristic pattern, 2n+1It is simultaneously input layer Port number.Input tomographic image is input in the deep neural network model of step C3 preservation and carries out forward calculation, obtains depth The output of neural network;I-th of character generates in long text i-thjThe probability of a puppet character sample are as follows:
D2, probability are average
Probability averagely includes the more character probabilities of text averagely and multiple pseudo- character sample probability of character are average.Assuming that shared R character can be generated in η class hand writer's long text, each long text, and N number of pseudo- character sample can be generated in each character, each The character puppet sample probability of i-th of character of the pseudo- available candidate item probability column of character sample is average are as follows:
The character candidates item average probability table of each text chunk are as follows:
The most probable value found in formula (14) is λ
D3, writer determine
The candidate item that candidate item probability tables according to pseudo- character select highest scoring is determined as writer;From step D2 The classification results of the long text are the λ class of candidate item.
Embodiment of the present invention are not limited by the above embodiments, other are any without departing from Spirit Essence of the invention With changes, modifications, substitutions, combinations, simplifications made under principle, equivalent substitute mode should be, be included in of the invention Within protection scope.

Claims (9)

1. the unrelated person's handwriting recognition methods end to end of the text based on deep learning, which comprises the steps of:
A, online handwritten text is pre-processed, generates pseudo- character sample;
B, the path integral characteristic image of pseudo- character sample is calculated;
C, the deep neural network model of writer's sample known to training;
D, using the deep neural network model of step C, automatic identification is carried out to the sample of uncertain writer;
The step A specifically:
A1, resampling is carried out to each stroke of online handwritten text, obtains the uniform hand script Chinese input equipment text chunk of sampling point density;
A2, stroke segmentation is carried out to the hand script Chinese input equipment text after resampling, obtains the set of paths of more broken small stroke section composition;
A3, stroke is divided after text carry out Character segmentation, generate pseudo- character;
A4, the pseudo- character progress stroke section after each segmentation is removed at random;
A5, the size for normalizing pseudo- character;
A6, affine transformation generate pseudo- character sample.
2. person's handwriting recognition methods according to claim 1, which is characterized in that
In the step A1, resampling is the track that new samples are calculated according to set tracing point density parameter and original stroke number The quantity of point;
In the step A2, stroke segmentation is first to carry out Corner Detection, and the stroke is cut off from corner point then, generates new ratio Shorter stroke section;
In the step A3, Character segmentation is successively to take out stroke section in order, when the word that these stroke sections being extracted into are combined into When the width of symbol is just above character average height, the last one current stroke section is taken as opening for a fresh character Begin;
In the step A4, the quantity for removing obtained pseudo- character at random of each character stroke section is total stroke number of segment siLetter Number;
In the step A5, normalization is that the figure that each pseudo- character is mapped to two-dimensional surface zooms in and out, wider and high Length, longest edge transforms to fixed required length value, and in the case where keeping the ratio of width to height constant, to short side multiplied by Corresponding scaling multiple;
In the step A6, affine transformation includes rotation, stretching and the inclination to entire character path.
3. person's handwriting recognition methods according to claim 1, which is characterized in that the step B specifically:
B1, one group of path integral feature is calculated to each pseudo- character sample;
B2, every group of path integral feature is reassembled into different path integral characteristic patterns according to the feature of identical dimensional;
B3, path integral characteristic pattern progress margin pixel is filled up.
4. person's handwriting recognition methods according to claim 3, it is characterised in that:
When the step B1 calculates path integral feature, it is assumed that a finite length stroke section P is two-dimensional spacePath, rail The mark mobile time meetsτiIndicate i-th intermediate of time point, and Positive integer i meets 1≤i≤k, then calculates the k rank path integral feature in time [0, T] interior PWhen P is straight line, use Δ0,TIndicate path displacement,It can be acquired by being segmented to calculate;N rank path integral feature is calculated, is exactly path integral spy Sign does the truncation of k rank, and obtained feature set isObtain 2n+1The path integral feature of dimension;
In the step B2, each dimension of path integral feature is individually become into a path integral characteristic pattern, so often One pseudo- character sample has 2n+1Path integral characteristic pattern is opened, including the two-dimensional image in path itself;
In the step B3, first by path integral characteristic pattern be set as pixel value be z × z size, be placed on a pixel value be Z × Then these path integral characteristic patterns are input in deep neural network model described in step C and instruct by the center of the figure of Z Practice, z < Z≤3z.
5. person's handwriting recognition methods according to claim 1, which is characterized in that the step C specifically:
The neuron number of C1, the projected depth neural network number of plies, the template number of convolutional layer and full articulamentum;
C2, whole sample extraction characteristic images of training set are trained as the input of deep neural network model;
C3, when network converges to accuracy rate on training set and no longer rises, deconditioning, save deep neural network model ginseng Number.
6. person's handwriting recognition methods according to claim 5, it is characterised in that:
In the step C1, deep neural network includes five convolutional layers, there is a maximum pond layer behind each convolutional layer;
In the step C2, the training of deep neural network includes the successive ignition of two steps of forward and backward, first with it is preceding to After network obtains network error, network parameter is updated using back-propagation algorithm, continuous iteration optimization network parameter.
7. person's handwriting recognition methods according to claim 1, which is characterized in that the step D specifically:
D1, deep neural network model will be inputted by pretreated path integral characteristic image, each random shifting is calculated Except the corresponding candidate item probability tables of pseudo- character sample after stroke;
D2, it asks probability average the corresponding candidate item probability tables addition of multiple puppet character samples of each character, obtains the puppet word The candidate item probability tables of all characters of text are added the candidate for asking probability averagely to obtain the text by the candidate item probability tables of symbol Item probability tables;
D3, it selects the candidate item of highest scoring according to the candidate item probability tables of text and is determined as writer.
8. person's handwriting recognition methods according to claim 7, which is characterized in that
In the step D1, each text dividing is multiple characters, and each puppet character generates multiple pseudo- character samples, these are pseudo- Character sample has identical label, calculates candidate item probability tables respectively as an independent sample;
In the step D2, probability averagely includes the more character probabilities of text averagely and multiple pseudo- character sample probability of character are flat ?;The candidate item for being added to obtain the character to the pseudo- each dimension of character sample candidate item probability tables of each of character is average general Rate table is added to obtain the candidate of the text by each dimension of candidate item average probability table of each character to a text chunk Item probability tables;
In the step D3, the candidate item probability tables that are averaged of the character according to text select the candidate item of highest scoring and are determined as this article This writer.
9. person's handwriting recognition methods according to claim 2, which is characterized in that when the step A2 Corner Detection, judge one A point is angle point, needs to be calculated its curvature by the coordinate of the point before and after the point, and the maximum point of local buckling degree is considered It is angle point;Assuming that (xi,yi) it is trajectory coordinates point after interpolation, respectively with k-th point of (x of the front and backi-k,yi-k) With (xi+k,yi+k) coordinate value calculate curvature.
CN201610202734.7A 2016-03-31 2016-03-31 The unrelated person's handwriting recognition methods end to end of text based on deep learning Active CN105893968B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610202734.7A CN105893968B (en) 2016-03-31 2016-03-31 The unrelated person's handwriting recognition methods end to end of text based on deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610202734.7A CN105893968B (en) 2016-03-31 2016-03-31 The unrelated person's handwriting recognition methods end to end of text based on deep learning

Publications (2)

Publication Number Publication Date
CN105893968A CN105893968A (en) 2016-08-24
CN105893968B true CN105893968B (en) 2019-06-14

Family

ID=57013309

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610202734.7A Active CN105893968B (en) 2016-03-31 2016-03-31 The unrelated person's handwriting recognition methods end to end of text based on deep learning

Country Status (1)

Country Link
CN (1) CN105893968B (en)

Families Citing this family (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106997473A (en) * 2016-09-08 2017-08-01 汪润春 A kind of image-recognizing method based on neutral net
CN106408038A (en) * 2016-09-09 2017-02-15 华南理工大学 Rotary Chinese character identifying method based on convolution neural network model
CN107292280A (en) * 2017-07-04 2017-10-24 盛世贞观(北京)科技有限公司 A kind of seal automatic font identification method and identifying device
CN108108746B (en) * 2017-09-13 2021-04-09 湖南理工学院 License plate character recognition method based on Caffe deep learning framework
CN107657230A (en) * 2017-09-27 2018-02-02 安徽硕威智能科技有限公司 A kind of bank self-help robot character recognition device
CN108154136B (en) * 2018-01-15 2022-04-05 众安信息技术服务有限公司 Method, apparatus and computer readable medium for recognizing handwriting
CN108596168B (en) * 2018-04-20 2020-11-20 京东数字科技控股有限公司 Method, apparatus and medium for recognizing characters in image
CN108665010B (en) * 2018-05-12 2022-01-04 新疆大学 Online handwriting Uygur language word data enhancement method
CN109740605A (en) * 2018-12-07 2019-05-10 天津大学 A kind of handwritten Chinese text recognition method based on CNN
CN109815809A (en) * 2018-12-19 2019-05-28 天津大学 A kind of English handwriting identification method based on CNN
CN109858488B (en) * 2018-12-28 2021-09-17 众安信息技术服务有限公司 Handwritten sample recognition method and system based on sample enhancement
US10846553B2 (en) * 2019-03-20 2020-11-24 Sap Se Recognizing typewritten and handwritten characters using end-to-end deep learning
CN111738167A (en) * 2020-06-24 2020-10-02 华南理工大学 Method for recognizing unconstrained handwritten text image
CN112989834B (en) * 2021-04-15 2021-08-20 杭州一知智能科技有限公司 Named entity identification method and system based on flat grid enhanced linear converter

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101730898A (en) * 2005-06-23 2010-06-09 微软公司 Adopt the handwriting recognition of neural network
CN104850837A (en) * 2015-05-18 2015-08-19 西南交通大学 Handwritten character recognition method
CN105320961A (en) * 2015-10-16 2016-02-10 重庆邮电大学 Handwriting numeral recognition method based on convolutional neural network and support vector machine
CN105354538A (en) * 2015-10-13 2016-02-24 广东小天才科技有限公司 Chinese character handwriting recognition method and system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101730898A (en) * 2005-06-23 2010-06-09 微软公司 Adopt the handwriting recognition of neural network
CN104850837A (en) * 2015-05-18 2015-08-19 西南交通大学 Handwritten character recognition method
CN105354538A (en) * 2015-10-13 2016-02-24 广东小天才科技有限公司 Chinese character handwriting recognition method and system
CN105320961A (en) * 2015-10-16 2016-02-10 重庆邮电大学 Handwriting numeral recognition method based on convolutional neural network and support vector machine

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于前向神经网络的与内容无关的笔迹鉴别;周琳霞 等;《南昌航空工业学院学报(自然科学版)》;20020331;第16卷(第1期);27-34

Also Published As

Publication number Publication date
CN105893968A (en) 2016-08-24

Similar Documents

Publication Publication Date Title
CN105893968B (en) The unrelated person&#39;s handwriting recognition methods end to end of text based on deep learning
CN113378632B (en) Pseudo-label optimization-based unsupervised domain adaptive pedestrian re-identification method
CN110084239B (en) Method for reducing overfitting of network training during off-line handwritten mathematical formula recognition
Harouni et al. Online Persian/Arabic script classification without contextual information
CN107169485B (en) Mathematical formula identification method and device
CN107729865A (en) A kind of handwritten form mathematical formulae identified off-line method and system
CN107437100A (en) A kind of picture position Forecasting Methodology based on the association study of cross-module state
CN112651323B (en) Chinese handwriting recognition method and system based on text line detection
CN113920516B (en) Calligraphy character skeleton matching method and system based on twin neural network
Zarro et al. Recognition-based online Kurdish character recognition using hidden Markov model and harmony search
Luo et al. SLOGAN: handwriting style synthesis for arbitrary-length and out-of-vocabulary text
CN105023029B (en) A kind of on-line handwritten Tibetan language syllable recognition methods and device
CN111310820A (en) Foundation meteorological cloud chart classification method based on cross validation depth CNN feature integration
Saabni et al. Keyword searching for Arabic handwritten documents
US20190272447A1 (en) Machine learning artificial character generation
CN111144469B (en) End-to-end multi-sequence text recognition method based on multi-dimensional associated time sequence classification neural network
CN113903043B (en) Method for identifying printed Chinese character font based on twin metric model
CN115661123A (en) Industrial product surface defect position detection method based on weak supervision target detection
Nath et al. Improving various offline techniques used for handwritten character recognition: a review
Li et al. An annotation assistance system using an unsupervised codebook composed of handwritten graphical multi-stroke symbols
Ayeb et al. Deep Learning Architecture for Off-Line Recognition of Handwritten Math Symbols
Donmez et al. Concepture: a regular language based framework for recognizing gestures with varying and repetitive patterns
Jain Unconstrained Arabic & Urdu text recognition using deep CNN-RNN hybrid networks
Widiarti et al. Clustering Balinese Script Image in Palm Leaf Using Hierarchical K-Means Algorithm
Le et al. An Attention-Based Encoder–Decoder for Recognizing Japanese Historical Documents

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20220706

Address after: 401120 no.17-1, building 13, No.106, west section of Jinkai Avenue, Yubei District, Chongqing

Patentee after: CHONGQING AOS ONLINE INFORMATION TECHNOLOGY CO.,LTD.

Address before: 510640 South China University of technology, 381 Wushan Road, Tianhe District, Guangzhou City, Guangdong Province

Patentee before: SOUTH CHINA University OF TECHNOLOGY

CP03 Change of name, title or address
CP03 Change of name, title or address

Address after: 401121 no.17-1, building 13, No.106, west section of Jinkai Avenue, Yubei District, Chongqing

Patentee after: Chongqing Sign Digital Technology Co.,Ltd.

Country or region after: China

Address before: 401120 no.17-1, building 13, No.106, west section of Jinkai Avenue, Yubei District, Chongqing

Patentee before: CHONGQING AOS ONLINE INFORMATION TECHNOLOGY CO.,LTD.

Country or region before: China