CN117152770A

CN117152770A - Handwriting input-oriented writing capability intelligent evaluation method and system

Info

Publication number: CN117152770A
Application number: CN202311081395.8A
Authority: CN
Inventors: 梁齐贺; 沈一; 刘川
Original assignee: Beijing Smart Spirit Technology Co ltd
Current assignee: Beijing Smart Spirit Technology Co ltd
Priority date: 2023-08-25
Filing date: 2023-08-25
Publication date: 2023-12-01

Abstract

The invention discloses an intelligent evaluation method and system for writing capability of handwriting input. The method comprises the following steps: acquiring a handwritten text picture input by a user; performing text recognition on the handwritten text picture based on a preset text recognition model so as to convert the handwritten text picture into a candidate text; performing text correction on the candidate text based on a preset text correction model so as to correct the candidate text into a formal text; performing text analysis on the formal text based on a preset text analysis model to output a text analysis result aiming at a user; and carrying out intelligent evaluation on the writing ability of the user according to the text analysis result. Therefore, the method can realize the complete evaluation flow of handwriting character recognition and text semantic analysis, is compatible with the problems of random writing, poor fonts and the like, ensures the accuracy and greatly improves the robustness and the practicability of the whole evaluation method.

Description

Handwriting input-oriented writing capability intelligent evaluation method and system

Technical Field

The invention relates to an intelligent evaluation method of writing ability facing handwriting input, and also relates to a corresponding intelligent evaluation system of writing ability, belonging to the technical field of data identification.

Background

The writability assessment is an important item in the simple mental state scale by handwriting a Chinese sentence satisfying three conditions: (1) subject; (2) have verbs; (3) semantic smoothing. If the current patient can write out a statement meeting the conditions according to the three requirements, the current patient is considered to have basic writing ability so as to evaluate the cognitive level of the current patient, so that evaluating the writing ability of the patient has important reference significance for the diagnosis of the illness state of the patient.

Traditional writing ability assessment mainly relies on professional personnel to carry out manual assessment, and assessment time is long and is easily interfered by subjective factors. In order to solve the problem, in chinese patent application with publication number CN111651999a, an automatic text semantic analysis evaluation system for writing capability detection is disclosed, which mainly includes a corpus input module to be evaluated, a training corpus acquisition module, a corpus preprocessing module, a grammar integrity judgment module, a semantic smoothness analysis module and a database. The automatic evaluation system combines the grammar component integrity and the semantic smoothness of Chinese sentences written by the subject to judge whether the sentences are understandable or not, so as to judge whether the subject has basic writing capability or not.

However, a complete handwriting evaluation scheme should include both handwriting recognition and semantic analysis. The technical solution of the above patent application is mainly focused on the latter, and completely ignores the handwriting recognition part, so that automatic intelligent evaluation of the writing ability of the user cannot be realized.

Disclosure of Invention

The invention aims to provide an intelligent evaluation method for handwriting input-oriented writing capability.

The invention aims to provide an intelligent evaluation system for handwriting input-oriented writing capability.

In order to achieve the above purpose, the present invention adopts the following technical scheme:

according to a first aspect of an embodiment of the present invention, there is provided a handwriting input-oriented writing ability intelligent evaluation method, including the steps of:

acquiring a handwritten text picture input by a user;

performing text recognition on the handwritten text picture based on a preset text recognition model so as to convert the handwritten text picture into candidate texts;

performing text correction on the candidate text based on a preset text correction model so as to correct the candidate text into a formal text;

performing text analysis on the formal text based on a preset text analysis model to output a text analysis result aiming at the user;

and carrying out intelligent evaluation on the writing ability of the user according to the text analysis result.

Wherein preferably the text recognition process comprises:

extracting image characteristics of the handwritten text picture based on a preset text detection model to extract text writing areas with variable sizes from end to end;

performing direction recognition on the text writing area based on a preset text direction recognition model so as to recognize the text direction of the text writing area; the text direction at least comprises an upper direction, a lower direction, a left direction and a right direction;

performing text recognition on a text writing area based on a preset text recognition model, and sorting the recognized text based on the recognized text direction to output correctly-sorted candidate texts;

the text detection model, the text direction recognition model and the text recognition model jointly form the text recognition model.

Preferably, the image feature extraction is performed on the handwritten text based on a preset text detection model to extract a text writing area with an end-to-end indefinite size, which specifically includes:

normalizing the handwritten text picture through a first image transformation module;

performing feature extraction on the normalized handwritten character pictures based on the convolutional cyclic neural network through a first network feature extraction module to extract a text writing area feature matrix;

performing feature enhancement on the text writing area feature matrix through a first feature enhancement module;

and outputting the text writing area with the enhanced characteristics through the first network output module.

Preferably, the direction recognition is performed on the text writing area based on a preset text direction recognition model to recognize the text direction of the text writing area, which specifically includes:

normalizing the cut pictures of the text writing area through a second image transformation module;

performing feature extraction on the cut pictures of the normalized text writing area based on a convolutional cyclic neural network through a second network feature extraction module to extract a text writing direction feature matrix;

performing feature enhancement on the text writing direction feature matrix through a second feature enhancement module;

and outputting the text direction of the cut picture of the text writing area after the characteristic enhancement through a second network output module.

Preferably, the text writing area is subjected to text recognition based on a preset text recognition model, and the recognized text is ranked based on the recognized text direction, so as to output correctly ranked candidate texts, which specifically comprises:

normalizing the cut picture after correction text is subjected to the normalization processing through a third image transformation module;

performing feature extraction on the normalized right-direction cut picture based on the convolutional cyclic neural network through a third network feature extraction module so as to extract an indefinite length character feature matrix;

performing feature enhancement on the extracted feature matrix of the character with the indefinite length by a third feature enhancement module;

and outputting the correctly sequenced characters according to the text direction through a third network output module to form the correctly sequenced candidate texts.

Preferably, the third network output module selects the character with the highest probability as the recognition result, and adopts a CTC loss function to perform function optimization on the text writing area so as to output the text with the correct sequence of the text direction.

Preferably, the text correction process specifically includes:

in the process of recognizing characters by a text recognition model, when characters with bad handwriting are encountered, a plurality of candidate characters are given out according to the confidence coefficient, and a candidate character set is formed;

based on a pre-trained text correction model, carrying out blank excavation prediction on the characters with the handwriting being bad in the candidate text so as to predict a plurality of predicted characters at the blank excavation position and form a predicted character set;

if the candidate character set and the predicted character set have an intersection, the intersection is used as a character recognition result;

if the candidate character set and the predicted character set do not have an intersection, selecting candidate characters which enable sentences to be smooth in the candidate character set as character recognition results; if all candidate characters in the candidate character set can not make the sentences smooth, selecting the predicted characters in the predicted character set which make the sentences smooth as a character recognition result;

and repeating the process to correct all the characters with the bad handwriting in the candidate text, thereby correcting the candidate file into the formal text.

Preferably, the text analysis process specifically includes:

outputting the dependency relationship label of the formal text according to a syntactic analysis model; the dependency relationship label at least comprises a master name relationship SBV, a dynamic guest relationship VOB, a mediate guest relationship POB, a core relationship HED and a double guest relationship DOB;

outputting the part-of-speech tag of the formal text according to the part-of-speech tagging model; the part-of-speech tag at least comprises a noun, a verb v, a pronoun r, an adjective a, an adverb d and a punctuation mark w;

judging whether the formal text is smooth in sentences and accords with the requirement of the list title according to the dependency relation tag and the part-of-speech tag and a preset rule; if the judgment result is that the item corresponding to the user in the table is counted by 1, and if the judgment result is that the item corresponding to the user in the table is not counted by 0, the item corresponding to the user in the table is counted by 0.

Preferably, the preset rule includes:

if the dependency relationship label of the formal text comprises any one combination of SBV+VOB, SBV+POB, SBV+DOB and SBV+HED, the part-of-speech label of SBV belongs to n or r, the part-of-speech label of VOB, POB, DOB belongs to n or r, and the part-of-speech label of HED belongs to n or r or v, the judgment result is yes; otherwise, the judgment result is no.

According to a second aspect of the embodiment of the present invention, there is provided a handwriting input-oriented writing ability intelligent evaluation system, including a processor and a memory, where the processor reads a computer program in the memory, and is configured to perform the following operations:

acquiring a handwritten text picture input by a user;

Compared with the prior art, the intelligent evaluation method and the intelligent evaluation system for the handwriting input-oriented writing capability provided by the invention not only comprise semantic analysis aiming at texts, but also comprise a handwriting character recognition technology. According to the invention, a plurality of deep learning models are introduced for evaluation, so that the problems of random writing, poor fonts and the like are solved, the compatibility is realized for various writing conditions, the accuracy is ensured, and the robustness and the practicability of the whole evaluation method are greatly improved.

Drawings

FIG. 1 is a flowchart of a handwriting input-oriented intelligent evaluation method for writing ability provided by an embodiment of the invention;

FIG. 2 is a schematic diagram of a deep learning model according to an embodiment of the present invention;

FIG. 3 is a schematic diagram of an ERNIE3.0 language model in accordance with an embodiment of the invention;

fig. 4 is a block diagram of an intelligent evaluation system for writing ability facing handwriting input provided in an embodiment of the present invention.

Detailed Description

The technical contents of the present invention will be described in detail with reference to the accompanying drawings and specific examples.

As shown in fig. 1, the method for intelligently evaluating writing ability for handwriting input provided by the embodiment of the invention specifically includes steps S1 to S5:

s1: and acquiring the handwriting text and pictures input by the user on the electronic screen.

Specifically, when the user performs the writing ability evaluation, handwritten characters need to be input on the electronic screen through the stylus according to the scale requirements. After the input is completed, a handwritten text picture of the user is formed on the electronic screen for subsequent text recognition and analysis.

For example: in one embodiment of the invention, the information entered by the user is: when the input is completed, the character and picture of the disease curing in the hospital can be formed.

S2: and carrying out text recognition on the handwritten text picture based on a preset text recognition model so as to convert the handwritten text picture into a candidate text.

After the user finishes inputting the handwritten characters, the user needs to utilize a preset text recognition model to carry out text recognition on the handwritten character pictures. The text recognition model consists of a text detection model, a text direction recognition model and a text recognition model. The text detection model is used for extracting image characteristics of handwritten text pictures so as to extract end-to-end text writing areas; the text direction recognition model is used for recognizing the direction of the text writing area so as to recognize the text direction of the text writing area; the text recognition model is used for recognizing the text writing area and sorting the recognized text based on the recognized text direction to output correctly sorted candidate text.

It will be appreciated that in one embodiment of the invention, the text detection model, the text direction recognition model, and the text recognition model are all constructed from the same deep learning model. As shown in fig. 2, the deep learning model is composed of four parts, namely: an image conversion section, a network feature extraction section, a feature enhancement section, and a network output section. Namely: the text detection model, the text direction recognition model and the text recognition model are composed of the four parts, and the difference is that the specific roles of the parts are different.

Specifically, the text recognition process includes steps S21 to S23:

s21: and (5) text detection.

Specifically, the method comprises the steps S211 to S214:

s211: and carrying out normalization processing on the handwritten text picture through the first image conversion module so as to eliminate interference of image resolution, text background color and the like.

S212: and performing feature extraction on the normalized handwritten character pictures based on the convolutional cyclic neural network through a first network feature extraction module so as to extract a text writing area feature matrix. The first network feature extraction module adopts a convolution cyclic neural network, the convolution network has a good effect on feature extraction of images, the speed is high, the cyclic neural network is suitable for processing tasks related to natural language processing text, and the convolution cyclic neural network combined with the convolution cyclic neural network has a good effect on recognizing text writing areas with indefinite lengths from extraction end to end.

S213: performing feature enhancement on the text writing area feature matrix through a first feature enhancement module;

s214: and outputting the text writing area with the enhanced characteristics through the first network output module.

S22: text direction detection.

Specifically, the method comprises the steps S221 to S224:

s221: normalizing the cut pictures of the text writing area through a second image transformation module;

s222: performing feature extraction on the cut pictures of the normalized text writing area based on a convolutional cyclic neural network through a second network feature extraction module to extract a text writing direction feature matrix;

s223: performing feature enhancement on the text writing direction feature matrix through a second feature enhancement module;

s224: and outputting the text direction of the cut picture of the text writing area after the characteristic enhancement through a second network output module. The second network output module outputs the direction classification, and reduces the dimension to 4 dimensions through a full connection layer, wherein the dimension represents 4 classifications, namely four directions of up, down, left and right.

S23: and (5) character recognition.

Specifically, the method comprises the steps S231 to S234:

s231: normalizing the cut picture after correction text is subjected to the normalization processing through a third image transformation module;

s232: performing feature extraction on the normalized right-direction cut picture based on the convolutional cyclic neural network through a third network feature extraction module so as to extract an indefinite length character feature matrix;

s233: performing feature enhancement on the extracted feature matrix of the character with the indefinite length by a third feature enhancement module;

s234: and outputting the words which are correctly ordered according to the text direction through a third network output module to form correctly ordered candidate texts. In step S234, the third network output module uses a full connectivity layer to dimension n from512 dimension up to n5529, wherein n represents a character of indefinite length, 5529 represents a prediction probability corresponding to each character of Chinese, punctuation, english. The characters with the highest probability are selected as recognition results, and a CTC (Connectionist temporal classification, connection temporal classification) loss function is adopted to perform function optimization on a text writing area so as to output the correctly ordered characters in the text direction.

The CTC loss function optimizes the text writing area as follows:

LSTM outputs with input xThe probability of (2) is:

π∈B ^-1 (l) Representing all passes BAfter transformation toIs defined by a path pi.

For any one of the paths pi there is:

path pi for t=12 ₁ In terms of:

the CTC loss function adjusts the parameter w of LSTM through gradient to make the input sample pi epsilon B ^-1 (l) P (l|x) is the maximum at the time;

the value y of one of the input y matrices of the CTC loss function _tk Probability of (2):

wherein alpha is _t (l _k )、β _t (l _k ) Is a constant calculated by recursion.

S3: and carrying out text correction on the candidate text based on a preset text correction model so as to correct the candidate text into a formal text.

Specifically, the method comprises the steps S31 to S34:

s31: generating a candidate character set

In the process of recognizing characters by the text recognition model, when characters with bad handwriting are encountered, a plurality of candidate characters are given out according to the confidence level, and a candidate character set is formed.

S32: generating a set of predicted characters

And carrying out hollowing prediction on the characters with the bad handwriting in the candidate text based on the pre-trained text correction model so as to predict a plurality of predicted characters at the hollowing position and form a predicted character set.

Wherein, as shown in fig. 3, in one embodiment of the present invention, the pre-trained text correction model is an ERNIE3.0 language model. In other embodiments, other language models may be adaptively substituted as desired.

S33: outputting the character recognition result

Specifically, if the candidate character set and the predicted character set have an intersection, the intersection is used as a character recognition result.

If the candidate character set and the predicted character set do not have an intersection, selecting candidate characters which enable sentences to be smooth in the candidate character set as character recognition results; if all candidate characters in the candidate character set can not make the sentences smooth, selecting the predicted characters in the predicted character set which make the sentences smooth as a character recognition result.

S34: the above process is repeated to correct all the characters in the candidate text that are bad, thereby correcting the candidate file into the formal text.

S4: and carrying out text analysis on the formal text based on a preset text analysis model so as to output a text analysis result aiming at a user.

Specifically, the method includes steps S41 to S43:

s41: obtaining dependency tags

Obtaining important information such as keywords, subjects, predicates, objects and the like in the formal text according to the syntactic analysis model, and outputting dependency relation labels of the formal text; the dependency relationship label at least comprises a master name relationship SBV, a dynamic guest relationship VOB, a mediate guest relationship POB, a core relationship HED and a double guest relationship DOB.

S42: acquiring part-of-speech tags

Part of speech tagging is carried out on keywords in the formal text according to the part of speech tagging model, so that part of speech tags of the formal text are output; the part-of-speech tag at least comprises noun, verb v, pronoun r, adjective a, adverb d and punctuation mark w.

S43: outputting the result

Judging whether the formal text is smooth in statement and accords with the requirement of the list title according to the dependency relationship label and the part-of-speech label and a preset rule; if the judgment result is that the user counts 1 point in the corresponding item in the table, and if the judgment result is that the user does not count 0 point in the corresponding item in the table. Wherein, the preset rule is as follows:

if the dependency relationship label of the formal text comprises any one combination of SBV+VOB, SBV+POB, SBV+DOB and SBV+HED, the part-of-speech label of SBV belongs to n or r, the part-of-speech label of VOB, POB, DOB belongs to n or r, and the part-of-speech label of HED belongs to n or r or v, the judgment result is yes. Otherwise, the judgment result is no.

S5: and carrying out intelligent evaluation on the writing ability of the user according to the text analysis result.

The scale score of the user in the writing capability can be known based on the text analysis result of the user, so that the evaluation result of the user in the writing capability can be known by comparing with the normal mode standard.

On the basis of the handwriting input-oriented writing capacity intelligent evaluating method, the invention further provides a handwriting input-oriented writing capacity intelligent evaluating system. As shown in fig. 4, the writing ability intelligent evaluating system includes one or more processors 21 and a memory 22. Wherein the memory 22 is coupled to the processor 21 for storing one or more programs that, when executed by the one or more processors 21, cause the one or more processors 21 to implement the handwriting input oriented writing capability intelligent assessment method as in the above embodiments.

The processor 21 is configured to control the overall operation of the intelligent writing capability evaluating system, so as to complete all or part of the steps of the intelligent writing capability evaluating method facing handwriting input. The processor 21 may be a Central Processing Unit (CPU), a Graphics Processor (GPU), a Field Programmable Gate Array (FPGA), an Application Specific Integrated Circuit (ASIC), a Digital Signal Processing (DSP) chip, or the like. The memory 22 is used to store various types of data to support the operation of the writing capability intelligent assessment system, which may include, for example, instructions for any application or method operating on the writing capability intelligent assessment system, as well as application-related data. The memory 22 may be implemented by any type or combination of volatile or non-volatile memory devices such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, etc.

In an exemplary embodiment, the evaluation system may be implemented by a computer chip or an entity, or by a product with a certain function, for executing the above-mentioned intelligent evaluation method for handwriting input-oriented writing capability, and achieving the technical effects consistent with the above-mentioned method. One exemplary embodiment is a computer. In particular, the computer may be, for example, a personal computer, a laptop computer, a car-mounted human-machine interaction device, a cellular telephone, a camera phone, a smart phone, a personal digital assistant, a media player, a navigation device, an email device, a game console, a tablet computer, a wearable device, or a combination of any of these devices.

In another exemplary embodiment, the present invention also provides a computer readable storage medium including program instructions which, when executed by a processor, implement the steps of the handwriting input oriented writing ability intelligent assessment method in any of the above embodiments. For example, the computer readable storage medium may be the memory including the program instructions, where the program instructions may be executed by a processor of the evaluation system to perform the handwriting input-oriented intelligent evaluation method, and achieve technical effects consistent with the method.

In summary, the method and the system for intelligently evaluating writing ability facing handwriting input provided by the embodiment of the invention have the following beneficial effects:

1. the invention not only comprises semantic analysis for text, but also comprises a handwriting character recognition technology;

2. according to the invention, a plurality of deep learning models are introduced for evaluation, so that the problems of random writing, poor fonts and the like are solved, the compatibility is realized for various writing conditions, the accuracy is ensured, and the robustness and the practicability of the whole evaluation method are greatly improved.

The method and the system for intelligently evaluating the handwriting ability for handwriting input provided by the invention are described in detail. Any obvious modifications to the present invention, without departing from the spirit thereof, would constitute an infringement of the patent rights of the invention and would take on corresponding legal liabilities.

Claims

1. A handwriting input-oriented writing capability intelligent evaluation method is characterized by comprising the following steps:

acquiring a handwritten text picture input by a user;

2. The method for intelligently evaluating writing ability according to claim 1, wherein the text recognition process comprises the following sub-steps:

3. The intelligent evaluation method of writing ability according to claim 2, wherein the image feature extraction is performed on the handwritten text based on a preset text detection model to extract a text writing area with an end-to-end indefinite size, specifically comprising:

4. The method for intelligently evaluating writing ability according to claim 3, wherein the direction recognition is performed on the text writing area based on a preset text direction recognition model to recognize the text direction of the text writing area, and specifically comprises the following steps:

5. The method for intelligently evaluating writing ability according to claim 4, wherein the text writing area is subjected to text recognition based on a preset text recognition model, and recognized text is ranked based on the recognized text direction, so as to output correctly ranked candidate text, and the method specifically comprises the following steps:

6. The method for intelligently evaluating writing ability according to claim 5, wherein:

and the third network output module selects the character with the highest probability as a recognition result, and adopts a CTC loss function to perform function optimization on the text writing area so as to output the text with the correct sequence of the text direction.

7. The method for intelligently evaluating writing ability according to claim 1, wherein the text correction process specifically comprises the following steps:

8. The method for intelligently evaluating writing ability according to claim 1, wherein the text analysis process specifically comprises the following steps:

9. The method for intelligently evaluating writing ability according to claim 8, wherein the preset rule includes:

10. A handwriting input oriented intelligent writing ability evaluation system, characterized by comprising a processor and a memory, wherein the processor reads a computer program in the memory and is used for executing the intelligent writing ability evaluation method according to any one of claims 1-9.