CN109062404B - Interaction system and method applied to intelligent early education machine for children - Google Patents

Interaction system and method applied to intelligent early education machine for children Download PDF

Info

Publication number
CN109062404B
CN109062404B CN201810799639.9A CN201810799639A CN109062404B CN 109062404 B CN109062404 B CN 109062404B CN 201810799639 A CN201810799639 A CN 201810799639A CN 109062404 B CN109062404 B CN 109062404B
Authority
CN
China
Prior art keywords
layer
children
pronunciation
child
early education
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810799639.9A
Other languages
Chinese (zh)
Other versions
CN109062404A (en
Inventor
吴成东
刘鑫
丁鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Northeastern University China
Original Assignee
Northeastern University China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Northeastern University China filed Critical Northeastern University China
Priority to CN201810799639.9A priority Critical patent/CN109062404B/en
Publication of CN109062404A publication Critical patent/CN109062404A/en
Application granted granted Critical
Publication of CN109062404B publication Critical patent/CN109062404B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/011Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/174Facial expression recognition
    • G06V40/175Static expression
    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09BEDUCATIONAL OR DEMONSTRATION APPLIANCES; APPLIANCES FOR TEACHING, OR COMMUNICATING WITH, THE BLIND, DEAF OR MUTE; MODELS; PLANETARIA; GLOBES; MAPS; DIAGRAMS
    • G09B5/00Electrically-operated educational appliances
    • G09B5/08Electrically-operated educational appliances providing for individual presentation of information to a plurality of student stations
    • G09B5/14Electrically-operated educational appliances providing for individual presentation of information to a plurality of student stations with provision for individual teacher-student communication

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Biophysics (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Business, Economics & Management (AREA)
  • Educational Administration (AREA)
  • Educational Technology (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Multimedia (AREA)
  • Electrically Operated Instructional Devices (AREA)

Abstract

The invention belongs to the technical field of early education of children, and provides an interactive system and method applied to an intelligent early education machine for children. The invention integrates expression recognition, handwritten character recognition, voice recognition, text processing and voice evaluation with the field of early education of children. Mainly comprises the following aspects: firstly, using CNN to realize facial expression recognition and assist children teaching; secondly, using CNN to realize handwritten character recognition and helping children to write characters according to correct strokes and stroke orders; thirdly, voice interaction between the children and the early education machine is realized by using voice recognition and SVM text processing; and fourthly, using the voice to evaluate, teach or correct the pronunciation of the children. Compared with the existing early education machine, the early education machine changes the early education mode, enriches the early education content and improves the interest of early education.

Description

Interaction system and method applied to intelligent early education machine for children
Technical Field
The invention belongs to the technical field of early education of children, and relates to an interactive system and method applied to an intelligent early education machine for children.
Background
The early education machine is an educational electronic product specially used for promoting the learning interest of children for early education of children. The early education machine on the current market is basically in a key type or a point-reading type, and the man-machine interaction mode is more traditional, namely the early education machine simply tells stories and children passively listen to stories; the early education machine in the market can not capture the emotion and facial expression of the child, has no emotion and warmth, can not detect the current emotional state of the child in the teaching process, and can not teach and examine the situation according to the situation; meanwhile, the early education machine on the market does not teach children to correctly write characters, strokes and foreign language letters, and daily pronunciations of the children cannot be monitored and corrected.
Disclosure of Invention
Aiming at the defects of the existing early teaching machine, the invention provides a method for applying facial expression recognition, handwritten character recognition, voice interaction, text processing and the like to the early teaching machine for children so as to solve the problem that the early teaching machine is not intelligent enough.
The technical scheme of the invention is as follows:
an interactive system applied to an intelligent early education machine for children comprises a facial expression recognition module, a handwritten character, stroke and foreign letter recognition module, a child insert language filtering module and a child pronunciation evaluation and correction module;
the facial expression recognition module is used for realizing facial expression recognition through a CNN network, judging the type of the expression, selecting the teaching operation corresponding to the expression and realizing that the early education machine has the function of assisting teaching; for example: as shown in fig. 2, when the child shows a confused expression in the learning process, the teaching contents can be explained in further detail so as to facilitate the child to understand.
The recognition module of the handwritten characters, the strokes and the foreign language letters realizes the recognition of the handwritten characters, the strokes and the foreign language letters through a CNN network, detects whether the sequence of the handwritten characters or the foreign language letters and the strokes of the handwritten characters or the foreign language letters is correct or not, and realizes that the early education machine has the function of teaching children to correctly write the characters or the foreign language letters; as shown in fig. 3, in the process of teaching a child to write a character or a foreign language letter, the early teaching machine plays a writing animation of the character or the foreign language letter, and prompts the writing skill of the character or the foreign language letter, so that the child writes. The early education machine detects whether the written characters or foreign letters of the children and the stroke sequence of the written characters or foreign letters are correct. If not, the early teaching machine can play the writing animation of the character or the foreign language letter again and prompt the writing skill of the character or the foreign language letter, and the child writes again until the writing is correct.
The filtering module of the child inserted words is used for judging the text to be a negative text when the input voice contains the inserted words, so as to abandon the response to the text; the addition of the module can improve the response efficiency of the early education machine and the child in the man-machine conversation process. As shown in fig. 4, when the child speaks the insert during the dialog with the early education machine, the early education machine may judge that the text is a negative text, thereby abandoning the response to the text.
The pronunciation evaluating and correcting module for children is used for evaluating and correcting the daily pronunciation of children according to the pronunciation error or defective words in the daily conversation process of the recording children and the early teaching machine, as shown in figure 5, the early teaching machine can correct the pronunciation of children, when the children enter the pronunciation correcting module, the early teaching machine can select problem words, as shown in figure 6, the early teaching machine accurately reads the words, and the children can read the words and the words, if the children still have the pronunciation error, the early teaching machine reads the words again and prompts the pronunciation mode and the skill of the words, and the children read the words again until the pronunciation is correct.
The method for applying the interactive system is characterized by comprising the following steps:
step 1, establishing a database; establishing a database A of facial expressions, handwritten characters, strokes and foreign letters; a database B is used for establishing a children daily conversation text and inserting language database; a database C, positioning and collecting the wrong pronunciation or the defective pronunciation of the daily dialogue of the children, and establishing a problem pronunciation database;
step 1-1, collecting data; acquiring a child facial expression data set, a handwritten picture of simple characters and strokes and a handwritten picture of foreign letters;
step 1-2, classifying and labeling the expression data, and classifying the expression data into various expressions such as anger, happiness, confusion, fear, disgust, sadness and the like, labeling the handwritten character, establishing a database, and classifying the handwritten stroke picture into various Chinese characters such as mouth, hui, hand, foot, cat, dog and the like, labeling the handwritten stroke picture into various strokes such as points, horizontal, vertical, left-falling, right-falling and the like, labeling the handwritten foreign language letter picture into various foreign language letters such as A, B, C, α, β, gamma and the like;
and step 2, a specific implementation method of the functional module comprises a facial expression recognition module, a recognition module for handwritten characters, strokes and foreign letters, a filtering module for child inserted words and a pronunciation evaluation and correction module for children.
Step 2-1, realizing a facial expression recognition module:
step 2-1-1, carrying out gray level processing on the picture, and inputting the picture into a CNN network for training, wherein the specific steps are as follows:
step 2-1-1-1, establishing a CNN neural network model, and setting a network structure; the CNN neural network comprises an input layer, two hidden layers, a full connection layer and an output layer; each hidden layer comprises a convolutional layer and a sub-sampling layer; the convolutional layer adopts a sigmoid activation function to play a role in local sensing and parameter sharing; the sub-sampling layer is used for reducing visual redundancy and network parameters; the fully connected layer maps the learned distributed feature representation to a sample mark space; the output layer, namely a classifier, adopts softmax regression, and the network structure is shown in figure 1 and comprises input layer nodes, hidden layer nodes and output layer nodes;
and 2-1-1-2, initializing parameters of the CNN neural network, wherein the initialization of the network weight is to endow all connection weights (including a threshold) in the network with an initial value. If the initial weight vector is in a relatively flat area of the error surface, the convergence rate of the network training may be abnormally slow. The connection weights and thresholds of the network are initialized to random interval values between-0.30, + 0.30; setting an excitation function of a hidden layer, and setting the learning rate of the weight value to be a point value between the range of [0,1 ];
step 2-1-1-3, according to the picture input data after graying processing at the moment k-1, inputting the weight from the input layer to the hidden layer node and the weight between the input layer and the hidden layer to obtain the output value of the output layer, and updating the weight from the input layer at the moment k to the hidden layer node and the weight between the input layer and the hidden layer;
step 2-1-1-4, setting a total error threshold value for stopping training, and judging whether the total error of the obtained predicted value is greater than the set total error threshold value or not, if so, adjusting the interval weight from the hidden layer node to the output layer node according to the total error value, inputting the interval weight from the layer node to the hidden layer node, and otherwise, finishing the training of the CNN neural network;
step 2-1-2, the output value prediction of the child expression picture is completed by using the trained CNN neural network;
step 2-2, realizing a recognition module of handwritten characters, strokes and foreign letters;
step 2-2-1, carrying out gray level processing on the picture, and inputting the picture into a CNN network for training, wherein the specific steps are as follows:
step 2-2-1-1, establishing a CNN neural network model, and setting a network structure, wherein the CNN neural network comprises an input layer, three hidden layers, a full connection layer and an output layer; each hidden layer comprises a convolutional layer and a sub-sampling layer; the convolution layer adopts a ReLU activation function to play a role in local perception and parameter sharing; the sub-sampling layer is used for reducing visual redundancy and network parameters; the fully connected layer maps the learned distributed feature representation to a sample mark space; the output layer is a classifier and adopts softmax regression;
step 2-2-1-2, initializing CNN neural network parameters, and initializing the connection weight and the threshold of the network to be random interval values between [ -0.30, +0.30 ]; setting an excitation function of a hidden layer, and setting the learning rate of the weight value to be a point value between the range of [0,1 ];
step 2-2-1-3, according to the picture input data after graying processing at the moment k-1, inputting the weight from the input layer to the hidden layer node and the weight between the input layer and the hidden layer to obtain the output value of the output layer, and updating the weight from the input layer at the moment k to the hidden layer node and the weight between the input layer and the hidden layer;
step 2-2-1-4, setting a total error threshold value for stopping training, and judging whether the total error of the obtained predicted value is greater than the set total error threshold value or not, if so, adjusting the interval weight from the hidden layer node to the output layer node according to the total error value, inputting the interval weight from the layer node to the hidden layer node, and otherwise, finishing the training of the CNN neural network;
step 2-2-2, completing the output value prediction of children characters, strokes and foreign letters by using the trained CNN neural network;
step 2-3, realizing a filtering module of the child insert words;
step 2-3-1, labeling the text data, and dividing the text data into a positive text and a negative text, wherein a positive sample is a normal text; the negative sample is an insertion language text;
step 2-3-2, monitoring the sound in the environment, if no sound exists, continuing the monitoring, otherwise intercepting the sound, wherein the sound intercepting method adopts a sound endpoint detection method based on short-time energy and short-time zero crossing rate, and carries out voice recognition on the intercepted sound to obtain a corresponding text of the section of sound;
2-3-3, building an SVM model to carry out secondary classification on the text data, and specifically comprising the following steps:
step 2-3-3-1, performing word segmentation on all training documents, and representing texts by using the words as the dimensionality of vectors;
step 2-3-3-2, counting all the appearing words and frequencies of the documents in each class, then filtering, and removing stop words and single words;
step 2-3-3-3, counting the total word frequency of the words appearing in each category, and taking the vocabulary with the highest frequency as the characteristic word set of the category;
step 2-3-3-4, removing words appearing in each category, and combining feature word sets of all categories to form a total feature word set; finally, the obtained feature word set is a feature set, and the feature set is used for screening features in the test set;
2-3-3-5, training the SVM by using the screened features to obtain a training model;
step 2-3-4, completing output value prediction of the words of the children by using the trained SVM, responding to the prediction value of the SVM if the prediction value of the SVM is a text book, and giving up the response if the prediction value of the SVM is not the text book;
step 2-4, implementing a child pronunciation evaluation and correction module;
2-4-1, selecting a problem word by the early education machine, carrying out correct pronunciation, and simultaneously prompting the pronunciation mode and pronunciation skill of the problem word;
step 2-4-2, the child follows the pronunciation of the early education machine, and meanwhile, the early education machine further evaluates the pronunciation of the child and judges whether the pronunciation of the child is correct or not;
and 2-4-3, if the pronunciation of the child is correct, ending the pronunciation teaching of the problem word, otherwise, repeating the step 2-4-1 and the step 2-4-2 until the pronunciation of the child is correct.
The invention fully utilizes the image recognition, voice interaction and text processing methods in the field of modern mature artificial intelligence, and designs the traditional early teaching machine into an intelligent early teaching machine with the functions of facial expression recognition, man-machine voice interaction and the like. The invention brings artificial intelligence into the field of early teaching machines for children, adds more fun to the early teaching process of children and improves the learning efficiency of children.
The human face recognition and the expression recognition in the field of artificial intelligence are applied to the field of intelligent early education machines for children, the early education machines can recognize different owners, meanwhile, the early education machines can capture the current emotion of the children, and when the early education machines capture the angry expression of the children, a joke can be spoken, so that the children are happy; when capturing the puzzlement expression of children as early education machine, can carry out careful and repeated explanation to the current teaching content of early education machine, let children learn, understand. Meanwhile, the hand-written character recognition technology is applied to the early teaching machine, so that the early teaching machine has the functions of teaching children to write, stroke order and the like.
The voice interaction and text processing method in the field of artificial intelligence is applied to the field of intelligent children early education machines, and the interaction mode of the early education machines and the children is not the traditional key type or point-reading type any more, but the most common and convenient man-machine conversation mode. And the words of the children in the stage of the Eh-scholar can be simply filtered by a text processing method, namely the words of the children inserted in the processes of Eh-scholar, Eh-kah, Eo, and Ey in the conversation process of the early teaching machine and the children are filtered, so that the error response in the process of asking and answering the early teaching machine and the children is reduced.
The voice evaluation technology in the field of artificial intelligence is applied to the field of intelligent early education machines for children, the pronunciation level is automatically evaluated through the intelligent voice technology, and pronunciation errors and defects are positioned and analyzed. In children and early education machine conversation process, intelligence early education machine can be to children's pronunciation level evaluate, pronunciation mistake is monitored, pronunciation defect fixes a position, gets off children's the word record of wrong pronunciation, then carries out problem analysis, and in the teaching process, we can add pronunciation exercise part, reads children's the word of wrong pronunciation aloud for children, then children follow-up reading to this corrects children's pronunciation.
The CNN network is used for realizing facial expression recognition and handwritten character recognition, the SVM is used for realizing simple character processing, and the voice technology is used for realizing voice interaction and voice evaluation functions.
Drawings
Fig. 1 is a schematic diagram of a convolutional neural network structure.
FIG. 2 schematic diagram of expression recognition application of intelligent early education machine
Fig. 3 is a schematic diagram of the intelligent early education machine teaching children writing and writing order.
Fig. 4 is a flow chart of a voice interaction and a child insertion filtering process between a child and an intelligent early education machine.
Fig. 5 is a flow chart of the intelligent early education machine recording the mispronunciation words.
Fig. 6 is a flowchart of pronunciation correction and teaching of the intelligent early education machine.
Detailed Description
The following detailed description is given to specific embodiments with reference to the accompanying drawings.
As shown in fig. 1, when a child lights up the screen of the early teaching machine, the early teaching machine starts to perform face recognition, matches the result of the face recognition with the face in the database, if the matching is successful, the early teaching machine is allowed to enter the system, otherwise, the face registration can be performed, and the registered face is added into the face database so as to be successfully matched next time. The early education machine can identify the current user, so that different databases of different users can be called to realize the purpose of teaching according to people and according to personal conditions. As shown in fig. 4, after the child enters the early education machine system, the early education machine can enter the story telling submodule through the voice instruction "tell me a story of the white snow princess" and search for the keyword of the white snow princess ", and in the process, the early education machine can screen the voice instruction, filter out the children's insertion words of" Eyah, Eyao, Eao, and so on ", and reduce the voice instruction false response rate. Meanwhile, as shown in fig. 5, the pronunciation level of the child can be monitored and evaluated, the unqualified pronunciations are subjected to error, defect positioning and problem analysis, the unqualified pronunciations are recorded in the database, and then the children pronunciations are trained in a targeted manner in the child pronunciation training submodule, wherein the specific process is shown in fig. 6. In the process of teaching children, the early teaching machine can monitor expressions of the children in real time, if the children are confused in the answering process and answer time is overtime, the early teaching machine determines that puzzlement points and difficulties occur in the answering process of the children, the early teaching machine analyzes and answers the questions in detail, otherwise, the answering process is automatically ended, and the specific process is shown in fig. 2. When a child enters the stroke and stroke order practice submodule through a voice command 'writing practice', the early teaching machine can disassemble the strokes and the stroke orders of the fonts, the child respectively writes the strokes and the stroke orders of the fonts, and after the practice, the early teaching machine gives any Chinese character and the child starts to write. If the strokes of each stroke in the writing process of the children are correct and the stroke order in the whole writing process is also correct, the early education machine determines that the children write correctly, otherwise, the early education machine indicates that the children write wrongly and gives a writing mode with correct font, so that the children write repeatedly until the writing is correct, and the specific process is shown in fig. 3.
The SVM filters the inserted words of the children:
recording daily conversations of children to obtain 1000 text materials (the normal text and the meaningless text respectively account for 50%); the 1000 pieces of human-computer dialog text are numbered from 1 to 1000, wherein the numbers 1-800 are training texts, and the number 801-1000 are testing texts.
Figure GDA0002337904790000081
And (5) constructing a filtering link of the inserted texts by using an SVM model. The training test is carried out by using the SVM realized by Python to obtain a comparison table of the real value of the man-machine conversation text and the SVM distinguishing value, wherein '1' represents a normal child text, and '0' represents a child insert language text, as shown in the following table:
Figure GDA0002337904790000082
Figure GDA0002337904790000091
as shown in the table, the intelligent early education machine only answers and responds to the children utterances judged to be 1 in the text screening link realized by the SVM model. Experiments prove that the early education machine achieves the accuracy rate of 98.8% of the children inserting words from the original condition that the texts of the children inserting words are not screened completely. In a word, in the process of dialogue with children, the intelligent early education machine can filter out some meaningless request texts, and the error response rate of the early education machine is improved.

Claims (1)

1. A method applied to an interactive system of an intelligent early education machine for children is characterized by comprising a facial expression recognition module, a handwritten character, stroke and foreign letter recognition module, a child inserted language filtering module and a child pronunciation evaluation and correction module;
the facial expression recognition module is used for realizing facial expression recognition through a CNN network, judging the type of the expression, selecting the teaching operation corresponding to the expression and realizing that the early education machine has the function of assisting teaching;
the recognition module of the handwritten characters, the strokes and the foreign language letters realizes the recognition of the handwritten characters, the strokes and the foreign language letters through a CNN network, detects whether the sequence of the handwritten characters or the foreign language letters and the strokes of the handwritten characters or the foreign language letters is correct or not, and realizes that the early education machine has the function of teaching children to correctly write the characters or the foreign language letters;
the child insert language filtering module is used for judging that the text containing the insert language is a negative text when the input speech contains the insert language, so as to abandon the response to the text containing the insert language;
the children pronunciation evaluating and correcting module is used for correctly reading the words and phrases by the early teaching machine according to the recorded wrong or defective words and phrases during the daily conversation process of the children and the early teaching machine, and then the children follow-up reading;
the method of the interactive system comprises the following steps:
step 1, establishing a database; establishing a database A of facial expressions, handwritten characters, strokes and foreign letters; a database B is used for establishing a children daily conversation text and inserting language database; a database C, positioning and collecting the wrong pronunciation or the defective pronunciation of the daily dialogue of the children, and establishing a problem pronunciation database;
step 1-1, collecting data; acquiring a child facial expression data set, a handwritten picture of simple characters and strokes and a handwritten picture of foreign letters;
step 1-2, classifying and labeling the expression data, labeling the handwritten characters, establishing a database, labeling the handwritten stroke pictures, and labeling the handwritten foreign language letter pictures;
step 2, a specific implementation method of the functional module comprises a facial expression recognition module, a recognition module for handwritten characters, strokes and foreign letters, a filtering module for child inserted words and a child pronunciation evaluation and correction module;
step 2-1, realizing a facial expression recognition module:
step 2-1-1, carrying out gray level processing on the picture, and inputting the picture into a CNN network for training, wherein the specific steps are as follows:
step 2-1-1-1, establishing a CNN neural network model, and setting a network structure; the CNN neural network comprises an input layer, two hidden layers, a full connection layer and an output layer; each hidden layer comprises a convolutional layer and a sub-sampling layer; the convolutional layer adopts a sigmoid activation function to play a role in local sensing and parameter sharing; the sub-sampling layer is used for reducing visual redundancy and network parameters; the fully connected layer maps the learned distributed feature representation to a sample mark space; the output layer, namely a classifier, adopts softmax regression and comprises input layer nodes, hidden layer nodes and output layer nodes;
step 2-1-1-2, initializing CNN neural network parameters, and initializing the connection weight and the threshold of the network to be random interval values between [ -0.30, +0.30 ]; setting an excitation function of a hidden layer, and setting the learning rate of the weight value to be a point value between the range of [0,1 ];
step 2-1-1-3, according to the picture input data after graying processing at the moment k-1, inputting the weight from the input layer to the hidden layer node and the weight between the input layer and the hidden layer to obtain the output value of the output layer, and updating the weight from the input layer at the moment k to the hidden layer node and the weight between the input layer and the hidden layer;
step 2-1-1-4, setting a total error threshold value for stopping training, and judging whether the total error of the obtained predicted value is greater than the set total error threshold value or not, if so, adjusting the interval weight from the hidden layer node to the output layer node according to the total error value, inputting the interval weight from the layer node to the hidden layer node, and otherwise, finishing the training of the CNN neural network;
step 2-1-2, the output value prediction of the child expression picture is completed by using the trained CNN neural network;
step 2-2, realizing a recognition module of handwritten characters, strokes and foreign letters;
step 2-2-1, carrying out gray level processing on the picture, and inputting the picture into a CNN network for training, wherein the specific steps are as follows:
step 2-2-1-1, establishing a CNN neural network model, and setting a network structure, wherein the CNN neural network comprises an input layer, three hidden layers, a full connection layer and an output layer; each hidden layer comprises a convolutional layer and a sub-sampling layer; the convolution layer adopts a ReLU activation function to play a role in local perception and parameter sharing; the sub-sampling layer is used for reducing visual redundancy and network parameters; the fully connected layer maps the learned distributed feature representation to a sample mark space; the output layer is a classifier and adopts softmax regression;
step 2-2-1-2, initializing CNN neural network parameters, and initializing the connection weight and the threshold of the network to be random interval values between [ -0.30, +0.30 ]; setting an excitation function of a hidden layer, and setting the learning rate of the weight value to be a point value between the range of [0,1 ];
step 2-2-1-3, according to the picture input data after graying processing at the moment k-1, inputting the weight from the input layer to the hidden layer node and the weight between the input layer and the hidden layer to obtain the output value of the output layer, and updating the weight from the input layer at the moment k to the hidden layer node and the weight between the input layer and the hidden layer;
step 2-2-1-4, setting a total error threshold value for stopping training, and judging whether the total error of the obtained predicted value is greater than the set total error threshold value or not, if so, adjusting the interval weight from the hidden layer node to the output layer node according to the total error value, inputting the interval weight from the layer node to the hidden layer node, and otherwise, finishing the training of the CNN neural network;
step 2-2-2, completing the output value prediction of children characters, strokes and foreign letters by using the trained CNN neural network;
step 2-3, realizing a filtering module of the child insert words;
step 2-3-1, labeling the text data, and dividing the text data into a positive text and a negative text, wherein a positive sample is a normal text; the negative sample is an insertion language text;
step 2-3-2, monitoring the sound in the environment, if no sound exists, continuing the monitoring, otherwise intercepting the sound, wherein the sound intercepting method adopts a sound endpoint detection method based on short-time energy and short-time zero crossing rate, and carries out voice recognition on the intercepted sound to obtain a corresponding text of the section of sound;
2-3-3, building an SVM model to carry out secondary classification on the text data, and specifically comprising the following steps:
step 2-3-3-1, performing word segmentation on all training documents, and representing texts by using the words as the dimensionality of vectors;
step 2-3-3-2, counting all the appearing words and frequencies of the documents in each class, then filtering, and removing stop words and single words;
step 2-3-3-3, counting the total word frequency of the words appearing in each category, and taking the vocabulary with the highest frequency as the characteristic word set of the category;
step 2-3-3-4, removing words appearing in each category, and combining feature word sets of all categories to form a total feature word set; finally, the obtained feature word set is a feature set, and the feature set is used for screening features in the test set;
2-3-3-5, training the SVM by using the screened features to obtain a training model;
step 2-3-4, completing output value prediction of the words of the children by using the trained SVM, responding to the prediction value of the SVM if the prediction value of the SVM is a text book, and giving up the response if the prediction value of the SVM is not the text book;
step 2-4, implementing a child pronunciation evaluation and correction module;
2-4-1, selecting a problem word by the early education machine, carrying out correct pronunciation, and simultaneously prompting the pronunciation mode and pronunciation skill of the problem word;
step 2-4-2, the child follows the pronunciation of the early education machine, and meanwhile, the early education machine further evaluates the pronunciation of the child and judges whether the pronunciation of the child is correct or not;
and 2-4-3, if the pronunciation of the child is correct, ending the pronunciation teaching of the problem word, otherwise, repeating the step 2-4-1 and the step 2-4-2 until the pronunciation of the child is correct.
CN201810799639.9A 2018-07-20 2018-07-20 Interaction system and method applied to intelligent early education machine for children Active CN109062404B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810799639.9A CN109062404B (en) 2018-07-20 2018-07-20 Interaction system and method applied to intelligent early education machine for children

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810799639.9A CN109062404B (en) 2018-07-20 2018-07-20 Interaction system and method applied to intelligent early education machine for children

Publications (2)

Publication Number Publication Date
CN109062404A CN109062404A (en) 2018-12-21
CN109062404B true CN109062404B (en) 2020-03-24

Family

ID=64817601

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810799639.9A Active CN109062404B (en) 2018-07-20 2018-07-20 Interaction system and method applied to intelligent early education machine for children

Country Status (1)

Country Link
CN (1) CN109062404B (en)

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108847238B (en) * 2018-08-06 2022-09-16 东北大学 Service robot voice recognition method
CN109841122A (en) * 2019-03-19 2019-06-04 深圳市播闪科技有限公司 A kind of intelligent robot tutoring system and student's learning method
CN111078098B (en) * 2019-05-10 2021-11-05 广东小天才科技有限公司 Dictation control method and device
CN111078025A (en) * 2019-07-29 2020-04-28 广东小天才科技有限公司 Method and terminal equipment for determining correctness of input Chinese characters
CN111326030A (en) * 2019-09-10 2020-06-23 西安掌上盛唐网络信息有限公司 Reading, dictation and literacy integrated learning system, device and method
WO2021087752A1 (en) * 2019-11-05 2021-05-14 山东英才学院 Paperless early education machine for children based on wireless transmission technology
CN111652316A (en) * 2020-06-04 2020-09-11 上海仙剑文化传媒股份有限公司 AR Chinese character recognition system based on multimedia application scene
CN112215175B (en) * 2020-10-19 2024-01-30 北京乐学帮网络技术有限公司 Handwritten character recognition method, device, computer equipment and storage medium
CN112927566B (en) * 2021-01-27 2023-01-03 读书郎教育科技有限公司 System and method for student to rephrase story content
CN114245194A (en) * 2021-12-23 2022-03-25 深圳市优必选科技股份有限公司 Video teaching interaction method and device and electronic equipment

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1975804A (en) * 2006-12-15 2007-06-06 华南理工大学 Education robot with character-learning and writing function and character recognizing method thereof
CN103377568A (en) * 2013-06-20 2013-10-30 浙江大学软件学院(宁波)管理中心(宁波软件教育中心) Multifunctional child somatic sensation educating system
CN106228982A (en) * 2016-07-27 2016-12-14 华南理工大学 A kind of interactive learning system based on education services robot and exchange method
CN107123418A (en) * 2017-05-09 2017-09-01 广东小天才科技有限公司 The processing method and mobile terminal of a kind of speech message
CN107346629A (en) * 2017-08-22 2017-11-14 贵州大学 A kind of intelligent blind reading method and intelligent blind reader system
CN107392109A (en) * 2017-06-27 2017-11-24 南京邮电大学 A kind of neonatal pain expression recognition method based on deep neural network

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080166689A1 (en) * 2007-01-05 2008-07-10 Timothy Gerard Joiner Words
CN104778867B (en) * 2013-05-15 2016-05-25 张绪伟 Multifunctional children intercommunication early learning machine

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1975804A (en) * 2006-12-15 2007-06-06 华南理工大学 Education robot with character-learning and writing function and character recognizing method thereof
CN103377568A (en) * 2013-06-20 2013-10-30 浙江大学软件学院(宁波)管理中心(宁波软件教育中心) Multifunctional child somatic sensation educating system
CN106228982A (en) * 2016-07-27 2016-12-14 华南理工大学 A kind of interactive learning system based on education services robot and exchange method
CN107123418A (en) * 2017-05-09 2017-09-01 广东小天才科技有限公司 The processing method and mobile terminal of a kind of speech message
CN107392109A (en) * 2017-06-27 2017-11-24 南京邮电大学 A kind of neonatal pain expression recognition method based on deep neural network
CN107346629A (en) * 2017-08-22 2017-11-14 贵州大学 A kind of intelligent blind reading method and intelligent blind reader system

Also Published As

Publication number Publication date
CN109062404A (en) 2018-12-21

Similar Documents

Publication Publication Date Title
CN109062404B (en) Interaction system and method applied to intelligent early education machine for children
CN110110585B (en) Intelligent paper reading implementation method and system based on deep learning and computer program
CN107993665B (en) Method for determining role of speaker in multi-person conversation scene, intelligent conference method and system
CN105845134B (en) Spoken language evaluation method and system for freely reading question types
CN107221318B (en) English spoken language pronunciation scoring method and system
US20070055523A1 (en) Pronunciation training system
CN106782603B (en) Intelligent voice evaluation method and system
EP0549265A2 (en) Neural network-based speech token recognition system and method
CN110797010A (en) Question-answer scoring method, device, equipment and storage medium based on artificial intelligence
CN113657168B (en) Student learning emotion recognition method based on convolutional neural network
CN113592251B (en) Multi-mode integrated teaching state analysis system
CN109461441A (en) A kind of Activities for Teaching Intellisense method of adaptive, unsupervised formula
CN110675292A (en) Child language ability evaluation method based on artificial intelligence
CN111078010B (en) Man-machine interaction method and device, terminal equipment and readable storage medium
CN112560429A (en) Intelligent training detection method and system based on deep learning
Najeeb et al. Gamified smart mirror to leverage autistic education-aliza
CN115376547B (en) Pronunciation evaluation method, pronunciation evaluation device, computer equipment and storage medium
CN112464664B (en) Multi-model fusion Chinese vocabulary repeated description extraction method
CN112700796B (en) Voice emotion recognition method based on interactive attention model
WO2012152290A1 (en) A mobile device for literacy teaching
Stoianov et al. Modelling the phonotactic structure of natural language words with Simple Recurrent Networks
CN111899581A (en) Word spelling and reading exercise device and method for English teaching
CN111950472A (en) Teacher grinding evaluation method and system
CN116631452B (en) Management system is read in drawing book record broadcast based on artificial intelligence
TWI833328B (en) Reality oral interaction evaluation system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant