CN109559576B - Child accompanying learning robot and early education system self-learning method thereof - Google Patents

Child accompanying learning robot and early education system self-learning method thereof Download PDF

Info

Publication number
CN109559576B
CN109559576B CN201811367002.9A CN201811367002A CN109559576B CN 109559576 B CN109559576 B CN 109559576B CN 201811367002 A CN201811367002 A CN 201811367002A CN 109559576 B CN109559576 B CN 109559576B
Authority
CN
China
Prior art keywords
learning
new
neural network
layer
image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811367002.9A
Other languages
Chinese (zh)
Other versions
CN109559576A (en
Inventor
罗青
邹逸群
郭璠
唐琎
李凡
覃若彬
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Central South University
Original Assignee
Central South University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Central South University filed Critical Central South University
Priority to CN201811367002.9A priority Critical patent/CN109559576B/en
Publication of CN109559576A publication Critical patent/CN109559576A/en
Application granted granted Critical
Publication of CN109559576B publication Critical patent/CN109559576B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09BEDUCATIONAL OR DEMONSTRATION APPLIANCES; APPLIANCES FOR TEACHING, OR COMMUNICATING WITH, THE BLIND, DEAF OR MUTE; MODELS; PLANETARIA; GLOBES; MAPS; DIAGRAMS
    • G09B5/00Electrically-operated educational appliances
    • G09B5/06Electrically-operated educational appliances with both visual and audible presentation of the material to be studied
    • G09B5/065Combinations of audio and video presentations, e.g. videotapes, videodiscs, television systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Business, Economics & Management (AREA)
  • Educational Administration (AREA)
  • Educational Technology (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a child accompanying learning robot and a self-learning method of an early education system thereof, wherein the self-learning method comprises the following steps: step A10, training a convolutional neural network; step A20, extracting feature vectors from an input image by using a convolutional neural network; step A30, grouping and quantizing the eigenvectors by adopting a product quantization technology; step A40, generating a reference alphabet according to the Imagenet data set; step A50, acquiring images and categories of unknown new objects, extracting feature vectors of the images of the new objects, grouping and quantizing the feature vectors, and searching matched new object character strings in a reference alphabet; matching and connecting the character strings of the new object with the categories in the associative memory model to realize the learning of the new object into the early education system; and step A60, acquiring the image of the object to be recognized, and recognizing the type of the object to be recognized by the early education system. The invention can realize the study of new knowledge together with children and the joint competition, thereby improving the learning interest of children.

Description

Child accompanying learning robot and early education system self-learning method thereof
Technical Field
The invention relates to intelligent equipment, in particular to a child accompanying learning robot and a self-learning method of an early education system of the child accompanying learning robot.
Background
Preschool children are not fully developed in intelligence and body, and need to accompany and take care of adults for a long time. Moreover, the children stage is the sensitive period of fastest development of abilities of sports, language, mathematics and the like, and the importance of the children stage is self-evident, so that parents need to pay a great deal of effort and cost to accompany and educate the children. Most of the existing early education systems only have the audio and video playing functions, and although some early education systems can simply interact with the systems in a voice or touch mode, the interactive contents need to be existing materials in a system database. That is to say, the existing material function in the "on demand" system is mainly possessed by the existing early education system, and because the early education system does not possess the learning ability, the system can not learn the new knowledge that the system database encountered in the practical application can not cover, and the system can not reach the basic functions of learning, counting and the like together with the children through lively teaching, and can not satisfy the requirement of jointly growing with the children.
Disclosure of Invention
Aiming at the technical problem that the existing children early education system does not have the self-learning function, so that new knowledge which is not covered by the system cannot be learned, the invention provides the early education system self-learning method of the accompanying learning robot for the children.
In order to achieve the technical purpose, the invention adopts the following technical scheme:
a self-learning method of an early education system of a robot accompanied by children comprises the following steps:
step A10, training a convolutional neural network;
constructing a convolutional neural network model, taking all sample images of the Imagenet data set as input, taking the types of the sample images as labels, and training a convolutional neural network;
step A20, extracting image characteristic information;
performing feature extraction on the input image by adopting the convolutional neural network obtained by training in the step A10, and outputting a feature vector;
step A30, grouping and quantizing the feature vectors;
grouping and quantizing the feature vectors by adopting a product quantization technology to form m sub-feature vectors;
step A40, generating a reference alphabet;
processing all sample images of the Imagenet data set according to the steps A20 and A30 to obtain m sub-feature vectors of each sample image;
for all sample images, sub-feature vectors with the same sequence are taken to form 1 grouped data set, and m grouped data sets are counted in total;
k is obtained by calculating each packet data set by adopting a K-means algorithmsA cluster center for recording k of each packet data setsThe center of each class is 1 class set, and the total m class sets form a reference alphabet;
presetting a reference alphabet in an input layer of an associative memory model;
step A50, learning new things;
acquiring images and categories of unknown new things from the outside of the early education system;
processing the acquired new object image according to the step A20 and the step A30 to obtain m sub-feature vectors of the new object; traversing m sub-feature vectors of the new object, and searching matched letters in a class set of a reference alphabet, which is the same as the current sub-feature vector sequence of the new object, to obtain a character string of the new object with the length of m;
activating nodes of each letter of a character string of the new object in an input layer of an associative memory model, wherein the activated nodes of the input layer are connected with nodes of an output layer representing the category of the new object in a matching manner by the associative memory model; the associative memory model is a binary neural network comprising an input layer and an output layer, and the nodes of the output layer are preset with object types;
step A60, identifying an object to be identified;
acquiring an image of an object to be identified from the outside of the early education system;
processing the acquired image of the object to be identified according to the step A20 and the step A30 to obtain m sub-feature vectors of the object to be identified; traversing m sub-feature vectors of the object to be recognized, and searching matched letters in a class set of a reference alphabet, which is the same as the current sub-feature vector sequence of the object to be recognized, so as to obtain an object to be recognized character string with the length of m;
and activating nodes of all letters of the character string of the object to be recognized in an input layer of the associative memory model, searching nodes of an output layer matched and connected with the activated nodes of the input layer by the associative memory model, outputting object types corresponding to the nodes of the output layer, and recognizing to obtain the types of the object to be recognized.
For new objects which are not available in the early education system, the scheme can match and connect the images of the new objects with the categories by a self-learning method under the condition that the images of the new objects are acquired and other children/teachers inform the categories of the new objects, the matching relation between the images of the objects and the categories stored in the association memory model is updated, and the children accompanying learning robot realizes incremental learning of the objects. Therefore, under the teaching of children or other people, new objects and new characters are learned and recognized, new knowledge is learned together with the children, and the children compete with the new knowledge, so that the learning interest of the children is improved.
Further, the output layer of the associative memory model allocates at least 2 nodes for each object category; when learning a new thing, the output layer node that is connected in match with the active input layer node is the first node under the current thing category that is not connected to the input layer node.
The classes of the output layer events of the associative memory model are respectively provided with a plurality of nodes, so that the similar events can be studied for many times, and the connection relation between a plurality of events and the classes is recorded, thereby improving the identification degree of the event identification.
Further, the associative memory model is a two-layer neural network.
Further, the method for respectively searching the letters matched with the m sub-feature vectors of the object in the m class sets of the reference alphabet comprises the following steps: respectively calculating k in the current sub-feature vector and the corresponding class setsAnd (3) taking 1 closest class center as 1 letter corresponding to the object according to the distance between the class centers, and obtaining m letters from m sub-feature vectors of the object so as to obtain a character string which corresponds to the object and has the length of m.
Further, the category of the thing and the real name of the thing are stored in a key value pair file in txt format; when a new object is learned, acquiring the real name of the new object from the outside, and then searching the category corresponding to the real name in the key value pair file; if the category information does not exist, inserting a new key value pair to represent the newly learned category, and inputting the category of the matter into the associative memory model; when the object to be recognized is recognized, the associative memory model outputs the class of the object to be recognized, and the early education system outputs the real name of the object to be recognized by searching the real name corresponding to the class in the key value pair file.
According to the scheme, the storage space can be saved by storing the key value pairs.
Further, the convolutional neural network model is trained by using the cross entropy as a loss function, and the weight matrix of each layer of the convolutional neural network is updated according to the calculated value L of the loss function;
wherein the loss function is as shown in equation 1:
Figure GDA0002522685670000031
where N denotes the number of sample images input to the convolutional neural network at a time, yiIs the category true label, y 'of the ith sample image'iAnd (5) predicting the ith sample image for the convolutional neural network.
Further, the convolutional neural network comprises an input layer, a convolutional layer, a layer jump connection, a pooling layer, a full connection layer and a classification layer; when the convolutional neural network is trained, calculating a loss function value according to a predicted value output by the classification layer; when the convolutional neural network is used for extracting the image feature information, the feature vector of the image is formed by the features output by the full connection layer.
Corresponding to the self-learning method of the early education system, the invention also provides a robot for accompanying children, which comprises:
the function selection module is used for starting a new object learning function or an object identification function in the early education system function;
the storage module is used for storing the type of the event, the key value pair file of the real name and a storage reference alphabet;
the information input module is used for acquiring the image and the real name of the new object when the learning function of the new object is started, or acquiring the image of the object to be identified when the identification function of the object is started;
the information processing module is used for searching the category corresponding to the real name in the key value pair file according to the acquired real name of the new object when the new object learning function is started, and then learning the new object according to the method according to the image and the category of the new object; when the object identification function is started, the method is used for identifying the object to be identified according to the acquired image of the object to be identified, and then searching the real name corresponding to the category in the key value pair file according to the identified category;
and the information output module is used for outputting the real name of the object to be identified when the object identification function is started.
Further, the information input module comprises a camera and a voice input unit.
Further, the information output module comprises a display screen and a voice output unit.
Advantageous effects
The invention provides a child accompanying learning robot and an early education system self-learning method thereof, for new objects which do not exist in an early education system, the scheme can activate an input layer node of an associated memory model according to a character string obtained by the image of the new object after being subjected to convolutional neural network and product quantization and connect the input layer node with an output layer node of a corresponding class by the self-learning method under the condition that the image of the new object and other children/instructors inform the real name or class of the new object, the matching relation between the object image and the class stored in the associated memory model is updated, and the child accompanying learning robot realizes incremental learning of the objects. Therefore, under the teaching of children or other people, new objects, characters, formulas and the like are learned and identified, new knowledge is learned together with the children, and the children compete with each other, so that the learning interest of the children is improved.
Drawings
A specific embodiment of the invention will be described in detail hereinafter by way of example and not by way of limitation with reference to the accompanying drawings. The attached drawings are as follows:
fig. 1 is a schematic configuration diagram of a child companion robot according to an embodiment of the present invention;
fig. 2 is a block diagram showing the block configuration of a child companionship robot according to an embodiment of the present invention;
FIG. 3 is a schematic view of a child companion robot according to an embodiment of the present invention for recognizing objects, characters, and equations;
FIG. 4 is a schematic diagram of a learning framework of the learning module;
FIG. 5 is an exemplary diagram of alphabet generation in the learning module;
FIG. 6 is an exemplary diagram of the results of feature product quantization in the learning module;
FIG. 7 is a schematic diagram of a learning or recognition process of a companion robot;
FIG. 8 is a schematic diagram of a convolutional neural network model, in which (a) is a schematic diagram of a residual block, (b) is a schematic diagram of a convolutional layer, (c) is a schematic diagram of a max-pooling layer, and (d) is a schematic diagram of a full-link layer;
FIG. 9 is a convolutional neural network training flow diagram;
FIG. 10 is an exemplary simplified neural network model of the present invention.
Reference numerals: 1-camera, 2-interactive function selection area, 3-voice receiving and output area and 4-display screen.
Detailed Description
The invention will be further described with reference to the accompanying drawings and examples.
The invention discloses a child accompanying robot, which is schematically shown in the figure 1 in the structural diagram and comprises a camera 1, an interactive function selection area 2, a voice receiving and outputting area 3 and a display screen 4. The camera 1 is located at the top of the companion robot and is mainly used for acquiring various image information for object identification from the surrounding environment. The interactive function selection area 2 is located at the front part of the body of the companion robot, and a user can interactively select various functions of the robot by clicking, dragging, sliding and the like in the area. The voice receiving and outputting section 3 is provided at the back of the head of the companion robot and is mainly used for recognizing and receiving the voice of the user from the surrounding environment and outputting a reply voice in response to a voice message or a feedback sound in response to each operation of the system. The display screen 4 is positioned at the head front part of the companion robot and is mainly used for displaying the response information of the robot and the related expressions of the robot. The embodiment adopts the touch screen and has the functions of an interactive function selection area and a display screen. For example, if the user answers the question correctly, a smiling face may be displayed on the display screen of the companion robot, similar to other emotional expressions.
According to the structure of the child accompanying learning robot, a functional block diagram of an early education system is shown in fig. 2. The early education system of the child companion robot comprises: the device comprises a function selection module, a storage module, an information input module, an information processing module and an information output module.
The early education system of the child companion robot has multiple functions: the functions of learning articles, learning words, counting, singing, learning and the like, the corresponding function is selected in the interactive function selection area by a user, and the function is started by the function selection module of the early education system. The interactive interface of the function selection module can be in a touch screen form, and can be conveniently selected by a user in a voice command form.
The early education system has a memory as a storage module, and is used for storing parameters of a convolutional neural network, a reference alphabet obtained by learning on an Imagenet data set, a connection mode of a binary associative memory model, a category (namely category information of an event, namely a key-value pair relationship) corresponding to an output node of the binary associative memory model, and the like.
And the information input module is used for inputting voice information of a user and image information for system identification, such as objects, texts, equations and the like extracted from the surrounding environment of the accompanying robot.
And the information processing module is mainly used for processing the voice and image information acquired by the information input module to finish the learning and identifying work of related target objects in the input image, and the learning module can enable the accompanying robot to acquire objective entity information outside the system through a related learning mechanism. The processing of the voice information includes recognition and understanding of the voice; the processing of the image information includes image preprocessing, image feature extraction, and image recognition. The process includes the following two types of cases:
firstly, the method is simple in identification, directly calls a model trained on an existing database to identify the image, returns the identification result to the system, can identify common objects and characters, and can perform simple four-rule operation.
And secondly, new things are learned, and for objects, characters and the like which are not in a system database, the companion learning robot can update the old model by using the new data through a learning algorithm of the learning module under the condition that other children or instructors inform the types or text information of the new things, so that incremental learning is realized. After a plurality of times of learning, the new things can be identified, so that the purposes of learning together with children as users and making progress are achieved, and the learning interest of the children is improved in a mode of accompanying the learning of the children.
And the information output module comprises a voice output module and a display output module. And the voice output module is used for outputting response voice of the robot or responding to feedback voice of the system and playing music in the singing function. And the display output module is used for outputting patterns matched with the voice information and the response voice or related expressions of the accompanying robot, warning information sent to the guardian and the like.
The information processing module comprises a learning module and a warning module.
And the warning module judges the interaction time of the child and the accompanying learning robot, considers that the child has a small difference for a long time when the child does not interact with the accompanying learning robot for a certain time, and sends warning information to the guardian in time in a voice prompt or display screen display mode.
And the learning module is used for learning new things which do not exist in the early education system database in a self-learning mode, and the new things refer to objects or text information. The early education system of the invention utilizes the functions of the learning module, can learn and identify new objects, new characters and the like under the teaching of other people, can learn new knowledge together with children, compete with each other and progress together, thereby improving the learning interest of the children.
The early education system realizes the learning function through the learning module, can serve as a child playmate, can learn basic skills such as object learning, character learning, arithmetic and the like together with the child through education in entertainment, and can grow together with the child. Therefore, the method aims at new objects or new characters without data when the recognition model is trained. A learning framework is constructed in the learning module of the system information processing module, as shown in fig. 4. The learning module adopts a method of combining a Convolutional Neural Network (CNNS) and a binary associative memory model, firstly, a pre-trained convolutional neural network model is utilized to extract characteristics of a new sample, then, the extracted characteristics are mapped into a limited alphabet list by utilizing a product quantization technology, and finally, the binary associative memory model is used to store the new sample, thereby realizing the learning of the new sample. And during identification and prediction, a binary associative memory model is used for carrying out nearest neighbor search, so that the identification of a new category is realized. Therefore, the robot can learn new objects and text information which are not existed originally in a self-learning mode, so that the robot can learn with children and jointly grow. As can be seen from fig. 4, the illustrated learning framework mainly includes a Convolutional Neural Network (CNNS) and a binary associative memory model. The convolutional neural network has strong learning ability and expression ability, and the image can be well expressed by using the characteristics extracted by the convolutional network pre-trained by millions of data sets in the Imagenet data set.
The invention discloses a self-learning method of an early education system of a robot accompanied with learning by children, which comprises the following steps:
step A10, training the convolutional neural network:
firstly, a convolutional neural network model is constructed, the model used in the embodiment is correspondingly improved on the basis of ResNet, 50 residual blocks are used in total, the structure of the residual blocks is shown in FIG. 8(a), each residual block comprises three convolutional layers and a hopping connection, the structural schematic diagram of the convolutional layers is shown in FIG. 8(b), and the hopping connection can effectively solve the problem of gradient disappearance in the training process; every 10 residual blocks is followed by a maximum pooling layer, which, as shown in fig. 8(c), can reduce the feature size and extract the most significant features. Finally, a full-connectivity layer follows, as shown in fig. 8(d), the network used herein uses a total of three full-connectivity layers, the first full-connectivity layer has 1024 nodes, the second full-connectivity layer has 2048 nodes, the last full-connectivity layer has 1000 (1000 classes of the Imagenet data set), and the result of the last full-connectivity layer passes softmax to obtain the probability of 1000 classes in the Imagenet data set. When the network is subsequently used to extract features, the output of the second fully-connected layer is used as the features.
The convolutional neural network model is then trained using the cross entropy as a loss function, which is shown as equation 1, where y isiIs the category true label, y 'of the ith sample image'iAnd (5) predicting the ith sample image for the convolutional neural network. The process of model training is to compare the predicted value and the real label, and update the weight matrix of each layer of the convolutional neural network according to the difference between the predicted value and the real label, in this embodiment, the cross entropy shown in formula 1 is used as a loss function to measure the difference between the predicted value and the real label, and then the weight matrix of each layer of the convolutional neural network model is updated according to the loss function value.
Figure GDA0002522685670000071
Where N is the batch size, i.e., the number of sample images fed into the training at one time.
Specifically, the process of training the convolutional neural network comprises three parts of forward propagation, calculation of a loss function and backward propagation, wherein the forward propagation is to obtain a predicted value y 'from an image input to the convolutional neural network through convolution, pooling and full connection layers, the loss value L can be calculated according to formula 1 according to the sample true label y and the predicted value y' due to the fact that the true label y of a sample is known, and then the gradient descent method is used for backward propagation and updating the weight matrix of the convolutional neural network.
FIG. 10 is a neural network consisting of two fully-connected layers, where i is the input, h is the middle layer, o is the output layer, and w is the weight, and the forward propagation is calculated as follows:
h1=w1*i1+w3*i2+w5*i3(2)
h2=w2*i1+w4*i2+w6*i3(3)
o1=w7*h1+w9*h2(4)
o2=w8*h1+w10*h2(5)
can be known as o ═ o1,o2]Obtaining a predicted value of y ═ y 'after the activation of the softmax function'1,y'2]Specifically, the following formula is calculated:
Figure GDA0002522685670000072
Figure GDA0002522685670000081
suppose the true label of the sample is y ═ y1,y2]The loss values are as follows:
Figure GDA0002522685670000082
the above is the process of forward propagation and calculating the loss value, and the process of backward propagation and updating the weight parameter is described in detail below.
The loss value L can be known to the parameter w through the chain derivation method1The gradient calculation formula of (c) is as follows:
Figure GDA0002522685670000083
similarly, gradient values of other weight parameters can be calculated, and the formula for updating the parameters is as follows:
Figure GDA0002522685670000084
α is the learning rate, the above process completes one round of learning of the weight parameters of the neural network, the loss value L will decrease continuously after continuous learning until the value converges to a certain value or the training process completes the training process of the convolutional neural network, and obtains the optimized weight matrix, the convolutional neural network of the weight matrix is the trained convolutional neural network, when the feature vector is output by the convolutional neural network in the subsequent process, the feature vector is output by the previous fully connected layer of the classification layer.
Step a20, extracting image feature information: and (3) performing feature extraction on the input image by adopting a convolutional neural network to form a feature vector with the dimension d. The extracted feature vector is obtained by outputting the previous full-connection layer of the classification layer of the relational neural network.
Step A30, grouping and quantizing the feature vectors: and grouping and quantizing the d-dimensional feature vectors by adopting a product quantization technology to form m sub-feature vectors, wherein the dimension of each sub-feature vector is d/m.
Step a40, generating a reference alphabet:
acquiring n sample images of the Imagenet data set, wherein the number of the sample images is n, and each sample image is processed according to the steps A20 and A30, so that m sub-feature vectors are obtained from the n sample images;
and for all sample images, taking sub-feature vectors with the same sequence, wherein the sub-feature vectors with the same sequence of n sample images form 1 grouped data set, and each sample has m sub-feature vectors, so that m grouped data sets are counted in total. The sequence identity here means that each sample image has m sub-feature vectors, the m sub-feature vectors are numbered in sequence, and the sub-feature vectors with the same sequence number in each sample image are taken to obtain the grouped data set with the sequence number. Since there are m sub-feature vectors per sample data, all sample images can get m grouped data sets. For example, the 1 st sub-feature vector of each sample image is taken, and the 1 st sub-feature vectors of all sample images form the 1 st grouped data set; then, taking the 2 nd sub-feature vector of each sample image, wherein the 2 nd sub-feature vectors of all the sample images form a2 nd grouped data set; … …, respectively; finally m packet data sets are obtained.
Calculating K of each packet data set by adopting a K-means algorithmsA class center, note ksThe center of each class is 1 class set, and the total m class sets form a reference alphabet.
In the example shown in fig. 5, it is assumed that there are n samples in total, the extracted feature dimension d is 12, the number of groups m is 4, and the class center in each group is ksThe result of the quantization is shown in FIG. 6, where k is assumedsThe resulting alphabet comprises 4 alphabet sets (i.e. m class sets, J1, 2, 3, 4 respectively), each alphabet set comprising 4 alphabets (i.e. k)sThe centers of the individuals are respectively CijAnd representing the jth class center in the ith group of sub-feature vectors), an alphabet obtained from an Imagenet data set is used as a reference alphabet, the Imagenet data comprises thousands of classes and millions of pictures, and the alphabets obtained by learning on the data can effectively summarize image features, so that the recognition degree of things during recognition is improved.
The reference alphabet is then preset in the input layer of the associative memory model.
Step a50, learning new things:
when the early education system starts the learning function of the new object, as shown in fig. 4, the accompanying robot acquires an image of the new object unknown to the system from the outside of the system through the camera, and other children or instructors inform the robot system of the real name of the unknown new object through voice.
Processing the acquired new object image according to the step A20 to obtain the feature vector x of the new objectdThen, the m sub-feature vectors of the new object are obtained by processing according to the step A30
Figure GDA0002522685670000091
i denotes the sequence number of the sub-feature vector.
K at the ith class set of the reference alphabetsSearching letters which are most matched with the ith sub-feature vector of the new object in the center of the individual class until m sub-feature vectors of the new object
Figure GDA0002522685670000092
The best matching letter is found, resulting in a new object string of length m. Since each class set is obtained by sub-feature vectors with the same sequence of samples, when a new thing is learned, matched letters are searched in the class sets with the same sequence.
The method for searching the most matched letters comprises the following steps: respectively calculating k of the ith sub-feature vector and the ith class setsAnd (3) taking the class center corresponding to the minimum distance as 1 letter corresponding to the new object, and obtaining m letters from m sub-feature vectors of the new object, so as to obtain a character string with the length of m corresponding to the new object and output the character string to the associative memory model. For example, in FIG. 7, the 1 st sub-feature vector
Figure GDA0002522685670000093
Finding the letter closest to the center of the 4 classes as C 112 nd sub-feature vector
Figure GDA0002522685670000094
Finding the letter closest to the center of the 4 classes as C23The 3 rd sub-feature vector
Figure GDA0002522685670000095
Finding the letter closest to the center of the 4 classes as C 344 th sub-feature vector
Figure GDA0002522685670000096
Finding the letter closest to the center of the 4 classes as C42The character string obtained by product quantization of the new object is C11C23C34C42,
And searching a class corresponding to the real name of the new object in the key value pair file in the txt format stored in the storage module, and finding a node corresponding to the class of the new object in an output layer of the associative memory model. If the category information is not found in the key-value pair file, indicating that the thing has not been learned before, a new key-value pair is inserted into the key-value pair file.
Each letter of the new object character string activates a corresponding node in an input layer of the associative memory model, and the associative memory model connects the input layer node activated by the new object character string with an output layer node representing a new object category in a matching manner.
In this embodiment, the associative memory model is a binary neural network including an input layer and an output layer, and the nodes of the output layer are preset with object types.
The output layer of the associative memory model allocates at least 2 nodes for each object category, when a new object is learned, the output layer node which is matched and connected with the activated input layer node is the first node which is not connected with the input layer node under the current object category until all the nodes of the category are allocated, and the object of the category is fully learned in the system. By studying the same kind of objects for multiple times and recording the connection relation between a plurality of the objects and the categories, the identification degree of the objects during identification is improved.
The associative memory model in this embodiment is implemented using a two-layer neural network: the input layer consists of m groups of nodes, each group comprising ksA node of the input layer for storing each k of the m class sets of the reference alphabetsA class center; the output layer comprises RC nodes, wherein C represents the total number of new object classes which can be learned, R represents the number of nodes allocated to each class by the associative memory model, C is 9, and R is 2 in FIG. 7, and R and C can take a very large value in practice, so that the incremental learning of very large classes and data volumes can be realized with only little storage requirement.
For example, in FIG. 7, the letter C is obtained by multiplying and quantizing the new thing to be learned11、C23、C34、C42Inputting the reference alphabet into the associative memory model, respectively activating corresponding nodes in the input layer of the associative memory model (since the reference alphabet is preset in the input layer nodes of the associative memory model, the corresponding nodes here mean that the input layer respectively stores the letter C11、C23、C34、C42The association memory model assigns a category 6 node to the new object category (assuming category ' 6 ') at the output level and activates the first non-connected node of category 6, i.e. ' η61'; then, each node C of the input layer is connected11、C23、C34、C42Respectively with the new object class node η61And performing matching connection. Since the associative memory model in the present invention is a binary memory model, there is no connection weight, and only there is a connection and no connection. When all the R nodes of the category are distributed, objects which represent the category are fully learned into the system, and therefore learning of new samples is achieved.
The accompanying robot is similar to the learning process of children in the learning process of new things, along with repeated learning, the accuracy of recognizing the things can be gradually improved, the situation of common learning and common progress with the children is formed, the mode of accompanying the learning of the children with the role of the children classmate instead of the mode of teaching the learning of the children with the role of the teacher of the children can eliminate the fear and the aversion to learning of the children, and the learning interest of the children is improved.
The existing system stores contents such as objects, characters and the like in various forms inside the system, and the contents are called out when in use, so that the existing system lacks entity objects. The object identified by the accompanying learning robot is an external objective entity, the image of the object is acquired by a camera through a computer vision technology, the object, the character and the formula outside the system are identified through self-learning, the name of the object is spoken by voices, and the calculation result of the character and the formula is commented, so that the interest and the sense of reality of the existing system can be improved.
Step A60, identifying the object to be identified:
when the object identification function of the early education system is started, the accompanying robot acquires an image of an object to be identified from the outside of the system through the camera;
processing the acquired image of the object to be identified according to the step A20 to obtain a feature vector x of the object to be identifieddThen, the m sub-feature vectors of the object to be identified are obtained by processing according to the step A30
Figure GDA0002522685670000111
i denotes the sequence number of the sub-feature vector.
K at the ith class set of the reference alphabetsAnd searching the letter which is most matched with the ith sub-feature vector of the object to be recognized in the individual class center until m sub-feature vectors of the object to be recognized find the most matched letter, thereby obtaining the character string of the object to be recognized with the length of m. The method for searching the most matched letters comprises the following steps: respectively calculating the ith sub-feature vector and ksAnd (3) taking the class center corresponding to the minimum distance as 1 letter corresponding to the object to be recognized, and obtaining m letters from m sub-feature vectors of the object to be recognized, so as to obtain a character string with the length of m corresponding to the object to be recognized and output the character string to the associative memory model.
And activating corresponding nodes of an input layer of the associative memory model by each letter of the object character string to be recognized, searching output layer nodes connected with the activated input layer nodes in a matching manner by the associative memory model, and outputting object categories corresponding to the output layer nodes.
And searching a real name corresponding to the category of the object to be recognized in the key value pair file in the txt format stored in the storage module, and outputting the real name of the object to be recognized to the child accompanying robot through a display screen or voice.
In the present invention, objects may be any of various forms such as objects, characters, and expressions, and the category of the objects refers to a code sequence of each different object. When the object specifically refers to an object, the real name of the object refers to the Chinese name, pinyin and/or English name of the object; when the object is specifically a character, the real name of the object refers to pinyin, paraphrase and the like of the character, and when the object is specifically an arithmetic expression, the real name of the object refers to an arithmetic result.
Fig. 3 is a schematic view of an embodiment of the child companion robot according to the present invention for recognizing objects, characters, and numbers. As shown in fig. 3, when the companion robot sees an object through the overhead camera, the companion robot will automatically recognize the object and notify the user of the name of the object. For example, when the companion robot early education system is in the object recognition function, when the companion robot sees an apple placed in front, it automatically recognizes that the object is an apple, the display screen actively outputs the object name, or when the user asks the companion robot what this is, it can give an answer. The way of informing the user of the real name of the things can be output by voice, and the two characters of the apple, the corresponding pinyin and English words thereof or both can be displayed on the display screen. Similarly, in the character recognition function, when the accompanying robot sees a Chinese character placed in front (such as a 'chess' character), the accompanying robot can automatically recognize the Chinese character and actively output the pinyin and paraphrase of the Chinese character, or give an answer when the user asks. In addition, the child accompanying robot can recognize articles and words, and can perform addition, subtraction, multiplication and division four arithmetic operations. For example, in the mathematical function, when the companion robot sees a card written with "6 + 4" placed in front, or the robot hears a user asking for "6 + 4? When the robot is used, the robot can inform the user of the calculation result of 6+ 4-10 by voice, and simultaneously, the number of 6+ 4-10 is displayed on the display screen.
If the target object of the object identification is the object type existing in the picture of the system according to the Imagenet data set or the object type learned through new objects, the accompanying robot can realize the identification at any time, namely the real name and the related information of the object are immediately informed to the user after the identification. The object types existing in the image data set picture are pre-learned into the associative memory model by the accompanying robot system, and the learning method is the same as the incremental learning method when the accompanying robot is used in the step A50. If the target object is not learned by the system, the accompanying robot can start a new object learning function of the early education system, and when the user informs of the new object type, the accompanying robot learns the type of the new object that is learned, so that the accompanying robot can correctly recognize the object when seeing the object next time.
For example, if the picture of the original Imagenet data set contains the data of a pear, the child accompanying robot can immediately recognize the pear and inform the user that the pear is. For apples not in the initial training database, parents or other instructors can inform the companion robot that the apple is when the camera of the companion robot sees the apple, and the robot can realize incremental learning through the novelty learning function so as to recognize the apple with children as users next time.
The accompanying robot has a supervision function, judges the interaction time of the child and the accompanying robot, considers that the child has a small difference for a long time when the child does not interact with the accompanying robot for a certain time, and then sends warning information to a guardian in time in a voice prompt or display screen display mode to avoid some unsafe things of the child; meanwhile, the voice prompt and the display screen display can draw up the interest of the children, so that more children can interact with the toy robot.
The child accompanying learning robot and the child early education system apply the robot technology, the image processing and the pattern recognition technology to accompanying of preschool children, intelligently respond to various actions and states of a user, and give the user good use experience. Compared with the existing early teaching robot system, the system disclosed by the invention has the following advantages: 1) the system has a learning function, and can learn and identify new objects and new characters under the guidance of related personnel, so that the goals of learning and co-growth with children are realized, and the time and the energy of parents in the aspect of early education of the children are saved; 2) the system is more positioned in classmates of children than teachers of the children, so that the fear and the aversion to learning emotion of the children are eliminated; 3) the system can identify external objective entities by utilizing images acquired by the camera, so that the system has better interestingness and sense of reality; 4) the system has a supervision function, and can send warning information to a guardian when the child behavior is abnormal, such as a small difference after a long time.
Thus, it should be understood by those skilled in the art that while exemplary embodiments of the present invention have been illustrated and described in detail herein, many other variations and modifications can be made, which are consistent with the principles of the invention, from the disclosure herein, without departing from the spirit and scope of the invention. Accordingly, the scope of the invention should be understood and interpreted to cover all such other variations or modifications.

Claims (10)

1. A self-learning method of an early education system of a robot accompanied by children is characterized by comprising the following steps:
step A10, training a convolutional neural network;
constructing a convolutional neural network model, taking all sample images of the Imagenet data set as input, taking the types of the sample images as labels, and training a convolutional neural network;
step A20, extracting image characteristic information;
performing feature extraction on the input image by adopting the convolutional neural network obtained by training in the step A10, and outputting a feature vector;
step A30, grouping and quantizing the feature vectors;
grouping and quantizing the feature vectors by adopting a product quantization technology to form m sub-feature vectors;
step A40, generating a reference alphabet;
processing all sample images of the Imagenet data set according to the steps A20 and A30 to obtain m sub-feature vectors of each sample image;
for all sample images, sub-feature vectors with the same sequence are taken to form 1 grouped data set, and m grouped data sets are counted in total;
k is obtained by calculating each packet data set by adopting a K-means algorithmsA cluster center for recording k of each packet data setsThe center of each class is 1 class set, and the total m class sets form a reference alphabet;
presetting a reference alphabet in an input layer of an associative memory model;
step A50, learning new things;
acquiring images and categories of unknown new things from the outside of the early education system;
processing the acquired new object image according to the step A20 and the step A30 to obtain m sub-feature vectors of the new object; traversing m sub-feature vectors of the new object, and searching matched letters in a class set of a reference alphabet, which is the same as the current sub-feature vector sequence of the new object, to obtain a character string of the new object with the length of m;
activating nodes of each letter of a character string of the new object in an input layer of an associative memory model, wherein the activated nodes of the input layer are connected with nodes of an output layer representing the category of the new object in a matching manner by the associative memory model; the associative memory model is a binary neural network comprising an input layer and an output layer, and the nodes of the output layer are preset with object types;
step A60, identifying an object to be identified;
acquiring an image of an object to be identified from the outside of the early education system;
processing the acquired image of the object to be identified according to the step A20 and the step A30 to obtain m sub-feature vectors of the object to be identified; traversing m sub-feature vectors of the object to be recognized, and searching matched letters in a class set of a reference alphabet, which is the same as the current sub-feature vector sequence of the object to be recognized, so as to obtain an object to be recognized character string with the length of m;
and activating nodes of all letters of the character string of the object to be recognized in an input layer of the associative memory model, searching nodes of an output layer matched and connected with the activated nodes of the input layer by the associative memory model, outputting object types corresponding to the nodes of the output layer, and recognizing to obtain the types of the object to be recognized.
2. The method of claim 1, wherein the output layer of the associative memory model assigns at least 2 nodes for each transaction category; when learning a new thing, the output layer node that is connected in match with the active input layer node is the first node under the current thing category that is not connected to the input layer node.
3. The method of claim 1, wherein the associative memory model is a two-layer neural network.
4. The method of claim 1, wherein the method of finding the letter matching the m sub-feature vectors of the thing in the m class sets of the reference alphabet comprises: respectively calculating k in the current sub-feature vector and the corresponding class setsAnd (3) taking 1 closest class center as 1 letter corresponding to the object according to the distance between the class centers, and obtaining m letters from m sub-feature vectors of the object so as to obtain a character string which corresponds to the object and has the length of m.
5. The method of claim 1, wherein the category of the thing and the true name of the thing are stored in a txt format key-value pair file; when a new object is learned, acquiring the real name of the new object from the outside, and then searching the category corresponding to the real name in the key value pair file; if the category information does not exist, inserting a new key value pair to represent the newly learned category, and inputting the category of the matter into the associative memory model; when the object to be recognized is recognized, the associative memory model outputs the class of the object to be recognized, and the early education system outputs the real name of the object to be recognized by searching the real name corresponding to the class in the key value pair file.
6. The method of claim 1, wherein the convolutional neural network model is trained using cross entropy as a loss function, and the weight matrices for the layers of the convolutional neural network are updated according to the calculated values L of the loss function;
wherein the loss function is as shown in equation 1:
Figure FDA0001868838030000021
where N denotes the number of sample images input to the convolutional neural network at a time, yiIs the category true label, y 'of the ith sample image'iAnd (5) predicting the ith sample image for the convolutional neural network.
7. The method of claim 6, wherein the convolutional neural network comprises an input layer, a convolutional layer, a skip layer connection, a pooling layer, a full connection layer, and a classification layer; when the convolutional neural network is trained, calculating a loss function value according to a predicted value output by the classification layer; when the convolutional neural network is used for extracting the image feature information, the feature vector of the image is formed by the features output by the full connection layer.
8. A child companion robot, comprising:
the function selection module is used for starting a new object learning function or an object identification function in the early education system function;
the storage module is used for storing the type of the event, the key value pair file of the real name and a storage reference alphabet;
the information input module is used for acquiring the image and the real name of the new object when the learning function of the new object is started, or acquiring the image of the object to be identified when the identification function of the object is started;
an information processing module, when the new object learning function is started, for searching a category corresponding to the real name in the key value pair file according to the obtained real name of the new object, and then learning the new object according to the image and the category of the new object and the method of claim 1; when the object identification function is started, identifying the object to be identified according to the acquired image of the object to be identified and the method of claim 1, and then finding the real name corresponding to the category in the key value pair file according to the identified category;
and the information output module is used for outputting the real name of the object to be identified when the object identification function is started.
9. The robot of claim 8, wherein the information input module comprises a camera and a voice input unit.
10. The robot for accompanying children's education as claimed in claim 8, wherein the information output module includes a display screen and a voice output unit.
CN201811367002.9A 2018-11-16 2018-11-16 Child accompanying learning robot and early education system self-learning method thereof Active CN109559576B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811367002.9A CN109559576B (en) 2018-11-16 2018-11-16 Child accompanying learning robot and early education system self-learning method thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811367002.9A CN109559576B (en) 2018-11-16 2018-11-16 Child accompanying learning robot and early education system self-learning method thereof

Publications (2)

Publication Number Publication Date
CN109559576A CN109559576A (en) 2019-04-02
CN109559576B true CN109559576B (en) 2020-07-28

Family

ID=65866326

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811367002.9A Active CN109559576B (en) 2018-11-16 2018-11-16 Child accompanying learning robot and early education system self-learning method thereof

Country Status (1)

Country Link
CN (1) CN109559576B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112036208B (en) * 2019-05-17 2021-10-08 深圳市希科普股份有限公司 Artificial intelligence sprite based on interactive learning system
CN110781861A (en) * 2019-11-06 2020-02-11 上海谛闲工业设计有限公司 Electronic equipment and method for universal object recognition
CN111078008B (en) * 2019-12-04 2021-08-03 东北大学 Control method of early education robot
CN113673795A (en) * 2020-05-13 2021-11-19 百度在线网络技术(北京)有限公司 Method and device for acquiring online teaching material content and intelligent screen equipment

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103729459A (en) * 2014-01-10 2014-04-16 北京邮电大学 Method for establishing sentiment classification model
CN104866855A (en) * 2015-05-07 2015-08-26 华为技术有限公司 Image feature extraction method and apparatus
CN105512685A (en) * 2015-12-10 2016-04-20 小米科技有限责任公司 Object identification method and apparatus
CN105913087A (en) * 2016-04-11 2016-08-31 天津大学 Object identification method based on optimal pooled convolutional neural network
CN106409290A (en) * 2016-09-29 2017-02-15 深圳市唯特视科技有限公司 Infant intelligent voice education method based on image analysis
CN107918782A (en) * 2016-12-29 2018-04-17 中国科学院计算技术研究所 A kind of method and system for the natural language for generating description picture material
CN108304846A (en) * 2017-09-11 2018-07-20 腾讯科技(深圳)有限公司 Image-recognizing method, device and storage medium
CN108460399A (en) * 2017-12-29 2018-08-28 华南师范大学 A kind of child building block builds householder method and system

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160140438A1 (en) * 2014-11-13 2016-05-19 Nec Laboratories America, Inc. Hyper-class Augmented and Regularized Deep Learning for Fine-grained Image Classification
US10423874B2 (en) * 2015-10-02 2019-09-24 Baidu Usa Llc Intelligent image captioning

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103729459A (en) * 2014-01-10 2014-04-16 北京邮电大学 Method for establishing sentiment classification model
CN104866855A (en) * 2015-05-07 2015-08-26 华为技术有限公司 Image feature extraction method and apparatus
CN105512685A (en) * 2015-12-10 2016-04-20 小米科技有限责任公司 Object identification method and apparatus
CN105913087A (en) * 2016-04-11 2016-08-31 天津大学 Object identification method based on optimal pooled convolutional neural network
CN106409290A (en) * 2016-09-29 2017-02-15 深圳市唯特视科技有限公司 Infant intelligent voice education method based on image analysis
CN107918782A (en) * 2016-12-29 2018-04-17 中国科学院计算技术研究所 A kind of method and system for the natural language for generating description picture material
CN108304846A (en) * 2017-09-11 2018-07-20 腾讯科技(深圳)有限公司 Image-recognizing method, device and storage medium
CN108460399A (en) * 2017-12-29 2018-08-28 华南师范大学 A kind of child building block builds householder method and system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
自组织决策树的联想记忆在线学***等;《模式识别与人工智能》;20170131;第30卷(第1期);第21-30页 *

Also Published As

Publication number Publication date
CN109559576A (en) 2019-04-02

Similar Documents

Publication Publication Date Title
CN109559576B (en) Child accompanying learning robot and early education system self-learning method thereof
KR102071582B1 (en) Method and apparatus for classifying a class to which a sentence belongs by using deep neural network
CN110148318B (en) Digital teaching assistant system, information interaction method and information processing method
US20190318648A1 (en) Systems and methods for interactive language acquisition with one-shot visual concept learning through a conversational game
KR102040400B1 (en) System and method for providing user-customized questions using machine learning
Russo et al. PARLOMA–a novel human-robot interaction system for deaf-blind remote communication
CN112527993B (en) Cross-media hierarchical deep video question-answer reasoning framework
Ramakrishnan et al. Toward automated classroom observation: Multimodal machine learning to estimate class positive climate and negative climate
Camgöz et al. Sign language recognition for assisting the deaf in hospitals
AU2019101138A4 (en) Voice interaction system for race games
CN112131345B (en) Text quality recognition method, device, equipment and storage medium
Zhu An educational approach to machine learning with mobile applications
Sha et al. Neural knowledge tracing
WO2018212584A2 (en) Method and apparatus for classifying class, to which sentence belongs, using deep neural network
Doering et al. Neural-network-based memory for a social robot: Learning a memory model of human behavior from data
Sarawate et al. A real-time American Sign Language word recognition system based on neural networks and a probabilistic model.
Mazaheri et al. Video fill in the blank using lr/rl lstms with spatial-temporal attentions
Najeeb et al. Gamified smart mirror to leverage autistic education-aliza
Galke et al. What makes a language easy to deep-learn?
CN117216197A (en) Answer reasoning method, device, equipment and storage medium
Muhdalifah Pooling comparison in CNN architecture for Javanese script classification
Palmeri et al. Sign languages recognition based on neural network architecture
KR20220053441A (en) Method, apparatus and computer program for evaluating lecture video using neural network
Jebbari et al. Prediction and Classification of Learning Styles using Machine Learning Approach
AnithaElavarasi et al. Role of Machine Learning and Deep Learning in Assisting the Special Children’s Learning Process

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant