CN117747101A

CN117747101A - Cognitive rehabilitation robot system and control method thereof

Info

Publication number: CN117747101A
Application number: CN202311716557.0A
Authority: CN
Inventors: 吴剑煌; 王浩宇
Original assignee: Shenzhen Huaquejing Medical Technology Co ltd
Current assignee: Shenzhen Huaquejing Medical Technology Co ltd
Priority date: 2023-12-13
Filing date: 2023-12-13
Publication date: 2024-03-22

Abstract

The invention provides a cognitive rehabilitation robot system and a control method thereof, wherein the system comprises the following components: the system comprises terminal equipment, mixed reality equipment and a server, wherein the terminal equipment and the mixed reality equipment are all in communication connection with the server; the terminal equipment is used for receiving the capability assessment content information of the user when the preliminary test of the user passes and sending the capability assessment content information to the mixed reality equipment; the mixed reality equipment is used for displaying a corresponding evaluation scene based on the capability evaluation content information, collecting a real-time image of a user in the evaluation process and sending the real-time image to the server; the server is used for receiving the real-time images sent by the mixed reality equipment, determining the current state of the user based on the real-time images, carrying out capability assessment based on the current state of the user corresponding to the received continuous multiple real-time images, and sending the capability assessment result to the terminal equipment. The invention realizes the automatic capability assessment of the user based on the detected current state of the user, and improves the assessment efficiency and accuracy.

Description

Cognitive rehabilitation robot system and control method thereof

Technical Field

The invention relates to the technical field of rehabilitation robots, in particular to a cognitive rehabilitation robot system and a control method thereof.

Background

When the existing cognitive rehabilitation robot evaluates the capability of a user, the conventional cognitive rehabilitation robot usually adopts a scale to evaluate, and usually needs to manually participate in interaction, a large amount of manpower and material resources are needed to be input, the efficiency of the evaluation mode is low, the capability evaluation result is relatively subjective, and the evaluation accuracy is low.

Disclosure of Invention

Accordingly, the invention aims to provide a cognitive rehabilitation robot system and a control method thereof, which realize the automatic capability assessment of a user based on the detected user fixation point position, hand position or gesture, and have the advantages of convenient assessment mode and improved assessment efficiency and accuracy.

In order to achieve the above object, the technical scheme adopted by the embodiment of the invention is as follows:

in a first aspect, an embodiment of the present invention provides a cognitive rehabilitation robot system, including: the system comprises terminal equipment, mixed reality equipment and a server, wherein the terminal equipment and the mixed reality equipment are both in communication connection with the server;

the terminal equipment is used for receiving the capability assessment content information of the user when the preliminary test of the user passes and sending the capability assessment content information to the mixed reality equipment;

the mixed reality equipment is used for displaying a corresponding evaluation scene based on the capability evaluation content information, collecting a real-time image of a user in an evaluation process, and sending the real-time image to the server so as to enable the server to perform capability evaluation;

The server is used for receiving the real-time image sent by the mixed reality equipment, determining the current state of the user based on the real-time image, performing capability assessment based on the received current state of the user corresponding to the continuous multiple real-time images, and sending the capability assessment result to the terminal equipment; wherein the user current state includes any one or more of gaze point location, hand location, and gestures.

Further, the present embodiment provides a first possible implementation manner of the first aspect, wherein the current state of the user includes a gaze point position;

the server is further configured to determine a current gaze point position of a user based on the real-time image, send out a ray based on the current gaze point position as an origin, and perform collision detection on the ray and a target model displayed by the mixed reality device, so as to determine whether the user gazes at the target model, determine a total gaze time length of the user gazing at the target model and a longest continuous gaze time length of the user gazing at the target model based on the real-time image continuously sent by the mixed reality device, and evaluate attention capability based on the longest continuous gaze time length and the total gaze time length.

Further, the present embodiment provides a second possible implementation manner of the first aspect, wherein the current state of the user includes a gesture and a hand position, and the real-time image includes an RGB image and a depth image;

the mixed reality equipment is also used for displaying the target model and sending out an action instruction;

the server is further configured to identify a gesture of a user in the real-time image based on a gesture identification model obtained by training in advance when the action instruction is a grasping target model, calculate a center point position corresponding to each fingertip position when the gesture is the grasping gesture, record the center point position as an average position, determine that the gesture is correct when the average position is located inside a bounding box of the target model, record the current time as gesture completion time, and evaluate hand-eye coordination capability based on a start time when the action instruction is issued and the gesture completion time.

Further, the present embodiment provides a third possible implementation manner of the first aspect, wherein the current state of the user includes a hand position;

the server is further configured to identify joint positions of fingers of a user extending out of the continuous multiple real-time images when the motion instruction is a touch target model, determine that the motion touch motion is correct when the joint positions move from the outside of a bounding box of the target model to the inside of the bounding box of the target model, record the current time as touch motion completion time, and evaluate hand-eye coordination capability based on the start time when the motion instruction is sent and the touch motion completion time.

Further, the embodiment of the present invention provides a fourth possible implementation manner of the first aspect, where the server is further configured to generate a corresponding training task based on the capability assessment result, and send the training task to the mixed reality device;

the mixed reality equipment is used for displaying a training scene corresponding to the training task, sending a training instruction, collecting training images of a user in the training process, and sending the training images to the server; wherein the training image comprises an RGB image and a depth image;

the server is further configured to determine whether each training action of the user is correct based on the training image and the training instruction, determine a training action correct rate after the training task is completed, and send the training action correct rate to the terminal device, so that the terminal device displays the training action correct rate.

Further, the present embodiment provides a fifth possible implementation manner of the first aspect, wherein the server is further configured to send an eye movement calibration instruction to the mixed reality device when a new user uses the mixed reality device;

the mixed reality device is further used for displaying a calibration picture when receiving the eye movement calibration command, collecting eye images of a user and sending the eye images to the server;

The server is configured to receive the eye image and determine an amount of identification deviation of the gaze point location based on the eye image.

Further, the embodiment of the present invention provides a sixth possible implementation manner of the first aspect, wherein the calibration screen includes a plurality of calibration points;

the server is further used for identifying the center point coordinates of the pupil area based on the eye images, obtaining the fixation point position, calculating the distance between the fixation point position and the calibration point, determining that the user is looking at the calibration point when the distance is smaller than a preset distance threshold, recording the fixation point coordinates when the user looks at the calibration point, and identifying the deviation amount of the fixation point position based on the fixation point coordinates and the set coordinates of the calibration point.

Further, the embodiment of the present invention provides a seventh possible implementation manner of the first aspect, where the mixed reality device is further configured to display a test picture and send a voice test instruction when receiving the test instruction, collect a test image of a user, and send the test image to the server;

the server is also used for judging whether the user action is correct or not based on the test image, and determining that the preliminary test of the user passes when the user action is correct.

Further, the embodiment of the present invention provides an eighth possible implementation manner of the first aspect, wherein the mixed reality device includes a head-mounted mixed reality device, and the terminal device includes a computer and/or a mobile terminal.

In a second aspect, an embodiment of the present invention further provides a control method of a cognitive rehabilitation robot system, which is applied to the cognitive rehabilitation robot system in any one of the first aspect, where the control method of the cognitive rehabilitation robot system includes:

when the preliminary test of the user passes, receiving the capacity evaluation content information of the user based on the terminal equipment;

transmitting the capability assessment content information to the mixed reality equipment so that the mixed reality equipment displays a corresponding assessment scene based on the capability assessment content information, and acquiring a real-time image of a user in an assessment process;

the real-time images are sent to the server, so that the server determines the current state of a user based on the real-time images, performs capability assessment based on the received current state of the user corresponding to a plurality of continuous real-time images, and sends the capability assessment result to the terminal equipment; wherein the user current state includes any one or more of gaze point location, hand location, and gestures.

The embodiment of the invention provides a cognitive rehabilitation robot system and a control method thereof, wherein the cognitive rehabilitation robot system comprises: the system comprises terminal equipment, mixed reality equipment and a server, wherein the terminal equipment and the mixed reality equipment are all in communication connection with the server; the terminal equipment is used for receiving the capability assessment content information of the user when the preliminary test of the user passes and sending the capability assessment content information to the mixed reality equipment; the mixed reality equipment is used for displaying a corresponding assessment scene based on the capability assessment content information, collecting a real-time image of a user in the assessment process, and sending the real-time image to the server so as to enable the server to carry out capability assessment; the server is used for receiving the real-time images sent by the mixed reality equipment, determining the current state of the user based on the real-time images, carrying out capability assessment based on the current state of the user corresponding to the received continuous multiple real-time images, and sending the capability assessment result to the terminal equipment; wherein the user current state includes any one or more of gaze point location, hand location, and gestures. According to the invention, the evaluation scene is further displayed based on the mixed reality equipment after the user preliminary test is passed, and the real-time images of the user are acquired to evaluate the capability of the user according to the state change condition of the user corresponding to the real-time images, so that the automatic capability evaluation of the user based on the detected user fixation point position, hand position or gesture is realized, the evaluation mode is convenient, and the evaluation efficiency and accuracy are improved.

Additional features and advantages of embodiments of the invention will be set forth in the description which follows, or in part will be obvious from the description, or may be learned by practice of the embodiments of the invention.

In order to make the above objects, features and advantages of the present invention more comprehensible, preferred embodiments accompanied with figures are described in detail below.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are needed in the description of the embodiments or the prior art will be briefly described, and it is obvious that the drawings in the description below are some embodiments of the present invention, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.

Fig. 1 shows a schematic structural diagram of a cognitive rehabilitation robot system according to an embodiment of the present invention;

FIG. 2a is a diagram illustrating a calibration screen according to an embodiment of the present invention;

FIG. 2b is a diagram illustrating another calibration screen according to an embodiment of the present invention;

FIG. 3a is a schematic diagram illustrating a gesture interaction test according to an embodiment of the present invention;

FIG. 3b illustrates a schematic diagram of an aspect test based on ambient sound identification provided by an embodiment of the present invention;

fig. 4 shows a flowchart of a control method of a cognitive rehabilitation robot system according to an embodiment of the present invention.

Detailed Description

For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the present invention will be described below with reference to the accompanying drawings, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments.

At present, the development of computer technology, especially mixed reality technology, motion capture technology and research capture technology, has brought about more possibilities for cognitive rehabilitation assessment and training. The technical scheme of the cognitive rehabilitation training system developed by adopting the computer technology at present mainly comprises the steps of constructing a cognitive rehabilitation training scene by adopting a mixed reality technology, capturing data related to the attention and limb actions of a user by assisting with peripheral equipment, and fusing a plurality of data to comprehensively evaluate and train. Existing rehabilitation robotic devices typically suffer from the following drawbacks:

1. The device has the advantages of large volume, large occupied area, poor portability and high price, and is difficult to use for home training.

2. The integration level is poor, and usually only one or more of eye movements, gestures and special peripherals are applied to one training system, and interfaces of different devices are required to be considered and integrated, so that all-round cognitive assessment and training cannot be performed.

3. Poor mobility, lack of spatial stereo environment, lack of spatial sensory evaluation and training.

In order to improve the above problems, the embodiment of the invention provides a cognitive rehabilitation robot system and a control method thereof, and the following describes the embodiment of the invention in detail.

The embodiment provides a cognitive rehabilitation robot system, referring to a structural schematic diagram of the cognitive rehabilitation robot system shown in fig. 1, the cognitive rehabilitation robot system includes: the system comprises a terminal device 11, a mixed reality device 12 and a server 13, wherein the terminal device 11 and the mixed reality device 12 are all in communication connection with the server 13, and the terminal device 11 is also in communication connection with the mixed reality device 12;

the terminal device 11 is configured to receive capability assessment content information of a user and send the capability assessment content information to a mixed reality device when a preliminary test of the user passes;

The preliminary test can be a basic test of the hearing, cognition and executive capability of the user, and when the hearing, cognition or executive capability of the user is impaired, the ability evaluation and training cannot be carried out, and the preliminary test is failed; when all the hearing, cognition and executive abilities of the user pass the test, the preliminary test is determined to pass.

The capability evaluation content information can be input by a user, or can be personalized evaluation content automatically generated according to basic information and historical evaluation or training information of the user; the capability assessment content information includes capability assessment types including scale assessment, attention capability assessment, hand-eye coordination capability assessment, and/or assessment scenario information.

The mixed reality device 12 is used for displaying a corresponding assessment scene based on the capability assessment content information, collecting a real-time image of a user in the assessment process, and sending the real-time image to the server so as to enable the server to perform capability assessment;

after the terminal device sends the capability evaluation content information to the mixed reality device, the mixed reality device displays a scene picture corresponding to the evaluation content according to the evaluation type or the evaluation scene included in the capability evaluation content information, so that interactive evaluation is started. The mixed reality device is provided with the image sensor, and in the evaluation process, the real-time image of the user is acquired based on the image sensor in real time, the real-time image comprises eye or hand images of the user, and the continuously acquired real-time image is sent to the server.

The server 13 is configured to receive a real-time image sent by the mixed reality device, determine a current state of a user based on the real-time image, perform capability assessment based on the current state of the user corresponding to the received continuous multiple real-time images, and send a capability assessment result to the terminal device; wherein the user current state includes any one or more of gaze point location, hand location, and gestures.

After receiving the real-time image, the server identifies the current state of the user in the real-time image based on the neural network model obtained by pre-training, the state change information of the user can be obtained by identifying a plurality of continuous real-time images, and the capability assessment can be completed by judging the similarity condition of the state change information and the state change corresponding to the capability assessment content, wherein the capability assessment value is higher when the completion degree of the assessment content of the user is higher. After the server finishes the capability assessment, the capability assessment result is sent to the terminal equipment, so that the terminal equipment displays the capability assessment result of the user to the user.

According to the cognitive rehabilitation robot system provided by the embodiment, the evaluation scene is further displayed based on the mixed reality equipment after the user passes the preliminary test, the real-time images of the user are collected to evaluate the capability of the user according to the state change condition of the user corresponding to the real-time images, the automatic capability evaluation of the user based on the detected user fixation point position, hand position or gesture is realized, the evaluation mode is convenient, and the evaluation efficiency and accuracy are improved.

In one embodiment, the user current state includes a gaze point location; the server is further used for determining the current gazing point position of the user based on the real-time image, emitting rays based on the current gazing point position as an origin, performing collision detection on the rays and a target model displayed by the mixed reality equipment to judge whether the user is gazing at the target model, determining the total gazing time length of the user gazing at the target model and the longest continuous gazing time length of the continuous gazing at the target model based on the real-time image continuously transmitted by the mixed reality equipment, and evaluating the attention capability based on the longest continuous gazing time length and the total gazing time length.

The above-mentioned target model is a virtual model displayed by the mixed reality equipment, and can be in a moving state for accurately evaluating the attention of the user, when the mixed reality equipment starts displaying the moving target model, the user is prompted to watch the target model, and at the same time, the real-time image includes an eye image, the current gaze point position of the user in the real-time image is identified based on the gaze point identification model obtained by training in advance, the two-dimensional coordinate of the current gaze point position of the user is set as T (x, y), the coordinate value is converted into a ray R emitted by the virtual camera, and the ray can use the origin R _p And ray direction R _d And (3) representing. Let the position of the object model be P _t Because the target model may be an irregular model, an impact ball with a set radius R is constructed by taking the geometric center of the target model as the sphere center, and the ray R is used for collision detection with the impact ball of the target model, in a specific embodiment, the position P of the target model at the current moment t can be calculated _t Distance L from ray:

L＝||P _t -R _p -((P _t -R _p )·R _d )R _d ||

if L is less than or equal to R, the ray R collides with a collision ball of the target model, namely the user is looking at the target model, otherwise, the user is not looking at the target model. The total watching time length T of the user watching the target model can be obtained according to all the real-time images acquired in the evaluation process _f And a longest continuous gaze duration T for which the user continuously gazes at the target model _m Let us say the total of the evaluation processTime consuming T _total Calculate T _f /T _total The T is _f /T _total Can be used as an index for evaluating the attention capacity of the user, T _m Can be used as an index for evaluating the continuous attention capability of the user. T (T) _f /T _total The higher the user's attention, the higher the T _m The higher the user's continuous attention ability is.

In one embodiment, the user's current state includes gestures and hand positions, and the live image includes an RGB image and a depth image;

The mixed reality equipment is also used for displaying the target model and sending out an action instruction; the mixed reality device can acquire a physical image of a required scene, the physical image is transmitted to the server to enable the server to conduct three-dimensional modeling based on the physical image, a virtual model of the required scene is obtained, the virtual target is transmitted to the mixed reality device, the mixed reality device is enabled to display a virtual scene corresponding to the virtual model, and corresponding action instructions under the scene are sent out, and the target model can be a three-dimensional model of any shape in the virtual scene.

The server is further configured to identify a gesture of a user in the real-time image based on a gesture identification model obtained by training in advance when the action instruction is a grasping target model, calculate a center point position corresponding to each fingertip position when the gesture is the grasping gesture, record the average position as an average position, determine that the gesture is correct when the average position is located inside a bounding box of the target model, record the current time as a gesture completion time, and evaluate the hand-eye coordination capability based on a start time and the gesture completion time when the action instruction is issued.

The real-time image includes a hand image of a user, inputting RGB (i.e., color image) images and depth images into a gesture recognition model trained in advance for gesture recognition, recognizing to obtain a gesture type and three-dimensional coordinate positions of each fingertip and each joint, setting a virtual bounding box capable of performing interaction with the geometric center of the target model as the center when the gesture is a grip gesture, calculating the center point positions of the fingertips of the 5 fingers, and recording as an average position P _avg If the average position P _avg In the target modelIndicating that the user is gripping the target model, determining that the gesture of the user is correct, and setting the moment when the mixed reality equipment sends out the action instruction as T ₁ The moment when the gesture of the user is detected to be correct is T ₂ The time difference T between the start time and the gesture completion time when the action command is issued ₂ -T ₁ As an evaluation index for the hand-eye coordination ability and the reaction ability of the user, the smaller the time difference is, the better the hand-eye coordination ability and the reaction ability of the user are indicated.

In one embodiment, the user current state includes a hand position; the server is further configured to identify joint positions of fingers of a user extending out of the continuous multiple real-time images when the motion instruction is a touch target model, determine that the motion touch motion is correct when the joint positions move from outside of a bounding box of the target model to inside of the bounding box of the target model, record the current time as touch motion completion time, and evaluate hand-eye coordination capability based on start time and touch motion completion time when the motion instruction is sent.

Inputting RGB (i.e. color image) image and depth image into gesture recognition model trained in advance to perform gesture recognition to obtain gesture type and three-dimensional coordinate position of each fingertip and each joint, and when the gesture of user is finger extension, acquiring joint position P of finger extension of user _index If the joint position P of the extended finger is detected in two continuous real-time images _index The method comprises the steps of moving from the outside of a bounding box of a target model to the inside of the bounding box of the target model, determining that a user touches the target model, recording the time when the joint position of a finger extending out of the bounding box is located in the bounding box, namely the touch action completion time, wherein the time difference between the starting time when an action command is sent and the touch action completion time can be used as an evaluation index for the hand-eye coordination ability and the reaction ability of the user, and the smaller the time difference is, the better the hand-eye coordination ability and the reaction ability of the user are indicated.

The bounding box of the target model can be represented by a central point coordinate C (x, y, z) and extension vectors E (x, y, z) on three coordinate axes, and the method for judging that a certain vertex P (x, y, z) is inside and outside the bounding box is as follows:

Z＝P-C

if |Z _x |＞E _x Or |Z _y |＞E _y Or |Z _z |＞Z _z The vertex P is indicated as being outside the bounding box, otherwise the vertex P is inside the bounding box.

In one embodiment, the capability assessment for the user may also include an interactive scale content assessment. The electronic demand reconstruction can be carried out on the contents of the traditional cognitive assessment list, and the traditional query and form filling assessment is converted into real-time interactive assessment for users. All of the evaluation items in the scale may be decomposed, categorized and combined. For example, the pictures in the scale can be displayed in the form of a three-dimensional model, the scene scenario described in the scale is displayed in the form of a three-dimensional animation, and the required interactions (such as drawing, selecting, grabbing, dragging and the like) in the scale are completed through gestures. Taking interactive conversion of a Montreal cognitive assessment scale (MoCA) as an example, the connection test in the traditional MoCA scale can be converted into the connection test in the space, the nodes in the traditional scale are converted into the interactive objects in the space, a user can connect in the air in a grabbing and dragging mode, and the system can score automatically.

After the capability assessment of the user is finished, a normalized capability value can be generated according to various capability assessment results of the user, so that a personalized training task is generated for the user according to the normalized capability value.

In one embodiment, the server is further configured to generate a corresponding training task based on the capability assessment result, and send the training task to the mixed reality device; the server can also receive the historical capability evaluation result of the user sent by the terminal equipment, and generate a corresponding training task according to the historical capability evaluation result.

The mixed reality equipment is used for displaying training scenes corresponding to training tasks, sending training instructions, collecting training images of users in the training process, and sending the training images to the server; wherein the training image comprises an RGB image and a depth image;

the server is also used for judging whether each training action of the user is correct based on the training images and the training instructions, determining the training action correct rate after the training task is completed, and sending the training action correct rate to the terminal equipment so that the terminal equipment displays the training action correct rate.

When corresponding training tasks are generated according to the capability assessment result of the user, more training content can be added for the lower capability of the user capability assessment so as to improve the capability of the type. Such as adding more coordination training content when the user's hand-eye coordination is low, and adding more attention training content in the training task when the user's attention is low.

And dynamically generating training game contents according to the granulated game elements, the preliminary test of the user and the capability evaluation result. The granulated game elements can be flying birds, running cars, crawling insects, geometric bodies with different colors, three-dimensional addition, subtraction, multiplication and division calculation questions and the like. The system may assign each game element at least one label that is associated with a particular training scenario. For example, flying bird tags are attention, calculation questions are calculation power, geometry tags are basic cognitive power, and the like. When training game content is generated, the system generates training game content according to the evaluation result of the user, namely the normalized score of each capability, and the user is assumed to perform evaluation test of N capabilities altogether, wherein the i-th capability is named A _i Its normalized evaluation score is S _i ，0≤S _i Less than or equal to 1, if the designated user is only directed at the capability A _i Training, the system will be labeled A from the label _i Randomly selecting and combining game elements in the game to form a training task; if the designated user performs A _i ，A _j And A _k Training of multiple cognitive functions, etc., can take the reciprocal of each function score as weight to select different numbers of label game materials for combination, thus forming a training task. Assuming that the number of game elements to be selected is M, the label is A _i The number of game elements is

When the cognitive training is carried out, video stream information acquired by the depth sensor and the RGB sensor is transmitted to a server through a network. The pre-trained object recognition neural network model takes the received RGB video stream as input to recognize objects in a training image, such as objects of a desktop, a book and the like; and carrying out three-dimensional reconstruction based on point cloud on depth information captured by a depth sensor on a server, and constructing a surface model in a real environment. The server carries out encoding compression on the modeling and recognition results and sends the encoding compression to the mixed reality equipment, so that the mixed reality equipment obtains a surface model of a real environment in a visual field range and three-dimensional position information of a recognizable object. Interaction of the virtual element with the surface model may be achieved, for example placing a virtual three-dimensional sphere on a desktop; in addition, the system can also use the identified object information to provide operation instructions for the user to guide the user to complete the specified actions, such as placing red pellets on a book on a desktop. Because the server knows the three-dimensional coordinate positions of the desktop and the book in advance, whether the training action of the user is correct can be automatically judged.

The server can calculate the training action accuracy by judging whether each training action of the user is correct or not, and the training action accuracy is sent to the terminal equipment so that the terminal equipment displays the training action accuracy. Meanwhile, when the accuracy of a certain ability of a user is higher, the corresponding ability score is correspondingly increased, the training content for the ability generated later is less and less, and the training content corresponding to the training action with lower accuracy is continuously increased, so that the effect of centralizing exercise weak items is achieved.

In one embodiment, the mixed reality device provided in this embodiment includes a head-mounted mixed reality device, and the terminal device includes a computer and/or a mobile terminal.

In one embodiment, the server is further configured to send eye movement calibration instructions to the mixed reality device when the new user is using the device;

the mixed reality device is also used for displaying a calibration picture when receiving an eye movement calibration command, collecting eye images of a user and sending the eye images to the server;

the server is configured to receive an eye image and determine an amount of identification deviation of a gaze point location based on the eye image.

In one embodiment, the calibration screen includes a plurality of calibration points; the server is also used for identifying the center point coordinates of the pupil area based on the eye image, obtaining the fixation point position, calculating the distance between the fixation point position and the calibration point, determining that the user is gazing at the calibration point when the distance is smaller than a preset distance threshold, recording the fixation point coordinates when the user gazes at the calibration point, and identifying the deviation amount of the fixation point position based on the fixation point coordinates and the set coordinates of the calibration point.

Because of the different eye distance, eye height and visual range of each user when wearing the head display, the user needs to perform eye movement calibration when wearing the head display for the first time, so that the system can be ensured to accurately detect the position of the user's gaze point. The calibration picture comprises a plurality of calibration points which can be in a sphere shape or other shapes, and the plurality of calibration points are uniformly distributed in the calibration picture.

In a specific embodiment, referring to the calibration screen schematic diagrams shown in fig. 2a and 2b, the calibration screen may include 4 light-emitting spheres distributed on the border of the screen, as shown in fig. 2a, the mixed reality device may prompt the user to adjust the wearing position and mode of the head-mounted mixed reality device by voice, so as to ensure that the user can see the 4 spheres simultaneously, and then prompt the user to look at the 4 spheres sequentially by voice until the spheres disappear, and each sphere disappears after the user looks at for 1 second. By installing two cameras facing eyes of a wearer in a head-mounted mixed reality device to acquire a local picture of eyes of a user, detecting an area where pupils of the user are located by a machine learning method based on a convolutional neural network (Convolutional Neural Network, CNN), calculating a coordinate position of a central position of the pupils area in the picture, mapping the position into a position in screen coordinates, wherein the position is a fixation point coordinate of the user, and the fixation point position of the user can be obtained by using two-dimensional coordinate values (G _x ，G _y ) And (3) representing. Judging whether the user is by calculating the distance between the position of the user's gaze point and the two-dimensional coordinates of the target sphereWhether or not the target sphere is being gazed at, assuming that the two-dimensional coordinate point of the target sphere is (s _x ，s _y ) The distance d between the user gaze point position and the sphere is:

if d < epsilon, indicating that the user is looking at the target sphere, otherwise, the user is not looking at the target sphere, wherein epsilon is a preset judgment threshold. The server detects the position of the user's point of regard and the position of the calibration point once every time an image is received, and records the time at this time as t if the previous detection is not gazing but the current detection is gazing _enter The method comprises the steps of carrying out a first treatment on the surface of the If the previous detection is gazing and the current detection is not gazing, recording the time at the moment as t _leave This time, the time t of fixation on the pellet _focus ＝t _leave -t _enter The method comprises the steps of carrying out a first treatment on the surface of the If t _focus If 1.0, then the calibration of the pellet at that location is ended, otherwise it is necessary to calibrate t _focus The timing is restarted.

The calibration picture can comprise two, as shown in fig. 2b, the calibration picture comprises three luminous spheres distributed in an equilateral triangle, the position of the sphere 1 is positioned on the center of the picture, the sphere 2 and the sphere 3 are respectively positioned at the left lower part and the right lower part of the picture, and the mixed reality device prompts a user to sequentially watch the three spheres until the three spheres disappear through voice.

In the above-described calibration process of the 7 calibration points, the coordinate positions of the calibration points are determined, and the coordinate values (G _x ，G _y ) There is a certain deviation from the actual coordinate position of the calibration point, which may be represented by a two-dimensional vector (d _x ，d _y ) And (3) representing. When the fixation process of the 7 calibration points is completed, the deviation vectors corresponding to the 7 calibration points may be stored as calibration result data (i.e., the identification deviation amount of the fixation point position) of the user. In practical use, when the system detects that the user is looking at a certain coordinate position P, the user is required to be stored in the calibration process through a linear interpolation algorithmAnd carrying out weighted interpolation on the offset of the 7 calibration points to obtain an offset D corresponding to the position P, and finally obtaining the user gazing position T=P+D.

In one embodiment, the mixed reality device is further configured to display a test screen and send a voice test command when receiving the test command, collect a test image of a user, and send the test image to the server;

In order to enable a user to be familiar with basic operations of system equipment and training before performing capability assessment and training, the user can correctly understand the instructions when hearing the operation instructions and can execute the instructions through a correct operation method. The user can be subjected to preliminary tests before the capability assessment, so that the user is ensured to be fully familiar with basic operation of the training equipment and then to perform cognitive assessment and training, and assessment and training result deviation caused by unfamiliar equipment or operation are eliminated.

The preliminary test may include understanding and execution of voice instructions, gesture interaction, and recognition of an orientation from ambient sounds.

When the voice instruction is understood and tested, the user can be instructed to lift and watch the left hand and the right hand by voice, if the user correctly makes action, the test is passed, otherwise, the test fails.

When the gesture interaction test is performed, referring to a gesture interaction test schematic diagram shown in fig. 3a, luminous spheres can be randomly placed in virtual spaces on the left side and the right side of a user, sound is emitted from the positions of the spheres, the user is instructed to find the luminous bodies through the sources of the sound, if the user can find the corresponding spheres along the correct directions immediately, the test is passed, otherwise, the test is failed.

In performing the test according to the ambient sound recognition orientation, referring to the schematic diagram of the test according to the ambient sound recognition orientation shown in fig. 3b, two virtual three-dimensional balls of red and green may be placed in front of the user's field of view, and the voice prompt requires the user to hold the red ball with the left hand and the green ball with the right hand, respectively. If the user can operate correctly, the test passes, otherwise the test fails.

According to the cognitive rehabilitation robot system provided by the embodiment, the evaluation and training of the cognitive ability and the hand-eye coordination ability of the user can be realized through one training system, and the integration level is high; by adopting the head-mounted mixed reality equipment, the system equipment has small volume, strong mobility and good portability; more training items including gaze point tracking, gesture recognition, stereo environmental sound and the like can be provided for the user, and compared with the traditional training equipment, the method has higher evaluation dimension and more diversified modes for the user; the training content can be dynamically generated and adjusted based on the capability assessment result and the training accuracy of the user, and compared with the traditional training method, the training content is more intelligent, and the timeliness and the efficiency are higher.

The present embodiment provides a control method of a cognitive rehabilitation robot system, which is applied to the cognitive rehabilitation robot system provided in the foregoing embodiment, referring to a control method flowchart of the cognitive rehabilitation robot system shown in fig. 4, and the method includes the following steps:

step S402, when the preliminary test of the user passes, the content information is evaluated based on the capability of the terminal equipment receiving the user;

step S404, transmitting the capability evaluation content information to the mixed reality equipment so that the mixed reality equipment displays a corresponding evaluation scene based on the capability evaluation content information and acquires a real-time image of a user in the evaluation process;

Step S406, the real-time images are sent to a server, so that the server determines the current state of the user based on the real-time images, performs capability assessment based on the received current state of the user corresponding to the continuous multiple real-time images, and sends the capability assessment result to a terminal device;

wherein the user current state includes any one or more of gaze point location, hand location, and gestures.

After the software program of the control method of the cognitive rehabilitation robot system is started, a link is established with a server through a tcp protocol so as to be used for a subsequent data exchange function. Meanwhile, searching for the online mixed reality equipment, if the link is successfully established with the server, inquiring the mixed reality equipment connected with the server, and combining the results of the two equipment to obtain a controllable system terminal set.

According to the control method of the cognitive rehabilitation robot system, provided by the embodiment, the evaluation scene is further displayed based on the mixed reality equipment after the user passes the preliminary test, and the real-time images of the user are acquired to evaluate the capability of the user according to the state change condition of the user corresponding to the real-time images, so that the automatic capability evaluation of the user based on the detected user fixation point position, hand position or gesture is realized, the evaluation mode is convenient, and the evaluation efficiency and accuracy are improved.

The method provided in this embodiment has the same implementation principle and technical effects as those of the foregoing embodiment, and for brevity, reference may be made to the corresponding content in the foregoing system embodiment where the method embodiment is not mentioned in the section of the method embodiment.

The embodiment of the invention provides electronic equipment, which comprises a processor and a memory, wherein the memory stores a computer program capable of running on the processor, and the processor realizes the steps of the method provided by the embodiment when executing the computer program.

Embodiments of the present invention provide a computer readable medium storing computer executable instructions that, when invoked and executed by a processor, cause the processor to implement the methods described in the above embodiments.

It will be clear to those skilled in the art that, for convenience and brevity of description, the specific working process of the system described above may refer to the corresponding process in the foregoing embodiment, which is not described in detail herein.

The computer program product of the cognitive rehabilitation robot system and the control method thereof provided by the embodiment of the invention comprise a computer readable storage medium storing program codes, wherein the instructions included in the program codes can be used for executing the method described in the previous method embodiment, and specific implementation can be seen in the method embodiment and will not be repeated here.

In addition, in the description of embodiments of the present invention, unless explicitly stated and limited otherwise, the terms "mounted," "connected," and "connected" are to be construed broadly, and may be, for example, fixedly connected, detachably connected, or integrally connected; can be mechanically or electrically connected; can be directly connected or indirectly connected through an intermediate medium, and can be communication between two elements. The specific meaning of the above terms in the present invention will be understood in specific cases by those of ordinary skill in the art.

The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.

In the description of the present invention, it should be noted that the directions or positional relationships indicated by the terms "center", "upper", "lower", "left", "right", "vertical", "horizontal", "inner", "outer", etc. are based on the directions or positional relationships shown in the drawings, are merely for convenience of describing the present invention and simplifying the description, and do not indicate or imply that the devices or elements referred to must have a specific orientation, be configured and operated in a specific orientation, and thus should not be construed as limiting the present invention. Furthermore, the terms "first," "second," and "third" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance.

Finally, it should be noted that: the above examples are only specific embodiments of the present invention, and are not intended to limit the scope of the present invention, but it should be understood by those skilled in the art that the present invention is not limited thereto, and that the present invention is described in detail with reference to the foregoing examples: any person skilled in the art may modify or easily conceive of the technical solution described in the foregoing embodiments, or perform equivalent substitution of some of the technical features, while remaining within the technical scope of the present disclosure; such modifications, changes or substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention, and are intended to be included in the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims

1. A cognitive rehabilitation robot system, comprising: the system comprises terminal equipment, mixed reality equipment and a server, wherein the terminal equipment and the mixed reality equipment are both in communication connection with the server;

2. The cognitive rehabilitation robot system according to claim 1, characterized in that the user current status comprises gaze point position;

3. The cognitive rehabilitation robot system according to claim 1, characterized in that the user current state comprises gestures and hand positions, the real-time image comprises an RGB image and a depth image;

4. The cognitive rehabilitation robot system according to claim 3, characterized in that the user current state comprises hand position;

5. The cognitive rehabilitation robot system according to claim 1, wherein the server is further configured to generate a corresponding training task based on the capability assessment result and send the training task to the mixed reality device;

6. The cognitive rehabilitation robot system according to claim 1, characterized in that the server is further configured to send eye movement calibration instructions to the mixed reality device when a new user is using the system;

7. The cognitive rehabilitation robot system according to claim 6, characterized in that the calibration screen comprises a plurality of calibration points;

8. The cognitive rehabilitation robot system according to claim 1, wherein the mixed reality device is further configured to display a test screen and issue a voice test instruction when receiving the test instruction, collect a test image of a user, and send the test image to the server;

9. The cognitive rehabilitation robot system according to claim 1, characterized in that the mixed reality device comprises a head-mounted mixed reality device, the terminal device comprising a computer and/or a mobile terminal.

10. A control method of a cognitive rehabilitation robot system, characterized by being applied to the cognitive rehabilitation robot system according to any one of claims 1 to 9, comprising: