CN117746505A

CN117746505A - Learning accompanying method and device combined with abnormal sitting posture dynamic detection and robot

Info

Publication number: CN117746505A
Application number: CN202311775638.8A
Authority: CN
Inventors: 陈辉; 熊章; 张智; 杜沛力; 雷奇文
Original assignee: Wuhan Xingxun Intelligent Technology Co ltd
Current assignee: Wuhan Xingxun Intelligent Technology Co ltd
Priority date: 2023-12-21
Filing date: 2023-12-21
Publication date: 2024-03-22

Abstract

The invention relates to the technical field of learning accompanying, solves the problems of inaccurate abnormal sitting posture detection and high cost in a student learning accompanying scene in the prior art, and provides a learning accompanying method, device and robot combined with dynamic abnormal sitting posture detection. The method comprises the following steps: acquiring multi-frame real-time images in a student learning accompanying scene, inputting the multi-frame real-time images into a pre-trained target detection model, and extracting student learning images according to detected preset learning article position information and hand position information; extracting features and detecting monocular depth of the student learning image, and outputting key point information and depth information; analyzing the real-time sitting postures of the students, and outputting real-time sitting posture analysis results; identifying an abnormal sitting posture by combining preset abnormal sitting posture parameters; and sending reminding information corresponding to the sitting posture detection result to the user. The invention realizes accurate detection of abnormal sitting postures without affecting learning effect, and reduces detection cost.

Description

Learning accompanying method and device combined with abnormal sitting posture dynamic detection and robot

Technical Field

The invention relates to the technical field of learning accompanying, in particular to a learning accompanying method, device and robot combined with abnormal sitting posture dynamic detection.

Background

In the long-time learning process of students, particularly in a classroom or a family, the students usually need to keep a relatively fixed sitting posture for a long time, and the poor sitting posture can cause discomfort of all parts of the body, such as cervical vertebra, lumbar vertebra and the like, and by detecting the learning sitting posture of the students, schools and parents can timely find out poor habits and take measures to improve the bad habits, so that the physical health of the students is maintained; meanwhile, proper sitting postures are helpful for keeping the attention of students concentrated and improving learning effect, and correct sitting postures are helpful for promoting blood circulation and improving the oxygen supply of brains, so that learning efficiency is improved, and conversely, poor sitting postures can lead to fatigue and distraction of students and influence academic manifestations; poor learning sitting postures can lead to eye fatigue and discomfort, increase the risk of vision problems, improve the quality of learning environment by focusing on the sitting postures of students, create an atmosphere which is beneficial to the physical and psychological health and the development of the students, use sensors (such as gyroscopes, accelerometers and the like) to monitor the sitting postures of students in the prior art, the precision and the stability of the sensors are critical to the accuracy, however, some portable or cheap sensors can have limitations in precision and stability, resulting in inaccurate posture detection; some techniques may require students to wear equipment or maintain a particular posture for a long period of time during learning, which may affect the comfort of the students and thus the learning effect.

The prior Chinese patent CN107169456A discloses a sitting posture detection method based on a sitting posture depth image, and particularly discloses a method for solving key points through a sitting posture segmentation image, wherein the key points comprise a head vertex, a head center point, a left shoulder point, a right shoulder point, a shoulder center point and a body center point; judging sitting postures according to the key points; acquiring a sitting posture depth image of a human body through a depth sensor, wherein if the included angle between the connecting line of the head center point and the shoulder center point and the horizontal direction is more than 105 degrees, the included angle between the connecting line of the shoulder center point and the body center point and the horizontal direction is 80-100 degrees, and the head is left offset; if the included angle between the connecting line of the head center point and the shoulder center point and the horizontal direction is less than 75 degrees, and meanwhile, the included angle between the connecting line of the shoulder center point and the body center point and the horizontal direction is 80-100 degrees, the head is right-offset; if the depth distance value of the head top point is more than 20mm greater than the depth distance value of the head center point, the head is turned upward; if the depth distance value of the head top point is more than 20mm smaller than the depth distance value of the head center point, the head is low; if the included angle between the connecting line of the shoulder center point and the body center point and the horizontal direction is less than 80 degrees, the body is inclined leftwards; if the included angle between the connecting line of the shoulder center point and the body center point and the horizontal direction is more than 100 degrees, the body is inclined to the right; obtaining an external rectangle of the outline according to the outline of the human body, and if the aspect ratio of the rectangle is smaller than 0.6, lying down; if the depth distance value from the shoulder center point to the body center point is in a decreasing trend, the body is backward inclined. The above patent also determines the sitting posture based on depth, but in order to acquire a high-quality depth image, a specially designed depth sensor is generally required. These sensors use different technologies, such as structured light, time of flight or stereo vision, to measure the object-to-sensor distance, and the design and manufacture of such special sensors often requires highly specialized technical and engineering knowledge, thus increasing costs; meanwhile, the performance of the depth sensor is often affected by environmental conditions such as illumination, shielding, etc., and additional hardware or a complicated calibration system may be required in order to ensure stable operation under various environments, which also increases costs.

Therefore, how to accurately detect abnormal sitting postures without affecting learning effects in a student learning accompanying scene and reduce detection cost are the problems to be solved urgently.

Disclosure of Invention

In view of the above, the invention provides a learning accompanying method, a learning accompanying device and a robot combined with abnormal sitting posture dynamic detection, which are used for solving the problems of inaccurate abnormal sitting posture detection and high cost in a student learning accompanying scene in the prior art.

The technical scheme adopted by the invention is as follows:

in a first aspect, the present invention provides a learning accompanying method in combination with dynamic detection of abnormal sitting postures, the method comprising:

s1: acquiring real-time video data of a student under a learning accompanying scene, and decomposing the real-time video data into multi-frame real-time images;

s2: inputting each real-time image into a pre-trained target detection model, and extracting a student learning image when the learning articles intersect with the hands according to the detected preset learning article position information and hand position information;

s3: extracting features of the student learning images and detecting monocular depth, and outputting key point information and depth information related to student learning;

s4: analyzing the real-time sitting postures of the students according to the key point information and the depth information, and outputting real-time sitting posture analysis results;

S5: identifying an abnormal sitting posture according to the real-time sitting posture analysis result and by combining preset abnormal sitting posture parameters;

s6: when the sitting posture is identified as abnormal, sending reminding information corresponding to the sitting posture detection result to the user, and reminding the user to adjust the sitting posture of the student.

Preferably, the S3 includes:

s31: inputting continuous multi-frame learning images of the students into a human body key point detection model, and identifying the key point information, wherein the human body key points at least comprise: left eye, right eye, nose, left shoulder and right shoulder;

s32: monocular depth detection is carried out on the student learning image, and a relative depth map corresponding to the student learning image is obtained;

s33: and carrying out filtering processing on the relative depth map corresponding to the continuous multi-frame student images, and outputting the depth map after the filtering processing as the depth information.

Preferably, the S4 includes:

s41: determining the coordinate positions of key points in continuous multi-frame student learning images according to the key point information;

s42: acquiring a depth value corresponding to the coordinate position of the key point according to the coordinate position of the key point and the depth information;

s43: determining a first target key point coordinate position and a corresponding first depth value according to the key point coordinate position and the corresponding depth value, and determining a second target key point coordinate position and a corresponding second depth value, wherein the first target key point comprises the left eye, the right eye and the nose, and the second target key point comprises the left shoulder and the right shoulder;

S44: judging the head deviation degree of the student according to the coordinate position of the first target key point and the corresponding first depth value, and outputting a head deviation judging result;

s45: judging the body deviation degree of the student according to the coordinate position of the second target key point and the corresponding second depth value, and outputting a body deviation judging result;

s46: and outputting the real-time sitting posture analysis result according to the head deviation judgment result and the body deviation judgment result.

Preferably, the S44 includes:

s441, acquiring a left eye coordinate position, a right eye coordinate position, a left eye depth value and a right eye depth value according to the first target key point coordinate position and the first depth value;

s442: comparing the left eye coordinate position and the right eye coordinate position in each frame of learning image, and judging whether the head is shifted leftwards or rightwards;

s443: and comparing the eye depth values in the adjacent frame depth map by taking the left eye depth value and the right eye depth value as eye depth values, and judging that the head is shifted forwards or backwards.

Preferably, the S45 includes:

s451: acquiring a left shoulder coordinate position, a right shoulder coordinate position, a left shoulder depth value and a right shoulder depth value according to the second target key point coordinate position and the second depth value;

S452: comparing the shoulder coordinate positions in the adjacent frame learning images according to the left shoulder coordinate position and the right shoulder coordinate position, and judging that the body is shifted leftwards or rightwards;

s453: and comparing the shoulder depth values in the depth maps of the adjacent frames according to the left shoulder depth value and the right shoulder depth value, and judging that the body is shifted forwards or backwards.

Preferably, the S5 includes:

s51: if the head is judged to be shifted leftwards or rightwards, connecting a left eye key point and a nose key point to obtain a first straight line according to the coordinate position of the first target key point, and connecting a right eye key point and the nose key point to serve as a second straight line;

s52: acquiring a first included angle between a first straight line and the vertical direction and a second included angle between a second straight line and the vertical direction;

s53: comparing the angle difference between the first included angle and the second included angle with a preset angle difference threshold, and identifying the first abnormal sitting posture when the angle difference in the continuous multi-frame images is larger than the angle difference threshold;

s54: if the head is judged to be shifted forwards or backwards, calculating the absolute value of the eye depth change in the depth map of the adjacent frames, and identifying the head as a second abnormal sitting posture when the absolute value of the eye depth change calculated in the continuous multi-frame images is larger than a preset depth threshold.

Preferably, the step S5 further includes:

s55: if the body is judged to be shifted leftwards or rightwards, acquiring a left shoulder coordinate position and a right shoulder coordinate position according to the coordinate position of the second target key point;

s56: calculating the difference of the longitudinal distances between the left shoulder and the right shoulder according to the left shoulder coordinate position and the right shoulder coordinate position;

s57: when the difference of the distances in the continuous multi-frame images is larger than a preset distance threshold value, recognizing that the sitting posture is a third abnormal sitting posture;

s58: if the body is judged to be shifted forwards or backwards, calculating the absolute value of the change of the shoulder depth in the depth map of the adjacent frames, and identifying the fourth abnormal sitting posture when the absolute value of the change of the shoulder depth calculated in the continuous multi-frame images is larger than a preset depth threshold value.

In a second aspect, the present invention provides a learning companion device incorporating dynamic detection of abnormal sitting postures, the device comprising:

the image acquisition module is used for acquiring real-time video data of a student under a study accompanying scene and decomposing the real-time video data into multi-frame real-time images;

the learning image extraction module is used for inputting each real-time image into a pre-trained target detection model, and extracting a student learning image when the learning articles intersect with the hands according to the detected preset learning article position information and hand position information;

The information extraction module is used for carrying out feature extraction and monocular depth detection on the student learning image and outputting key point information and depth information related to student learning;

the real-time sitting posture analysis module is used for analyzing the real-time sitting postures of the students according to the key point information and the depth information and outputting real-time sitting posture analysis results;

the abnormal sitting posture identification module is used for identifying abnormal sitting postures according to the real-time sitting posture analysis result and by combining preset abnormal sitting posture parameters;

and the user reminding module is used for sending reminding information corresponding to the sitting posture detection result to a user when the abnormal sitting posture is identified, and reminding the user to adjust the sitting posture of the student.

In a third aspect, an embodiment of the present invention further provides a learning accompanying robot, including: at least one processor, at least one memory and computer program instructions stored in the memory, which when executed by the processor, implement the method as in the first aspect of the embodiments described above.

In a fourth aspect, embodiments of the present invention also provide a storage medium having stored thereon computer program instructions which, when executed by a processor, implement a method as in the first aspect of the embodiments described above.

In summary, the beneficial effects of the invention are as follows:

the invention provides a learning accompanying method, a device and a robot combined with abnormal sitting posture dynamic detection, wherein the method comprises the following steps: acquiring real-time video data of a student under a learning accompanying scene, and decomposing the real-time video data into multi-frame real-time images; inputting each real-time image into a pre-trained target detection model, and extracting a student learning image when the learning articles intersect with the hands according to the detected preset learning article position information and hand position information; extracting features of the student learning images and detecting monocular depth, and outputting key point information and depth information related to student learning; analyzing the real-time sitting postures of the students according to the key point information and the depth information, and outputting real-time sitting posture analysis results; identifying an abnormal sitting posture according to the real-time sitting posture analysis result and by combining preset abnormal sitting posture parameters; when the sitting posture is identified as abnormal, sending reminding information corresponding to the sitting posture detection result to the user, and reminding the user to adjust the sitting posture of the student. According to the invention, the pre-trained target detection model is adopted, so that learning supplies and hand positions in a student learning scene can be more accurately identified, and further a student learning image can be accurately extracted; the key point information and the monocular depth detection are comprehensively utilized, so that the sitting posture condition of students can be comprehensively known, and the accuracy of abnormal sitting posture detection is improved through multi-level information fusion; by adopting a pre-trained target detection model and a deep learning technology, the advantages of computer vision on image analysis can be fully exerted, compared with traditional hardware equipment, the software-based solution is generally lower in cost, and meanwhile, the invention can provide instant feedback and reminding in real-time video data, so that students and guardians can take actions in time, the potential health problems can be prevented, and the coping efficiency of abnormal sitting postures can be improved.

Drawings

In order to more clearly illustrate the technical solution of the embodiments of the present invention, the drawings required to be used in the embodiments of the present invention will be briefly described, and it is within the scope of the present invention to obtain other drawings according to these drawings without inventive effort for a person skilled in the art.

Fig. 1 is a flowchart illustrating the overall operation of the learning accompanying method in combination with dynamic detection of abnormal sitting posture according to embodiment 1 of the present invention;

fig. 2 is a schematic flow chart of extracting learning images of students in embodiment 1 of the present invention;

FIG. 3 is a flow chart of the analysis of the real-time sitting postures of students in embodiment 1 of the invention;

FIG. 4 is a flow chart of determining the degree of head deviation of a student in embodiment 1 of the present invention;

FIG. 5 is a flow chart of determining the degree of body shift of a student in embodiment 1 of the present invention;

FIG. 6 is a flow chart of the identification of first and second abnormal sitting postures according to embodiment 1 of the invention;

FIG. 7 is a flow chart of the process of identifying third and fourth abnormal sitting postures according to embodiment 1 of the invention;

fig. 8 is a block diagram showing a learning accompanying apparatus in combination with dynamic detection of abnormal sitting posture in embodiment 2 of the present invention;

Fig. 9 is a schematic structural diagram of a learning accompanying robot in embodiment 3 of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present invention more clear, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention. It is noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. In the description of the present invention, it should be understood that the terms "center," "upper," "lower," "front," "rear," "left," "right," "vertical," "horizontal," "top," "bottom," "inner," "outer," and the like indicate orientations or positional relationships based on the orientation or positional relationships shown in the drawings, merely to facilitate description of the present application and simplify the description, and do not indicate or imply that the devices or elements referred to must have a specific orientation, be configured and operated in a specific orientation, and thus should not be construed as limiting the present invention. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising … …" does not exclude the presence of other like elements in a process, method, article or apparatus that comprises the element. If not conflicting, the embodiments of the present invention and the features of the embodiments may be combined with each other, which are all within the protection scope of the present invention.

Example 1

Referring to fig. 1, embodiment 1 of the invention discloses a learning accompanying method combined with dynamic detection of abnormal sitting postures, which comprises the following steps:

specifically, using a camera or other video acquisition device on the accompanying robot to acquire video data of a student learning accompanying scene in real time, decomposing continuous video data into a series of static image frames, and implementing by a video processing technology, such as using a frame extraction function in a computer vision library; further processing is carried out on each frame of image, including image enhancement, denoising and face detection, which is helpful for improving the accuracy and effect of subsequent analysis; key features are extracted from each image, and the extracted features are applied to analysis of learning accompanying scenes to acquire information about learning states and demands of students, and are used for adjusting teaching strategies, providing personalized suggestions and the like.

Specifically, inputting real-time video data into a pre-trained target detection model, wherein the target detection model comprises a deep learning model based on a YoloV8s structure, and detecting books on a desktop and hands of students in a learning scene; extracting position information of preset learning articles and hand position information, extracting position information of preset learning articles (such as books) and position information of student hand regions from the output of the target detection model, obtaining coordinates of each detected region by analyzing the output of the model,

for each frame, judging whether the extracted position information of the learning article region and the hand region are intersected, and determining whether the extracted position information of the learning article region and the hand region are intersected by using a geometric calculation method, such as overlapping judgment of rectangular frames; in the case where every N frames are taken as an analysis unit, the number of intersecting frames is counted to determine whether the student is learning, if the intersection of the learning supply and the hand is detected in more than N/2 frames, the student is considered to be learning, and when it is determined that the student is learning, corresponding student learning images including the posture of the student at the time of learning, the condition information of using the learning supply are extracted, and these student learning images are further used for analyzing the learning state of the student or other relevant information.

specifically, when it is determined that the student is learning, the student learning image is subjected to depth analysis, and key point information such as the posture and gestures of the student is captured by extracting features from the student learning image; meanwhile, a monocular depth detection technology is applied to acquire depth information of each key point in a student learning image, so that a student posture and a learning environment in a three-dimensional space are constructed; this process of depth analysis is accomplished through modern computer vision techniques and deep learning models. The key point information provides the positions of key points related to body gestures, the depth information can provide the distances between the key points and the camera, the comprehensive analysis of the information is helpful to more comprehensively understand the behaviors and environments of students in the learning process, and further more refined learning analysis, such as the information on the concentration of the attention of the students, the action coordination during learning and the like, is provided for education staff and researchers, more data support is provided for optimizing teaching methods, personalized learning experience and further understanding the learning process of the students.

In one embodiment, referring to fig. 2, the step S3 includes:

specifically, a mediapipe_else model is used to detect key points of a human body for a plurality of continuous frames of learning images of the student, and the model can detect the key points of the human body, wherein the key points of the human body at least comprise: left eye, right eye, nose, left shoulder and right shoulder; the detection of these keypoints can provide detailed information about the posture and facial expression of the human body, providing a basis for subsequent depth detection.

specifically, monocular depth detection is performed on a plurality of continuous frames of the student learning images by using a MobileDepth model, and the MobileDepth model can generate a depth field formed by relative values, which represents the depth of each point in the image relative to the camera, and the depth field can provide relative information about the distance between an object in the image and the camera. The MobileDepth model is able to take full advantage of computer vision in image analysis, which is generally less costly than conventional hardware devices.

Specifically, for continuous N-frame images, the depth map of each frame is calculated, in order to obtain a complete depth map, the relative depth map of each frame is first median filtered, which is helpful to process outliers in the depth map, so that the depth map of a single image is more complete and accurate, and then, the smooth average filtering is performed on the continuous N-frame relative depth map. The purpose of this step is to obtain a continuous stable depth map, by averaging the multi-frame depth information, reducing noise and fluctuations that may be present due to a single frame depth map, making the final depth map smoother and more reliable. The whole process aims at acquiring accurate and continuous depth information through human body key point detection and monocular depth detection and combining the filtering processing of a depth map, and providing high-quality data support for subsequent student learning behavior analysis.

specifically, based on the extracted human body key point information and depth information, analysis of the student's real-time sitting posture is performed, which includes comprehensive analysis of the positions and relative depths of the detected key points (e.g., eyes, nose, shoulders, etc.). Through this analysis process, the system can output detailed results regarding the student's current sitting posture, such as head tilt, shoulder position, body posture, etc.

In one embodiment, referring to fig. 3, the step S4 includes:

specifically, for each frame of depth map, the set of human keypoints K (including left eye p1, right eye p2, nose p3, left shoulder point p4, left shoulder point p 5) mentioned earlier is used to find corresponding depth values on the depth map, which represent the depth of each keypoint relative to the camera, for N frames of images, the depth values of the keypoints in each frame are time-sequentially formed into a time-series depth coordinate value D, for example, D1 (p 1, p2,..p 5) represents the depth value of the keypoint in the first frame, D2 (p 1, p2,..p 5) represents the depth value of the keypoint in the second frame, and so on until DN (p 1, p2,..p 5) represents the depth value of the keypoint in the nth frame.

in one embodiment, referring to fig. 4, the step S44 includes:

specifically, for the learning image of each frame, the coordinate position information of the left eye (p 1) and the right eye (p 2) is extracted, the y-axis coordinates of the left eye and the right eye are compared, and through observation description, if the y-axis coordinates of the left eye in the continuous multi-frame images are continuously larger than the y-axis coordinates of the right eye p2, the head is considered to be bent to the left due to the mirror image data shot by the camera, otherwise, if the y-axis coordinate sequence of the left eye p1 in the continuous multi-frame images is smaller than the y-axis coordinate sequence of the right eye p2, the head is considered to be bent to the right.

Specifically, the left eye depth value and the right eye depth value are combined into an eye depth value, the combination mode is realized by taking the average value of the depth values of the left eye depth value and the right eye depth value, for continuous depth map frames, the eye depth values in adjacent frames are compared, if the eye depth value continuously increases in a period of time, the head is considered to be shifted forwards, otherwise, if the eye depth value continuously decreases in a period of time, the head is considered to be shifted backwards.

in one embodiment, referring to fig. 5, the step S45 includes:

specifically, for each frame of learning image, the coordinate position information of the left and right shoulders is extracted, for successive image frames, the coordinate positions of the left and right shoulders in the adjacent frame images are compared, specifically, the change of the coordinates of the left and right shoulders on the x-axis is observed, the direction of the body is determined according to the result of the comparison, if the x-axis coordinates of the left shoulder continue to increase over a period of time and the x-axis coordinates of the right shoulder are relatively small, the body is considered to be shifted to the right due to the mirror image data captured by the camera, whereas if the x-axis coordinates of the left shoulder continue to increase over a period of time and the x-axis coordinates of the right shoulder continue to increase, the body is considered to be shifted to the left.

Specifically, for each frame of depth map, the depth values of the left and right shoulders are extracted, and for successive depth map frames, the depth values of the left and right shoulders in adjacent frames are compared. The change in left and right shoulder depth values may be observed, and based on the result of the comparison, the system may determine the direction of the body's deflection, e.g., if the left shoulder depth value continues to decrease over a period of time, while the right shoulder depth value is relatively large, then the body is considered to be deflecting forward; conversely, if the left shoulder depth value continues to increase while the right shoulder depth value is relatively small, the body is considered to be shifting posteriorly.

Specifically, according to the previous steps, the offset direction of the head is obtained, namely, the head is offset leftwards or rightwards; likewise, the direction of deflection of the body is derived, i.e. the body deflects forward or backward; combining the judgment results of head deviation and body deviation to form a comprehensive result of real-time sitting posture analysis, and outputting the comprehensive real-time sitting posture analysis result to indicate the sitting posture condition of the students.

in one embodiment, referring to fig. 6, the step S5 includes:

specifically, if it is determined that the head is shifted leftward or rightward, it is assumed that in the learning image of a certain frame, the left-eye key point position is (x 1, y 1), the right-eye key point position is (x 2, y 2), the nose key point position is (x 3, y 3), and from these coordinate information, a first straight line is a straight line from the left eye to the nose, i.e., straight line L1, and a second straight line is a straight line from the right eye to the nose, i.e., straight line L2. The straight line connecting the left and right eyes to the nose helps to form a reference axis for the head, facilitating subsequent analysis of the angular offset of the head. Through the connection of the straight lines, the posture change of the head can be better described, and a foundation is provided for the subsequent calculation of the included angle.

Specifically, assume that the included angle between the straight line L1 and the vertical direction is θ1, and the included angle between the straight line L2 and the vertical direction is θ2. Acquiring information of the angle between the left and right eyes and the nose helps to quantify the degree of offset of the head. This angle information can be used to determine if the head is offset left and right, providing more detailed head pose analysis.

specifically, the angle difference between the first included angle and the second included angle is compared with a preset angle difference threshold, and in the continuous multi-frame images, if the angle difference between theta 1 and theta 2 is larger than the preset angle difference threshold, the first abnormal sitting posture is identified as the left-right offset of the head. Such a determination is helpful for timely finding out bad learning postures of students, providing real-time feedback, and promoting correct sitting postures.

Specifically, if the head is determined to be offset forwards or backwards, calculating a depth map to obtain that the absolute value of the change of the eye depth value of the adjacent frame is larger than a preset depth threshold value, and identifying the second abnormal sitting posture as the front-back offset of the head through the change of the eye depth value. Such a determination helps to monitor student body posture changes, provide timely feedback, and promote correct sitting posture.

In an embodiment, referring to fig. 7, the step S5 further includes:

specifically, if it is determined that the body is shifted to the left or right, in one frame of the learning image, the coordinate position of the left shoulder key point is (x 4, y 4), and the coordinate position of the right shoulder key point is (x 5, y 5) is obtained according to the coordinate position of the second target key point. By acquiring the coordinate positions of the left shoulder and the right shoulder, the position information of the shoulders can be accurately captured, and a basis is provided for subsequent analysis.

specifically, the difference in longitudinal distance, Δy= |y4-y5|, is calculated from the coordinate positions of the left and right shoulders. By calculating the distance difference between the shoulders on the y-axis, it can be understood whether the shoulders are offset in the vertical direction, helping to determine whether the body is curved to the left or right.

specifically, in the continuous multi-frame images, it is observed that the difference Δy between the shoulder longitudinal distances continuously increases and is greater than the preset distance threshold, the third abnormal sitting posture is recognized as the left-right bending of the body, and the left-right bending abnormal sitting posture of the body can be recognized by judging the change of the difference between the shoulder longitudinal distances. This helps to find out in time the bad sitting posture that the student may produce while learning, provide timely feedback.

Specifically, under the condition that the body shifts forwards or backwards, by calculating the absolute value of the shoulder depth change in the depth map of the adjacent frames, the depth value change in the continuous multi-frame images is observed to be larger than a preset depth threshold value, the fourth abnormal sitting posture is recognized as the forward and backward body shift, whether the body shifts forwards or backwards is judged through the observation of the shoulder depth value change, the body posture change of students in the learning process is monitored, and real-time feedback is provided.

Specifically, if the head of the student is detected to be shifted leftwards, reminding information is sent to the user, such as: "note that the student's head is offset to the left, please consider adjusting the sitting posture"; the reminding is helpful for correcting the bad sitting posture of the students possibly caused by habitual head postures, and avoiding discomfort caused by long-time keeping of the same head deviation; if a forward head offset of the student is detected, a reminder may be sent to the user, such as: note that the student's head is leaning forward, suggesting an adjustment of the sitting posture to maintain the correct learning posture. "such reminders help to avoid the student being in an improper head position for a long period of time, helping to maintain good neck and spinal health. If it is detected that the student's body is curved to the left, a reminder may be sent to the user, such as: reminding that the student's body is bent to the left and recommended to adjust to maintain a uniform sitting posture helps correct the student's improper body bending that may be caused by habitual body posture, preventing potential health problems; if the system detects that the student's body is shifted forward, a reminder message is sent to the user, such as: "please note that the student's body is leaning forward, and adjustments are recommended to maintain the correct sitting position. "such a reminder helps to avoid the student being in an improper posture of leaning forward for a long period of time, helps to maintain the health of the spine, and reduces the potential for discomfort. By sending specific reminding information to the user, timely and personalized feedback is provided, the user is helped to correct bad learning postures, and the student is helped to maintain good health habits, and learning effect and comfort level are improved.

Example 2

Referring to fig. 8, embodiment 2 of the present invention further provides a learning accompanying device combined with dynamic detection of abnormal sitting postures, the device comprising:

in an embodiment, the information extraction module includes:

the key point information extraction submodule is used for inputting continuous multi-frame student learning images into a human body key point detection model to identify the key point information, wherein the human body key points at least comprise: left eye, right eye, nose, left shoulder and right shoulder;

the monocular depth detection sub-module is used for monocular depth detection of the student learning image and obtaining a relative depth map corresponding to the student learning image;

And the filtering processing sub-module is used for carrying out filtering processing on the relative depth map corresponding to the continuous multi-frame student images and outputting the depth map after the filtering processing as the depth information.

preferably, the real-time sitting posture analysis module comprises:

the key point coordinate position extraction sub-module is used for determining the key point coordinate positions in the continuous multi-frame student learning images according to the key point information;

the depth value obtaining sub-module is used for obtaining a depth value corresponding to the coordinate position of the key point according to the coordinate position of the key point and the depth information;

a depth value determining submodule, configured to determine a first target key point coordinate position and a corresponding first depth value according to the key point coordinate position and the corresponding depth value, and determine a second target key point coordinate position and a corresponding second depth value, where the first target key point includes the left eye, the right eye, and the nose, and the second target key point includes the left shoulder and the right shoulder;

the deviation degree judging submodule is used for judging the head deviation degree of the student according to the coordinate position of the first target key point and the corresponding first depth value and outputting a head deviation judging result;

In one embodiment, the offset degree determination submodule includes:

the eye coordinate position and depth value acquisition unit is used for acquiring a left eye coordinate position, a right eye coordinate position, a left eye depth value and a right eye depth value according to the first target key point coordinate position and the first depth value;

a head left-right shift determination unit configured to compare the left-eye coordinate position and the right-eye coordinate position in each frame of the learning image, and determine whether the head shifts left or right;

and the head forward and backward offset judging unit is used for comparing the eye depth values in the adjacent frame depth map by taking the left eye depth value and the right eye depth value as eye depth values to judge whether the head is offset forward or backward.

The body deviation judging sub-module is used for judging the body deviation degree of the student according to the coordinate position of the second target key point and the corresponding second depth value, and outputting a body deviation judging result;

preferably, the body shift determination submodule includes:

the left-right shoulder coordinate position and depth value acquisition unit is used for acquiring a left-shoulder coordinate position, a right-shoulder coordinate position, a left-shoulder depth value and a right-shoulder depth value according to the second target key point coordinate position and the second depth value;

The body left-right offset judging unit is used for comparing the shoulder coordinate positions in the adjacent frame learning images according to the left shoulder coordinate position and the right shoulder coordinate position to judge that the body is offset left or right;

and the body forward and backward offset judging unit is used for comparing the shoulder depth values in the adjacent frame depth map according to the left shoulder depth value and the right shoulder depth value to judge whether the body is offset forward or backward.

And the sitting posture analysis result output unit is used for outputting the real-time sitting posture analysis result according to the head deviation judgment result and the body deviation judgment result.

preferably, the abnormal sitting posture recognition module includes:

the straight line acquisition sub-module is used for connecting the left eye key point and the nose key point to obtain a first straight line according to the coordinate position of the first target key point if the head is judged to be offset leftwards or rightwards, and connecting the right eye key point and the nose key point to serve as a second straight line;

the included angle acquisition submodule is used for acquiring a first included angle between the first straight line and the vertical direction and a second included angle between the second straight line and the vertical direction;

The first abnormal sitting posture identification submodule is used for comparing the angle difference between the first included angle and the second included angle with a preset angle difference threshold value, and identifying the first abnormal sitting posture when the angle difference in the continuous multi-frame images is larger than the angle difference threshold value;

and the second abnormal sitting posture identification sub-module is used for calculating the absolute value of the eye depth change in the depth map of the adjacent frames if the head is judged to be shifted forwards or backwards, and identifying the abnormal sitting posture as the second abnormal sitting posture when the absolute value of the eye depth change calculated in the continuous multi-frame images is larger than a preset depth threshold.

Preferably, the abnormal sitting posture recognition module further comprises:

the coordinate position acquisition sub-module is used for acquiring a left shoulder coordinate position and a right shoulder coordinate position according to the coordinate position of the second target key point if the body is judged to be shifted leftwards or rightwards;

the left shoulder and right shoulder longitudinal distance difference calculating submodule is used for calculating the left shoulder and right shoulder longitudinal distance difference according to the left shoulder coordinate position and the right shoulder coordinate position;

the third abnormal sitting posture identification submodule is used for identifying the third abnormal sitting posture when the difference of the distances in the continuous multi-frame images is larger than a preset distance threshold value;

And the fourth abnormal sitting posture identification sub-module is used for calculating the absolute value of the shoulder depth change in the depth map of the adjacent frames if the body is judged to be shifted forwards or backwards, and identifying the abnormal sitting posture as the fourth abnormal sitting posture when the absolute value of the shoulder depth change calculated in the continuous multi-frame images is larger than the preset depth threshold.

Specifically, the learning accompanying device combined with the dynamic detection of abnormal sitting posture provided in the embodiment 2 of the present invention is adopted, and the device includes: the image acquisition module is used for acquiring real-time video data of a student under a study accompanying scene and decomposing the real-time video data into multi-frame real-time images; the learning image extraction module is used for inputting each real-time image into a pre-trained target detection model, and extracting a student learning image when the learning articles intersect with the hands according to the detected preset learning article position information and hand position information; the information extraction module is used for carrying out feature extraction and monocular depth detection on the student learning image and outputting key point information and depth information related to student learning; the real-time sitting posture analysis module is used for analyzing the real-time sitting postures of the students according to the key point information and the depth information and outputting real-time sitting posture analysis results; the abnormal sitting posture identification module is used for identifying abnormal sitting postures according to the real-time sitting posture analysis result and by combining preset abnormal sitting posture parameters; and the user reminding module is used for sending reminding information corresponding to the sitting posture detection result to a user when the abnormal sitting posture is identified, and reminding the user to adjust the sitting posture of the student. The device adopts a pre-trained target detection model, so that learning supplies and hand positions in a student learning scene can be more accurately identified, and further a student learning image can be accurately extracted; the key point information and the monocular depth detection are comprehensively utilized, so that the sitting posture condition of students can be comprehensively known, and the accuracy of abnormal sitting posture detection is improved through multi-level information fusion; by adopting a pre-trained target detection model and a deep learning technology, the advantages of computer vision on image analysis can be fully exerted, compared with traditional hardware equipment, the software-based solution is generally lower in cost, and meanwhile, the device can provide instant feedback and reminding in real-time video data, so that students and guardians can take actions in time, potential health problems can be prevented, and the coping efficiency of abnormal sitting postures is improved.

Example 3

In addition, the learning accompanying method in combination with the dynamic detection of abnormal sitting posture of embodiment 1 of the present invention described in connection with fig. 1 may be implemented by a learning accompanying robot. Fig. 9 shows a schematic hardware structure of a learning accompanying robot according to embodiment 3 of the present invention.

The learning companion robot may include a processor and a memory storing computer program instructions.

In particular, the processor may comprise a Central Processing Unit (CPU), or an application specific integrated circuit (Application Specific Integrated Circuit, ASIC), or may be configured as one or more integrated circuits that implement embodiments of the present invention.

The memory may include mass storage for data or instructions. By way of example, and not limitation, the memory may comprise a Hard Disk Drive (HDD), floppy Disk Drive, flash memory, optical Disk, magneto-optical Disk, magnetic tape, or universal serial bus (Universal Serial Bus, USB) Drive, or a combination of two or more of the foregoing. The memory may include removable or non-removable (or fixed) media, where appropriate. The memory may be internal or external to the data processing apparatus, where appropriate. In a particular embodiment, the memory is a non-volatile solid state memory. In a particular embodiment, the memory includes Read Only Memory (ROM). The ROM may be mask programmed ROM, programmable ROM (PROM), erasable PROM (EPROM), electrically Erasable PROM (EEPROM), electrically rewritable ROM (EAROM), or flash memory, or a combination of two or more of these, where appropriate.

The processor reads and executes the computer program instructions stored in the memory to implement any of the learning accompanying methods of the above embodiments in combination with dynamic detection of abnormal sitting postures.

In one example, the learning companion robot may further include a communication interface and a bus. The processor, the memory, and the communication interface are connected by a bus and complete communication with each other, as shown in fig. 9.

The communication interface is mainly used for realizing communication among the modules, the devices, the units and/or the equipment in the embodiment of the invention.

The bus includes hardware, software, or both, coupling the components of the learning companion robot to each other. By way of example, and not limitation, the buses may include an Accelerated Graphics Port (AGP) or other graphics bus, an Enhanced Industry Standard Architecture (EISA) bus, a Front Side Bus (FSB), a HyperTransport (HT) interconnect, an Industry Standard Architecture (ISA) bus, an infiniband interconnect, a Low Pin Count (LPC) bus, a memory bus, a micro channel architecture (MCa) bus, a Peripheral Component Interconnect (PCI) bus, a PCI-Express (PCI-X) bus, a Serial Advanced Technology Attachment (SATA) bus, a video electronics standards association local (VLB) bus, or other suitable bus, or a combination of two or more of the above. The bus may include one or more buses, where appropriate. Although embodiments of the invention have been described and illustrated with respect to a particular bus, the invention contemplates any suitable bus or interconnect.

Example 4

In addition, in combination with the learning accompanying method in combination with the dynamic detection of abnormal sitting posture in the above embodiment 1, embodiment 4 of the present invention may also provide a computer readable storage medium for implementation. The computer readable storage medium has stored thereon computer program instructions; the computer program instructions, when executed by a processor, implement any of the above embodiments in combination with a learning accompanying method for dynamic detection of abnormal sitting postures.

In summary, the embodiment of the invention provides a learning accompanying method, a learning accompanying device and a robot combined with abnormal sitting posture dynamic detection.

It should be understood that the invention is not limited to the particular arrangements and instrumentality described above and shown in the drawings. For the sake of brevity, a detailed description of known methods is omitted here. In the above embodiments, several specific steps are described and shown as examples. However, the method processes of the present invention are not limited to the specific steps described and shown, and those skilled in the art can make various changes, modifications and additions, or change the order between steps, after appreciating the spirit of the present invention.

The functional blocks shown in the above-described structural block diagrams may be implemented in hardware, software, firmware, or a combination thereof. When implemented in hardware, it may be, for example, an electronic circuit, an Application Specific Integrated Circuit (ASIC), suitable firmware, a plug-in, a function card, or the like. When implemented in software, the elements of the invention are the programs or code segments used to perform the required tasks. The program or code segments may be stored in a machine readable medium or transmitted over transmission media or communication links by a data signal carried in a carrier wave. A "machine-readable medium" may include any medium that can store or transfer information. Examples of machine-readable media include electronic circuitry, semiconductor memory devices, ROM, flash memory, erasable ROM (EROM), floppy disks, CD-ROMs, optical disks, hard disks, fiber optic media, radio Frequency (RF) links, and the like. The code segments may be downloaded via computer networks such as the internet, intranets, etc.

It should also be noted that the exemplary embodiments mentioned in this disclosure describe some methods or systems based on a series of steps or devices. However, the present invention is not limited to the order of the above-described steps, that is, the steps may be performed in the order mentioned in the embodiments, or may be performed in a different order from the order in the embodiments, or several steps may be performed simultaneously.

In the foregoing, only the specific embodiments of the present invention are described, and it will be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working processes of the systems, modules and units described above may refer to the corresponding processes in the foregoing method embodiments, which are not repeated herein. It should be understood that the scope of the present invention is not limited thereto, and any equivalent modifications or substitutions can be easily made by those skilled in the art within the technical scope of the present invention, and they should be included in the scope of the present invention.

Claims

1. A learning accompanying method combined with dynamic detection of abnormal sitting postures, characterized in that it comprises:

2. A learning accompanying method in combination with dynamic detection of abnormal sitting posture according to claim 1, wherein S3 comprises:

3. A learning accompanying method in combination with dynamic detection of abnormal sitting posture according to claim 2, wherein S4 comprises:

4. A learning accompanying method in combination with dynamic detection of abnormal sitting posture according to claim 3, wherein S44 comprises:

5. A learning accompanying method in combination with dynamic detection of abnormal sitting posture according to claim 3, wherein S45 comprises:

6. The learning accompanying method in combination with dynamic detection of abnormal sitting posture according to claim 4, wherein S5 comprises:

7. The learning accompanying method in combination with dynamic detection of abnormal sitting posture according to claim 5, wherein S5 further comprises:

8. A learning accompanying device incorporating dynamic detection of abnormal sitting postures, the device comprising:

9. A learning accompanying robot, comprising: at least one processor, at least one memory, and computer program instructions stored in the memory, which when executed by the processor, implement the method of any one of claims 1-7.

10. A storage medium having stored thereon computer program instructions, which when executed by a processor, implement the method of any of claims 1-7.