CN117079222B

CN117079222B - Teaching plan generation method and system based on classroom audio and video intelligent analysis

Info

Publication number: CN117079222B
Application number: CN202311339244.8A
Authority: CN
Inventors: 唐武雷; 廖劲光; 易宏银
Original assignee: Guangzhou Logansoft Technology Co ltd
Current assignee: Guangzhou Logansoft Technology Co ltd
Priority date: 2023-10-17
Filing date: 2023-10-17
Publication date: 2024-01-26
Anticipated expiration: 2043-10-17
Also published as: CN117079222A

Abstract

The application relates to a teaching plan generation method and system based on intelligent analysis of classroom audio and video, wherein the method comprises the steps of acquiring video data and audio data of a classroom in real time when a lesson instruction is received; identifying a current classroom scene based on the video data and the audio data; if the current classroom scene is a question-answer scene, sending the current obtained audio data to an audio recognition model, performing text conversion on the audio data by the audio recognition model to obtain a question-answer text, performing semantic analysis on the question-answer text, and outputting a question-answer result; if the current classroom scene is a training scene, acquiring desktop images of all students based on the video data, acquiring exercise problem data of the students from the desktop images, and outputting answer results of the exercise problems; and screening out teaching plan materials corresponding to the question answering text with wrong answer and the practice problem data with wrong answer, and sending the teaching plan materials to the user side of the teaching teacher of the lesson. The application has the effect of improving the lesson-taking efficiency of teachers on course teaching plans.

Description

Teaching plan generation method and system based on classroom audio and video intelligent analysis

Technical Field

The application relates to the technical field of intelligent education, in particular to a teaching plan generation method and system based on classroom audio and video intelligent analysis.

Background

Along with the intelligent application of multimedia in the education field, at present, about the classroom teaching lesson preparation of teacher, generally adopt the course teaching plan of electron version, the teacher selects corresponding teaching plan material from the teaching material storehouse according to current course, typesets the teaching plan material according to the teaching plan template and obtains course teaching plan.

When the prior course teaching plan is screened, a teacher can properly select teaching plan materials which are difficult to master or important to master in the previous course based on the knowledge mastering degree of the students in the previous course, and the materials are added into the course teaching plan of the current course to consolidate the knowledge of the students. However, the teaching plan materials are numerous, and the teacher needs to spend time reviewing the classroom and screening the teaching plan materials, which consumes large manpower and needs improvement.

Disclosure of Invention

In order to improve the lesson-taking efficiency of a teacher on a lesson plan; the application provides a teaching plan generation method and system based on intelligent analysis of classroom audio and video.

The first object of the present invention is achieved by the following technical solutions:

a teaching plan generation method based on intelligent analysis of classroom audio and video comprises the following steps:

When a lesson instruction is received, a ceiling camera terminal positioned in a classroom acquires video data and audio data of the classroom in real time;

identifying a current class scene based on the video data and the audio data, wherein the class scene comprises a question-answer scene and a training scene;

if the current classroom scene is a question-answer scene, the current obtained audio data is sent to an audio recognition model, the audio recognition model performs text conversion on the audio data to obtain a question-answer text, semantic analysis is performed on the question-answer text, and a question-answer result is output, wherein the question-answer result comprises correct answer and incorrect answer;

if the current classroom scene is a training scene, acquiring desktop images of all students based on the video data, acquiring exercise problem data of the students from the desktop images, and outputting answer results of the exercise problems, wherein the answer results comprise correct answer and incorrect answer;

screening teaching plan materials corresponding to wrong question answering text and wrong practice problem answering data;

and sending the teaching plan materials to the user side of the teaching teacher of the lesson.

By adopting the technical scheme, when a classroom of a teacher starts, the in-classroom ceiling equipment acquires video pictures in the teacher and audios of teachers and students in the classroom, the current classroom scene is preferentially identified through the video data and the audio data, the classroom scene comprises a communication dialogue scene for enabling students to stand up to answer questions and a scene for full class students to do classroom exercises together, the corresponding scene can be determined through analysis of actions and voices in the video pictures, further, when the questions and answers the scenes, the audio of the questions and answers the scenes is converted through an audio identification model to obtain questions and answers the texts, and then, in the teachers and the answers of the students, the answer results of the students are correct or incorrect through semantic analysis, namely whether the students grasp knowledge points of the classroom or not is judged through audio identification; when the students practice scenes, the students do practice problems on the desktop by drawing frames on the desktop pictures of the students in the video data, and the texts of the problems and the answer parts of the students are extracted, so that the answer result of each problem of the students is judged to be correct or incorrect, and whether the students grasp knowledge points of the lessons or not is judged through the video data.

Therefore, through acquiring, identifying and analyzing the video data and the audio data of the class, the analysis of the video data and the audio data is carried out in the class link which can embody whether students master the knowledge points of the class, the answer and answer conditions of the students on the class to related questions and problems are obtained, the situation of the students on the knowledge points which are not mastered in the class can be reflected from the text of the answer errors of the students and the problems of the answer errors of the students, and further, in the course of generating the teaching plan, the teaching plan materials corresponding to the knowledge points can be automatically screened out for the review of the knowledge points of the class, so that the teaching plan giving efficiency of teachers on the teaching plan is improved.

In a preferred example, the present application: the step of identifying the current class scene based on the video data and the audio data, wherein the class scene comprises a question-answer scene and a training scene comprises the following steps:

acquiring a plurality of frames of classroom images from video data at intervals, and acquiring audio segments in the intervals;

transmitting a plurality of classroom images to a scene judgment model, identifying actions of students and teachers in the classroom images, and comparing the actions with pre-stored images in the scene judgment model to judge an image comparison result, wherein the image comparison result comprises question-answering scene images and exercise scene images;

Identifying whether the audio segment in the interval duration contains a keyword segment of a question-answer scene or a training scene;

if the audio segment contains the keyword segment of the question-answer scene, judging the current audio segment as the question-answer scene audio, and if the audio segment contains the keyword segment of the exercise scene, judging the current audio segment as the exercise scene audio;

if the classroom image is judged to be a question-answer scene image or the audio segment is judged to be a question-answer scene audio segment within the interval duration, the current classroom scene is a question-answer scene;

and in the interval time, if the classroom image is judged to be the exercise scene image or the audio segment is judged to be the exercise scene audio segment, the current classroom scene is the exercise scene.

By adopting the technical scheme, the image feature comparison is carried out by taking the audio and video in the interval duration and utilizing the scene judging model, for example, the images of the teacher who lifts hands and stands at the podium are identified to judge as question-answering scene images, the images of the desk practice problem by identifying the low-head holding pen of the whole class student are identified to judge as practice scene images, and meanwhile, the key word segments in the audio segment in the same interval duration are acquired through audio identification, for example, the question-answering scene audio segment is judged by identifying the question-answering text field of the teacher and the text field of the student answering the question in the audio segment, and the audio segment which is sent by the teacher and starts to practice in class is identified as practice scene audio segment.

Further, through a logical OR relationship, if the classroom image is a question-answer scene image or the audio segment is a question-answer scene audio segment, judging that the current classroom scene is a question-answer scene; if the classroom image is the exercise scene audio segment or the audio segment is the exercise scene audio segment, the current classroom scene is judged to be the exercise scene, and the judgment accuracy is improved.

In a preferred example, the present application: the step of screening out the teaching plan materials corresponding to the wrong answer text and wrong answer practice problem data comprises the steps of:

identifying and extracting knowledge point keywords and chapter keywords of the error text, and knowledge point keywords and chapter keywords of the error problem;

associating the knowledge point keywords belonging to the same error text with the chapter keywords, and associating the knowledge point keywords belonging to the same error problem with the chapter keywords;

the related knowledge point keywords and chapter keywords are sent to a teaching plan material matching model, and the teaching plan material matching model stores a plurality of keyword sets related to different teaching plan materials;

And judging a keyword set to which the related knowledge point keywords and chapter keywords belong, screening out teaching plan materials related to the keyword set, and obtaining the teaching plan materials corresponding to the error text and the error exercises.

By adopting the technical scheme, the audio frequency of the error text contains the audio frequency section of the chapter which is described by the teacher question, so that the error text can be identified to contain the chapter key words, the problem stem of the problem is recorded with the chapter information as the chapter key words which are used for being identified in the error problem, the knowledge point key words and the chapter key words of the same error text are then associated, and after the knowledge point key words and the chapter key words of the same error problem are associated, the knowledge point key words and the chapter key words are sent to the teaching plan material matching model for judging the keyword sets to which the input knowledge point key words and the chapter key words belong, and the knowledge point key words and the chapter key words in different error texts and error problems possibly belong to the same keyword set.

And further acquiring the teaching plan materials associated with the attributed keyword set to obtain the teaching plan materials corresponding to all the error texts and the error problems in the lesson, wherein the screened teaching plan materials cannot be repeated.

In a preferred example, the present application: the keyword set contains chapter information and knowledge point information, the keyword set of the knowledge point keyword and the chapter keyword which are associated with each other is judged, the teaching plan materials associated with the keyword set are screened out, and the teaching plan materials corresponding to the error text and the error problem are obtained, and the method comprises the following steps:

comparing the chapter keywords with chapter information in all keyword sets, judging and screening out a keyword set to be judged to which the chapter keywords belong;

and comparing the knowledge point keywords associated with the chapter keywords with knowledge point information in the keyword set to be judged, judging a keyword set to which the knowledge point keywords and the chapter keywords are associated, and screening teaching plan materials associated with the keyword set.

By adopting the technical scheme, when judging which keyword set the knowledge point keywords and the chapter keywords belong to, the comparison of the chapter keywords and the chapter information in the error text or the error exercises is firstly carried out, and then the comparison of the knowledge point keywords and the knowledge point information is carried out.

In a preferred example, the present application: if any wrong text or wrong problem is identified, only knowledge point keywords are identified and extracted, the step of screening the teaching plan materials corresponding to the wrong question-answering text and wrong-answering practice problem data comprises the following steps:

comparing the knowledge point keywords with knowledge point information in the keyword set;

if the screened keyword set is judged to comprise two or more than two, identifying chapter keywords of the rest error texts or error exercises on the lessons as chapter keywords for auxiliary identification;

based on the chapter keywords and the knowledge point keywords which are identified in an auxiliary mode, judging the keyword set corresponding to the error text or the error problem of the knowledge point keywords.

By adopting the technical scheme, if the wrong text or the wrong problem is identified, the chapter key words cannot be identified, when the judgment of the keyword set of the knowledge point key words is carried out, the judgment is carried out only by the knowledge point key words because of the fact that the judgment condition of one chapter key word is reduced, if the knowledge point key words are judged to belong to one keyword set, the teaching plan materials related to the keyword set are obtained, if the knowledge point key words are judged to belong to two or more keyword sets, namely, when the unique attributive keyword set cannot be determined, the knowledge points on the same lesson belong to the same chapter or adjacent chapters, and the chapter key words of the wrong text or the wrong problem generated by the rest of the lessons are identified as the chapter key words for auxiliary identification, so that the keyword set of the current knowledge point can be assisted to be judged.

In a preferred example, the present application: after the step of sending the teaching plan material to the user side of the teaching teacher of the lesson, the following steps are executed:

when a lesson instruction is received, classifying and counting the same error texts and error exercises of the keyword set belonging to the lesson;

if the number of the error texts and the error problems added in the same keyword set is larger than a preset value, screening out corresponding teaching plan materials and exercise problem data associated with the teaching plan materials;

if only the wrong problem appears in the same keyword set, only the wrong problem is screened out as the teaching plan material.

By adopting the technical scheme, when the classroom is in class, the number of error texts and error problems of each knowledge point is counted, if the number of error texts and error problems is larger than a preset value, the fact that the number of students which are not mastered by the knowledge point is larger is proved, the teaching plan materials are required to be screened out, meanwhile, the related training problem data are also screened out together, so that the study of most students which are not mastered by the knowledge point is enhanced, and when only the error problems are generated in the same knowledge point, only the error problems are screened out and used as teaching plan materials, and the knowledge points which are not mastered can be more specifically explained.

The second object of the present invention is achieved by the following technical solutions:

a teaching plan generation system based on intelligent analysis of classroom audio and video comprises:

the system comprises an audio and video acquisition module, a video processing module and a video processing module, wherein the audio and video acquisition module is used for acquiring video data and audio data of a classroom in real time by a ceiling camera terminal positioned in the classroom when receiving a lesson instruction;

the scene judging module is used for identifying the current class scene based on the video data and the audio data, wherein the class scene comprises a question-answer scene and a training scene;

the question-answering judging module is used for sending the currently obtained audio data to the audio recognition model if the current classroom scene is a question-answering scene, performing text conversion on the audio data by the audio recognition model to obtain a question-answering text, performing semantic analysis on the question-answering text, and outputting a question-answering result, wherein the question-answering result comprises correct answer and incorrect answer;

the problem judging module is used for acquiring desktop images of all students based on video data if the current classroom scene is a training scene, acquiring the problem practice data of the students from the desktop images, and outputting the answering results of the problem practice, wherein the answering results comprise answering correctness and answering errors;

the teaching plan material screening module is used for screening teaching plan materials corresponding to the question-answering text with wrong answer and the practice problem data with wrong answer;

And the teaching plan sending module is used for sending the teaching plan materials to the user side of the teaching teacher of the lesson.

Optionally, the scene judging module includes:

the acquisition sub-module is used for acquiring a plurality of frames of classroom images from video data at intervals and acquiring audio segments in the intervals;

the image feature comparison sub-module is used for sending a plurality of classroom images to the scene judgment model, identifying actions of students and teachers in the classroom images, and comparing the actions with pre-stored images in the scene judgment model to judge an image comparison result, wherein the image comparison result comprises question-answering scene images and exercise scene images;

The audio segment identification sub-module is used for identifying whether the audio segment in the interval duration contains a keyword segment of a question-answer scene or a training scene;

the keyword segment judging submodule is used for judging the current audio segment to be the question-answer scene audio if the audio segment contains the keyword segment of the question-answer scene, and judging the current audio segment to be the exercise scene audio if the audio segment contains the keyword segment of the exercise scene;

the answer scene judging sub-module is used for judging that the current classroom scene is a question-answer scene if the classroom image is judged to be a question-answer scene image or the audio segment is judged to be a question-answer scene audio segment in the interval duration;

and the contact scene judging sub-module is used for judging that the current classroom scene is the exercise scene if the classroom image is judged to be the exercise scene image or the audio segment is judged to be the exercise scene audio segment in the interval duration.

The third object of the present application is achieved by the following technical solutions:

a computer device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, wherein the processor implements the steps of a teaching plan generating method based on classroom audio/video intelligent analysis when executing the computer program.

The fourth object of the present application is achieved by the following technical solutions:

a computer readable storage medium storing a computer program which when executed by a processor implements the steps of a teaching plan generating method based on classroom audio/video intelligent analysis.

In summary, the present application includes at least one of the following beneficial technical effects:

the video data and the audio data are acquired, identified and analyzed, whether students master the classroom links of the classroom knowledge points or not is analyzed, the answers and answering conditions of the students on the lessons are obtained, the situations of the students on the knowledge points which are not yet mastered in the lessons on the lessons can be reflected from the text of the answer errors of the students and the exercises of the answer errors, and further, the teaching case materials corresponding to the knowledge points can be automatically screened out for the review of the knowledge points of the lessons on the basis of the questions and the exercises of the answer errors in the lessons on the lessons, so that the teaching case giving efficiency of teachers on the course teaching case is improved;

through a logical OR relationship, if the classroom image is a question-answer scene image or the audio segment is a question-answer scene audio segment, judging that the current classroom scene is a question-answer scene; if the classroom image is an exercise scene audio segment or an audio segment is an exercise scene audio segment, judging the current classroom scene as an exercise scene, and improving the judgment accuracy;

In the error problems, the problem stems of the problems record chapter information as chapter keywords used for being identified, and then the knowledge point keywords and the chapter keywords of the same error text are associated, and after the knowledge point keywords and the chapter keywords of the same error problems are associated, the knowledge point keywords and the chapter keywords are sent to a teaching plan material matching model for judging keyword sets to which the input knowledge point keywords and the chapter keywords belong, wherein the knowledge point keywords and the chapter keywords in different error texts and error problems possibly belong to the same keyword set;

when judging which keyword set the knowledge point keywords and the chapter keywords belong to, firstly comparing the chapter keywords with the chapter information in the error text or the error exercises, and then comparing the knowledge point keywords with the knowledge point information.

Drawings

FIG. 1 is a flow chart of an embodiment of a method for generating a teaching plan based on intelligent analysis of audio and video in a classroom according to the present application;

fig. 2 is a schematic diagram of a desktop image of a student in a teaching plan generating method based on intelligent analysis of audio and video in a classroom;

FIG. 3 is a flowchart showing an implementation of step S20 in a teaching plan generating method based on intelligent analysis of audio and video in a classroom;

FIG. 4 is a flowchart showing an implementation of step S50 in a teaching plan generating method based on intelligent analysis of audio and video in a classroom;

FIG. 5 is a flowchart of another implementation of step S50 in a teaching plan generating method based on intelligent analysis of classroom audio and video according to the present application;

FIG. 6 is a flowchart of an implementation of the teaching plan generating method based on intelligent analysis of audio and video in a classroom according to the present application after step S50;

fig. 7 is a schematic block diagram of a computer device of the present application.

Detailed Description

The present application is described in further detail below in conjunction with figures 1-7.

In an embodiment, as shown in fig. 1, the application discloses a teaching plan generating method based on intelligent analysis of audio and video in a classroom, which specifically includes the following steps:

s10: when a lesson instruction is received, a ceiling camera terminal positioned in a classroom acquires video data and audio data of the classroom in real time;

in this embodiment, the instruction for lesson is triggered by a teacher, and may be triggered by multimedia in the classroom, or may be issued by voice recognition, for triggering the acquisition of audio data and video data. The indoor furred ceiling of classroom makes a video recording the terminal and is wide-angle camera device, and it has functions such as making a video recording, photography and data transmission, receipt, and furred ceiling makes a video recording the terminal and can fix a plurality of in order to cover and acquire the panorama picture in the classroom based on classroom area size.

The video data refers to panoramic picture video in the studio, and referring to fig. 2, the video data also includes desktop images of each student in the whole class, and a contact book of the desktop can be shot. The audio data is acquired by sound pickup equipment in the classroom, and the audio data refers to sound audio sent by students and teachers in the classroom.

Specifically, when a lesson instruction representing the start of a classroom is received, the ceiling camera terminal and the pickup device terminal located in the classroom start to acquire video pictures in the classroom and audio of teachers and students in the classroom.

S20: identifying a current class scene based on the video data and the audio data, wherein the class scene comprises a question-answer scene and a training scene;

in this embodiment, the classroom scene is preset to distinguish different lectures in the classroom, and the current classroom scene can be determined by performing image recognition and semantic analysis on the video image of the classroom and the audio of teachers and students in the classroom.

The question-answering scene refers to a scene that a teacher asks questions of the class-related knowledge in a class, and the students lift hands and answer the questions at the beginning.

The practice scene refers to a scene that students in the class do practice exercises, and the practice exercises include but are not limited to judgment questions, selection questions and simple answering questions. The classroom scene also includes teacher lectures, student lecture listening scenes, and student discussion scenes in groups.

Specifically, in the course of class, the current class scene is determined by identifying real-time video data and audio data, and whether the class scene is a question-answering scene or a training scene is determined.

S30: if the current classroom scene is a question-answer scene, the current obtained audio data is sent to an audio recognition model, the audio recognition model performs text conversion on the audio data to obtain a question-answer text, semantic analysis is performed on the question-answer text, and a question-answer result is output, wherein the question-answer result comprises correct answer and incorrect answer;

in this embodiment, the audio recognition model is a trained model for converting audio into text, that is, into question-answering text, and performing context semantic analysis on the question-answering text, where each grade knowledge point is stored, and through semantic analysis, it can be determined whether the student correctly answers the question posed by the teacher in the question-answering scene. And outputting a question and answer result, wherein the answer correctly represents that the student correctly answers the question raised by the teacher, and the answer error represents that the student answers the question incorrectly or does not answer the question raised by the teacher.

Specifically, if the current class scene is a question-answering scene, the audio data obtained at present is sent to an audio recognition model, the audio recognition model converts the audio data into an answer text, and whether the students answer the questions of the teacher in the question-answering link correctly is judged based on the answer text of the context.

S40: if the current classroom scene is a training scene, acquiring desktop images of all students based on the video data, acquiring exercise problem data of the students from the desktop images, and outputting answer results of the exercise problems, wherein the answer results comprise correct answer and incorrect answer;

in this embodiment, an exercise image placed on the desktop of the student can be obtained from the desktop image of the student, exercise data is extracted from the exercise image, i.e. text of the exercise is extracted, the exercise data includes answers of the student, and by comparing the answers of the student with correct answers of the exercise in a preset database, it can be determined that the answer result of the student is correct or incorrect.

Specifically, if the current classroom scene is determined to be a training scene, the desktop images of each student are acquired based on the video data, so as to acquire the text of the practice problem on the desktop of the student, the answering problem text after the completion of the answering is compared with the correct answer of the pre-stored practice problem, and whether the answering result of the practice problem is correct is determined.

Furthermore, the students need time to do the questions, the practice problem data of the students are acquired at intervals in the process of doing the questions, only the practice problem data which is finished to be answered is judged, and the practice problem which is output to be answered is not judged any more.

S50: screening teaching plan materials corresponding to wrong question answering text and wrong practice problem answering data;

in this embodiment, the teaching plan material refers to the knowledge point text of the relevant chapter of the relevant subject of the lesson.

Specifically, the lesson-related chapter knowledge point text corresponding to the question text with wrong answer and the exercise text with wrong answer is screened out.

S60: and sending the teaching plan materials to the user side of the teaching teacher of the lesson.

In this embodiment, the teaching plan material is sent to the teacher's user end in the form of graphics context, where the teacher's user end includes a PC end or a mobile PC end and a mobile terminal.

In one embodiment, referring to fig. 3, step S20 includes the steps of:

s21: acquiring a plurality of frames of classroom images from video data at intervals, and acquiring audio segments in the intervals;

s22: transmitting a plurality of classroom images to a scene judgment model, identifying actions of students and teachers in the classroom images, and comparing the actions with pre-stored images in the scene judgment model to judge an image comparison result, wherein the image comparison result comprises question-answering scene images and exercise scene images;

s23: identifying whether the audio segment in the interval duration contains a keyword segment of a question-answer scene or a training scene;

S24: if the audio segment contains the keyword segment of the question-answer scene, judging the current audio segment as the question-answer scene audio, and if the audio segment contains the keyword segment of the exercise scene, judging the current audio segment as the exercise scene audio;

s25: if the classroom image is judged to be a question-answer scene image or the audio segment is judged to be a question-answer scene audio segment within the interval duration, the current classroom scene is a question-answer scene;

s26: and in the interval time, if the classroom image is judged to be the exercise scene image or the audio segment is judged to be the exercise scene audio segment, the current classroom scene is the exercise scene.

In the present embodiment, the interval period is generally set to 2 seconds to 5 seconds, and the acquisition of the classroom image is set to acquire one frame every 0.5 seconds.

The scene judgment model stores image databases capable of representing question-answer scenes and exercise scenes, each image database stores a plurality of images capable of representing the scene, for example, a book database of the question-answer scenes prestores images of a plurality of teachers at different positions, and a single or a plurality of students stand up and lift hands, and the exercise scene database stores images of a plurality of students doing classroom exercises at low heads.

And keyword segments representing question-answering scenes and exercise scenes are stored in the scene judgment model, for example, the question-answering scenes comprise but are not limited to keyword segments of question-answering scenes such as 'pick/have a classmate to answer the question', 'answer the question of a teacher', 'answer the question', and the like. The training scene comprises keyword segments such as 'start answering', 'start classroom training', and the like.

The judgment of the classroom scene adopts a logical OR relation to judge, namely two recognition conditions of the classroom image and the keyword segment, and the current classroom scene can be determined only by one recognition determination of the classroom scene.

Specifically, every other preset interval time length, a plurality of frames of classroom images extracted from video data and audio segments of the interval time length are obtained, the classroom images are sent to a scene judgment model, image feature comparison is carried out on the classroom images and an image database in the scene judgment model, a classroom scene to which the classroom images belong is output, and meanwhile, whether the audio segments contain keyword segments capable of representing the classroom scene or not is identified by converting the audio segments into texts.

Further, whether the classroom image in the interval duration belongs to the question-answer scene image or the exercise scene image or not is judged through the logical OR relationship, and whether the audio segment in the interval duration belongs to the question-answer scene audio segment or the contact scene audio segment is judged.

If the classroom image is judged to be a question-answer scene image or the audio segment is judged to be a question-answer scene audio segment, the current classroom scene is a question-answer scene; if the classroom image is judged to be the exercise scene image or the audio segment is judged to be the exercise scene audio segment, the current classroom scene is the exercise scene.

In one embodiment, the answer text of the answer error is marked as the error text, the exercise data of the answer error is marked as the error exercise, referring to fig. 4, the step S50 includes the steps of:

s51: identifying and extracting knowledge point keywords and chapter keywords of the error text, and knowledge point keywords and chapter keywords of the error problem;

s52: associating the knowledge point keywords belonging to the same error text with the chapter keywords, and associating the knowledge point keywords belonging to the same error problem with the chapter keywords;

s53: the related knowledge point keywords and chapter keywords are sent to a teaching plan material matching model, and the teaching plan material matching model stores a plurality of keyword sets related to different teaching plan materials;

s54: and judging a keyword set to which the related knowledge point keywords and chapter keywords belong, screening out teaching plan materials related to the keyword set, and obtaining the teaching plan materials corresponding to the error text and the error exercises.

In this embodiment, the knowledge point keyword refers to an error text or a text capable of showing a knowledge point of a related teaching plan in an error problem, the chapter keyword refers to an error text or a text capable of representing a chapter of the teaching plan in which the problem occurs in the error problem, when a teacher asks a question in a question answering link, the teacher needs to speak the chapter in which the problem is located first, and the chapter keyword in the error problem occurs in a stem.

The story matching model is a trained model for semantically analyzing text and identifying matches. For matching the error text or error problem to the keyword set of the corresponding teaching plan material. The keyword set stores a plurality of knowledge point keywords and chapter keywords of the associated teaching plan material.

Specifically, knowledge point keywords and chapter keywords of the error text are identified and extracted, knowledge point keywords and chapter keywords of the error problem are identified and extracted, knowledge point keywords and chapter keywords of the same error text are associated, knowledge point keywords and chapter keywords of the same error problem are associated, the associated knowledge point keywords and chapter keywords are used as a group of key phrases, and key phrases generated by the lessons are sent to a teaching plan material matching model one by one.

And judging a keyword set to which the keyword group belongs by the teaching plan material matching model, screening the teaching plan materials associated with the keyword set after obtaining the keyword set to obtain the teaching plan materials corresponding to the error text and the error problem.

Further, if the error text is the same as the teaching plan material corresponding to the error problem, the same teaching plan material is screened out only once.

In one embodiment, the keyword set includes chapter information and knowledge point information, and step S54 includes the steps of:

s541: comparing the chapter keywords with chapter information in all keyword sets, judging and screening out a keyword set to be judged to which the chapter keywords belong;

s542: and comparing the knowledge point keywords associated with the chapter keywords with knowledge point information in the keyword set to be judged, judging a keyword set to which the knowledge point keywords and the chapter keywords are associated, and screening teaching plan materials associated with the keyword set.

In this embodiment, the chapter key is a number representing a chapter of the teaching material, and also includes a number representing each section in each chapter, and each keyword set includes a single or a plurality of chapter keys.

Specifically, the chapter keywords are compared with chapter information contained in all keyword sets, keyword sets to be determined, to which the chapter keywords belong, are screened out, and the number of the keyword sets to be determined includes one or two or more cases.

Further, acquiring a knowledge point keyword associated with the chapter keyword, comparing the knowledge point keyword with knowledge point information in the screened keyword set to be judged, determining a unique keyword set to which the knowledge point keyword and the chapter keyword belong, and screening out teaching plan materials associated with the keyword set.

In one embodiment, if any error text or error problem is identified, only the knowledge point keywords are identified and extracted, referring to fig. 5, step S50 includes the steps of:

S51A: comparing the knowledge point keywords with knowledge point information in the keyword set;

S52A: if the screened keyword set is judged to comprise two or more than two, identifying chapter keywords of the rest error texts or error exercises on the lessons as chapter keywords for auxiliary identification;

S53A: based on the chapter keywords and the knowledge point keywords which are identified in an auxiliary mode, judging the keyword set corresponding to the error text or the error problem of the knowledge point keywords.

In this embodiment, only knowledge point keywords are identified and extracted, which refers to the case where related chapter keywords are not extracted from the wrong text or the wrong problem.

Since the phrase of the knowledge point keyword is short and is difficult to have uniqueness, the number of keyword sets to which the knowledge point keyword belongs may be two or more.

Specifically, when the wrong text or wrong problem is identified by keywords, only knowledge point keywords are identified, the knowledge point keywords are compared with knowledge point information in the keyword set, and if the screened keyword set is judged to be unique, the teaching plan materials associated with the keyword set are screened out and used as the teaching plan materials of the wrong text or wrong problem.

If the screened keyword set is judged to comprise two or more than two chapters of the knowledge points on the same lesson, the chapters keywords of the rest error texts or error problems on the lesson are identified as the chapter keywords for assisting in judging the error texts or error problems of the only identified knowledge point keywords, namely whether the only identified knowledge point keywords are contained or not is identified from the keyword set corresponding to the assisted identified chapter keywords.

Or identifying a plurality of keyword sets to which the knowledge point keywords belong, wherein chapter information in the keyword sets is close to or consistent with the chapter keywords which are auxiliary to be identified, so as to judge the keyword sets corresponding to the error texts or the error problems of the knowledge point keywords.

In one embodiment, referring to fig. 6, after step S60, the following steps are further performed:

s61: when a lesson instruction is received, classifying and counting the same error texts and error exercises of the keyword set belonging to the lesson;

s62: if the number of the error texts and the error problems added in the same keyword set is larger than a preset value, screening out corresponding teaching plan materials and exercise problem data associated with the teaching plan materials;

S63: if only the wrong problem appears in the same keyword set, only the wrong problem is screened out as the teaching plan material.

In this embodiment, the instruction to get lesson is also issued by the teacher in the classroom through multimedia or voice recognition. The preset value is set for judging the number of the students which are not mastered by the knowledge point, and when the number of the students which are not mastered is large, the students can also select practice problems associated with the teaching plan materials together to consolidate the study of the students on the knowledge point when the corresponding teaching plan materials are selected during the screening of the teaching plan materials.

Specifically, when receiving a class-giving instruction from a classroom,

if the same keyword set to which the text belongs only has error problems and no error text, the error problems are only screened out as teaching plan materials.

It should be understood that the sequence number of each step in the foregoing embodiment does not mean that the execution sequence of each process should be determined by the function and the internal logic of each process, and should not limit the implementation process of the embodiment of the present application in any way.

In an embodiment, a teaching plan generating system based on intelligent analysis of audio and video in a classroom is provided, and the teaching plan generating system based on intelligent analysis of audio and video in a classroom corresponds to the teaching plan generating method based on intelligent analysis of audio and video in a classroom in the above embodiment. The teaching plan generation system based on classroom audio and video intelligent analysis comprises:

Optionally, the scene judging module includes:

Optionally, the answer text for answering the error is marked as an error text, the practice problem data for answering the error is marked as an error problem, and the teaching plan material screening module comprises:

the keyword recognition sub-module is used for recognizing and extracting knowledge point keywords and chapter keywords of the error text, knowledge point keywords and chapter keywords of the error exercises;

the keyword association sub-module is used for associating the knowledge point keywords and the chapter keywords belonging to the same error text, and associating the knowledge point keywords and the chapter keywords belonging to the same error problem;

the keyword set judging sub-module is used for sending the associated knowledge point keywords and chapter keywords to a teaching plan material matching model, wherein the teaching plan material matching model stores a plurality of keyword sets associated with different teaching plan materials;

and the teaching plan material screening sub-module is used for judging a keyword set to which the associated knowledge point keywords and chapter keywords belong, screening the teaching plan materials associated with the keyword set, and obtaining the teaching plan materials corresponding to the error text and the error problem.

Optionally, the keyword set includes chapter information and knowledge point information, and the teaching plan material screening submodule includes:

The chapter keyword comparison unit is used for comparing chapter keywords with chapter information in all keyword sets, judging and screening out keyword sets to be judged to which the chapter keywords belong;

and the knowledge point keyword comparison unit is used for comparing the knowledge point keywords related to the chapter keywords with knowledge point information in the keyword set to be judged, judging the keyword set to which the knowledge point keywords and the chapter keywords which are related to each other belong, and screening the teaching plan materials related to the keyword set.

Optionally, if any error text or error problem is identified, only the knowledge point keyword is identified and extracted, and the teaching plan material screening module includes:

the knowledge point keyword comparison sub-module is used for comparing the knowledge point keywords with knowledge point information in the keyword set;

the chapter auxiliary comparison sub-module is used for identifying chapter keywords of other error texts or error problems on the lessons if judging that the screened keyword sets comprise two or more than two, and taking the chapter keywords as chapter keywords for auxiliary identification;

and the auxiliary judging sub-module is used for judging the keyword set corresponding to the error text or error problem of which only the knowledge point keywords are identified based on the chapter keywords and the knowledge point keywords which are identified in an auxiliary mode.

Optionally, the method further comprises:

the statistics module is used for classifying and counting the same error texts and error problems of the keyword sets belonging to the lessons when receiving the lesson-taking instruction;

the error quantity comparison module is used for screening out corresponding teaching plan materials and exercise problem data associated with the teaching plan materials if the quantity of the error texts and the error problems added in the same keyword set is larger than a preset value;

and the error problem screening module is used for screening out the error problem only as a teaching plan material if the same keyword set to which the error problem belongs only has the error problem.

Specific limitation of a teaching plan generating system based on intelligent analysis of classroom audio and video can be referred to above, and detailed description of the limitation of a teaching plan generating method based on intelligent analysis of classroom audio and video will be omitted here. The modules in the teaching plan generating system based on the intelligent analysis of the classroom audio and video can be all or partially realized by software, hardware and the combination thereof. The above modules may be embedded in hardware or may be independent of a processor in the computer device, or may be stored in software in a memory in the computer device, so that the processor may call and execute operations corresponding to the above modules.

In one embodiment, a computer device is provided, which may be a server, the internal structure of which may be as shown in fig. 7. The computer device includes a processor, a memory, a network interface, and a database connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, computer programs, and a database.

The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The database of the computer device is used for storing video data, audio data, teaching plan materials, an audio recognition model, a scene judgment model and a material matching model. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to realize a teaching plan generation method based on intelligent analysis of classroom audios and videos.

In one embodiment, a computer device is provided, including a memory, a processor, and a computer program stored on the memory and executable on the processor, where the processor implements a teaching plan generating method based on classroom audio/video intelligent analysis when executing the computer program;

In one embodiment, a computer readable storage medium is provided, on which a computer program is stored, which when executed by a processor implements a teaching plan generation method based on classroom audio/video intelligent analysis.

Those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in the various embodiments provided herein may include non-volatile and/or volatile memory. The nonvolatile memory can include Read Only Memory (ROM), programmable ROM (PROM), electrically Programmable ROM (EPROM), electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double Data Rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous Link DRAM (SLDRAM), memory bus direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), among others.

It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-described division of the functional units and modules is illustrated, and in practical application, the above-described functional distribution may be performed by different functional units and modules according to needs, i.e. the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-described functions.

The above embodiments are only for illustrating the technical solution of the present application, and are not limiting; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present application, and are intended to be included in the scope of the present application.

Claims

1. A teaching plan generation method based on intelligent analysis of classroom audio and video is characterized by comprising the following steps: the method comprises the following steps:

transmitting the teaching plan materials to a user side of the teaching teacher of the lesson section;

the step of screening out the teaching plan materials corresponding to the wrong answer text and wrong answer practice problem data comprises the steps of:

judging a keyword set to which the related knowledge point keywords and chapter keywords belong, screening out teaching plan materials related to the keyword set, and obtaining teaching plan materials corresponding to the error text and the error exercises respectively;

the method comprises the steps that a keyword set comprises chapter information and knowledge point information, the knowledge point keyword and the keyword set to which the chapter keyword belongs are judged to be related, the chapter keyword refers to an error text or text which can represent a teaching plan chapter where the problem occurs in an error problem, when a teacher asks the problem in a question answering link, the chapter where the problem exists needs to be firstly spoken, the chapter keyword in the error problem occurs in a question stem, teaching plan materials related to the keyword set are screened out, and the teaching plan materials corresponding to the error text and the error problem are obtained, and the method comprises the following steps:

comparing the knowledge point keywords associated with the chapter keywords with knowledge point information in the keyword set to be judged, judging a keyword set to which the knowledge point keywords and the chapter keywords which are associated with each other belong, and screening teaching plan materials associated with the keyword set;

if any wrong text or wrong problem is identified, only knowledge point keywords are identified and extracted, the step of screening the teaching plan materials corresponding to the wrong question-answering text and wrong-answering practice problem data comprises the following steps:

2. The teaching plan generating method based on intelligent analysis of classroom audio and video according to claim 1, wherein the method comprises the following steps: the step of identifying the current class scene based on the video data and the audio data, wherein the class scene comprises a question-answer scene and a training scene comprises the following steps:

3. The teaching plan generating method based on intelligent analysis of classroom audio and video according to claim 1, wherein the method comprises the following steps: after the step of sending the teaching plan material to the user side of the teaching teacher of the lesson, the following steps are executed:

4. Teaching plan generation system based on classroom audio and video intelligent analysis, which is characterized by comprising:

the teaching plan sending module is used for sending teaching plan materials to the user side of the teaching teacher of the lesson section;

the answer text of the answer error is marked as an error text, the exercise problem data of the answer error is marked as an error exercise problem, and the teaching plan material screening module comprises:

The teaching plan material screening submodule is used for judging a keyword set to which the related knowledge point keywords and chapter keywords belong, screening the teaching plan materials related to the keyword set, and obtaining teaching plan materials corresponding to the error text and the error exercises;

judging a keyword set to which a knowledge point keyword and a chapter keyword which are associated with each other belong, wherein the keyword set comprises chapter information and knowledge point information, the chapter keyword refers to an error text or a text which can represent a teaching plan chapter where the problem occurs in an error problem, when a teacher asks the problem in a question-answering link, the chapter where the problem exists needs to be firstly spoken, and a chapter keyword in the error problem occurs in a teaching plan material screening submodule in a question stem, wherein the method comprises the following steps:

the knowledge point keyword comparison unit is used for comparing the knowledge point keywords related to the chapter keywords with knowledge point information in the keyword set to be judged, judging a keyword set to which the knowledge point keywords and the chapter keywords which are related to each other belong, and screening out teaching plan materials related to the keyword set;

If any error text or error problem is identified, only knowledge point keywords are identified and extracted, and the teaching plan material screening module comprises:

5. The teaching plan generating system based on intelligent analysis of classroom audio and video according to claim 4, wherein the scene judging module comprises:

6. A computer device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor, when executing the computer program, carries out the steps of a teaching plan generation method based on intelligent analysis of classroom audio and video according to any of claims 1-4.

7. A computer-readable storage medium storing a computer program, wherein the computer program when executed by a processor implements the steps of a teaching plan generating method based on intelligent analysis of classroom audio and video according to any of claims 1-4.