CN110879987A

CN110879987A - Method for identifying answer content of test question

Info

Publication number: CN110879987A
Application number: CN201911149719.0A
Authority: CN
Inventors: 王红接; 刘林; 刘恒鲁
Original assignee: CHENGDU DONGFANG WENDAO SCIENCE AND TECHNOLOGY DEVELOPMENT Co Ltd
Current assignee: CHENGDU DONGFANG WENDAO SCIENCE AND TECHNOLOGY DEVELOPMENT Co Ltd
Priority date: 2019-11-21
Filing date: 2019-11-21
Publication date: 2020-03-13
Anticipated expiration: 2039-11-21
Also published as: CN110879987B

Abstract

The invention relates to the technical field of intelligent identification and discloses a method for identifying answer contents of test questions. The invention provides a new method for identifying the identification position of a answering area and the answering content of test questions without manual intervention, which comprises a test question early-stage preparation stage and a test question answering and photographing stage, wherein the test question early-stage preparation stage firstly determines a plurality of anchor points of a test question mother board and the positions of the answering areas with labels, then the accurate positions of the anchor points in a photographed picture are determined in an image characteristic matching mode in the test question answering and photographing stage, then perspective transformation matrixes are obtained according to the positions of the anchor points in the scanned picture and the photographed picture, and finally the identification position of the answering area and the answering content of the test questions are obtained according to the positions of the answering areas with the labels and the perspective transformation matrixes, so that a user can directly obtain the answering content of the test questions after answering and photographing without manual intervention. In addition, the method is suitable for test questions of all disciplines in all grades, and is convenient for practical application and popularization.

Description

Method for identifying answer content of test question

Technical Field

The invention belongs to the technical field of intelligent identification, and particularly relates to a method for identifying answer contents of test questions.

Background

In recent years, with the continuous upgrading and development of information technology, the use of mobile terminal devices has become more and more popular, and more convenient, faster, and more efficient work and learning modes have become more and more popular. In the traditional education field, a new generation of education informatization upgrading exploration is gradually developed. In the current basic education stage in China, the main investigation forms of the learning conditions of students are still various types of examinations, including high examinations and middle examinations, as small as basic level examinations, such as daily homework and unit examinations of teachers, and end-of-term examinations, interviews, joint examinations, modular examinations and the like in various periods. Under such conditions, the teacher is burdened with significant work pressure on the correction work and test paper. Therefore, various auxiliary examination methods are gradually used in various examination scenes, for example, a picture taking answering mode is adopted to collect answering contents so as to achieve the purposes of remote guidance, remote examination reading or automatic examination reading and the like.

The current photographing answering methods are mainly classified into the following two types.

(1) The situation that the page of shooing has corresponding electron page: the service provider makes an electronic page in advance and inputs information such as the area of each question, the area of a response area, standard answer content and the like on the page; the user firstly appoints pages on the APP and then takes a picture, so that the APP knows the electronic pages corresponding to the current picture; after the photo is taken, the APP judges the position of the page in the photo by carrying out edge detection on the photo, and then changes the page in a perspective way to be a standard rectangle; searching question numbers in the converted pictures, and judging the positions of the questions and the answering area by combining the previously input question information; the answering area is cut out for recognition, so that the answering area is converted into electronic answering content. However, this method has two disadvantages: firstly, only mathematical problems can be processed, and the subject is single; secondly, the edge identification is not accurate enough, and the user is required to adjust the edge identification to give correct edge information.

(2) Aiming at the situation that the photographed page does not correspond to the electronic page: the service provider inputs a text question; a user takes a picture on an APP, then identifies a horizontal and vertical formula in the picture through the APP, and carries out simple formula identification; and/or the APP identifies the answering text by adopting an OCR technology, retrieves the corresponding questions on the pages in the background and judges the questions. However, the method has the defects that only primary mathematics problems can be processed, and the subject is single.

Disclosure of Invention

The invention aims to solve the problems that the edge identification is not accurate enough and manual intervention is needed in the current test question answering content acquisition process, and provides a new method for identifying test question answering content.

The technical scheme adopted by the invention is as follows:

a method for identifying the answer content of test questions comprises a preliminary preparation stage of test questions and a photographing stage of test questions answering;

the preliminary examination question preparation stage comprises the following steps S101 to S103:

s101, obtaining a scanned picture of a test question page, and dividing the scanned picture into n × n sub-scanned pictures, wherein n is a natural number not less than 3;

s102, scanning an area with the most image characteristic points by adopting a square sliding window aiming at each sub-scanning picture at the outermost periphery, then taking the area as an anchor point of the corresponding sub-scanning picture, and finally obtaining a test master plate containing all the anchor points, wherein the side length of the square sliding window is smaller than the width of the corresponding sub-scanning picture;

s103, acquiring question information manually marked on the scanning picture, wherein the question information comprises a answering area position;

the examination question answering and photographing stage comprises the following steps S201 to S207:

s201, acquiring a photographed picture of an answered test question page, and acquiring a test question master plate and question information corresponding to the answered test question page;

s202, selecting m anchor points closest to the edge line of the scanned picture in the test question master, then estimating the initial positions of the m selected anchor points in the photographed picture, and finally cutting out m sub photographed pictures containing different selected anchor points according to the initial positions, wherein m is a natural number not less than 4 and not more than 4 (n-1);

s203, sending the obtained m sub-photographed pictures and the test question master mask to a matching server, and sending the photographed pictures to a handwriting recognition server;

s204, receiving a matching result from a matching server, wherein the matching result is the accurate position of each selected anchor point in the corresponding sub-photographed picture;

s205, determining the accurate positions of all the selected anchor points in the corresponding sub-photographed pictures according to the accurate positions of the selected anchor points in the sub-photographed pictures, then selecting 4 anchor points closest to the edge lines of the photographed pictures from the m selected anchor points, and finally calculating the perspective transformation matrix of the photographed pictures according to the accurate positions of the 4 selected anchor points in the photographed pictures and the positions of the 4 selected anchor points in the scanned pictures;

s206, obtaining a response area identification position in the photographed picture according to the perspective transformation matrix and the response area position in the question information;

and S207, obtaining answering content corresponding to the answering area identification position according to the answering area identification position and a handwriting identification result from a handwriting identification server.

Preferably, when the topic information further includes a topic stem position, before the step S207, the following steps are further included:

and recognizing the stem content in the handwriting recognition result in a character comparison mode, then determining the corresponding coordinate position of the stem content in the photographed picture, and finally correcting the answering area recognition position according to the mapping relation between the question stem position and the coordinate position to obtain a more accurate answering area recognition position.

Optimally, the topic information also comprises standard answer content.

Preferably, in step S102, an image feature value of each anchor point is further calculated, and the test question master includes the image feature value of each anchor point.

The optimization method further comprises a server matching processing stage: and carrying out image feature matching on each sub-photographing picture and the corresponding selection anchor point in the test question master mask to obtain the accurate position of each selection anchor point in the corresponding sub-photographing picture.

Specifically, an opencv-based SURF algorithm, an SI FT algorithm, an ORB algorithm or a FAST algorithm is adopted for image feature matching.

Optimized, the method also comprises a server identification processing stage: and recognizing each character on the photographed picture by adopting a handwriting recognition model trained through deep learning in advance, and acquiring a handwriting recognition result containing the raisin content and the answering content.

Specifically, a YOLO target detection network model is adopted for deep learning, and then the handwriting recognition model is obtained.

Specifically, when n is 3, m is 6.

Specifically, the side length of the square sliding window is 1/30-1/10 of the width of a scanned picture or the width of a sub-scanned picture.

The invention has the beneficial effects that:

(1) the invention provides a new method for identifying the identification position of a answering area and the answering content of test questions without manual intervention, which comprises a test question early-stage preparation stage and a test question answering and photographing stage, wherein the test question early-stage preparation stage firstly determines a plurality of anchor points of a test question mother board and the positions of the answering areas with labels, then determines the accurate positions of the anchor points in a photographed picture in an image characteristic matching mode in the test question answering and photographing stage, then obtains perspective transformation matrixes according to the positions of the anchor points in the scanned picture and the photographed picture, and finally obtains the identification position of the answering area and the answering content of the test questions according to the positions of the answering areas with labels and the perspective transformation matrixes, so that a user can directly obtain the answering content of the test questions after answering and photographing without manual intervention, and the user experience is greatly improved;

(2) because the characters on the photographed picture are recognized by the handwriting recognition model which is trained through deep learning in advance, various characters, symbols or mathematical formulas and the like can be recognized, and the handwriting recognition model has strong generalization characteristics, so that the handwriting recognition model is suitable for test questions of all-grade and all-disciplines, and is convenient for practical application and popularization.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.

FIG. 1 is a flow chart of a method for identifying answer contents of test questions provided by the present invention.

Detailed Description

The invention is further described with reference to the following figures and specific embodiments. It should be noted that the description of the embodiments is provided to help understanding of the present invention, but the present invention is not limited thereto. Specific structural and functional details disclosed herein are merely illustrative of example embodiments of the invention. This invention may, however, be embodied in many alternate forms and should not be construed as limited to the embodiments set forth herein.

It will be understood that, although the terms first, second, etc. may be used herein to describe various elements, these elements should not be limited by these terms. These terms are only used to distinguish one element from another. For example, a first element could be termed a second element, and, similarly, a second element could be termed a first element, without departing from the scope of example embodiments of the present invention.

It should be understood that, for the term "and/or" as may appear herein, it is merely an associative relationship that describes an associated object, meaning that three relationships may exist, e.g., a and/or B may mean: a exists alone, B exists alone, and A and B exist at the same time; for the term "/and" as may appear herein, which describes another associative object relationship, it means that two relationships may exist, e.g., a/and B, may mean: a exists independently, and A and B exist independently; in addition, for the character "/" that may appear herein, it generally means that the former and latter associated objects are in an "or" relationship.

It will be understood that when an element is referred to herein as being "connected," "connected," or "coupled" to another element, it can be directly connected or coupled to the other element or intervening elements may be present. Conversely, if a unit is referred to herein as being "directly connected" or "directly coupled" to another unit, it is intended that no intervening units are present. In addition, other words used to describe the relationship between elements should be interpreted in a similar manner (e.g., "between … …" versus "directly between … …", "adjacent" versus "directly adjacent", etc.).

It is to be understood that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of example embodiments of the invention. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms "comprises," "comprising," "includes" and/or "including," when used herein, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, numbers, steps, operations, elements, components, and/or groups thereof.

It should also be noted that, in some alternative implementations, the functions/acts noted may occur out of the order noted in the figures. For example, two figures shown in succession may, in fact, be executed substantially concurrently, or the figures may sometimes be executed in the reverse order, depending upon the functionality/acts involved.

It should be understood that specific details are provided in the following description to facilitate a thorough understanding of example embodiments. However, it will be understood by those of ordinary skill in the art that the example embodiments may be practiced without these specific details. For example, systems may be shown in block diagrams in order not to obscure the examples in unnecessary detail. In other instances, well-known processes, structures and techniques may be shown without unnecessary detail in order to avoid obscuring example embodiments.

Example one

As shown in fig. 1, the method for identifying the answer content of the test questions provided by this embodiment includes a preliminary preparation stage of the test questions and a photographing stage of the answer of the test questions.

The preliminary test question preparation phase is performed on the test question library side, and may include, but is not limited to, the following steps S101 to S103.

S101, obtaining a scanning picture of a test question page, and dividing the scanning picture into n × n sub-scanning pictures, wherein n is a natural number not less than 3.

In step S101, the scanned picture is imported by a service provider. In addition, the method of dividing the picture equally is the conventional processing method, for example, dividing the scanned picture equally into 9 sub-scanned pictures.

S102, aiming at each sub-scanning picture at the outermost periphery, scanning an area with the most image characteristic points by adopting a square sliding window, then taking the area as an anchor point of the corresponding sub-scanning picture, and finally obtaining a test master containing all the anchor points, wherein the side length of the square sliding window is smaller than the width of the corresponding sub-scanning picture.

In step S102, if the scanned image is divided into 9 sub-scanned images, the number of sub-scanned images at the outermost periphery is 8 (i.e. the sub-scanned image without the center), and the number of feature points of the scanned image using the square sliding window is the conventional scanning manner; the side length of the square sliding window is preferably 1/30-1/10 of the width of a scanned picture or the width of a sub-scanned picture, such as 1/20. In addition, in order to reduce the data amount of the test question master, and facilitate space saving and traffic saving transmission, in the step S102, image feature values of the anchor points are further calculated, where the test question master includes the image feature values of the anchor points, and the calculation manner of the image feature values is an existing conventional manner.

S103, acquiring question information manually marked on the scanning picture, wherein the question information comprises a answering area position.

In step S103, the topic information is obtained by manually labeling on a human-computer interaction interface by a service provider. Specifically, the question information may further include a question stem position, standard answer content, and the like.

The examination question answering and photographing stage is performed on the user side, and may include, but is not limited to, the following steps S201 to S207.

S201, a photographed picture of the answered test question page is obtained, and a test question master plate and question information corresponding to the answered test question page are obtained.

In step S201, the photographed picture is obtained by the user after photographing the page of the answered test question, for example, photographing the answered test question by using the mobile phone APP. In addition, the test question master plate and the question information can be obtained by accessing a test question library and inquiring according to the same test question page number.

S202, selecting m anchor points closest to the edge line of the scanned picture in the test question master, then estimating the initial positions of the m selected anchor points in the photographed picture, and finally cutting out m sub photographed pictures containing different selected anchor points according to the initial positions, wherein m is a natural number not less than 4 and not more than 4 (n-1).

In step S202, the preliminary location estimation manner of the single selected anchor point may be, but is not limited to: and taking the position of the selection anchor point in the scanned picture as a preliminary position in the photographed picture. And after the preliminary position is determined, the sub-shot pictures which correspond to the selected anchor points one by one can be obtained by adopting a conventional cutting mode, and the sizes of the sub-shot pictures can be the same or different. For a specific example, when n is 3, m is 6, that is, 6 anchor points closest to the edge line of the scanned picture are selected from the test question master.

S203, sending the obtained m sub-photographed pictures and the test question master to a matching server, and sending the photographed pictures to a handwriting recognition server.

In step S203, the matching server is configured to perform image feature matching processing on each sub-photographed picture with a corresponding selection anchor point (i.e., an anchor point region or a calculated image feature value), so as to obtain an accurate position of each selection anchor point in the corresponding sub-photographed picture. Namely, in the method for identifying the answer content of the test questions, the method further comprises the following server matching processing stage: and carrying out image feature matching on each sub-photographing picture and the corresponding selection anchor point in the test question master mask to obtain the accurate position of each selection anchor point in the corresponding sub-photographing picture. Specifically, the image Feature matching may be performed by, but not limited to, using an existing algorithm such as an opencv-based SURF algorithm (Speeded Up Robust Feature, which is a well-known scale invariant Feature detection method), a SIFT algorithm (scale invariant Feature Transform, which is another well-known scale invariant Feature detection method), an ORB algorithm (short for ored Brief, which is an improved version of Brief algorithm and 100 times faster than SIFT algorithm and 10 times faster than SURF algorithm), or a FAST algorithm (Features from Accelerated segmentation Test to obtain Features), which may be used specifically to quickly detect a point of interest, and may determine whether the point of interest is a key point only by comparing several pixels).

In step S203, the handwriting recognition server is configured to perform handwriting content recognition on the photographed picture, and acquire information such as answering content and raiser content therein. Namely, in the method for identifying the answer content of the test questions, the method further comprises the following server identification processing stage: and recognizing each character on the photographed picture by adopting a handwriting recognition model trained through deep learning in advance, and acquiring a handwriting recognition result containing the raisin content and the answering content. Specifically, a YOLO target detection network model is preferably used for deep learning, and then the handwriting recognition model is obtained. Since YOLO is an end-to-end target detection algorithm, it is not necessary to extract region probes in advance, and it can be output through a network: the type, confidence and coordinate position have the characteristic of high detection speed, and can be favorable for quickly detecting and learning a large number of characters. In addition, in order to complete the picture transmission quickly, the photographed picture can be compressed and then sent to the handwriting recognition server.

And S204, receiving a matching result from the matching server, wherein the matching result is the accurate position of each selected anchor point in the corresponding sub-photographed picture.

S205, determining the accurate positions of the selected anchor points in the photographed picture according to the accurate positions of the selected anchor points in the corresponding sub photographed picture, then selecting 4 anchor points closest to the edge line of the photographed picture from the m selected anchor points, and finally calculating the perspective transformation matrix of the photographed picture according to the accurate positions of the 4 selected anchor points in the photographed picture and the positions of the 4 selected anchor points in the scanned picture.

In step S205, the perspective transformation matrix is calculated in the conventional manner.

S206, obtaining the answering area identification position in the photographed picture according to the position of the answering area in the perspective transformation matrix and the question information.

In step S206, specifically, the position of the response area in the topic information is substituted into the perspective transformation matrix, so as to obtain the identification position of the response area in the photographed picture.

In step S207, since the stem content and the answering content in the handwriting recognition result have corresponding coordinate positions in the photographed picture, the answering content corresponding to the answering area recognition position can be found through position matching. Preferably, the extraction content is directly recognized in a character comparison mode, so that the corresponding coordinate position of the extraction content in the photographed picture can be determined in advance. Therefore, in order to further precisely locate the answering area identification position, when the question information further includes a question stem position, the following steps are further included before the step S207: and recognizing the stem content in the handwriting recognition result in a character comparison mode, then determining the corresponding coordinate position of the stem content in the photographed picture, and finally correcting the answering area recognition position according to the mapping relation between the question stem position and the coordinate position to obtain a more accurate answering area recognition position.

The application process of the foregoing steps S101 to S207 is specifically, but not limited to: the service provider firstly scans test question pages, and then adopts steps S101-S103 to make an electronic book comprising a plurality of test question pages; after a teacher uses the APP to arrange homework and the student answers on the paper test question page, the APP can be used for photographing, then the answering content is directly obtained through the steps S201-S207, and finally the teacher can continue to use the APP to check the answering content, so that the purposes of remote paper reading and the like are achieved.

In summary, the method for identifying the answer content of the test questions provided by the embodiment has the following technical effects:

(1) the embodiment provides a new method for identifying the identification position of a answering area and answering content of test questions without manual intervention, which comprises a test question early-stage preparation stage and a test question answering and photographing stage, wherein a plurality of anchor points of a test question mother board and the positions of the marked answering areas are firstly determined in the test question early-stage preparation stage, then the accurate positions of the anchor points in a photographed picture are determined in an image characteristic matching mode in the test question answering and photographing stage, then perspective transformation matrixes are obtained according to the positions of the anchor points in the scanned picture and the photographed picture, and finally the identification position of the answering area and the answering content of the test questions are obtained according to the positions of the marked answering areas and the perspective transformation matrixes, so that a user can directly obtain the answering content of the test questions after answering and photographing, the manual intervention is not needed, and the user experience is greatly improved;

The various embodiments described above are merely illustrative, and may or may not be physically separate, as they relate to elements illustrated as separate components; if reference is made to a component displayed as a unit, it may or may not be a physical unit, and may be located in one place or distributed over a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.

The above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: modifications of the technical solutions described in the embodiments or equivalent replacements of some technical features may still be made. And such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Finally, it should be noted that the present invention is not limited to the above alternative embodiments, and that various other forms of products can be obtained by anyone in light of the present invention. The above detailed description should not be taken as limiting the scope of the invention, which is defined in the claims, and which the description is intended to be interpreted accordingly.

Claims

1. A method for identifying the answer content of test questions is characterized by comprising a test question early-stage preparation stage and a test question answer photographing stage;

2. The method for identifying answer contents to questions of claim 1, wherein when said question information further includes a question stem position, before said step S207, further comprising the steps of:

3. The method for identifying answering content to a question according to claim 1, wherein the question information further includes standard answer content.

4. The method according to claim 1, wherein in step S102, the image feature value of each anchor point is further calculated, and the test question master contains the image feature value of each anchor point.

5. The method for identifying the answer content of the test question as claimed in claim 1, further comprising a server matching processing stage of: and carrying out image feature matching on each sub-photographing picture and the corresponding selection anchor point in the test question master mask to obtain the accurate position of each selection anchor point in the corresponding sub-photographing picture.

6. The method of claim 5, wherein the image feature matching is performed by using an opencv-based SURF algorithm, SIFT algorithm, ORB algorithm or FAST algorithm.

7. The method for identifying the answer content of the test question as claimed in claim 1, further comprising a server identification processing stage of: and recognizing each character on the photographed picture by adopting a handwriting recognition model trained through deep learning in advance, and acquiring a handwriting recognition result containing the raisin content and the answering content.

8. The method of claim 7, wherein a YOLO target detection network model is used for deep learning, and then the handwriting recognition model is obtained.

9. The method of claim 1, wherein when n is 3, m is 6.

10. The method as claimed in claim 1, wherein the side length of the square sliding window is 1/30-1/10 of the width of the scanned picture or the width of the sub-scanned picture.