CN114708543B

CN114708543B - Examination student positioning method in examination room monitoring video image

Info

Publication number: CN114708543B
Application number: CN202210629393.7A
Authority: CN
Inventors: 刘说; 潘帆; 李翔; 赵启军; 黄珂; 杨玲; 杨智鹏
Original assignee: Chengdu University of Information Technology
Current assignee: Chengdu University of Information Technology
Priority date: 2022-06-06
Filing date: 2022-06-06
Publication date: 2022-08-30
Anticipated expiration: 2042-06-06
Also published as: CN114708543A

Abstract

The invention relates to the field of image processing, in particular to a method for positioning examinees in an examination room monitoring video image, which mainly comprises the steps of firstly carrying out frame selection marking based on head hair areas of the examinees on a large amount of examination room monitoring video image data containing different examination scenes and different examinees according to the ear visible conditions of the examinees in the examination room monitoring video image data, establishing an examinee head hair area data set, carrying out primary screening based on high false alarm rate target detection on the basis, finally establishing a model based on SSD deep learning target detection, positioning the head hair areas of the examinees, and finally realizing the positioning of the examinees.

Description

Examination student positioning method in examination room monitoring video image

Technical Field

The invention belongs to the field of image processing, and particularly relates to a method for positioning an examinee in an examination room monitoring video image.

Background

Examination is widely used as an important examination and drawing means all over the world, because fairness and justness can be ensured to some extent. However, various cheating means exist for smooth examination passing, and the examination monitoring system is widely applied to various examinations for ensuring the fairness and justice principle of the examinations. However, the examination room has a video monitoring system, which does not mean that the cheating problem can be solved well.

Although the video monitoring can record the examination room information completely, whether the examination cheating behavior exists still needs a large amount of manpower input by related departments to carry out post-processing and examination on the video data, wherein the cheating behavior does not exist in a large proportion of videos, but each video section needs to be carefully examined by related personnel, so that a large amount of workload is generated, the requirement for automatically identifying the behavior of the examinee in the examination room monitoring video is generated, and the key problem which needs to be solved is how to position the examinee in the examination room monitoring video.

The examination room monitoring video detection and positioning methods can be roughly divided into methods based on background difference, methods based on template matching and methods based on image characteristics, and the methods have the problems of limited detection range, large dependence on examination room layout and the like.

Disclosure of Invention

Aiming at the defects of the prior art, the invention provides a method for positioning an examinee in an examination room monitoring video image, which comprises the following steps:

step 1: performing frame selection marking based on the head hair area of the examinee on a large amount of examination room monitoring video image data containing different examination scenes and different examinees, and then establishing an examinee head hair area data set of the examination room monitoring video image data;

step 2: establishing a target detection deep learning model for positioning a hair area at the top of an examinee in examination room monitoring video image data, firstly screening pixel points which are possibly hairs in the examination room monitoring video image data to obtain preprocessed image data, and then carrying out deep learning target detection based on SSD on the preprocessed image data;

and step 3: the head hair area of the examinee of the established examination room monitoring video image dataDividing the data set in proportion to generate a training data set and a testing data set respectively, training and testing the established target detection deep learning model for positioning the head top hair region of the examinee in the examination room monitoring video image data to obtain a final target detection model

；

And 4, step 4: inputting initial image data of examination room monitoring video into final target detection model

Then, the examinee positioning result in the examination room monitoring video image data is obtained.

Further, step 1: the method comprises the following steps of performing frame selection marking based on the head hair area of an examinee on a large number of examinee monitoring video image data containing different examination scenes and different examinees, and specifically comprises the following steps: and carrying out box marking based on the ear exposure condition of the examinee and based on the approximate image on the head and top hair area of the examinee in the examination room monitoring video image data.

Further, the head and top hair area of the examinee in the examination room monitoring video image data is subjected to frame marking based on the examinee ear exposure condition, and the ear exposure condition is divided into: two ears are exposed, one ear is exposed, and no ear is exposed.

Furthermore, the head top hair area of the examinee in the examination room monitoring video image data is marked by a box based on an approximate image, and particularly, the horizontal and vertical edges of the generated frame are parallel to the edges of the image data.

Furthermore, two ears of the examinee in the video image data are exposed, and the frame selection area is as follows: using the lowest point of the boundary between the hair and the forehead obtained by the edge detection as the bottom of the frame selection region, and the distance from the bottom of the frame selection region to the top of the hair obtained by the edge detection

Selecting the height as the frame, and taking the longest distance between the left and right sides of the hair and the background

Doubling the frame selection width to form frame selection area and variable

And

are weighting coefficients.

Further, if an ear of the examinee in the video image data is exposed, the framing area is: using the middle point between the highest point and the lowest point of the junction between the hair and the forehead as the bottom of the framed selection area, and the distance from the bottom of the framed selection area to the top of the hair

Height is selected as the longest distance between the edge of one exposed ear and hair and the edge of the other side hair and background

Doubling the frame selection width to form frame selection area and variable

And

are weighting coefficients.

Furthermore, the examinee does not expose ears in the video image data, the highest point of the junction of the hair and the forehead is taken as the bottom of the frame selection area, the distance from the bottom of the frame selection area to the top of the hair is taken as the frame selection height, and the horizontal width displayed by the forehead in the image data is taken as the frame selection width to form the frame selection area.

Further, step 2: screening pixel points which may be hairs in the examination room monitoring video image data to obtain preprocessed image data, wherein the specific method comprises the following steps: firstly, the image data is grayed to obtain the grayscale image data

(ii) a Then, the pixel value of each pixel point in the gray image data is calculated according to

Performing inversion to obtain grayscale inverted image data

In which

、

As image data

、

The abscissa of (A) is

On the ordinate of

The gray value of the pixel point; inverting gray scale image data

CFAR target detection with high false alarm rate is carried out to obtain screened image data, and a threshold value is set

Carrying out binarization processing on the screened image data to obtain preprocessed image data

。

Further, step 2: performing SSD-based deep learning target detection on the preprocessed image data, specifically: image data of binary detection result

And as an index image, mapping coordinates of non-0 pixel value pixel points in the index image in the image to corresponding examination room monitoring video image data, and establishing a target detection model for positioning the hair region of the examination room monitoring video image data based on an SSD target detection framework by taking the mapped pixel points as anchor frame center points.

Further, step 4: inputting initial image data of examination room monitoring video into final target detection model

Then, obtaining the examinee positioning result in the examination room monitoring video image data, specifically: inputting initial image data of examination room monitoring video into final target detection model

In the method, the result of the hair region selection is obtained, and each selection region is expanded downwards to the self range

And obtaining an updated region frame selection result, and determining the updated region frame selection result as a test taker positioning result.

The invention solves the following technical problems:

1. the frame selection marking method based on the examinee hair area is provided for the examinee monitoring video image data according to the ear visible condition of the examinee in the examinee monitoring video image data, and accuracy and reliability of the examinee hair area data set are improved.

2. The primary screening of target detection based on high false alarm rate is carried out on the pixel points which are probably hairs in the examination room monitoring video image data, so that the accuracy of examination of the hair region of the examinee is effectively improved.

3. The binary detection result of the monitoring video image data of the examination room is used as an index image, and the anchor frame selection is carried out on the basis of the index image, so that the accuracy of the examination room hair region detection is improved, and the complexity of a target detection model is reduced.

Drawings

Fig. 1 is a flow chart of a method for positioning examinees in an examination room monitoring video image.

Detailed Description

In the following, the technical solution in the embodiment of the present invention will be clearly and completely described with reference to the drawings in the embodiment of the present invention, and a flow chart of the method is shown in fig. 1, and includes the following steps:

a method for positioning examinees in an examination room monitoring video image comprises the following steps:

and 2, step: establishing a target detection deep learning model for positioning a hair area at the top of an examinee in examination room monitoring video image data, firstly screening pixel points which are possibly hairs in the examination room monitoring video image data to obtain preprocessed image data, and then carrying out deep learning target detection based on SSD on the preprocessed image data;

and step 3: dividing an examinee's head top hair region data set of the established examination room monitoring video image data in proportion to generate a training data set and a testing data set respectively, training and testing the established target detection deep learning model for positioning the examinee's head top hair region in the examination room monitoring video image data to obtain a final target detection model

；

Further, step 1: the method comprises the following steps of performing frame selection marking based on the head hair area of an examinee on a large amount of examination room monitoring video image data containing different examination scenes and different examinees, and specifically comprises the following steps: and carrying out frame marking on the head and top hair area of the examinee in the examination room monitoring video image data based on the ear exposure condition of the examinee and based on the approximate image.

Doubling the frame selection width to form frame selection area and variable

And

are weighting coefficients.

Doubling the frame selection width to form frame selection area and variable

And

are weighting coefficients.

Further, step 2: screening pixel points which are possibly hairs in examination room monitoring video image data to obtain preprocessed image data, wherein the specific method comprises the following steps: firstly, the image data is grayed to obtain the grayscale image data

Performing inversion to obtain grayscale inverted image data

Wherein

、

As a number of imagesAccording to

、

On the abscissa of

On the ordinate of

The gray value of the pixel point; inverting gray scale image data

Carrying out CFAR target detection with high false alarm rate to obtain screened image data, and setting a threshold value

。

In (1),then obtaining the examinee positioning result in the examination room monitoring video image data, which specifically comprises the following steps: inputting initial image data of examination room monitoring video into final target detection model

It should be apparent that the described embodiments are only some embodiments of the present invention, and not all embodiments. Other embodiments, which can be derived by one of ordinary skill in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Claims

1. A method for positioning examinees in an examination room monitoring video image is characterized by comprising the following steps:

step 1: the method comprises the following steps of performing frame selection marking based on the head hair area of an examinee on a large number of examinee monitoring video image data containing different examination scenes and different examinees, and specifically comprises the following steps: carrying out box marking based on the ear exposure condition of the examinee and based on the approximate image on the head and hair area of the examinee in the monitoring video image data of the examination room; then establishing a data set of the head hair area of the examinee of the examination room monitoring video image data;

and step 3: dividing the data set of the head top hair area of the examinee of the established examination room monitoring video image data according to the proportion to respectively generate a training data set and a test data setTraining and testing the established target detection deep learning model for positioning the hair area at the top of the examinee in the examination room monitoring video image data to obtain a final target detection model M _ssd ；

And 4, step 4: inputting initial image data of examination room monitoring video into final target detection model M _ssd Then, the examinee positioning result in the examination room monitoring video image data is obtained.

2. The method according to claim 1, wherein the examination room monitoring video image is characterized in that the examination room monitoring video image data is subjected to examination room ear exposure condition-based frame marking on the top of the head and hair area of the examinee, and the ear exposure condition is divided into: two ears are exposed, one ear is exposed, and no ear is exposed.

3. The method according to claim 1, wherein the top hair area of the head of the examinee in the examination room monitoring video image data is marked by a box based on the approximate image, and the horizontal and vertical edges of the generated frame are parallel to the edges of the image data.

4. The method according to claim 2, wherein if two ears of the examinee are exposed in the video image data, the frame selection area is: taking the lowest point of the boundary between the hair and the forehead obtained by the edge detection as the bottom of the frame selection area, and taking the distance alpha from the bottom of the frame selection area to the top of the hair obtained by the edge detection ₁ Selecting the height as a frame, and taking the beta of the longest distance between the left side and the right side of the hair and the background ₁ Doubling the frame selection width to form a frame selection area, and changing the variable alpha ₁ And beta ₁ Are weighting coefficients.

5. The method as claimed in claim 2, wherein if an ear of the examinee appears in the video image data, the selection area is: with the highest point of the hair-forehead junction andthe middle value point between the lowest points is the bottom of the frame selection area, and the distance alpha from the bottom of the frame selection area to the top of the hair ₂ Selecting the height as the frame, and exposing the longest distance beta between the border of one ear and hair and the border of the other hair and background ₂ Doubling the frame selection width to form a frame selection area, and changing the variable alpha ₂ And beta ₂ Are weighting coefficients.

6. The method as claimed in claim 2, wherein the examinee in the video image data has no exposed ear, the highest point of the junction between the hair and the forehead is the bottom of the frame selection area, the distance from the bottom of the frame selection area to the topmost point of the hair is the frame selection height, and the horizontal width of the forehead displayed in the image data is the frame selection width, thereby forming the frame selection area.

7. The method for locating the examinee in the examination room monitoring video image according to claim 1, characterized in that the step 2: screening pixel points which are possibly hairs in examination room monitoring video image data to obtain preprocessed image data, wherein the specific method comprises the following steps: firstly, carrying out gray processing on image data to obtain gray image data I _g (ii) a Then, the pixel value of each pixel point in the gray image data is according to I _c (i，j)＝255-I _g (I, j) inverting to obtain grayscale inverted image data I _c In which I _g (i，j)、I _c (I, j) is image data I _g 、I _c The horizontal coordinate in the gray value is i, and the vertical coordinate is j; inverting the grayscale image data I _c Carrying out CFAR target detection with high false alarm rate to obtain screened image data, setting a threshold th, carrying out binarization processing on the screened image data to obtain preprocessed image data I _t 。

8. The method for locating the examinee in the examination room monitoring video image according to claim 1, characterized in that the step 2: performing SSD-based deep learning target detection on the preprocessed image data, specifically: detecting the binary valueResulting image data I _t And as an index image, mapping coordinates of non-0 pixel value pixel points in the index image in the image to corresponding examination room monitoring video image data, and establishing a target detection model for positioning the hair region of the examination room monitoring video image data based on an SSD target detection framework by taking the mapped pixel points as anchor frame center points.

9. The method for locating the examinee in the examination room monitoring video image according to claim 1, characterized in that the step 4: inputting initial image data of examination room monitoring video into final target detection model M _ssd Then, obtaining the examinee positioning result in the examination room monitoring video image data, specifically: inputting initial image data of examination room monitoring video into final target detection model M _ssd And obtaining a hair area framing result, expanding each framing area downwards by Q times of the range of the framing area to obtain an updated area framing result, and determining the updated area framing result as a test taker positioning result.