WO2021039231A1 - Image analysis device, image analysis method, and program - Google Patents

Image analysis device, image analysis method, and program Download PDF

Info

Publication number
WO2021039231A1
WO2021039231A1 PCT/JP2020/028671 JP2020028671W WO2021039231A1 WO 2021039231 A1 WO2021039231 A1 WO 2021039231A1 JP 2020028671 W JP2020028671 W JP 2020028671W WO 2021039231 A1 WO2021039231 A1 WO 2021039231A1
Authority
WO
WIPO (PCT)
Prior art keywords
region
face
detector
detected
captured image
Prior art date
Application number
PCT/JP2020/028671
Other languages
French (fr)
Japanese (ja)
Inventor
相澤 知禎
Original Assignee
オムロン株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by オムロン株式会社 filed Critical オムロン株式会社
Publication of WO2021039231A1 publication Critical patent/WO2021039231A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/60Analysis of geometric attributes

Definitions

  • the present disclosure relates to an image analysis device, an image analysis method, and a program.
  • Patent Document 1 discloses an image analysis device that accurately detects the face of a shielded person while wearing a shield such as a mask.
  • the prior art is to improve the detection accuracy of the object when the object and the shield are in a predetermined positional relationship.
  • prior art enhances the detection accuracy of a normal masked face with the nose and mouth shielded.
  • the mask is moved downward and the nose is not covered by the mask for reasons such as relieving suffocation.
  • the conventional technique accurately performs the face in such a case. Cannot be detected.
  • An object of the present disclosure is to provide an image analysis technique capable of detecting an object according to various shielding modes as compared with the conventional technique when the object is shielded by a shield.
  • the image analysis apparatus is The image acquisition unit that acquires the captured image and In the captured image, a first detector that detects a first region indicating an object partially shielded by a shield, and A second detector that detects a second region indicating the object that is not shielded by the shield in the captured image. It is provided with an analysis unit for specifying an object area indicating an area in which the object is reflected in the captured image. The analysis unit If it is determined that the first region has been detected by the first detector, but the second region has not been detected by the second detector, the first region is specified as the object region. And When it is determined that the first region is detected by the first detector and the second region is detected by the second detector, the first region and the second region are included. The region is specified as the object region.
  • the image analysis method is The step that the control unit acquires the captured image, A first region detection step of detecting a first region indicating an object partially shielded by a shield in the captured image, and a first region detection step. A second region detection step of detecting a second region indicating the object that is not shielded by the shield in the captured image, and a second region detection step. Including an analysis step of identifying an object area indicating an area in which the object is reflected in the captured image.
  • the control unit If it is determined that the first region has been detected in the first region detection step, while it is determined that the second region has not been detected in the second region detection step, the first region is referred to as the object region. Specified as If it is determined that the first region has been detected in the first region detection step and that the second region has been detected in the second region detection step, the first region and the second region are separated.
  • the region to be included is specified as the object region.
  • FIG. 1 It is a figure for demonstrating the application example of the face detection apparatus which concerns on this disclosure. It is a block diagram which shows an example of the hardware composition of the face detection apparatus of FIG. It is a block diagram which shows the functional structure example of the control part of the face detection apparatus shown in FIG. It is a flowchart which shows an example of a face detection process executed by a control unit. It is a flowchart which shows an example of the face candidate rectangle detection process of FIG. It is a schematic diagram which shows an example of the face detector with a mask of FIG. It is a schematic diagram which illustrates the captured image including the face candidate rectangle R1 with a mask and the face candidate rectangle R2 without a mask. It is a schematic diagram which illustrates the intermediate face rectangle A. It is a schematic diagram which illustrates the rectangle to be merged. It is a schematic diagram which illustrates the final face rectangle B.
  • FIG. 1 schematically illustrates a face detection system 1 which is an example of an application scene of the face detection device 100.
  • the face detection device 100 is an example of the "image analysis device" of the present disclosure.
  • the face detection system 1 includes a face detection device 100.
  • the face detection system 1 may further include, for example, a camera 3, an eye opening / closing detection device 50, a line-of-sight detection device 60, and a face orientation detection device 70.
  • the face detection device 100 is an information processing device that acquires a captured image captured by the camera 3 and extracts a region (hereinafter, referred to as “face region”) B in which a human face is reflected in the captured image.
  • face region B a region in which a human face is reflected in the captured image.
  • the human face is an example of the "object” of the present disclosure
  • the face area B is an example of the "object area” indicating the area in which the object is reflected in the captured image.
  • the face detection device 100 is used in, for example, a face detection system 1 that detects the face of a worker who assembles or packs a product in a factory.
  • a face detection system 1 that detects the face of a worker who assembles or packs a product in a factory.
  • detection processing by the subsequent eye opening / closing detection device 50, the line-of-sight detection device 60, the face orientation detection device 70, or the like may be executed.
  • the eye opening / closing detection device 50 for example, analyzes the face region B by an image, detects the positions of the eyes, upper eyelid, lower eyelid, etc., and measures the number of times, frequency, and the like of opening / closing.
  • the line-of-sight detection device 60 detects, for example, the position of the pupil by image-analyzing the face region B, thereby measuring the position of the pupil or the line of sight, the moving speed, and the like.
  • the face orientation detection device 70 analyzes the face region B by an image, and detects the direction in which the face is facing by, for example, a known template matching method.
  • the results of the eye opening / closing detection device 50 and the line-of-sight detection device 60 are used, for example, to detect the alertness of the worker. For example, it is known that when a person falls into a drowsy state with low alertness, the range of movement of the position of the pupil of the worker becomes narrow, or the speed of the movement or saccade becomes small. In addition, when a person falls into a drowsy state, for example, the distance between the upper eyelid and the lower eyelid of the operator becomes smaller. That is, the eyelids are about to close. In such a case, the face detection system 1 determines, for example, that the worker's alertness is low. Also, for example, if the worker's eyes remain closed, it may be determined that the worker is asleep.
  • the face orientation detection device 70 detects that the orientation of the worker's face changes frequently, the worker's attention may be distracted.
  • the face detection system 1 may control such as sending an announcement urging the operator to take a break from a speaker (not shown).
  • the face detection system 1 may include a control unit that controls a work line in a factory.
  • the face detection system 1 can prevent the occurrence of mistakes and accidents by stopping the work line of the factory when the alertness of the worker is lowered.
  • the face detection system 1 may notify the factory manager, collaborators, and medical workers such as industrial physicians and nurses when the arousal level of the worker decreases. As a result, these persons can take measures such as reviewing the work plan. In this way, it is possible to prevent the occurrence of accidents, mistakes, etc. due to a decrease in alertness.
  • the detection by the eye opening / closing detection device 50, the line-of-sight detection device 60, and the face orientation detection device 70 as described above is based on the premise that facial organs such as eyes are included in the face region.
  • the worker imaged by the camera 3 may be wearing a mask. That is, the worker's face may be covered by a mask.
  • the mask is an example of the "shield" of the present disclosure.
  • Accurate detection of the face region B from the captured image 2a including the face of a person who is not shielded by a shield such as a mask is performed by the conventional maskless face detector 123. It is feasible. Further, the face region B is detected from the captured image 2b including the face of a person shielded by a shield such as a mask (hereinafter referred to as "face with mask”) whose nose and mouth are shielded. Can also be performed by the conventional masked face detector 122.
  • detecting the face region B from the face of a person wearing a mask so as to cover the nose and mouth is a conventional maskless face detector 123.
  • detecting the face region B from the captured image 2c including the masked face whose nose is not shielded is a conventional maskless face detector 123.
  • the conventional face detector 122 with a mask has a problem that an area not including eyes is erroneously detected as a face area B.
  • a trained model constructed by having a model such as a convolutional neural network (CNN) learn a large amount of images of a nose mask face is used to detect a face with a nose mask. It is conceivable to construct a vessel. However, it is difficult to grasp all the modes because there may be various ways to put on the mask, such as how much the nose is exposed even if it is called a "nose-out mask face". In addition, it is difficult to obtain a large number of images of the nose mask face for each of these aspects. Further, if a third nose-out mask face detector is installed in addition to the maskless face detector 123 and the masked face detector 122, the amount of arithmetic processing by the face detector increases, and the load and processing time increase. To do.
  • CNN convolutional neural network
  • the present disclosure provides a face detection device 100 capable of detecting a face with a nose mask by using the detection results of the face detector 123 without a mask and the face detector 122 with a mask.
  • the face detection device 100 can detect a face with a nose mask while suppressing an increase in load and processing time.
  • FIG. 2 is a block diagram showing an example of the hardware configuration of the face detection device 100 according to the present embodiment.
  • the face detection device 100 includes an input unit 11, a control unit 12, a storage unit 13, and a communication interface (I / F) 14.
  • the input unit 11 is an interface circuit that connects the face detection device 100 and an external device such as a camera 3.
  • the control unit 12 is an information processing device that includes a CPU (Central Processing Unit), a RAM (Random Access Memory), a ROM (Read Only Memory), and the like, and controls each component of the face detection device 100 according to information processing. is there.
  • a CPU Central Processing Unit
  • RAM Random Access Memory
  • ROM Read Only Memory
  • the storage unit 13 performs the information of the program or the like by electrical, magnetic, optical, mechanical or chemical action so that the information of the program or the like recorded by the computer or other device, the machine or the like can be read. It is a medium to accumulate.
  • the storage unit 13 is, for example, an auxiliary storage device such as a hard disk drive or a solid state drive, and stores an image processing program, a face detection program, or the like executed by the control unit 12.
  • the communication I / F 14 includes an interface circuit for enabling a communication connection between the face detection device 100 and an external device.
  • Communication I / F14 communicates according to standards such as IEEE802.3, IEEE802.11 or Wi-Fi, LTE, 3G, 4G, and 5G.
  • the communication I / F 14 may be an interface circuit that communicates according to standards such as USB (Universal Serial Bus), HDMI (High Definition Multimedia Interface), IEEE 1394, and Bluetooth.
  • the camera 3 is connected to the face detection device 100 via the input unit 11.
  • the camera 3 is, for example, an infrared camera having sensitivity to infrared rays.
  • the face detection device 100 may be equipped with an infrared irradiator that irradiates infrared rays toward the imaging range of the camera 3.
  • the camera 3 may be a visible light camera having sensitivity to visible light.
  • the camera 3 is arranged in a place where the worker in the factory can be photographed, for example, when the face detection device 100 is used to detect the face of the worker in the factory.
  • the camera 3 may be a wearable camera such as a glasses-type camera or a head-mounted camera that is worn on the worker's head.
  • the camera 3 may be connected to the face detection device 100 via the network and the communication I / F14.
  • FIG. 3 is a block diagram showing a functional configuration example of the control unit 12 of the face detection device 100 shown in FIG.
  • the control unit 12 includes an image acquisition unit 121, a face detector 122 with a mask, a face detector 123 without a mask, a mask presence / absence determination unit 124, a first merge processing unit 125, a merge target search unit 126, and a first. 2 Includes a merge processing unit 127.
  • the image acquisition unit 121 acquires the captured image captured by the camera 3 via the input unit 11.
  • the masked face detector 122 detects the masked face in the acquired captured image.
  • the unmasked face detector 123 detects an unmasked face in the acquired captured image.
  • the mask presence / absence determination unit 124 determines whether or not the face in the captured image is a masked face based on the detection results of the masked face detector 122 and the unmasked face detector 123.
  • the masked face detector 122 is an example of the "first detector” of the present disclosure
  • the maskless face detector 123 is an example of the "second detector” of the present disclosure.
  • the first merge processing unit 125 When the mask presence / absence determination unit 124 determines that the face in the captured image is a masked face, the first merge processing unit 125 combines (merges) the detected masked face candidate rectangle R1 (see FIG. 7). Then, the intermediate face rectangle A is generated (see FIG. 8). The region surrounded by the masked face candidate rectangle R1 is an example of the “first region” of the present disclosure.
  • the merge process executed by the first merge processing unit 125 is referred to as a "first merge process".
  • the merge target search unit 126 searches for a merge target rectangle that satisfies the conditions described later from the unmasked face candidate rectangle R2 (see FIG. 7).
  • the region surrounded by the unmasked face candidate rectangle R2 is an example of the “second region” of the present disclosure.
  • the second merge processing unit 127 sets the intermediate face rectangle A (see FIG. 9) as the merged rectangle, and sets the rectangle including the merge target rectangle and the intermediate face rectangle A as the final face rectangle B. (See FIG. 10).
  • the area surrounded by the final face rectangle B is an example of the “object area” of the present disclosure.
  • the functional block including the mask presence / absence determination unit 124, the first merge processing unit 125, the merge target search unit 126, and the second merge processing unit 127 is an example of the “analysis unit” of the present disclosure.
  • the functional block including the first merge processing unit 125, the merge target search unit 126, and the second merge processing unit 127 is an example of the “area identification unit” of the present disclosure.
  • control unit 12 A detailed operation example of the control unit 12 will be described later.
  • Each process by the image acquisition unit 121, the masked face detector 122, the unmasked face detector 123, the mask presence / absence determination unit 124, the first merge processing unit 125, the merge target search unit 126, and the second merge processing unit 127 It may be executed by the control unit 12 executing a necessary program.
  • the program may be stored in the storage unit 13.
  • the control unit 12 executes a necessary program, the control unit 12 expands the program stored in the storage unit 13 into the RAM. Then, the control unit 12 interprets and executes the program expanded in the RAM by the CPU, and controls each component of the face detection device 100.
  • control unit 12 may be composed of various semiconductor integrated circuits such as a CPU, MPU, GPU, microcomputer, DSP, FPGA, and ASIC.
  • FIG. 4 is a flowchart showing an example of the face detection process executed by the control unit 12 of the face detection device 100.
  • the processing procedure described below is only an example, and the processing procedure and each processing may be changed as much as possible.
  • Step S101 the control unit 12 operates as an image acquisition unit 121, and acquires an captured image captured by the camera 3 via the input unit 11 (S101).
  • the camera 3 takes an image at a constant frame rate.
  • the control unit 12 may acquire a plurality of captured images.
  • the control unit 12 may acquire a moving image composed of a plurality of frames, or may acquire a plurality of still images.
  • Step S102 the control unit 12 executes a face candidate rectangle detection process for detecting the unmasked face candidate rectangle and the masked face candidate rectangle (S102).
  • the face candidate rectangle detection process in step S102 will be described in detail with reference to FIG.
  • FIG. 5 is a flowchart showing an example of the face candidate rectangle detection process in step S102.
  • the control unit 12 detects the masked face candidate rectangle by the masked face detector 122 under the conditions of the detected face size iSize, the detected face rotation angle iAngle, and the detected face position iPos (S102a), and then the mask.
  • the face-less face detector 123 detects the face candidate rectangle without mask (S102b).
  • step S102a may be executed after step S102b.
  • step S102a and step S102b the control unit 12 cuts out a part of the captured image acquired in step S101 under the conditions of the detected face size iSIZE, the detected face rotation angle iAngle, and the detected face position iPos.
  • the detected face size iSIZE indicates the size of the image cut out from the captured image, and is specified by, for example, vertical and horizontal pixels. In the loop of the detected face size iSize of FIG. 5, the size of the image cut out from the captured image is changed each time the loop is performed.
  • the detected face rotation angle iAngle indicates the angle of the image cut out from the captured image, and is represented by, for example, an angle of 0 ° or more and less than 360 °.
  • the detected face position iPos indicates the position of the image cut out from the captured image.
  • the control unit 12 detects whether or not a part of the cut out captured image matches the template image stored in the storage unit 13 in advance.
  • the control unit 12 determines that a part of the captured image and the template image match when the score indicating the reliability of a part of the cut out captured image is equal to or higher than a predetermined threshold value.
  • the score is an example of the "reliability" of the present disclosure.
  • the score is, for example, an index showing the degree of similarity between a part of the cut out captured image and the template image.
  • the score takes a value in the range of 0 to 1, for example, and the larger the value, the higher the degree of similarity between the part of the captured image cut out and the template image.
  • control unit 12 stores a part of the edge of the captured image as a face candidate rectangle with a mask or a face candidate rectangle without a mask in the storage unit 13, and stores the score corresponding to the face candidate rectangle in the storage unit 13. To do.
  • the control unit 12 performs the detection processes of steps S102a and S102b under the conditions of a large number of detected face sizes iSize, detected face rotation angle iAngle, and detected face position iPos while changing i by incrementing or the like.
  • the method of detecting the face candidate rectangle is not limited to the above example.
  • the control unit 12 may detect the face candidate rectangle by a known template matching method.
  • the face candidate rectangle may be detected by a machine learning algorithm such as AdaBoost, or a detector constructed by machine learning for a model such as a convolutional neural network (CNN).
  • AdaBoost machine learning algorithm
  • CNN convolutional neural network
  • FIG. 6 is a schematic view showing an example of the masked face detector 122 of FIG. 3 constructed in this way.
  • the masked face detector 122 is, for example, a detector using a known cascade method, and includes first to Nth classifiers 2-1 to 2-N.
  • Each classifier identifies whether or not the target image input as follows is an image of a face with a mask.
  • the target image is a part of the captured image cut out under the conditions of the detected face size iSize, the detected face rotation angle iAngle, and the detected face position iPos as described above.
  • the first classifier 2-1 identifies whether or not the target image is a face with a mask.
  • the first classifier 2-1 outputs the target image to the second classifier 2-2.
  • the target image is deleted or discarded.
  • second to second classifiers 2-2-2 to N also discriminate whether or not the input target image is a masked face in the same manner as the first classifier 2-1.
  • the Nth classifier 2-N in the final stage outputs the target image when it identifies that the input target image is a face with a mask.
  • the output target image is identified as a masked face by the masked face detector 122. That is, it is determined that the target image shows a masked face only when all of the first to Nth classifiers 2-1 to 2-N are consistently identified as "a face with a mask”. Will be done.
  • the masked face detector 122 determines, for example, the edge of the target image output from the Nth classifier 2-N as a masked face candidate rectangle. In this way, the masked face candidate rectangle detection process in step 102a is completed.
  • Each of the first to Nth classifiers 2-1 to 2-N has an identification condition for discriminating whether or not the target image is a face with a mask.
  • the identification conditions of the first to Nth classifiers 2-1 to 2-N have different rigor.
  • the discrimination condition of the first classifier 2-1 is the most lenient, and the discrimination condition of the second classifier 2-2 is the strictest next to the discrimination condition of the first classifier 2-1.
  • the latter-stage classifier has stricter discrimination conditions, and the final-stage N-th classifier 2-N has the strictest discrimination conditions. Since the identification under loose conditions can be performed even with a small number of features, the amount of calculation is small.
  • the unmasked face detector 123 in FIG. 3 also has the same configuration as the masked face detector 122, and detects the unmasked face candidate rectangle.
  • Step S103 When neither the masked face candidate rectangle nor the unmasked face candidate rectangle is detected in step S102, the control unit 12 ends the process of FIG. 4 (S103). For example, when the captured image does not include a human face, neither the masked face candidate rectangle nor the unmasked face candidate rectangle is detected in step S102.
  • Step S104 the control unit 12 operates as the mask presence / absence determination unit 124, and based on the masked face candidate rectangle and the unmasked face candidate rectangle detected in step S102, is the face in the captured image a masked face? It is determined whether or not (S104).
  • S104 An example of the masked face determination process in step S104 will be described with reference to FIG. 7.
  • FIG. 7 is a schematic diagram illustrating an captured image including the face candidate rectangle R1 with a mask and the face candidate rectangle R2 without a mask detected in step S102.
  • the masked face candidate rectangle R1 is indicated by a double line
  • the unmasked face candidate rectangle R2 is indicated by a broken line.
  • step S104 whether or not the face in the captured image is a masked face is determined based on the one or more masked face candidate rectangles R1 or the one or more unmasked face candidate rectangles R2 detected in step S102. judge.
  • three masked face candidate rectangles R1 and two unmasked face candidate rectangles R2 are shown.
  • each masked face candidate rectangle R1 and unmasked face candidate rectangle R2 each have a score.
  • the control unit 12 counts the weighted numbers of the masked face candidate rectangle R1 and the unmasked face candidate rectangle R2. For example, the control unit 12 calculates the weighted number by adding up the scores of the masked face candidate rectangle R1 and the unmasked face candidate rectangle R2, respectively.
  • the scores of the three masked face candidate rectangles R1 are, for example, 0.7, 0.8, and 0.75
  • the scores of the two unmasked face candidate rectangles R2 are, for example, 0.2 and 0.1
  • the control unit 12 indicates that the face in the captured image is a masked face. Is determined.
  • the control unit 12 determines that the face in the captured image is an unmasked face. To do.
  • the total value of the scores for the face candidate rectangle R1 with mask is 2.25> the total value of the scores for the face candidate rectangle R2 without mask is 0.3. Therefore, the control unit 12 controls the face in the captured image. Is determined to be a face with a mask.
  • Step S105 When it is determined in step S104 that the face in the captured image is a face with a mask, the control unit 12 merges the face candidate rectangle R1 with a mask to generate an intermediate face rectangle A (S105).
  • the "analysis unit” of the present disclosure determines that the first region has been detected by the first detector. This is an example of "case”.
  • FIG. 8 is a schematic diagram illustrating the intermediate face rectangle A generated by the first merge process in step S105.
  • FIG. 8 schematically illustrates an intermediate face rectangle A generated by executing the first merge process on the three masked face candidate rectangles R1 of FIG. 7.
  • a known method may be used for the first merge process executed in step S105.
  • the control unit 12 that operates as the first merge processing unit 125 determines the position of the center of gravity of the intermediate face rectangle A based on the positions of the centers of gravity of the three masked face candidate rectangles R1 and their respective scores in FIG. Is calculated, and a rectangle having a predetermined shape whose center of gravity is the calculated position of the center of gravity is defined as an intermediate face rectangle A.
  • Steps S106, S107 the control unit 12 operates as a merge target search unit 126, and searches for a merge target rectangle satisfying a predetermined condition from the unmasked face candidate rectangle R2 (see FIG. 7) (S106). If the rectangle to be merged is detected, the process proceeds to step S108, and otherwise, the process proceeds to step S110 (S107).
  • step S107 the "analyzer of the present disclosure determines that the first region has been detected by the first detector, and the second detector detects the first region.” This is an example of "when it is determined that the region has been detected", and when the rectangle to be merged is not detected (when No in S107), "while determining that the first region has been detected by the first detector, while This is an example of "when it is determined that the second region is not detected by the second detector”.
  • step S107 for example, if the captured image is a masked face with a nose, the process proceeds to Yes, and if the face with a mask has a normal masked face with the nose and mouth shielded, the process proceeds to No.
  • FIG. 9 is a schematic diagram illustrating a rectangle to be merged.
  • the control unit 12 selects a rectangle that satisfies the following four conditions from the face candidate rectangle R2 without a mask, and sets it as a rectangle to be merged. In other words, the merged rectangle satisfies all of the following conditions.
  • the upper end of the unmasked face candidate rectangle R2 is located above the upper end of the intermediate face rectangle A.
  • Condition 2 The distance between the upper end of the unmasked face candidate rectangle R2 and the upper end of the intermediate face rectangle A is equal to or less than a predetermined threshold value H.
  • the score of the unmasked face candidate rectangle R2 indicates, for example, the probability that the image reflected in the unmasked face candidate rectangle R2 is an unmasked face.
  • Step S108 When the control unit 12 detects the rectangle to be merged in step S106, the control unit 12 merges the rectangle to be merged into the intermediate face rectangle A and determines the final face rectangle B (S108).
  • the merge process executed in step S108 is referred to as a "second merge process".
  • FIG. 10 is a schematic diagram illustrating the final face rectangle B generated by the second merge process in step S108.
  • the second merge process is a process of outputting a rectangle including a rectangle to be merged and a rectangle to be merged.
  • the control unit 12 that operates as the second merge processing unit 127 uses the intermediate face rectangle A (see FIG. 9) as the merged rectangle, and the rectangle including the merged target rectangle and the intermediate face rectangle A as the final face rectangle. Let it be B.
  • step S108 when the control unit 12 determines that the unmasked face candidate rectangle is detected by the unmasked face detector 123 and the masked face candidate rectangle is detected by the masked face detector 122, the merge is performed.
  • the final face rectangle B is specified so as to combine the target rectangle and the intermediate face rectangle A.
  • Step S110 If the control unit 12 does not detect the merge target rectangle in step S106, the control unit 12 does not perform a second merge process or the like, and specifies the intermediate face rectangle A as the final face rectangle B (S110).
  • Step S109 Further, when returning to step S104 and determining that the face in the captured image is not a face with a mask, the control unit 12 executes the first merge process for all the face candidate rectangles R2 without a mask, and finally faces.
  • the rectangle B is determined (S109).
  • the face detection device 100 which is an example of the image analysis device, includes an image acquisition unit 121 that acquires an captured image, a masked face detector 122 that is an example of the first detector, and a second. It includes a maskless face detector 123, which is an example of a detector, and a control unit 12 that operates as an analysis unit.
  • the masked face detector 122 detects a first region in the captured image showing a face, which is an example of an object, and is partially shielded by a mask, which is an example of a shield.
  • the unmasked face detector 123 detects a second region in the captured image that shows a face that is not masked by the mask.
  • the control unit 12 specifies a face region indicating a region in which the face is reflected in the captured image.
  • the control unit 12 determines that the first region has been detected by the masked face detector 122 (Yes in S104), while the control unit 12 determines that the second region has not been detected by the unmasked face detector 123 (in S107). No) specifies the first region as a face region (S110).
  • the control unit 12 determines that the first region is detected by the masked face detector 122 (Yes in S104) and determines that the second region is detected by the unmasked face detector 123 (Yes in S107).
  • Specifies a region including the first region and the second region as a face region S108.
  • Specifying the region including the first region and the second region as the face region is an example of the second merge process of the present disclosure.
  • the first region that is, the detection result of the masked face detector 122 alone is specified as the face region. be able to.
  • the detection results of both the masked face detector 122 and the unmasked face detector 123 are used. Therefore, the region including the first region and the second region can be specified as the face region.
  • Nose mask There are various ways to wear a mask, such as those with various degrees of nose, those whose mouth is covered with a mask, and those whose mouth is not covered with a mask.
  • the face detection device 100 can also detect a face region in captured images of faces in various wearing modes of these masks. As described above, when the face is shielded by the mask, the face detection device 100 can detect the face region according to various shielding modes as compared with the prior art.
  • the face detection device 100 since it is not necessary to provide the face detection device 100 with a detector for detecting the nose mask face, it is possible to detect the nose mask face while suppressing an increase in load and processing time. For example, when the face detection device 100 is used to measure the arousal level of a worker such as drowsiness, real-time processing for performing face detection processing immediately after imaging the worker's face is required.
  • the device 100 is advantageous because it can suppress an increase in load and processing time.
  • the analysis unit of the face detection device 100 determines that the first region has not been detected by the first detector, but if it determines that the second region has been detected by the second detector, the second region is used. It may be specified as an object area. In this way, the face detection device 100 can detect the face region of the captured image of the face of a person who does not wear a mask.
  • the face detection system 1 applied to the factory use has been described.
  • the present disclosure is not limited to this.
  • the face detection system 1 may be used in an office or the like.
  • the face detection system 1 works from the speaker. Controls such as an announcement urging a person to take a break may be made. This makes it possible to reduce the risk of making mistakes in desk work.
  • the face detection system 1 may be applied to in-vehicle applications.
  • the camera 3 is mounted in front of the driver, such as near the steering column cover, dashboard, and rearview mirror.
  • the position of the camera 3 is not limited to this, and may be any position as long as it can capture the driver's face.
  • the face detection system 1 vibrates the vibrating device attached to the seat and / or outputs a warning sound or an announcement prompting a break to the speaker. You may execute the control to make it. Further, for example, the face detection system 1 may perform automatic driving control and automatic braking control by controlling the steering and braking of the vehicle when the driver's arousal level is lowered. As a result, it is possible to prevent an accident caused by a decrease in the driver's alertness.
  • the face detection system 1 may be applied to medical applications. For example, it is known that the frequency of saccades increases in dementia patients such as patients with Lewy body dementias and Alzheimer's disease, and in persons with mild cognitive impairment. Therefore, the face detection system 1 may be used for diagnosing dementia, mild cognitive impairment, etc. by detecting the frequency of saccades with the line-of-sight detection device 60.
  • the face detection device 100 can also be applied to the automatic face detection function of a digital camera. Further, the face detection device 100 can be used for detecting the face of a pedestrian on a road or in a building such as a station yard for security purposes, for example. In this case, the camera 3 may be arranged so as to take a picture on the road or in a building such as a station yard.
  • face regions such as a face candidate rectangle R1 with a mask, a face candidate rectangle R2 without a mask, an intermediate face rectangle A, and a final face rectangle B, which are rectangles, have been described.
  • the shape of these face regions is not limited to a rectangle.
  • the shape of these face regions may be a quadrangle other than a rectangle, a polygon, a circle, and an ellipse.
  • the second merge process is described as a process of outputting a rectangle including a rectangle to be merged and a rectangle to be merged. However, when the shape of the face area is not a rectangle, the second merge process is performed.
  • This is a process for generating a result area that includes a plurality of merge target areas.
  • the second merge process is a process of generating a result area including two merge target areas, which is in contact with each of the two merge target areas at at least one point.
  • the image analysis apparatus (100) is An image acquisition unit (121) that acquires an captured image, In the captured image, a first detector (122) for detecting a first region showing an object partially shielded by a shield, and A second detector (123) that detects a second region indicating the object that is not shielded by the shield in the captured image. It is provided with an analysis unit that identifies an object region (B) indicating an region in which the object is reflected in the captured image. The analysis unit If it is determined that the first region has been detected by the first detector (122), while the second region has not been detected by the second detector (123), the first region is determined.
  • the object area (B) Specified as the object area (B)
  • the first region and the first region are determined.
  • a region including the two regions is specified as the object region (B).
  • a determination unit (124) for determining whether or not the object is shielded by the shield is further provided.
  • the analysis unit detects a coupling target region satisfying a predetermined condition from the second region, and the analysis unit detects the combination target region.
  • a region including the first region and the binding target region may be specified as the target region (B).
  • the condition may include that the binding target region has a predetermined positional relationship with the first region.
  • the analysis unit determines the intermediate region (A) by combining the plurality of first regions.
  • a region including the intermediate region (A) and the binding target region may be specified as the object region (B).
  • the condition may include that the binding target region has a predetermined positional relationship with the intermediate region (A).
  • the condition may include that the reliability of the binding target region is equal to or higher than a predetermined threshold value.
  • the object may be a human face.
  • the shield may be a mask.
  • the image analysis method is The step (S101) in which the control unit (12) acquires the captured image, In the captured image, a first region detection step (S102a) for detecting a first region indicating an object partially shielded by a shield, and A second region detection step (S102b) for detecting a second region indicating the object that is not shielded by the shield in the captured image, and It includes an analysis step of identifying an object region (B) indicating an region in which the object is reflected in the captured image.
  • the control unit (12) If it is determined that the first region has been detected in the first region detection step (S102a), while it is determined that the second region has not been detected in the second region detection step (S102b), the first region is determined.
  • the region is specified as the object region (B) (S110),
  • the first region and the second region are detected.
  • a region including the region is specified as the object region (B) (S108).
  • the program according to one aspect of the present disclosure causes the control unit to execute the image analysis method of the above aspect.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Geometry (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

An image analysis device 100 is provided with a first detector that detects a first area in a captured image, the first area indicating an object partially shielded by a shield, a second detector that detects a second area in a captured image, the second area indicating an object that is not shielded by a shield, and an analysis unit that identifies an object area in a captured image, the object area indicating an area containing an object. The analysis unit, upon determining that the first area has been detected by the first detector and that the second area has been detected by the second detector, identifies an area that encompasses the first area and the second area as the object area.

Description

画像解析装置、画像解析方法、及びプログラムImage analyzer, image analysis method, and program
 本開示は、画像解析装置、画像解析方法、及びプログラムに関する。 The present disclosure relates to an image analysis device, an image analysis method, and a program.
 所望の対象物を撮像画像から検出するテンプレートマッチング等の画像処理技術が知られている。従来技術によれば、例えば、撮像画像から人の顔を検出することができる。しかしながら、人の顔がマスク等によって遮蔽されている場合、顔検出の精度が悪化する。このような問題を解決する技術として、例えば、特許文献1は、マスク等の遮蔽物を装着した状態の、遮蔽された人物の顔を精度良く検出する画像解析装置を開示する。 Image processing techniques such as template matching that detects a desired object from an captured image are known. According to the prior art, for example, a human face can be detected from a captured image. However, when the human face is shielded by a mask or the like, the accuracy of face detection deteriorates. As a technique for solving such a problem, for example, Patent Document 1 discloses an image analysis device that accurately detects the face of a shielded person while wearing a shield such as a mask.
特開2018-151919号公報Japanese Unexamined Patent Publication No. 2018-151919
 しかしながら、従来技術は、対象物と遮蔽物とが予め定められた位置関係にある場合に、対象物の検出精度を高めるものである。例えば、従来技術は、鼻及び口が遮蔽された通常のマスク顔の検出精度を高める。息苦しさを解消する等の理由のために、マスクが下方にずらされ、鼻がマスクに覆われない状態でマスクが着用される場合もあるが、従来技術は、このような場合に精度良く顔を検出することができない。 However, the prior art is to improve the detection accuracy of the object when the object and the shield are in a predetermined positional relationship. For example, prior art enhances the detection accuracy of a normal masked face with the nose and mouth shielded. In some cases, the mask is moved downward and the nose is not covered by the mask for reasons such as relieving suffocation. However, the conventional technique accurately performs the face in such a case. Cannot be detected.
 本開示の目的は、遮蔽物により対象物が遮蔽されている場合において、従来技術よりも多様な遮蔽の態様に応じて対象物を検出できる画像解析技術を提供することにある。 An object of the present disclosure is to provide an image analysis technique capable of detecting an object according to various shielding modes as compared with the conventional technique when the object is shielded by a shield.
 本開示の一態様に係る画像解析装置は、
 撮像画像を取得する画像取得部と、
 前記撮像画像内で、一部が遮蔽物により遮蔽された対象物を示す第1領域を検出する第1検出器と、
 前記撮像画像内で、前記遮蔽物により遮蔽されていない前記対象物を示す第2領域を検出する第2検出器と、
 前記撮像画像内で前記対象物が映った領域を示す対象物領域を特定する解析部とを備え、
 前記解析部は、
   前記第1検出器によって前記第1領域が検出されたと判断する一方で、前記第2検出器によって前記第2領域が検出されなかったと判断した場合は、前記第1領域を前記対象物領域として特定し、
   前記第1検出器によって前記第1領域が検出されたと判断し、かつ前記第2検出器によって前記第2領域が検出されたと判断した場合は、前記第1領域と前記第2領域とを包含する領域を前記対象物領域として特定する。
The image analysis apparatus according to one aspect of the present disclosure is
The image acquisition unit that acquires the captured image and
In the captured image, a first detector that detects a first region indicating an object partially shielded by a shield, and
A second detector that detects a second region indicating the object that is not shielded by the shield in the captured image.
It is provided with an analysis unit for specifying an object area indicating an area in which the object is reflected in the captured image.
The analysis unit
If it is determined that the first region has been detected by the first detector, but the second region has not been detected by the second detector, the first region is specified as the object region. And
When it is determined that the first region is detected by the first detector and the second region is detected by the second detector, the first region and the second region are included. The region is specified as the object region.
 本開示の一態様に係る画像解析方法は、
 制御部が、撮像画像を取得するステップと、
 前記撮像画像内で、一部が遮蔽物により遮蔽された対象物を示す第1領域を検出する第1領域検出ステップと、
 前記撮像画像内で、前記遮蔽物により遮蔽されていない前記対象物を示す第2領域を検出する第2領域検出ステップと、
 前記撮像画像内で前記対象物が映った領域を示す対象物領域を特定する解析ステップとを含み、
 前記解析ステップにおいて、制御部は、
   前記第1領域検出ステップにおいて前記第1領域が検出されたと判断する一方で、前記第2領域検出ステップにおいて前記第2領域が検出されなかったと判断した場合は、前記第1領域を前記対象物領域として特定し、
   前記第1領域検出ステップにおいて前記第1領域が検出されたと判断し、かつ前記第2領域検出ステップにおいて前記第2領域が検出されたと判断した場合は、前記第1領域と前記第2領域とを包含する領域を前記対象物領域として特定する。
The image analysis method according to one aspect of the present disclosure is
The step that the control unit acquires the captured image,
A first region detection step of detecting a first region indicating an object partially shielded by a shield in the captured image, and a first region detection step.
A second region detection step of detecting a second region indicating the object that is not shielded by the shield in the captured image, and a second region detection step.
Including an analysis step of identifying an object area indicating an area in which the object is reflected in the captured image.
In the analysis step, the control unit
If it is determined that the first region has been detected in the first region detection step, while it is determined that the second region has not been detected in the second region detection step, the first region is referred to as the object region. Specified as
If it is determined that the first region has been detected in the first region detection step and that the second region has been detected in the second region detection step, the first region and the second region are separated. The region to be included is specified as the object region.
 本開示によれば、遮蔽物により対象物が遮蔽されている場合において、従来技術よりも多様な遮蔽の態様に応じて対象物を検出する画像解析技術を得ることができる。 According to the present disclosure, when an object is shielded by a shield, it is possible to obtain an image analysis technique for detecting the object according to various shielding modes as compared with the conventional technique.
本開示に係る顔検出装置の適用例を説明するための図である。It is a figure for demonstrating the application example of the face detection apparatus which concerns on this disclosure. 図1の顔検出装置のハードウェア構成の一例を示すブロック図である。It is a block diagram which shows an example of the hardware composition of the face detection apparatus of FIG. 図2に示した顔検出装置の制御部の機能構成例を示すブロック図である。It is a block diagram which shows the functional structure example of the control part of the face detection apparatus shown in FIG. 制御部によって実行される顔検出処理の一例を示すフローチャートである。It is a flowchart which shows an example of a face detection process executed by a control unit. 図4の顔候補矩形検出処理の一例を示すフローチャートである。It is a flowchart which shows an example of the face candidate rectangle detection process of FIG. 図3のマスク有り顔検出器の一例を示す模式図である。It is a schematic diagram which shows an example of the face detector with a mask of FIG. マスク有り顔候補矩形R1及びマスク無し顔候補矩形R2を含む撮像画像を例示する模式図である。It is a schematic diagram which illustrates the captured image including the face candidate rectangle R1 with a mask and the face candidate rectangle R2 without a mask. 中間顔矩形Aを例示する模式図である。It is a schematic diagram which illustrates the intermediate face rectangle A. マージ対象矩形を例示する模式図である。It is a schematic diagram which illustrates the rectangle to be merged. 最終顔矩形Bを例示する模式図である。It is a schematic diagram which illustrates the final face rectangle B.
 以下、添付の図面を参照して本開示に係る顔検出装置の実施の形態を説明する。なお、以下の各実施形態において、同様の構成要素については同一の符号を付している。 Hereinafter, embodiments of the face detection device according to the present disclosure will be described with reference to the attached drawings. In each of the following embodiments, the same reference numerals are given to the same components.
1.適用例
 まず、図1を用いて、実施形態に係る顔検出装置100が適用される場面の一例について説明する。図1は、顔検出装置100の適用場面の一例である顔検出システム1を模式的に例示する。顔検出装置100は、本開示の「画像解析装置」の一例である。
1. 1. Application Example First, an example of a scene in which the face detection device 100 according to the embodiment is applied will be described with reference to FIG. FIG. 1 schematically illustrates a face detection system 1 which is an example of an application scene of the face detection device 100. The face detection device 100 is an example of the "image analysis device" of the present disclosure.
 顔検出システム1は、顔検出装置100を備える。顔検出システム1は、例えば、カメラ3、目開閉検出装置50、視線検出装置60、及び顔向き検出装置70を更に備えてもよい。顔検出装置100は、カメラ3によって撮像された撮像画像を取得し、撮像画像内で人の顔が映った領域(以下、「顔領域」という。)Bを抽出する情報処理装置である。人の顔は、本開示の「対象物」の一例であり、顔領域Bは、撮像画像内で対象物が映った領域を示す「対象物領域」の一例である。 The face detection system 1 includes a face detection device 100. The face detection system 1 may further include, for example, a camera 3, an eye opening / closing detection device 50, a line-of-sight detection device 60, and a face orientation detection device 70. The face detection device 100 is an information processing device that acquires a captured image captured by the camera 3 and extracts a region (hereinafter, referred to as “face region”) B in which a human face is reflected in the captured image. The human face is an example of the "object" of the present disclosure, and the face area B is an example of the "object area" indicating the area in which the object is reflected in the captured image.
 顔検出装置100は、例えば工場において製品の組立てや梱包等の作業を行なう作業者の顔を検出する顔検出システム1に利用される。顔検出装置100によって検出された作業者の顔領域Bに対して、例えば、後続の目開閉検出装置50、視線検出装置60、及び顔向き検出装置70等による検出処理が実行されてもよい。目開閉検出装置50は、例えば、顔領域Bを画像解析して、目、上眼瞼、下眼瞼等の位置を検出し、その開閉の回数、頻度等を測定する。視線検出装置60は、例えば、顔領域Bを画像解析して瞳孔の位置を検出し、これにより瞳孔又は視線の位置及び移動速度等を測定する。顔向き検出装置70は、顔領域Bを画像解析し、例えば公知のテンプレートマッチングの手法によって顔が向いている方向を検出する。 The face detection device 100 is used in, for example, a face detection system 1 that detects the face of a worker who assembles or packs a product in a factory. For the face area B of the worker detected by the face detection device 100, for example, detection processing by the subsequent eye opening / closing detection device 50, the line-of-sight detection device 60, the face orientation detection device 70, or the like may be executed. The eye opening / closing detection device 50, for example, analyzes the face region B by an image, detects the positions of the eyes, upper eyelid, lower eyelid, etc., and measures the number of times, frequency, and the like of opening / closing. The line-of-sight detection device 60 detects, for example, the position of the pupil by image-analyzing the face region B, thereby measuring the position of the pupil or the line of sight, the moving speed, and the like. The face orientation detection device 70 analyzes the face region B by an image, and detects the direction in which the face is facing by, for example, a known template matching method.
 目開閉検出装置50及び視線検出装置60の結果は、例えば作業者の覚醒度を検出するために利用される。例えば、覚醒度が低い眠気状態に陥ると、作業者の瞳孔の位置の移動範囲が狭くなり、又はその移動若しくはサッカードの速さが小さくなることが知られている。また、眠気状態に陥ると、例えば作業者の上眼瞼と下眼瞼との距離が小さくなる。すなわち、瞼が閉じかかった状態になる。このような場合、顔検出システム1は、例えば作業者の覚醒度が低いと判断する。また、例えば作業者の目が閉じたままである場合、作業者が眠っていると判断されてもよい。 The results of the eye opening / closing detection device 50 and the line-of-sight detection device 60 are used, for example, to detect the alertness of the worker. For example, it is known that when a person falls into a drowsy state with low alertness, the range of movement of the position of the pupil of the worker becomes narrow, or the speed of the movement or saccade becomes small. In addition, when a person falls into a drowsy state, for example, the distance between the upper eyelid and the lower eyelid of the operator becomes smaller. That is, the eyelids are about to close. In such a case, the face detection system 1 determines, for example, that the worker's alertness is low. Also, for example, if the worker's eyes remain closed, it may be determined that the worker is asleep.
 また、顔向き検出装置70によって、作業者の顔の向きが頻繁に変わっていることが検出された場合、作業者の注意が散漫になっている可能性がある。 Further, when the face orientation detection device 70 detects that the orientation of the worker's face changes frequently, the worker's attention may be distracted.
 上記のような場合、顔検出システム1は、図示しないスピーカから作業者に休憩を促すアナウンスを流す等の制御をしてもよい。顔検出システム1は、工場の作業ラインを制御する制御部を備えてもよい。これにより、例えば、顔検出システム1は、作業者の覚醒度が低下した場合、工場の作業ラインを止めることにより、ミス及び事故の発生を防止することができる。また、顔検出システム1は、作業者の覚醒度が低下した場合、工場管理者、共同作業者、並びに産業医及び看護師等の医療従事者等に通知してもよい。これにより、これらの者が作業計画の見直しをする、といった対応を採ることができる。このようにして、覚醒度の低下に起因する事故、ミス等の発生を防止することができる。 In the above case, the face detection system 1 may control such as sending an announcement urging the operator to take a break from a speaker (not shown). The face detection system 1 may include a control unit that controls a work line in a factory. As a result, for example, the face detection system 1 can prevent the occurrence of mistakes and accidents by stopping the work line of the factory when the alertness of the worker is lowered. In addition, the face detection system 1 may notify the factory manager, collaborators, and medical workers such as industrial physicians and nurses when the arousal level of the worker decreases. As a result, these persons can take measures such as reviewing the work plan. In this way, it is possible to prevent the occurrence of accidents, mistakes, etc. due to a decrease in alertness.
 上記のような目開閉検出装置50、視線検出装置60、及び顔向き検出装置70による検出は、顔領域の中に目等の顔の器官が含まれていることが前提となる。カメラ3によって撮像される作業者は、マスクを着用している可能性がある。すなわち、作業者の顔はマスクにより遮蔽されている可能性がある。マスクは、本開示の「遮蔽物」の一例である。 The detection by the eye opening / closing detection device 50, the line-of-sight detection device 60, and the face orientation detection device 70 as described above is based on the premise that facial organs such as eyes are included in the face region. The worker imaged by the camera 3 may be wearing a mask. That is, the worker's face may be covered by a mask. The mask is an example of the "shield" of the present disclosure.
 マスク等の遮蔽物により遮蔽されていない人の顔(以下、「マスク無し顔」という。)を含む撮像画像2aから顔領域Bを正確に検出することは、従来のマスク無し顔検出器123により実行可能である。また、マスク等の遮蔽物により遮蔽された人の顔(以下、「マスク有り顔」という。)であって、鼻及び口が遮蔽されたものを含む撮像画像2bから顔領域Bを検出することも、従来のマスク有り顔検出器122により実行可能である。 Accurate detection of the face region B from the captured image 2a including the face of a person who is not shielded by a shield such as a mask (hereinafter, referred to as “unmasked face”) is performed by the conventional maskless face detector 123. It is feasible. Further, the face region B is detected from the captured image 2b including the face of a person shielded by a shield such as a mask (hereinafter referred to as "face with mask") whose nose and mouth are shielded. Can also be performed by the conventional masked face detector 122.
 このように、鼻及び口を覆うようにマスクを着用した人の顔から顔領域Bを検出することは従来のマスク有り顔検出器122により実行可能である。しかしながら、マスク有り顔のうち、鼻が遮蔽されていないもの(以下、「鼻出しマスク顔」という。)を含む撮像画像2cから顔領域Bを検出することは、従来のマスク無し顔検出器123又はマスク有り顔検出器122では実行できない。例えば、従来のマスク有り顔検出器122では、目が含まれていない領域を誤って顔領域Bとして検出してしまう問題があった。 In this way, it is possible to detect the face region B from the face of a person wearing a mask so as to cover the nose and mouth with the conventional masked face detector 122. However, detecting the face region B from the captured image 2c including the masked face whose nose is not shielded (hereinafter, referred to as “nose-out masked face”) is a conventional maskless face detector 123. Or, it cannot be executed by the face detector 122 with a mask. For example, the conventional face detector 122 with a mask has a problem that an area not including eyes is erroneously detected as a face area B.
 この問題を解決するために、例えば畳み込みニューラルネットワーク(Convolutional Neural Network、CNN)等のモデルに鼻出しマスク顔の画像を大量に学習させることによって構築した学習済みモデルを利用し、鼻出しマスク顔検出器を構成することが考えられる。しかしながら、「鼻出しマスク顔」といってもどの程度鼻を露出するかなどマスクのかけ方には様々な態様があり得るため、すべての態様を把握することは困難である。また、これらの各態様について鼻出しマスク顔の画像を大量に得ることは困難である。さらに、マスク無し顔検出器123及びマスク有り顔検出器122に加えて、3つめの鼻出しマスク顔検出器を搭載すると、顔検出装置による演算処理の量が多くなり、負荷及び処理時間が増加する。 In order to solve this problem, for example, a trained model constructed by having a model such as a convolutional neural network (CNN) learn a large amount of images of a nose mask face is used to detect a face with a nose mask. It is conceivable to construct a vessel. However, it is difficult to grasp all the modes because there may be various ways to put on the mask, such as how much the nose is exposed even if it is called a "nose-out mask face". In addition, it is difficult to obtain a large number of images of the nose mask face for each of these aspects. Further, if a third nose-out mask face detector is installed in addition to the maskless face detector 123 and the masked face detector 122, the amount of arithmetic processing by the face detector increases, and the load and processing time increase. To do.
 そこで、本開示は、マスク無し顔検出器123及びマスク有り顔検出器122の検出結果を利用して、鼻出しマスク顔を検出できる顔検出装置100を提供する。顔検出装置100は、負荷及び処理時間の増加を抑えつつ、鼻出しマスク顔を検出できる。 Therefore, the present disclosure provides a face detection device 100 capable of detecting a face with a nose mask by using the detection results of the face detector 123 without a mask and the face detector 122 with a mask. The face detection device 100 can detect a face with a nose mask while suppressing an increase in load and processing time.
2.構成例
[ハードウェア構成]
 図2は、本実施形態に係る顔検出装置100のハードウェア構成の一例を示すブロック図である。顔検出装置100は、入力部11と、制御部12と、記憶部13と、通信インタフェース(I/F)14とを備える。
2. 2. Configuration example [Hardware configuration]
FIG. 2 is a block diagram showing an example of the hardware configuration of the face detection device 100 according to the present embodiment. The face detection device 100 includes an input unit 11, a control unit 12, a storage unit 13, and a communication interface (I / F) 14.
 入力部11は、顔検出装置100とカメラ3等の外部機器とを接続するインタフェース回路である。 The input unit 11 is an interface circuit that connects the face detection device 100 and an external device such as a camera 3.
 制御部12は、CPU(Central Processing Unit)、RAM(Random Access Memory)、ROM(Read Only Memory)等を含み、情報処理に応じて顔検出装置100の各構成要素の制御を行う情報処理装置である。 The control unit 12 is an information processing device that includes a CPU (Central Processing Unit), a RAM (Random Access Memory), a ROM (Read Only Memory), and the like, and controls each component of the face detection device 100 according to information processing. is there.
 記憶部13は、コンピュータその他の装置、機械等が記録されたプログラム等の情報を読み取り可能なように、当該プログラム等の情報を、電気的、磁気的、光学的、機械的又は化学的作用によって蓄積する媒体である。記憶部13は、例えば、ハードディスクドライブ、ソリッドステートドライブ等の補助記憶装置であり、制御部12で実行される画像処理プログラム、顔検出プログラム等を記憶する。 The storage unit 13 performs the information of the program or the like by electrical, magnetic, optical, mechanical or chemical action so that the information of the program or the like recorded by the computer or other device, the machine or the like can be read. It is a medium to accumulate. The storage unit 13 is, for example, an auxiliary storage device such as a hard disk drive or a solid state drive, and stores an image processing program, a face detection program, or the like executed by the control unit 12.
 通信I/F14は、顔検出装置100と外部機器との通信接続を可能とするためのインタフェース回路を含む。通信I/F14は、例えば、IEEE802.3、IEEE802.11又はWi-Fi、LTE、3G、4G、5G等の規格に従って通信を行う。通信I/F14は、USB(Universal Serial Bus)、HDMI(High Definition Multimedia Interface)、IEEE1394、Bluetooth等の規格に従って通信を行うインタフェース回路であってもよい。 The communication I / F 14 includes an interface circuit for enabling a communication connection between the face detection device 100 and an external device. Communication I / F14 communicates according to standards such as IEEE802.3, IEEE802.11 or Wi-Fi, LTE, 3G, 4G, and 5G. The communication I / F 14 may be an interface circuit that communicates according to standards such as USB (Universal Serial Bus), HDMI (High Definition Multimedia Interface), IEEE 1394, and Bluetooth.
 顔検出装置100には、入力部11を介してカメラ3が接続される。カメラ3は、例えば、赤外線に対して感度を有する赤外線カメラである。この場合、顔検出装置100には、カメラ3の撮像範囲に向けて赤外線を照射する赤外線照射器が取り付けられてもよい。カメラ3は、可視光に対して感度を有する可視光カメラであってもよい。 The camera 3 is connected to the face detection device 100 via the input unit 11. The camera 3 is, for example, an infrared camera having sensitivity to infrared rays. In this case, the face detection device 100 may be equipped with an infrared irradiator that irradiates infrared rays toward the imaging range of the camera 3. The camera 3 may be a visible light camera having sensitivity to visible light.
 カメラ3は、例えば、顔検出装置100が工場内の作業者の顔を検出するために利用される場合には、工場内の作業者を撮影可能な場所に配置される。あるいは、カメラ3は、作業者の頭部に装着される眼鏡型のカメラ、ヘッドマウント型のカメラ等のウェアラブルなカメラであってもよい。カメラ3は、ネットワーク及び通信I/F14を介して、顔検出装置100に接続されてもよい。 The camera 3 is arranged in a place where the worker in the factory can be photographed, for example, when the face detection device 100 is used to detect the face of the worker in the factory. Alternatively, the camera 3 may be a wearable camera such as a glasses-type camera or a head-mounted camera that is worn on the worker's head. The camera 3 may be connected to the face detection device 100 via the network and the communication I / F14.
[機能構成]
 図3は、図2に示した顔検出装置100の制御部12の機能構成例を示すブロック図である。制御部12は、画像取得部121と、マスク有り顔検出器122と、マスク無し顔検出器123と、マスク有無判定部124と、第1マージ処理部125と、マージ対象探索部126と、第2マージ処理部127とを含む。
[Functional configuration]
FIG. 3 is a block diagram showing a functional configuration example of the control unit 12 of the face detection device 100 shown in FIG. The control unit 12 includes an image acquisition unit 121, a face detector 122 with a mask, a face detector 123 without a mask, a mask presence / absence determination unit 124, a first merge processing unit 125, a merge target search unit 126, and a first. 2 Includes a merge processing unit 127.
 画像取得部121は、入力部11を介して、カメラ3によって撮像された撮像画像を取得する。マスク有り顔検出器122は、取得された撮像画像内でマスク有り顔を検出する。マスク無し顔検出器123は、取得された撮像画像内でマスク無し顔を検出する。マスク有無判定部124は、マスク有り顔検出器122及びマスク無し顔検出器123の検出結果に基づいて、撮像画像中の顔がマスク有り顔であるか否かを判定する。マスク有り顔検出器122は、本開示の「第1検出器」の一例であり、マスク無し顔検出器123は、本開示の「第2検出器」の一例である。 The image acquisition unit 121 acquires the captured image captured by the camera 3 via the input unit 11. The masked face detector 122 detects the masked face in the acquired captured image. The unmasked face detector 123 detects an unmasked face in the acquired captured image. The mask presence / absence determination unit 124 determines whether or not the face in the captured image is a masked face based on the detection results of the masked face detector 122 and the unmasked face detector 123. The masked face detector 122 is an example of the "first detector" of the present disclosure, and the maskless face detector 123 is an example of the "second detector" of the present disclosure.
 第1マージ処理部125は、マスク有無判定部124において撮像画像中の顔がマスク有り顔であると判定された場合、検出されたマスク有り顔候補矩形R1(図7参照)を結合(マージ)して、中間顔矩形Aを生成する(図8参照)。マスク有り顔候補矩形R1で囲まれた領域は、本開示の「第1領域」の一例である。以下、第1マージ処理部125によって実行されるマージ処理を「第1マージ処理」という。 When the mask presence / absence determination unit 124 determines that the face in the captured image is a masked face, the first merge processing unit 125 combines (merges) the detected masked face candidate rectangle R1 (see FIG. 7). Then, the intermediate face rectangle A is generated (see FIG. 8). The region surrounded by the masked face candidate rectangle R1 is an example of the “first region” of the present disclosure. Hereinafter, the merge process executed by the first merge processing unit 125 is referred to as a "first merge process".
 マージ対象探索部126は、マスク無し顔候補矩形R2(図7参照)の中から、後述の条件を満たすマージ対象矩形を探索する。マスク無し顔候補矩形R2で囲まれた領域は、本開示の「第2領域」の一例である。マージ対象矩形が検出された場合、第2マージ処理部127は、中間顔矩形A(図9参照)を被マージ矩形とし、マージ対象矩形と中間顔矩形Aとを包含する矩形を最終顔矩形B(図10参照)とする。最終顔矩形Bで囲まれた領域は、本開示の「対象物領域」の一例である。 The merge target search unit 126 searches for a merge target rectangle that satisfies the conditions described later from the unmasked face candidate rectangle R2 (see FIG. 7). The region surrounded by the unmasked face candidate rectangle R2 is an example of the “second region” of the present disclosure. When the rectangle to be merged is detected, the second merge processing unit 127 sets the intermediate face rectangle A (see FIG. 9) as the merged rectangle, and sets the rectangle including the merge target rectangle and the intermediate face rectangle A as the final face rectangle B. (See FIG. 10). The area surrounded by the final face rectangle B is an example of the “object area” of the present disclosure.
 マスク有無判定部124、第1マージ処理部125、マージ対象探索部126、及び第2マージ処理部127を含む機能ブロックは、本開示の「解析部」の一例である。第1マージ処理部125、マージ対象探索部126、及び第2マージ処理部127を含む機能ブロックは、本開示の「領域特定部」の一例である。 The functional block including the mask presence / absence determination unit 124, the first merge processing unit 125, the merge target search unit 126, and the second merge processing unit 127 is an example of the “analysis unit” of the present disclosure. The functional block including the first merge processing unit 125, the merge target search unit 126, and the second merge processing unit 127 is an example of the “area identification unit” of the present disclosure.
 制御部12の詳細な動作例については後述する。 A detailed operation example of the control unit 12 will be described later.
 画像取得部121、マスク有り顔検出器122、マスク無し顔検出器123、マスク有無判定部124、第1マージ処理部125、マージ対象探索部126、及び第2マージ処理部127による各処理は、制御部12が必要なプログラムを実行することによって実行されてもよい。当該プログラムは、記憶部13に記憶されていてもよい。制御部12は、必要なプログラムを実行する際、記憶部13に記憶された当該プログラムをRAMに展開する。そして、制御部12は、RAMに展開された当該プログラムをCPUにより解釈及び実行して、顔検出装置100の各構成要素を制御する。 Each process by the image acquisition unit 121, the masked face detector 122, the unmasked face detector 123, the mask presence / absence determination unit 124, the first merge processing unit 125, the merge target search unit 126, and the second merge processing unit 127 It may be executed by the control unit 12 executing a necessary program. The program may be stored in the storage unit 13. When the control unit 12 executes a necessary program, the control unit 12 expands the program stored in the storage unit 13 into the RAM. Then, the control unit 12 interprets and executes the program expanded in the RAM by the CPU, and controls each component of the face detection device 100.
 本実施形態では、制御部12の各機能がいずれもCPUによって実現される例について説明している。しかしながら、以上の機能の一部又は全部は、1又は複数の専用のプロセッサにより実現されてもよい。また、制御部12の構成要素に関して、実施形態に応じて、適宜、機能の省略、置換及び追加が行われてもよい。制御部12は、CPU、MPU、GPU、マイコン、DSP、FPGA、ASIC等の種々の半導体集積回路で構成されてもよい。 In the present embodiment, an example in which each function of the control unit 12 is realized by the CPU is described. However, some or all of the above functions may be realized by one or more dedicated processors. In addition, with respect to the components of the control unit 12, the functions may be omitted, replaced, or added as appropriate according to the embodiment. The control unit 12 may be composed of various semiconductor integrated circuits such as a CPU, MPU, GPU, microcomputer, DSP, FPGA, and ASIC.
3.動作例
 図4は、顔検出装置100の制御部12によって実行される顔検出処理の一例を示すフローチャートである。以下で説明する処理手順は一例に過ぎず、処理手順及び各処理は可能な限り変更されてよい。
3. 3. Operation Example FIG. 4 is a flowchart showing an example of the face detection process executed by the control unit 12 of the face detection device 100. The processing procedure described below is only an example, and the processing procedure and each processing may be changed as much as possible.
(ステップS101)
 まず、制御部12は、画像取得部121として動作し、入力部11を介して、カメラ3によって撮像された撮像画像を取得する(S101)。例えば、カメラ3は、一定のフレームレートで撮像を行う。ステップS101においては、制御部12は、複数の撮像画像を取得してもよい。以下では、制御部12が、1つの撮像画像を取得した後に次のステップS102に進む処理例について説明する。しかしながら、本開示はこれに限定されない。例えば、ステップS101において、制御部12は、複数フレームで構成される動画を取得してもよいし、複数枚の静止画像を取得してもよい。
(Step S101)
First, the control unit 12 operates as an image acquisition unit 121, and acquires an captured image captured by the camera 3 via the input unit 11 (S101). For example, the camera 3 takes an image at a constant frame rate. In step S101, the control unit 12 may acquire a plurality of captured images. In the following, a processing example in which the control unit 12 proceeds to the next step S102 after acquiring one captured image will be described. However, the present disclosure is not limited to this. For example, in step S101, the control unit 12 may acquire a moving image composed of a plurality of frames, or may acquire a plurality of still images.
(ステップS102)
 次に、制御部12は、マスク無し顔候補矩形と、マスク有り顔候補矩形とを検出する顔候補矩形検出処理を実行する(S102)。図5を用いて、ステップS102の顔候補矩形検出処理を詳細に説明する。
(Step S102)
Next, the control unit 12 executes a face candidate rectangle detection process for detecting the unmasked face candidate rectangle and the masked face candidate rectangle (S102). The face candidate rectangle detection process in step S102 will be described in detail with reference to FIG.
 図5は、ステップS102の顔候補矩形検出処理の一例を示すフローチャートである。例えば、制御部12は、検出顔サイズiSize、検出顔回転角度iAngle、及び検出顔位置iPosの条件で、マスク有り顔検出器122によってマスク有り顔候補矩形を検出し(S102a)、次に、マスク無し顔検出器123によってマスク無し顔候補矩形を検出する(S102b)。図5と異なり、ステップS102bの後にステップS102aが実行されてもよい。 FIG. 5 is a flowchart showing an example of the face candidate rectangle detection process in step S102. For example, the control unit 12 detects the masked face candidate rectangle by the masked face detector 122 under the conditions of the detected face size iSize, the detected face rotation angle iAngle, and the detected face position iPos (S102a), and then the mask. The face-less face detector 123 detects the face candidate rectangle without mask (S102b). Unlike FIG. 5, step S102a may be executed after step S102b.
 ステップS102a及びステップS102bにおいて、制御部12は、ステップS101で取得した撮像画像の一部を、検出顔サイズiSize、検出顔回転角度iAngle、及び検出顔位置iPosの条件で切り出す。検出顔サイズiSizeは、撮像画像から切り出される画像のサイズを示し、例えば縦及び横のピクセルによって指定される。図5の検出顔サイズiSizeのループでは、ループの度に撮像画像から切り出される画像のサイズが変更される。 In step S102a and step S102b, the control unit 12 cuts out a part of the captured image acquired in step S101 under the conditions of the detected face size iSIZE, the detected face rotation angle iAngle, and the detected face position iPos. The detected face size iSIZE indicates the size of the image cut out from the captured image, and is specified by, for example, vertical and horizontal pixels. In the loop of the detected face size iSize of FIG. 5, the size of the image cut out from the captured image is changed each time the loop is performed.
 検出顔回転角度iAngleは、撮像画像から切り出される画像の角度を示し、例えば0°以上360°未満の角度で表される。検出顔位置iPosは、撮像画像から切り出される画像の位置を示す。制御部12は、切り出された撮像画像の一部が予め記憶部13に格納されたテンプレート画像に一致しているか否かを検出する。 The detected face rotation angle iAngle indicates the angle of the image cut out from the captured image, and is represented by, for example, an angle of 0 ° or more and less than 360 °. The detected face position iPos indicates the position of the image cut out from the captured image. The control unit 12 detects whether or not a part of the cut out captured image matches the template image stored in the storage unit 13 in advance.
 制御部12は、切り出された撮像画像の一部の信頼度を示すスコアが所定の閾値以上である場合、撮像画像の一部とテンプレート画像とが一致すると判断する。スコアは、本開示の「信頼度」の一例である。スコアは、例えば、切り出された撮像画像の一部とテンプレート画像との類似度を示す指標である。スコアは、例えば0~1の範囲の値を取り、値が大きいほど切り出された撮像画像の一部とテンプレート画像との類似度が高いことを意味する。 The control unit 12 determines that a part of the captured image and the template image match when the score indicating the reliability of a part of the cut out captured image is equal to or higher than a predetermined threshold value. The score is an example of the "reliability" of the present disclosure. The score is, for example, an index showing the degree of similarity between a part of the cut out captured image and the template image. The score takes a value in the range of 0 to 1, for example, and the larger the value, the higher the degree of similarity between the part of the captured image cut out and the template image.
 制御部12は、例えば、当該撮像画像の一部の縁をマスク有り顔候補矩形又はマスク無し顔候補矩形として記憶部13に格納するとともに、当該顔候補矩形に対応するスコアを記憶部13に格納する。 For example, the control unit 12 stores a part of the edge of the captured image as a face candidate rectangle with a mask or a face candidate rectangle without a mask in the storage unit 13, and stores the score corresponding to the face candidate rectangle in the storage unit 13. To do.
 制御部12は、iをインクリメントする等して変化させながら、多数の検出顔サイズiSize、検出顔回転角度iAngle、及び検出顔位置iPosの条件で、ステップS102a及びステップS102bの検出処理を行う。 The control unit 12 performs the detection processes of steps S102a and S102b under the conditions of a large number of detected face sizes iSize, detected face rotation angle iAngle, and detected face position iPos while changing i by incrementing or the like.
 顔候補矩形を検出する方法は、上記の例に限定されない。例えば、制御部12は、公知のテンプレートマッチングの方法によって顔候補矩形を検出してもよい。また、例えば、アダブースト(AdaBoost)等の機械学習アルゴリズム、又は畳み込みニューラルネットワーク(Convolutional Neural Network、CNN)等のモデルに対する機械学習によって構築された検出器によって顔候補矩形が検出されてもよい。 The method of detecting the face candidate rectangle is not limited to the above example. For example, the control unit 12 may detect the face candidate rectangle by a known template matching method. Further, for example, the face candidate rectangle may be detected by a machine learning algorithm such as AdaBoost, or a detector constructed by machine learning for a model such as a convolutional neural network (CNN).
 図6は、このようにして構築された図3のマスク有り顔検出器122の一例を示す模式図である。マスク有り顔検出器122は、例えば公知のカスケード方式を利用する検出器であり、第1~第Nの識別器2-1~2-Nを備える。 FIG. 6 is a schematic view showing an example of the masked face detector 122 of FIG. 3 constructed in this way. The masked face detector 122 is, for example, a detector using a known cascade method, and includes first to Nth classifiers 2-1 to 2-N.
 各識別器は、以下のようにして入力された対象画像がマスク有り顔の画像であるか否かを識別する。対象画像は、例えば上記のような検出顔サイズiSize、検出顔回転角度iAngle、及び検出顔位置iPosの条件で切り出された撮像画像の一部である。 Each classifier identifies whether or not the target image input as follows is an image of a face with a mask. The target image is a part of the captured image cut out under the conditions of the detected face size iSize, the detected face rotation angle iAngle, and the detected face position iPos as described above.
 まず、第1識別器2-1は、対象画像がマスク有り顔であるか否かを識別する。マスク有り顔であると識別した場合、第1識別器2-1は、対象画像を第2識別器2-2に対して出力する。第1識別器2-1は、対象画像がマスク有り顔でないと識別した場合、対象画像を削除する等して廃棄する。後続の第2~第Nの識別器2-2~2-Nも、第1識別器2-1と同様にして、入力された対象画像がマスク有り顔であるか否かを識別する。最終段の第N識別器2-Nは、入力された対象画像がマスク有り顔であると識別した場合、対象画像を出力する。出力された対象画像は、マスク有り顔検出器122によってマスク有り顔であると識別されたことになる。すなわち、第1~第Nの識別器2-1~2-Nのすべてにおいて一貫して「マスク有り顔である」と識別された場合にのみ、対象画像がマスク有り顔を示していると判断される。 First, the first classifier 2-1 identifies whether or not the target image is a face with a mask. When the face is identified as having a mask, the first classifier 2-1 outputs the target image to the second classifier 2-2. When the first classifier 2-1 identifies that the target image is not a masked face, the target image is deleted or discarded. Subsequent second to second classifiers 2-2-2 to N also discriminate whether or not the input target image is a masked face in the same manner as the first classifier 2-1. The Nth classifier 2-N in the final stage outputs the target image when it identifies that the input target image is a face with a mask. The output target image is identified as a masked face by the masked face detector 122. That is, it is determined that the target image shows a masked face only when all of the first to Nth classifiers 2-1 to 2-N are consistently identified as "a face with a mask". Will be done.
 マスク有り顔検出器122は、例えば、第N識別器2-Nから出力された対象画像の縁をマスク有り顔候補矩形に決定する。このようにして、ステップ102aのマスク有り顔候補矩形検出処理が完了する。 The masked face detector 122 determines, for example, the edge of the target image output from the Nth classifier 2-N as a masked face candidate rectangle. In this way, the masked face candidate rectangle detection process in step 102a is completed.
 第1~第Nの識別器2-1~2-Nは、それぞれ、対象画像がマスク有り顔であるか否かを識別するための識別条件を有する。第1~第Nの識別器2-1~2-Nのそれぞれの識別条件は、異なる厳格度を有する。第1識別器2-1の識別条件は最も緩やかであり、第2識別器2-2の識別条件は、第1識別器2-1の識別条件の次に緩やかである。このように、後段の識別器ほど厳格な識別条件を有し、最終段の第N識別器2-Nの識別条件は最も厳格である。緩やかな条件による識別は、少ない特徴の数でも行えるため、演算量が少ない。したがって、上記のように緩やかな識別器から始めて段階的に厳格な識別器を並べることにより、マスク有り顔でない対象画像を手前の識別器で、したがって、少ない演算量で排除することができる。これにより、マスク有り顔検出器122の処理量を低減することができ、処理速度が増加する。 Each of the first to Nth classifiers 2-1 to 2-N has an identification condition for discriminating whether or not the target image is a face with a mask. The identification conditions of the first to Nth classifiers 2-1 to 2-N have different rigor. The discrimination condition of the first classifier 2-1 is the most lenient, and the discrimination condition of the second classifier 2-2 is the strictest next to the discrimination condition of the first classifier 2-1. As described above, the latter-stage classifier has stricter discrimination conditions, and the final-stage N-th classifier 2-N has the strictest discrimination conditions. Since the identification under loose conditions can be performed even with a small number of features, the amount of calculation is small. Therefore, by starting with a loose classifier as described above and arranging strict classifiers step by step, it is possible to eliminate the target image without a masked face with the front classifier, and therefore with a small amount of calculation. As a result, the processing amount of the masked face detector 122 can be reduced, and the processing speed is increased.
 図3のマスク無し顔検出器123も、マスク有り顔検出器122と同様の構成を有し、マスク無し顔候補矩形を検出する。 The unmasked face detector 123 in FIG. 3 also has the same configuration as the masked face detector 122, and detects the unmasked face candidate rectangle.
(ステップS103)
 制御部12は、ステップS102においてマスク有り顔候補矩形及びマスク無し顔候補矩形のいずれも検出しなかった場合、図4の処理を終了する(S103)。例えば、撮像画像内に人の顔が含まれていない場合、ステップS102においてマスク有り顔候補矩形及びマスク無し顔候補矩形のいずれも検出されない。
(Step S103)
When neither the masked face candidate rectangle nor the unmasked face candidate rectangle is detected in step S102, the control unit 12 ends the process of FIG. 4 (S103). For example, when the captured image does not include a human face, neither the masked face candidate rectangle nor the unmasked face candidate rectangle is detected in step S102.
(ステップS104)
 次に、制御部12は、マスク有無判定部124として動作し、ステップS102において検出されたマスク有り顔候補矩形及びマスク無し顔候補矩形に基づいて、撮像画像中の顔がマスク有り顔であるか否かを判定する(S104)。ステップS104のマスク有り顔判定処理の一例について図7を用いて説明する。
(Step S104)
Next, the control unit 12 operates as the mask presence / absence determination unit 124, and based on the masked face candidate rectangle and the unmasked face candidate rectangle detected in step S102, is the face in the captured image a masked face? It is determined whether or not (S104). An example of the masked face determination process in step S104 will be described with reference to FIG. 7.
 図7は、ステップS102において検出されたマスク有り顔候補矩形R1及びマスク無し顔候補矩形R2を含む撮像画像を例示する模式図である。図7において、マスク有り顔候補矩形R1は二重線で、マスク無し顔候補矩形R2は破線で示されている。ステップS104では、ステップS102において検出された1つ以上のマスク有り顔候補矩形R1又は1つ以上のマスク無し顔候補矩形R2に基づいて、撮像画像中の顔がマスク有り顔であるか否かを判定する。図7に示した例では、3つのマスク有り顔候補矩形R1と、2つのマスク無し顔候補矩形R2とが示されている。 FIG. 7 is a schematic diagram illustrating an captured image including the face candidate rectangle R1 with a mask and the face candidate rectangle R2 without a mask detected in step S102. In FIG. 7, the masked face candidate rectangle R1 is indicated by a double line, and the unmasked face candidate rectangle R2 is indicated by a broken line. In step S104, whether or not the face in the captured image is a masked face is determined based on the one or more masked face candidate rectangles R1 or the one or more unmasked face candidate rectangles R2 detected in step S102. judge. In the example shown in FIG. 7, three masked face candidate rectangles R1 and two unmasked face candidate rectangles R2 are shown.
 図5を参照してステップS102において説明したように、各マスク有り顔候補矩形R1及びマスク無し顔候補矩形R2は、それぞれスコアを有する。ステップS104では、制御部12は、マスク有り顔候補矩形R1及びマスク無し顔候補矩形R2のそれぞれの重み付き個数を計数する。例えば、制御部12は、マスク有り顔候補矩形R1及びマスク無し顔候補矩形R2のスコアをそれぞれ合算することによって重み付き個数を算出する。図7に示した例では、3つのマスク有り顔候補矩形R1のそれぞれのスコアが例えば0.7,0.8,及び0.75であった場合、マスク有り顔候補矩形R1についてのスコアの合算値は0.7+0.8+0.75=2.25となる。他方、2つのマスク無し顔候補矩形R2のそれぞれのスコアが例えば0.2及び0.1であった場合、マスク無し顔候補矩形R2についてのスコアの合算値は0.2+0.1=0.3となる。 As described in step S102 with reference to FIG. 5, each masked face candidate rectangle R1 and unmasked face candidate rectangle R2 each have a score. In step S104, the control unit 12 counts the weighted numbers of the masked face candidate rectangle R1 and the unmasked face candidate rectangle R2. For example, the control unit 12 calculates the weighted number by adding up the scores of the masked face candidate rectangle R1 and the unmasked face candidate rectangle R2, respectively. In the example shown in FIG. 7, when the scores of the three masked face candidate rectangles R1 are, for example, 0.7, 0.8, and 0.75, the sum of the scores for the masked face candidate rectangle R1. The value is 0.7 + 0.8 + 0.75 = 2.25. On the other hand, when the scores of the two unmasked face candidate rectangles R2 are, for example, 0.2 and 0.1, the total score of the two unmasked face candidate rectangles R2 is 0.2 + 0.1 = 0.3. It becomes.
 そして、制御部12は、例えば、マスク有り顔候補矩形R1についてのスコアの合算値がマスク無し顔候補矩形R2についてのスコアの合算値以上である場合、撮像画像中の顔がマスク有り顔であると判定する。他方、制御部12は、マスク有り顔候補矩形R1についてのスコアの合算値がマスク無し顔候補矩形R2についてのスコアの合算値未満である場合、撮像画像中の顔がマスク無し顔であると判定する。上記の例では、マスク有り顔候補矩形R1についてのスコアの合算値2.25>マスク無し顔候補矩形R2についてのスコアの合算値0.3であるため、制御部12は、撮像画像中の顔がマスク有り顔であると判定する。 Then, for example, when the total value of the scores for the masked face candidate rectangle R1 is equal to or greater than the total score for the unmasked face candidate rectangle R2, the control unit 12 indicates that the face in the captured image is a masked face. Is determined. On the other hand, when the total score of the masked face candidate rectangle R1 is less than the total score of the unmasked face candidate rectangle R2, the control unit 12 determines that the face in the captured image is an unmasked face. To do. In the above example, the total value of the scores for the face candidate rectangle R1 with mask is 2.25> the total value of the scores for the face candidate rectangle R2 without mask is 0.3. Therefore, the control unit 12 controls the face in the captured image. Is determined to be a face with a mask.
(ステップS105)
 ステップS104において撮像画像中の顔がマスク有り顔であると判定された場合、制御部12は、マスク有り顔候補矩形R1をマージして、中間顔矩形Aを生成する(S105)。制御部12が撮像画像中の顔がマスク有り顔であると判定した場合(S104でYesの場合)は、本開示の「解析部が、第1検出器によって第1領域が検出されたと判断した場合」の一例である。図8は、ステップS105における第1マージ処理によって生成された中間顔矩形Aを例示する模式図である。図8は、図7の3つのマスク有り顔候補矩形R1に第1マージ処理を実行することによって生成された中間顔矩形Aを模式的に例示している。
(Step S105)
When it is determined in step S104 that the face in the captured image is a face with a mask, the control unit 12 merges the face candidate rectangle R1 with a mask to generate an intermediate face rectangle A (S105). When the control unit 12 determines that the face in the captured image is a face with a mask (Yes in S104), the "analysis unit" of the present disclosure determines that the first region has been detected by the first detector. This is an example of "case". FIG. 8 is a schematic diagram illustrating the intermediate face rectangle A generated by the first merge process in step S105. FIG. 8 schematically illustrates an intermediate face rectangle A generated by executing the first merge process on the three masked face candidate rectangles R1 of FIG. 7.
 ステップS105において実行される第1マージ処理には、公知の方法が用いられてもよい。例えば、第1マージ処理部125として動作する制御部12は、図7の3つのマスク有り顔候補矩形R1のそれぞれの重心の位置とそれぞれのスコアとに基づいて、中間顔矩形Aの重心の位置を算出し、算出された重心の位置を重心とする所定の形状の矩形を中間顔矩形Aとする。 A known method may be used for the first merge process executed in step S105. For example, the control unit 12 that operates as the first merge processing unit 125 determines the position of the center of gravity of the intermediate face rectangle A based on the positions of the centers of gravity of the three masked face candidate rectangles R1 and their respective scores in FIG. Is calculated, and a rectangle having a predetermined shape whose center of gravity is the calculated position of the center of gravity is defined as an intermediate face rectangle A.
(ステップS106,S107)
 次に、制御部12は、マージ対象探索部126として動作し、マスク無し顔候補矩形R2(図7参照)の中から、所定の条件を満たすマージ対象矩形を探索する(S106)。マージ対象矩形を検出した場合、ステップS108に進み、それ以外の場合、ステップS110に進む(S107)。ステップS107においてマージ対象矩形を検出した場合(S107でYesの場合)は、本開示の「解析部が、第1検出器によって第1領域が検出されたと判断し、かつ第2検出器によって第2領域が検出されたと判断した場合」の一例であり、マージ対象矩形を検出しなかった場合(S107でNoの場合)は、「第1検出器によって第1領域が検出されたと判断する一方で、第2検出器によって第2領域が検出されなかったと判断した場合」の一例である。ステップS107では、例えば、撮像画像が鼻出しマスク顔である場合にYesに進み、マスク有り顔のうち鼻及び口が遮蔽された通常のマスク顔である場合にNoに進む。
(Steps S106, S107)
Next, the control unit 12 operates as a merge target search unit 126, and searches for a merge target rectangle satisfying a predetermined condition from the unmasked face candidate rectangle R2 (see FIG. 7) (S106). If the rectangle to be merged is detected, the process proceeds to step S108, and otherwise, the process proceeds to step S110 (S107). When the rectangle to be merged is detected in step S107 (Yes in S107), the "analyzer of the present disclosure determines that the first region has been detected by the first detector, and the second detector detects the first region." This is an example of "when it is determined that the region has been detected", and when the rectangle to be merged is not detected (when No in S107), "while determining that the first region has been detected by the first detector, while This is an example of "when it is determined that the second region is not detected by the second detector". In step S107, for example, if the captured image is a masked face with a nose, the process proceeds to Yes, and if the face with a mask has a normal masked face with the nose and mouth shielded, the process proceeds to No.
 ステップS106における所定の条件について、図9を用いて説明する。図9は、マージ対象矩形を例示する模式図である。制御部12は、マスク無し顔候補矩形R2の中から、以下の4つの条件を満たすものを選択し、マージ対象矩形とする。言い換えれば、マージ対象矩形は、下記の条件をすべて満たす。
(条件1)
 マスク無し顔候補矩形R2の上端が、中間顔矩形Aの上端より上方に位置する。
(条件2)
 マスク無し顔候補矩形R2の上端と中間顔矩形Aの上端との距離が、所定の閾値H以下である。
(条件3)
 マスク無し顔候補矩形R2の左端と中間顔矩形Aの左端との距離、及びマスク無し顔候補矩形R2の右端と中間顔矩形Aの右端との距離が、所定の閾値W以下である。
(条件4)
 マスク無し顔候補矩形R2のスコアが所定の閾値以上である。
The predetermined conditions in step S106 will be described with reference to FIG. FIG. 9 is a schematic diagram illustrating a rectangle to be merged. The control unit 12 selects a rectangle that satisfies the following four conditions from the face candidate rectangle R2 without a mask, and sets it as a rectangle to be merged. In other words, the merged rectangle satisfies all of the following conditions.
(Condition 1)
The upper end of the unmasked face candidate rectangle R2 is located above the upper end of the intermediate face rectangle A.
(Condition 2)
The distance between the upper end of the unmasked face candidate rectangle R2 and the upper end of the intermediate face rectangle A is equal to or less than a predetermined threshold value H.
(Condition 3)
The distance between the left end of the unmasked face candidate rectangle R2 and the left end of the intermediate face rectangle A and the distance between the right end of the unmasked face candidate rectangle R2 and the right end of the intermediate face rectangle A are equal to or less than a predetermined threshold value W.
(Condition 4)
The score of the unmasked face candidate rectangle R2 is equal to or higher than a predetermined threshold value.
 条件4に関して、マスク無し顔候補矩形R2のスコアは、例えば、マスク無し顔候補矩形R2内に映っている画像がマスク無し顔である確率を示す。 Regarding condition 4, the score of the unmasked face candidate rectangle R2 indicates, for example, the probability that the image reflected in the unmasked face candidate rectangle R2 is an unmasked face.
 (ステップS108)
 制御部12は、ステップS106においてマージ対象矩形を検出した場合、マージ対象矩形を中間顔矩形Aにマージして、最終顔矩形Bを決定する(S108)。以下、ステップS108において実行されるマージ処理を「第2マージ処理」という。図10は、ステップS108における第2マージ処理によって生成された最終顔矩形Bを例示する模式図である。
(Step S108)
When the control unit 12 detects the rectangle to be merged in step S106, the control unit 12 merges the rectangle to be merged into the intermediate face rectangle A and determines the final face rectangle B (S108). Hereinafter, the merge process executed in step S108 is referred to as a "second merge process". FIG. 10 is a schematic diagram illustrating the final face rectangle B generated by the second merge process in step S108.
 第2マージ処理は、マージ対象矩形と、被マージ矩形と、を包含する矩形を出力する処理である。ステップS108において、第2マージ処理部127として動作する制御部12は、中間顔矩形A(図9参照)を被マージ矩形とし、マージ対象矩形と中間顔矩形Aとを包含する矩形を最終顔矩形Bとする。 The second merge process is a process of outputting a rectangle including a rectangle to be merged and a rectangle to be merged. In step S108, the control unit 12 that operates as the second merge processing unit 127 uses the intermediate face rectangle A (see FIG. 9) as the merged rectangle, and the rectangle including the merged target rectangle and the intermediate face rectangle A as the final face rectangle. Let it be B.
 このように、ステップS108では、制御部12は、マスク無し顔検出器123によってマスク無し顔候補矩形が検出され、かつマスク有り顔検出器122によってマスク有り顔候補矩形が検出されたと判断すると、マージ対象矩形と中間顔矩形Aとを結合するように最終顔矩形Bを特定する。 As described above, in step S108, when the control unit 12 determines that the unmasked face candidate rectangle is detected by the unmasked face detector 123 and the masked face candidate rectangle is detected by the masked face detector 122, the merge is performed. The final face rectangle B is specified so as to combine the target rectangle and the intermediate face rectangle A.
 (ステップS110)
 制御部12は、ステップS106においてマージ対象矩形を検出しなかった場合、第2マージ処理等の処理を行わず、中間顔矩形Aを最終顔矩形Bとして特定する(S110)。
(Step S110)
If the control unit 12 does not detect the merge target rectangle in step S106, the control unit 12 does not perform a second merge process or the like, and specifies the intermediate face rectangle A as the final face rectangle B (S110).
 (ステップS109)
 また、ステップS104に戻り、撮像画像中の顔がマスク有り顔でないと判定された場合、制御部12は、すべてのマスク無し顔候補矩形R2を対象として第1マージ処理を実行して、最終顔矩形Bを決定する(S109)。
(Step S109)
Further, when returning to step S104 and determining that the face in the captured image is not a face with a mask, the control unit 12 executes the first merge process for all the face candidate rectangles R2 without a mask, and finally faces. The rectangle B is determined (S109).
4.作用・効果
 以上のように、画像解析装置の一例である顔検出装置100は、撮像画像を取得する画像取得部121と、第1検出器の一例であるマスク有り顔検出器122と、第2検出器の一例であるマスク無し顔検出器123と、解析部として動作する制御部12とを備える。マスク有り顔検出器122は、撮像画像内で、一部が遮蔽物の一例であるマスクにより遮蔽された、対象物の一例である顔を示す第1領域を検出する。マスク無し顔検出器123は、撮像画像内で、マスクにより遮蔽されていない顔を示す第2領域を検出する。制御部12は、撮像画像内で顔が映った領域を示す顔領域を特定する。制御部12は、マスク有り顔検出器122によって第1領域が検出されたと判断する(S104でYes)一方で、マスク無し顔検出器123によって第2領域が検出されなかったと判断した場合(S107でNo)は、第1領域を顔領域として特定する(S110)。制御部12は、マスク有り顔検出器122によって第1領域が検出されたと判断し(S104でYes)、かつマスク無し顔検出器123によって第2領域が検出されたと判断した場合(S107でYes)は、第1領域と第2領域とを包含する領域を顔領域として特定する(S108)。第1領域と第2領域とを包含する領域を顔領域として特定することは、本開示の第2マージ処理の一例である。
4. Action / Effect As described above, the face detection device 100, which is an example of the image analysis device, includes an image acquisition unit 121 that acquires an captured image, a masked face detector 122 that is an example of the first detector, and a second. It includes a maskless face detector 123, which is an example of a detector, and a control unit 12 that operates as an analysis unit. The masked face detector 122 detects a first region in the captured image showing a face, which is an example of an object, and is partially shielded by a mask, which is an example of a shield. The unmasked face detector 123 detects a second region in the captured image that shows a face that is not masked by the mask. The control unit 12 specifies a face region indicating a region in which the face is reflected in the captured image. The control unit 12 determines that the first region has been detected by the masked face detector 122 (Yes in S104), while the control unit 12 determines that the second region has not been detected by the unmasked face detector 123 (in S107). No) specifies the first region as a face region (S110). When the control unit 12 determines that the first region is detected by the masked face detector 122 (Yes in S104) and determines that the second region is detected by the unmasked face detector 123 (Yes in S107). Specifies a region including the first region and the second region as a face region (S108). Specifying the region including the first region and the second region as the face region is an example of the second merge process of the present disclosure.
 以上の顔検出装置100によると、鼻及び口を覆うようにマスクを着用した人の顔の撮像画像については、第1領域、すなわちマスク有り顔検出器122単独の検出結果を顔領域として特定することができる。これに加えて、鼻を遮蔽しないようにマスクを着用した人の顔(鼻出しマスク顔)の撮像画像については、マスク有り顔検出器122とマスク無し顔検出器123の双方の検出結果を用いて、第1領域と第2領域とを包含する領域を顔領域として特定することができる。鼻出しマスク顔の中には、様々な鼻出しの程度のもの、口がマスクに覆われているもの、口もマスクに覆われていないもの等の多様なマスクの着用態様があり得るが、顔検出装置100は、これらの多様なマスクの着用態様の顔の撮像画像についても、顔領域を検出することができる。このように、顔検出装置100は、マスクにより顔が遮蔽されている場合において、従来技術よりも多様な遮蔽の態様に応じて顔領域を検出することができる。 According to the above face detection device 100, for the captured image of the face of a person wearing a mask so as to cover the nose and mouth, the first region, that is, the detection result of the masked face detector 122 alone is specified as the face region. be able to. In addition to this, for the captured image of the face of a person wearing a mask so as not to block the nose (nose-out mask face), the detection results of both the masked face detector 122 and the unmasked face detector 123 are used. Therefore, the region including the first region and the second region can be specified as the face region. Nose mask There are various ways to wear a mask, such as those with various degrees of nose, those whose mouth is covered with a mask, and those whose mouth is not covered with a mask. The face detection device 100 can also detect a face region in captured images of faces in various wearing modes of these masks. As described above, when the face is shielded by the mask, the face detection device 100 can detect the face region according to various shielding modes as compared with the prior art.
 また、顔検出装置100には鼻出しマスク顔を検出するための検出器を設ける必要がないため、負荷及び処理時間の増加を抑えつつ、鼻出しマスク顔を検出することができる。例えば作業者の眠気等の覚醒度を測定するために顔検出装置100を利用する場合、作業者の顔の撮像後に即時に顔検出処理を行うリアルタイム処理が要求されるところ、本開示の顔検出装置100は負荷及び処理時間の増加を抑えることができるため有利である。 Further, since it is not necessary to provide the face detection device 100 with a detector for detecting the nose mask face, it is possible to detect the nose mask face while suppressing an increase in load and processing time. For example, when the face detection device 100 is used to measure the arousal level of a worker such as drowsiness, real-time processing for performing face detection processing immediately after imaging the worker's face is required. The device 100 is advantageous because it can suppress an increase in load and processing time.
 なお、顔検出装置100の解析部は、第1検出器によって第1領域が検出されなかったと判断する一方で、第2検出器によって第2領域が検出されたと判断した場合は、第2領域を対象物領域として特定してもよい。このように、顔検出装置100は、マスクを着用していない人の顔の撮像画像について、顔領域を検出することができる。 The analysis unit of the face detection device 100 determines that the first region has not been detected by the first detector, but if it determines that the second region has been detected by the second detector, the second region is used. It may be specified as an object area. In this way, the face detection device 100 can detect the face region of the captured image of the face of a person who does not wear a mask.
5.変形例
 以上、本開示の実施形態を詳細に説明したが、前述までの説明はあらゆる点において本開示の例示に過ぎない。本開示の範囲を逸脱することなく種々の改良や変形を行うことができる。例えば、以下のような変更が可能である。なお、以下では、上記実施形態と同様の構成要素に関しては同様の符号を用い、上記実施形態と同様の点については、適宜説明を省略する。以下の変形例は適宜組み合わせることができる。
5. Modifications Although the embodiments of the present disclosure have been described in detail above, the above description is merely an example of the present disclosure in all respects. Various improvements and modifications can be made without departing from the scope of the present disclosure. For example, the following changes can be made. In the following, the same reference numerals will be used for the same components as those in the above embodiment, and the same points as in the above embodiment will be omitted as appropriate. The following modifications can be combined as appropriate.
[5-1.第1変形例]
 上記の実施形態では、工場用途に適用される顔検出システム1について説明した。しかしながら、本開示はこれに限定されない。例えば、顔検出システム1は、オフィス等において利用されてもよい。例えば、オフィスにおいてデスクワークを行なうデスク作業者、及び在宅でデスクワークを行うデスク作業者等の、同じ場所で作業を続けるような作業者の覚醒度が低下した場合、顔検出システム1は、スピーカから作業者に休憩を促すアナウンスを流す等の制御をしてもよい。これにより、デスクワークにおけるミス等が生じるリスクを低減することができる。
[5-1. First modification]
In the above embodiment, the face detection system 1 applied to the factory use has been described. However, the present disclosure is not limited to this. For example, the face detection system 1 may be used in an office or the like. For example, when the alertness of a desk worker who performs desk work in the office and a desk worker who performs desk work at home, etc., who continue to work in the same place decreases, the face detection system 1 works from the speaker. Controls such as an announcement urging a person to take a break may be made. This makes it possible to reduce the risk of making mistakes in desk work.
 顔検出システム1は、車載用途に適用されてもよい。例えば、顔検出装置100が車両を運転する運転者の顔を検出するために利用される場合、カメラ3は、ステアリングコラムカバー、ダッシュボード、及びルームミラー付近等の運転者の前方に取り付けられる。カメラ3の位置はこれに限定されず、運転者の顔を撮像できる位置であればよい。 The face detection system 1 may be applied to in-vehicle applications. For example, when the face detection device 100 is used to detect the face of a driver driving a vehicle, the camera 3 is mounted in front of the driver, such as near the steering column cover, dashboard, and rearview mirror. The position of the camera 3 is not limited to this, and may be any position as long as it can capture the driver's face.
 運転者が眠気を感じる等、運転者の覚醒度が低下した場合、顔検出システム1は、座席に取り付けられた振動装置を振動させ、及び/又は、警告音や休憩を促すアナウンスをスピーカに出力させる制御を実行してもよい。また、例えば、顔検出システム1は、運転者の覚醒度が低下した場合、車両のステアリング及びブレーキ等を制御して、自動運転制御及び自動ブレーキ制御を行ってもよい。これにより、運転者の覚醒度の低下に起因する事故を防止することができる。 When the driver's alertness decreases, such as when the driver feels drowsy, the face detection system 1 vibrates the vibrating device attached to the seat and / or outputs a warning sound or an announcement prompting a break to the speaker. You may execute the control to make it. Further, for example, the face detection system 1 may perform automatic driving control and automatic braking control by controlling the steering and braking of the vehicle when the driver's arousal level is lowered. As a result, it is possible to prevent an accident caused by a decrease in the driver's alertness.
 顔検出システム1は、医療用途に適用されてもよい。例えば、レビー小体型認知症やアルツハイマー型認知症の患者等の認知症患者、及び軽度認知障害を有する人においては、サッカードの頻度が増加することが知られている。そこで、顔検出システム1は、視線検出装置60によってサッカードの頻度を検出することによって、認知症及び軽度認知障害等の診断に用いられてもよい。 The face detection system 1 may be applied to medical applications. For example, it is known that the frequency of saccades increases in dementia patients such as patients with Lewy body dementias and Alzheimer's disease, and in persons with mild cognitive impairment. Therefore, the face detection system 1 may be used for diagnosing dementia, mild cognitive impairment, etc. by detecting the frequency of saccades with the line-of-sight detection device 60.
 また、顔検出装置100は、デジタルカメラの自動的な顔検出機能にも適用できる。さらに、顔検出装置100は、例えばセキュリティ用途のため、道路上、及び駅構内等の建物の中等における歩行者の顔を検出するために利用できる。この場合、カメラ3は、道路上、及び駅構内等の建物の中を撮影するように配置されてもよい。 The face detection device 100 can also be applied to the automatic face detection function of a digital camera. Further, the face detection device 100 can be used for detecting the face of a pedestrian on a road or in a building such as a station yard for security purposes, for example. In this case, the camera 3 may be arranged so as to take a picture on the road or in a building such as a station yard.
[5-2.第2変形例]
 上記の実施形態では、矩形であるマスク有り顔候補矩形R1、マスク無し顔候補矩形R2、中間顔矩形A、及び最終顔矩形B等の顔領域について説明した。しかしながら、これらの顔領域の形状は矩形に限定されない。例えば、これらの顔領域の形状は、矩形以外の四角形、多角形、円、及び楕円であってもよい。上記の実施形態では、第2マージ処理は、マージ対象矩形と、被マージ矩形と、を包含する矩形を出力する処理と説明したが、顔領域の形状が矩形でない場合、第2マージ処理は、複数のマージ対象領域を包含する結果領域を生成する処理である。例えば、第2マージ処理は、2つのマージ対象領域を包含する結果領域であって、当該2つのマージ対象領域のそれぞれと少なくとも1点で接するものを生成する処理である。
[5-2. Second variant]
In the above embodiment, face regions such as a face candidate rectangle R1 with a mask, a face candidate rectangle R2 without a mask, an intermediate face rectangle A, and a final face rectangle B, which are rectangles, have been described. However, the shape of these face regions is not limited to a rectangle. For example, the shape of these face regions may be a quadrangle other than a rectangle, a polygon, a circle, and an ellipse. In the above embodiment, the second merge process is described as a process of outputting a rectangle including a rectangle to be merged and a rectangle to be merged. However, when the shape of the face area is not a rectangle, the second merge process is performed. This is a process for generating a result area that includes a plurality of merge target areas. For example, the second merge process is a process of generating a result area including two merge target areas, which is in contact with each of the two merge target areas at at least one point.
(付記)
 以下、本開示に係る各種態様を付記する。
(Additional note)
Hereinafter, various aspects of the present disclosure will be added.
 本開示の一態様に係る画像解析装置(100)は、
 撮像画像を取得する画像取得部(121)と、
 前記撮像画像内で、一部が遮蔽物により遮蔽された対象物を示す第1領域を検出する第1検出器(122)と、
 前記撮像画像内で、前記遮蔽物により遮蔽されていない前記対象物を示す第2領域を検出する第2検出器(123)と、
 前記撮像画像内で前記対象物が映った領域を示す対象物領域(B)を特定する解析部とを備える。
 前記解析部は、
   前記第1検出器(122)によって前記第1領域が検出されたと判断する一方で、前記第2検出器(123)によって前記第2領域が検出されなかったと判断した場合は、前記第1領域を前記対象物領域(B)として特定し、
   前記第1検出器(122)によって前記第1領域が検出されたと判断し、かつ前記第2検出器(123)によって前記第2領域が検出されたと判断した場合は、前記第1領域と前記第2領域とを包含する領域を前記対象物領域(B)として特定する。
The image analysis apparatus (100) according to one aspect of the present disclosure is
An image acquisition unit (121) that acquires an captured image,
In the captured image, a first detector (122) for detecting a first region showing an object partially shielded by a shield, and
A second detector (123) that detects a second region indicating the object that is not shielded by the shield in the captured image.
It is provided with an analysis unit that identifies an object region (B) indicating an region in which the object is reflected in the captured image.
The analysis unit
If it is determined that the first region has been detected by the first detector (122), while the second region has not been detected by the second detector (123), the first region is determined. Specified as the object area (B)
When it is determined that the first region is detected by the first detector (122) and the second region is detected by the second detector (123), the first region and the first region are determined. A region including the two regions is specified as the object region (B).
 前記画像解析装置(100)は、
 前記撮像画像において、前記対象物が前記遮蔽物により遮蔽されているか否かを判定する判定部(124)を更に備え、
 前記解析部は、前記判定部(124)によって前記対象物が前記遮蔽物により遮蔽されていると判定された場合、前記第2領域の中から所定の条件を満たす結合対象領域を検出し、前記第1領域と前記結合対象領域とを包含する領域を前記対象物領域(B)として特定してもよい。
The image analyzer (100)
In the captured image, a determination unit (124) for determining whether or not the object is shielded by the shield is further provided.
When the determination unit (124) determines that the object is shielded by the shield, the analysis unit detects a coupling target region satisfying a predetermined condition from the second region, and the analysis unit detects the combination target region. A region including the first region and the binding target region may be specified as the target region (B).
 前記条件は、前記結合対象領域が前記第1領域と所定の位置関係にあることを含んでもよい。 The condition may include that the binding target region has a predetermined positional relationship with the first region.
 前記解析部は、複数の前記第1領域を結合することによって中間領域(A)を決定し、
 前記中間領域(A)と前記結合対象領域とを包含する領域を前記対象物領域(B)として特定してもよい。
The analysis unit determines the intermediate region (A) by combining the plurality of first regions.
A region including the intermediate region (A) and the binding target region may be specified as the object region (B).
 前記条件は、前記結合対象領域が前記中間領域(A)と所定の位置関係にあることを含んでもよい。 The condition may include that the binding target region has a predetermined positional relationship with the intermediate region (A).
 前記条件は、前記結合対象領域の信頼度が所定の閾値以上であることを含んでもよい。 The condition may include that the reliability of the binding target region is equal to or higher than a predetermined threshold value.
 前記対象物は、人の顔であってもよい。 The object may be a human face.
 前記遮蔽物は、マスクであってもよい。 The shield may be a mask.
 本開示の一態様に係る画像解析方法は、
 制御部(12)が、撮像画像を取得するステップ(S101)と、
 前記撮像画像内で、一部が遮蔽物により遮蔽された対象物を示す第1領域を検出する第1領域検出ステップ(S102a)と、
 前記撮像画像内で、前記遮蔽物により遮蔽されていない前記対象物を示す第2領域を検出する第2領域検出ステップ(S102b)と、
 前記撮像画像内で前記対象物が映った領域を示す対象物領域(B)を特定する解析ステップとを含む。
 前記解析ステップにおいて、制御部(12)は、
   前記第1領域検出ステップ(S102a)において前記第1領域が検出されたと判断する一方で、前記第2領域検出ステップ(S102b)において前記第2領域が検出されなかったと判断した場合は、前記第1領域を前記対象物領域(B)として特定し(S110)、
   前記第1領域検出ステップ(S102a)において前記第1領域が検出され、かつ前記第2領域検出ステップ(S102b)において前記第2領域が検出されたと判断した場合は、前記第1領域と前記第2領域とを包含する領域を前記対象物領域(B)として特定する(S108)。
The image analysis method according to one aspect of the present disclosure is
The step (S101) in which the control unit (12) acquires the captured image,
In the captured image, a first region detection step (S102a) for detecting a first region indicating an object partially shielded by a shield, and
A second region detection step (S102b) for detecting a second region indicating the object that is not shielded by the shield in the captured image, and
It includes an analysis step of identifying an object region (B) indicating an region in which the object is reflected in the captured image.
In the analysis step, the control unit (12)
If it is determined that the first region has been detected in the first region detection step (S102a), while it is determined that the second region has not been detected in the second region detection step (S102b), the first region is determined. The region is specified as the object region (B) (S110),
When it is determined that the first region is detected in the first region detection step (S102a) and the second region is detected in the second region detection step (S102b), the first region and the second region are detected. A region including the region is specified as the object region (B) (S108).
 本開示の一態様に係るプログラムは、上記の態様の画像解析方法を制御部に実行させる。 The program according to one aspect of the present disclosure causes the control unit to execute the image analysis method of the above aspect.
 1 顔検出システム
 3 カメラ
 11 入力部
 12 制御部
 13 記憶部
 14 通信I/F
 50 目開閉検出装置
 60 視線検出装置
 70 顔向き検出装置
 100 顔検出装置(画像解析装置)
 121 画像取得部
 122 マスク有り顔検出器(第1検出器)
 123 マスク無し顔検出器(第2検出器)
 124 マスク有無判定部
 125 第1マージ処理部
 126 マージ対象探索部
 127 第2マージ処理部
1 Face detection system 3 Camera 11 Input unit 12 Control unit 13 Storage unit 14 Communication I / F
50 Eye opening / closing detection device 60 Eye line detection device 70 Face orientation detection device 100 Face detection device (image analysis device)
121 Image acquisition unit 122 Face detector with mask (1st detector)
123 Face detector without mask (second detector)
124 Mask presence / absence determination unit 125 1st merge processing unit 126 Merge target search unit 127 2nd merge processing unit

Claims (10)

  1.  撮像画像を取得する画像取得部と、
     前記撮像画像内で、一部が遮蔽物により遮蔽された対象物を示す第1領域を検出する第1検出器と、
     前記撮像画像内で、前記遮蔽物により遮蔽されていない前記対象物を示す第2領域を検出する第2検出器と、
     前記撮像画像内で前記対象物が映った領域を示す対象物領域を特定する解析部とを備え、
     前記解析部は、
       前記第1検出器によって前記第1領域が検出されたと判断する一方で、前記第2検出器によって前記第2領域が検出されなかったと判断した場合は、前記第1領域を前記対象物領域として特定し、
       前記第1検出器によって前記第1領域が検出されたと判断し、かつ前記第2検出器によって前記第2領域が検出されたと判断した場合は、前記第1領域と前記第2領域とを包含する領域を前記対象物領域として特定する、
    画像解析装置。
    The image acquisition unit that acquires the captured image and
    In the captured image, a first detector that detects a first region indicating an object partially shielded by a shield, and
    A second detector that detects a second region indicating the object that is not shielded by the shield in the captured image.
    It is provided with an analysis unit for specifying an object area indicating an area in which the object is reflected in the captured image.
    The analysis unit
    If it is determined that the first region has been detected by the first detector, but the second region has not been detected by the second detector, the first region is specified as the object region. And
    When it is determined that the first region is detected by the first detector and the second region is detected by the second detector, the first region and the second region are included. Identifying a region as the object region,
    Image analyzer.
  2.  前記撮像画像において、前記対象物が前記遮蔽物により遮蔽されているか否かを判定する判定部を更に備え、
     前記解析部は、前記判定部によって前記対象物が前記遮蔽物により遮蔽されていると判定された場合、前記第2領域の中から所定の条件を満たす結合対象領域を検出し、前記第1領域と前記結合対象領域とを包含する領域を前記対象物領域として特定する、請求項1に記載の画像解析装置。
    In the captured image, a determination unit for determining whether or not the object is shielded by the shield is further provided.
    When the determination unit determines that the object is shielded by the shield, the analysis unit detects a coupling target region satisfying a predetermined condition from the second region, and the first region. The image analysis apparatus according to claim 1, wherein a region including the binding target region and the binding target region is specified as the target region.
  3.  前記条件は、前記結合対象領域が前記第1領域と所定の位置関係にあることを含む、請求項2に記載の画像解析装置。 The image analysis apparatus according to claim 2, wherein the condition includes that the combination target region has a predetermined positional relationship with the first region.
  4.  前記解析部は、複数の前記第1領域を結合することによって中間領域を決定し、
     前記中間領域と前記結合対象領域とを包含する領域を前記対象物領域として特定する、請求項2又は3に記載の画像解析装置。
    The analysis unit determines an intermediate region by combining a plurality of the first regions.
    The image analysis apparatus according to claim 2 or 3, wherein a region including the intermediate region and the binding target region is specified as the object region.
  5.  前記条件は、前記結合対象領域が前記中間領域と所定の位置関係にあることを含む、請求項4に記載の画像解析装置。 The image analysis apparatus according to claim 4, wherein the condition includes that the combination target region has a predetermined positional relationship with the intermediate region.
  6.  前記条件は、前記結合対象領域の信頼度が所定の閾値以上であることを含む、請求項2~5のいずれかに記載の画像解析装置。 The image analysis apparatus according to any one of claims 2 to 5, wherein the condition includes that the reliability of the combination target region is equal to or higher than a predetermined threshold value.
  7.  前記対象物は、人の顔である、請求項1~6のいずれかに記載の画像解析装置。 The image analysis device according to any one of claims 1 to 6, wherein the object is a human face.
  8.  前記遮蔽物は、マスクである、請求項7に記載の画像解析装置。 The image analysis apparatus according to claim 7, wherein the shield is a mask.
  9.  制御部が、撮像画像を取得するステップと、
     前記撮像画像内で、一部が遮蔽物により遮蔽された対象物を示す第1領域を検出する第1領域検出ステップと、
     前記撮像画像内で、前記遮蔽物により遮蔽されていない前記対象物を示す第2領域を検出する第2領域検出ステップと、
     前記撮像画像内で前記対象物が映った領域を示す対象物領域を特定する解析ステップとを含み、
     前記解析ステップにおいて、制御部は、
       前記第1領域検出ステップにおいて前記第1領域が検出されたと判断する一方で、前記第2領域検出ステップにおいて前記第2領域が検出されなかったと判断した場合は、前記第1領域を前記対象物領域として特定し、
       前記第1領域検出ステップにおいて前記第1領域が検出されたと判断し、かつ前記第2領域検出ステップにおいて前記第2領域が検出されたと判断した場合は、前記第1領域と前記第2領域とを包含する領域を前記対象物領域として特定する、画像解析方法。
    The step that the control unit acquires the captured image,
    A first region detection step of detecting a first region indicating an object partially shielded by a shield in the captured image, and a first region detection step.
    A second region detection step of detecting a second region indicating the object that is not shielded by the shield in the captured image, and a second region detection step.
    Including an analysis step of identifying an object area indicating an area in which the object is reflected in the captured image.
    In the analysis step, the control unit
    If it is determined that the first region has been detected in the first region detection step, while it is determined that the second region has not been detected in the second region detection step, the first region is referred to as the object region. Specified as
    When it is determined that the first region is detected in the first region detection step and the second region is detected in the second region detection step, the first region and the second region are separated. An image analysis method for specifying an inclusion region as the object region.
  10.  請求項9に記載の画像解析方法を制御部に実行させるためのプログラム。 A program for causing the control unit to execute the image analysis method according to claim 9.
PCT/JP2020/028671 2019-08-30 2020-07-27 Image analysis device, image analysis method, and program WO2021039231A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2019-158658 2019-08-30
JP2019158658A JP7276013B2 (en) 2019-08-30 2019-08-30 Image analysis device, image analysis method, and program

Publications (1)

Publication Number Publication Date
WO2021039231A1 true WO2021039231A1 (en) 2021-03-04

Family

ID=74683399

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2020/028671 WO2021039231A1 (en) 2019-08-30 2020-07-27 Image analysis device, image analysis method, and program

Country Status (2)

Country Link
JP (1) JP7276013B2 (en)
WO (1) WO2021039231A1 (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2009217607A (en) * 2008-03-11 2009-09-24 Seiko Epson Corp Calculation for reliability in detecting face region in image
JP2013196034A (en) * 2012-03-15 2013-09-30 Toshiba Corp Human image processing apparatus, and human image processing method
JP2018151919A (en) * 2017-03-14 2018-09-27 オムロン株式会社 Image analysis apparatus, image analysis method, and image analysis program

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2009217607A (en) * 2008-03-11 2009-09-24 Seiko Epson Corp Calculation for reliability in detecting face region in image
JP2013196034A (en) * 2012-03-15 2013-09-30 Toshiba Corp Human image processing apparatus, and human image processing method
JP2018151919A (en) * 2017-03-14 2018-09-27 オムロン株式会社 Image analysis apparatus, image analysis method, and image analysis program

Also Published As

Publication number Publication date
JP7276013B2 (en) 2023-05-18
JP2021039422A (en) 2021-03-11

Similar Documents

Publication Publication Date Title
Rahman et al. Real time drowsiness detection using eye blink monitoring
Ahmed et al. Robust driver fatigue recognition using image processing
CN104573622B (en) Human face detection device, method
CN112754498B (en) Driver fatigue detection method, device, equipment and storage medium
US20220309808A1 (en) Driver monitoring device, driver monitoring method, and driver monitoring-use computer program
Hasan et al. State-of-the-art analysis of modern drowsiness detection algorithms based on computer vision
Khan et al. Efficient Car Alarming System for Fatigue Detectionduring Driving
CN112208544B (en) Driving capability judgment method for driver, safe driving method and system thereof
WO2021039231A1 (en) Image analysis device, image analysis method, and program
US11995898B2 (en) Occupant monitoring device for vehicle
JP2020149499A (en) Occupant observation device
Tarba et al. The driver's attention level
Agarkar et al. Driver Drowsiness Detection and Warning using Facial Features and Hand Gestures
Swetha et al. Vehicle Accident Prevention System Using Artificial Intelligence
Keyvanara et al. Robust real-time driver drowsiness detection based on image processing and feature extraction methods
Thummar et al. A real time driver fatigue system based on eye gaze detection
Rajput et al. Accident Prevention Using Drowsiness Detection
Bhargava et al. Drowsiness detection while driving using eye tracking
CN111696312A (en) Passenger observation device
Amanullah et al. Accident prevention by eye-gaze tracking using imaging Constraints
Varghese et al. Drowsiness Detection and Alert Android App Using OpenCV
Reddy et al. Real-Time Fatigue Detection System using OpenCV and Deep Learning
JP7255691B2 (en) Display system, display method, and program
Huu et al. Detecting Drivers Falling Asleep Algorithm Based on Eye and Head States
Priya et al. Machine Learning-Based System for Detecting and Tracking Driver Drowsiness

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20856325

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20856325

Country of ref document: EP

Kind code of ref document: A1