CN111917967A - Door monitoring system and control method thereof - Google Patents

Door monitoring system and control method thereof Download PDF

Info

Publication number
CN111917967A
CN111917967A CN201910373374.0A CN201910373374A CN111917967A CN 111917967 A CN111917967 A CN 111917967A CN 201910373374 A CN201910373374 A CN 201910373374A CN 111917967 A CN111917967 A CN 111917967A
Authority
CN
China
Prior art keywords
image data
door
region
image
interest
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910373374.0A
Other languages
Chinese (zh)
Inventor
袁坡
潘生俊
赵俊能
丹尼尔马里尼克
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Eyecloud Technology Co ltd
Original Assignee
Hangzhou Eyecloud Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Eyecloud Technology Co ltd filed Critical Hangzhou Eyecloud Technology Co ltd
Priority to CN201910373374.0A priority Critical patent/CN111917967A/en
Priority to US16/503,452 priority patent/US20190340904A1/en
Publication of CN111917967A publication Critical patent/CN111917967A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/18Closed-circuit television [CCTV] systems, i.e. systems in which the video signal is not broadcast
    • H04N7/183Closed-circuit television [CCTV] systems, i.e. systems in which the video signal is not broadcast for receiving images from a single remote source
    • H04N7/186Video door telephones
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/80Camera processing pipelines; Components thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30196Human being; Person
    • G06T2207/30201Face
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30232Surveillance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Lock And Its Accessories (AREA)

Abstract

The present application relates to a door monitoring system and a control method thereof. The door monitoring system acquires image data of a visiting object adjacent to the door through the camera system arranged in the cat eye hole on the door, processes and analyzes the image data acquired by the camera system based on an artificial intelligence algorithm, and then selectively sends the image data of the visiting object to a mobile terminal carried by a homeowner after the situation that the preset condition is met is determined, so that the homeowner is allowed to intelligently monitor the area adjacent to the door. Meanwhile, an interactive interface is also arranged in the cat eye hole and used for receiving interactive request operation of the visiting object, wherein after the interactive request operation of the visiting object is received, the image data of the visiting object and the interactive request are sent to a mobile terminal carried by a homeowner so as to implement remote interaction between the homeowner and the visiting object. The interactive request includes but is not limited to an unlocking request, a voice call request, a video call request and the like.

Description

Door monitoring system and control method thereof
Technical Field
The present invention relates generally to a door monitoring system, and more particularly, to a door monitoring system capable of implementing a remote interactive function and a control method thereof.
Background
The door monitoring system has important significance for guaranteeing personal and property safety. The door monitored control system of present mainstream triggers formula door monitored control system for removing, and its theory of operation is: detecting whether there is movement of an object in an area near the door, and turning on a video monitoring function after detecting that there is movement of the object. However, such door monitoring systems have a number of drawbacks in practical applications.
First, any object with mobility capability can trigger the video surveillance function. That is, existing door monitoring systems are unable to discern whether the moving object is the desired object, resulting in a large number of false detections and alarms. For example, when a cat or a dog enters the monitoring area of the door monitoring system, the video monitoring function can be triggered and corresponding warning signals can be generated. This undoubtedly causes great trouble to the user.
Second, in an application scenario, when the mobile object is a visitor (or other object that desires to interact with the homeowner), he/she standing in front of the door wishes to interact with the homeowner, e.g., to make a voice call, a video call, request an unlock, etc. However, the interaction pattern of the existing door monitoring system is limited to the scene when the owner is in the house. For example, a visitor may interact with a homeowner by pressing a doorbell to request unlocking. In response to the doorbell being triggered, an indoor opportunity communicatively coupled to the doorbell generates a prompt to the homeowner. If the house owner is in the house, after receiving the prompt message, the house owner can interact with the visitor standing outdoors through the indoor unit; however, if the homeowner is not in the house, the interaction needs of the visitors will not be met. That is, existing door monitoring systems lack the ability to implement remote interaction.
Accordingly, a need exists for a door monitoring system that facilitates remote interactive functionality.
Disclosure of Invention
The present application is proposed to solve the above-mentioned technical problems. The embodiment of the application provides a door monitoring system and a control method thereof, wherein a camera system arranged in a cat eye hole on a door is used for acquiring image data of a visiting object adjacent to the door, the image data acquired by the camera system is processed and analyzed based on an artificial intelligence algorithm, and then the image data of the visiting object is selectively sent to a mobile terminal carried by a homeowner after a preset condition is determined to be met, so that the homeowner is allowed to intelligently monitor the area near the door. Meanwhile, an interactive interface is also arranged in the cat eye hole and used for receiving interactive request operation of the visiting object, wherein after the interactive request operation of the visiting object is received, the image data of the visiting object and the interactive request are sent to a mobile terminal carried by a homeowner so as to implement remote interaction between the homeowner and the visiting object. In the embodiment of the present application, the interaction request includes, but is not limited to, an unlocking request, a voice call request, a video call request, and the like.
According to an aspect of the present application, there is provided a door monitoring system including: the camera system is arranged in a cat eye hole in a door of a residence, and comprises a motion detector for detecting whether an object moves in a field of view of the camera system, and a first camera device facing the outer side of the door and used for acquiring image data of a visiting object in an area adjacent to the outer side of the door; the interactive interface is arranged in the cat eye hole and used for receiving interactive request operation of a visiting object; and a door controller comprising a processor and a memory having computer program instructions stored thereon, the processor, when executed by the processor, being configured to: processing at least a part of the image data of the visiting object to determine that at least one condition is met, wherein the at least one condition comprises that the object contained in the image data comprises a human being or that the image data comprises a human face area; outputting, by the processor, at least a portion of the image data of the visiting subject to a mobile terminal in response to determining that at least one condition is satisfied; and/or, in response to receiving an interaction request operation of the visiting object, outputting at least a part of the image data and the interaction request of the visiting object to the mobile terminal through the processor.
In the above door monitoring system, the interactive request includes any one of an unlocking request, a voice call request, and a video call request.
In the above door monitoring system, the processor is further configured to: receiving an unlocking control command from the mobile terminal, wherein the unlocking control command is used for triggering unlocking of an electronic control type door lock installed on the door; and, in response to receiving the unlock control command, unlocking the electronically controlled door lock to open the door of the dwelling.
In the above door monitoring system, the camera system further comprises a second camera device, wherein the second camera device faces the inner side of the door for collecting the image data of the visiting object adjacent to the inner area of the door.
In the above door monitoring system, the processor is further configured to: processing at least a portion of the image data using a first deep neural network model to determine that an object contained in the image data is a human; processing the at least a portion of the image data using a second deep neural network model to determine that the image data includes a face region; and determining that at least one condition is satisfied in response to determining that a human is included in an object included in the image data or that a face region is included in the image data.
In the above door monitoring system, the first neural network model and the second neural network model respectively include N layers of depth separable convolutional layers for obtaining a feature map of the image data, where N is a positive integer and belongs to 4 to 12, where each depth separable convolutional layer includes a depth convolutional layer for applying a single filter to each input channel and a point-by-point convolutional layer for linearly combining outputs of the depth convolutions to obtain an updated feature map, where the second neural network model includes N layers of depth separable convolutional layers for obtaining a feature map of the image data, where N is a positive integer and belongs to 4 to 12, where each depth separable convolutional layer includes a depth convolutional layer and a point-by-point convolutional layer for applying a single filter to each input channel, the point-by-point convolution layer is used for carrying out linear combination on the output of the depth convolution so as to obtain an updated feature map.
In the above door monitoring system, the processor is further configured to: identifying a different image region between a first image and a second image contained in at least a portion of the image data; aggregating different image regions between the first and second images to obtain at least one region of interest; carrying out gray level processing on the at least one region of interest; processing the at least one region of interest after the gray scale processing by using the first deep neural network model so as to classify the object contained in the at least one region of interest; and determining that the object contained in the at least one region of interest comprises a human.
In the above door monitoring system, the processor is further configured to: identifying a different image region between a first image and a second image contained in at least a portion of the image data; aggregating different image regions between the first and second images to obtain at least one region of interest; carrying out gray level processing on the at least one region of interest; and processing the at least one region of interest after the gray scale processing by using the first deep neural network model to determine that the at least one region of interest comprises a human face region.
According to another aspect of the present application, there is provided a control method including: detecting whether an object moves in a field of view of a camera system including a first camera device, wherein the camera system is disposed in a cat eye hole in a door of a residence; capturing, by the camera system, image data of the visiting subject in response to detecting that there is subject movement within the field of view of the camera system; receiving an interactive request operation of a visiting object through an interactive interface; processing, by a door controller, at least a portion of the image data of the visiting subject to determine that at least one condition is satisfied, wherein the at least one condition includes determining that a human is included in a subject included in the image data or determining that a face region is included in the image data; outputting at least a portion of the image data to a mobile terminal via the door controller in response to determining that at least one condition is satisfied; and outputting at least a part of the image data and the interactive request to the mobile terminal through the door controller in response to receiving the interactive request operation of the visiting object.
In the above control method, the interaction request includes an unlocking request, wherein the method further includes: the door controller receives an unlocking control command from the mobile terminal, wherein the unlocking control command is used for triggering unlocking of an electronic control type door lock installed on the door, and the electronic control type door lock is connected with the door controller in a communication mode and used for controlling opening and closing of the door; and unlocking the electronically controlled door lock to open the door of the dwelling in response to receiving the unlock control command.
In the above control method, the camera system further includes a second camera device disposed at a door and facing an inner side of the door, for acquiring image data of the visiting object adjacent to an inner area of the door.
In the above control method, processing at least a part of the image data of the visiting subject by a door controller to determine that at least one condition is satisfied includes: processing at least a portion of the image data using a first deep neural network model to determine that an object contained in the image data is a human; processing the at least a portion of the image data using a second deep neural network model to determine that the image data includes a face region; and determining that at least one condition is satisfied in response to determining that a human is included in an object included in the image data or that a face region is included in the image data.
In the above control method, the first neural network model and the second neural network model respectively include N layers of depth separable convolutional layers for obtaining a feature map of the image data, where N is a positive integer and belongs to 4 to 12, where each depth separable convolutional layer includes a depth convolutional layer for applying a single filter to each input channel and a point-by-point convolutional layer for linearly combining outputs of the depth convolutions to obtain an updated feature map, where the second neural network model includes N layers of depth separable convolutional layers for obtaining a feature map of the image data, where N is a positive integer and belongs to 4 to 12, where each depth separable convolutional layer includes a depth convolutional layer and a point-by-point convolutional layer for applying a single filter to each input channel, the point-by-point convolution layer is used for carrying out linear combination on the output of the depth convolution so as to obtain an updated feature map.
In the above control method, processing, by a gate controller, at least a part of the image data using a first deep neural network model to determine that a subject included in the image data includes a human being, includes: identifying a different image region between a first image and a second image contained in at least a portion of the image data; aggregating different image regions between the first and second images to obtain at least one region of interest; carrying out gray level processing on the at least one region of interest; processing the at least one region of interest after the gray scale processing by using the first deep neural network model so as to classify the object contained in the at least one region of interest; and determining that the object contained in the at least one region of interest comprises a human.
In the above control method, processing, by the gate controller, the at least a part of the image data using a second deep neural network model to determine that the image data includes a face region, includes: identifying a different image region between a first image and a second image contained in at least a portion of the image data; aggregating different image regions between the first and second images to obtain at least one region of interest; carrying out gray level processing on the at least one region of interest; and processing the at least one region of interest after the gray scale processing by using the first deep neural network model to determine that the at least one region of interest comprises a human face region.
The application provides a door monitored control system, it is adjacent to the image data of the visiting object of this door through the camera system collection that sets up in the cat eye hole on the door effectively to it is right based on artificial intelligence algorithm the image data that camera system gathered handles the analysis, and then confirms that to satisfy the mobile terminal that carries with the image data transmission of this visiting object optionally after the preset condition, so as to allow the owner to carry out video monitoring intelligently to the regional near door. Meanwhile, an interactive interface is also arranged in the cat eye hole and used for receiving interactive request operation of the visiting object, wherein after the interactive request operation of the visiting object is received, the image data of the visiting object and the interactive request are sent to a mobile terminal carried by a homeowner so as to implement remote interaction between the homeowner and the visiting object. In the embodiment of the present application, the interaction request includes, but is not limited to, an unlocking request, a voice call request, a video call request, and the like.
Drawings
These and/or other aspects and advantages of the present invention will become more apparent and more readily appreciated from the following detailed description of the embodiments of the invention, taken in conjunction with the accompanying drawings of which:
FIG. 1 illustrates a first schematic diagram of a door monitoring system according to an embodiment of the present application.
FIG. 2 illustrates a second schematic diagram of a door monitoring system according to an embodiment of the present application.
Fig. 3 illustrates a first schematic diagram of a camera system, an interactive interface, and an optical peep hole integrally configured in a door according to an embodiment of the present application.
Fig. 4 illustrates a second schematic diagram of a camera system, an interactive interface, and an optical peep hole integrally configured in a door according to an embodiment of the present application.
FIG. 5 is a flow chart illustrating a process in which the gate controller processes at least a portion of the image data using a first deep neural network model to determine that a human is included in the subject contained in the image data according to the embodiment of the subject application.
FIG. 6 is a flow chart illustrating a process in which the door controller processes at least a portion of the image data using a first deep neural network model to determine a face region included in the image data according to the embodiment of the application.
Fig. 7 illustrates a flow chart of a control method according to an embodiment of the application.
Detailed Description
Hereinafter, example embodiments according to the present application will be described in detail with reference to the accompanying drawings. It should be apparent that the described embodiments are merely some embodiments of the present application and not all embodiments of the present application, with the understanding that the present application is not limited to the example embodiments described herein.
According to the technical content disclosed by the application, the application provides a door monitoring system which is used for remotely monitoring an area near a door and implementing remote interaction between a visiting object and a house owner, wherein the door monitoring system collects image data of the visiting object adjacent to the door through a camera system arranged in a cat eye hole in the door, processes and analyzes the image data collected by the camera system based on an artificial intelligence algorithm, and then selectively sends the image data of the visiting object to a mobile terminal carried by the house owner after the condition that a preset condition is met is determined, so that the house owner can intelligently monitor the area near the door through video. Meanwhile, an interactive interface is further arranged in the cat eye hole and used for receiving interactive request operation of the visiting object, wherein after the interactive request operation of the visiting object is received, the image data of the visiting object and the interactive request are sent to a mobile terminal carried by a homeowner, so that remote interaction between the homeowner and the visiting object is implemented. In particular, in the embodiment of the present application, the camera system and the interactive interface are integrally installed in a peep hole in the door, so that on one hand, the integral structure of the door can be ensured to be kept intact, and on the other hand, the camera system is effectively hidden in the peep hole to be protected.
In particular, in the embodiment of the present application, the interaction request includes, but is not limited to, an unlocking request, a voice call request, a video call request, and the like. That is, in the embodiment of the present application, the visiting object may remotely interact with the homeowner through the interactive interface, including but not limited to voice call, video call, request for unlocking, and the like.
In particular, in the embodiment of the present application, the at least one preset condition includes: the object included in the image data includes a human, and the image data includes a face region and the like. In this way, the homeowner can view the visiting object in the image data through the mobile terminal, i.e. perform video monitoring on the visiting object located in the area near the door. In addition, in the embodiment of the present application, the door monitoring system processes the image data through an artificial intelligence algorithm, and transmits at least a part of the image data to a remote mobile terminal after determining that at least one preset condition is met. In this way, the transmission of invalid or erroneous image data to the mobile terminal can be effectively reduced to reduce the power consumption of the door monitoring system.
In particular, the artificial intelligence algorithm employs a specific deep neural network model, which can achieve a good balance between computational cost and detection accuracy. And the deep neural network model has a relatively small model size, can be directly deployed at a programmable embedded chip end, and is used for analyzing and processing the image data so as to be beneficial to application and popularization of a deep learning network in an embedded terminal.
Exemplary door monitoring System
FIG. 1 illustrates a first schematic diagram of a door monitoring system 10 according to an embodiment of the present application. As shown in fig. 1, a door monitoring system 10 according to an embodiment of the present application includes: the system comprises an electronic control type door lock 11, a door lock control interface 12, an interactive interface 13, a door controller 14, an image pickup system 15 and a mobile terminal 16.
In the embodiment of the present application, as shown in fig. 1, the electronically controlled door lock 11 is installed on a door of a residence and is controlled to be switched between an open position and a closed position to control the opening or closing of the door. The door lock control interface 12 is communicatively connected to the electronically controlled door lock for implementing a safety confirmation mechanism to controllably drive the electronically controlled door lock 11 between an open position and a closed position. Optionally, the door lock control interface 12 includes a keypad (e.g., a numeric keypad, an alphanumeric keypad, or other type of keypad) for receiving an input code (e.g., manually entered by an accessing subject) for selectively controlling the electronically controlled door lock 11 to switch between the open position and the closed position in response to a match between the input code and an unlock code. In other examples of the present application, the door lock control interface 12 may be implemented as a biometric interface such as a voice recognition interface, a fingerprint recognition interface, an iris recognition interface, etc. for implementing a biometric-based security confirmation mechanism to selectively control the electronically controlled door lock 11 to switch between the open position and the closed position. In the embodiment of the present invention, the arrangement position of the door lock control interface 12 is not limited to the present invention, and for example, the door lock control interface 12 is integrally provided at a specific position (for example, an upper end portion or a lower end portion) of the electronically controlled door lock, or the door lock control interface 12 is separately provided on the door and located near the electronically controlled door lock 11.
In this embodiment of the present application, the camera system 15 is integrally disposed at the door for acquiring image data of a visiting object adjacent to the door. As shown in fig. 1, in the embodiment of the present application, the camera system 15 is integrally configured in the cat-eye hole 220 of the door for collecting the image data of the visiting object adjacent to the door. In other words, in the present embodiment, the camera system 15 can be regarded as a door monitoring camera system, which is integrated in the peep hole 220 for monitoring the area near the door. It will be appreciated that, since the camera system 15 is integrally mounted in the door's cat-eye aperture 220, on the one hand, it is ensured that the door's overall structure remains intact, and on the other hand, the camera system 15 is effectively hidden within the door's aperture 220 for protection.
More specifically, in the embodiment of the present application, the camera system 15 includes a motion detector 151 and at least one camera device, wherein the motion detector 151 is configured to detect whether there is a movement of an object in a field of view of the camera system 15, and the at least one camera device is configured to acquire image data of the visiting object adjacent to the area near the door. Here, the image data of the visiting object represents video data and/or still picture data of the moving object, and the visiting object represents a moving object adjacent to the door. In this embodiment of the present application, the motion detection result of the motion detector 151 is set as a control signal for triggering the at least one image capturing device to perform image data acquisition. In particular, when the motion detector 151 detects that there is object movement within the field of view of the camera system 15, the at least one camera device is activated for capturing image data of the moving object adjacent to the area near the door in response to the motion detector 151 detecting that there is object movement within the field of view of the camera system 15, in such a way that the energy consumption of the camera system 15 can be effectively reduced.
In the embodiment of the present application, the camera system 15 includes a first camera device 153, and the first camera device 153 faces the outer side of the door and is used for acquiring image data of a visiting object adjacent to the outer area of the door. Specifically, the first camera device 153 has a first field of view that covers a range from the area outside the door (e.g., 1 meter, 1.5 meters, etc. from the outside of the door), such that when the motion detector 151 detects that there is subject movement within the field of view of the first camera device 153, the first camera device 153 is activated to acquire image data of the visiting subject adjacent to the area outside the door in response to the motion detection result of the motion detector 151. That is, in this specific example, the image pickup system 15 includes the first image pickup device 153 facing the outside of the door for monitoring the vicinity of the outside of the door.
Optionally, in other examples of the present application, the camera system 15 further includes a second camera device 155, the second camera device 155 is embedded in the cat-eye hole 220 on the door and faces the inner side of the door, wherein the second camera device 155 has a second field of view covering a range from an area inside the door (e.g., 1 meter, 1.5 meters, etc. from the inner side of the door), so that when the motion detector 151 detects that there is an object moving within the field of view of the second camera device 155, the second camera device 155 is activated to acquire image data of the visiting object adjacent to the area inside the door in response to the motion detection result of the motion detector 151. That is, in this specific example, the image pickup system 15 includes the second image pickup apparatus 155, wherein the second image pickup apparatus 155 is disposed opposite to and cooperates with the first image pickup apparatus 153 to perform video monitoring of the door outer side and the outer side vicinity area at the same time.
It is noted that in implementations, the camera system 15 can include a greater or lesser number of camera devices. For example, in order to increase the overall field of view range of the camera system 15, the camera system 15 may further include a third camera device which is toward the outside of the door in correspondence with the first camera device 153 and has a different installation height from the first camera device 153 (i.e., a different height installed in the cat-eye hole), so that the first camera device 153 and the third camera module have different field of view regions to increase the field of view range of the camera system 15 as a whole. It is worth mentioning that in other examples, the third image pickup apparatus and the first image pickup apparatus 153 may be implemented to have different field angles to have different field areas, thereby increasing the field of view range of the image pickup system 15 as a whole.
In the present embodiment, the camera module (the first camera device 153 or the second camera device 155) can be implemented as and/or include any imaging sensor or device for capturing image data (e.g., video data or still image data, etc.) of a moving object in response to detecting the presence of object movement within the camera module's field of view. The camera module can store a certain amount of image data in a data buffer, for example, a circular buffer (which can store a certain amount of image data within a preset time period). In some embodiments of the present application, the image data can be stored in a computer readable storage medium of the door controller 14.
Accordingly, in this embodiment of the present application, the door controller 14 includes a processor and a memory (computer readable storage medium) having stored thereon computer program instructions that, when executed by the processor, the processor is configured to implement control functions as described below. Here, the processor may include, but is not limited to, a microprocessor, a controller, a digital signal processor, an application specific integrated circuit, a matrix of programmable gates, or other separate or integrated logic circuits having data processing capabilities. The computer-readable storage medium may take any combination of one or more readable media. For example, the computer readable storage medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium include: an electrical connection having one or more wires, a portable disk, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
In a specific operation process, first, the door controller 14 processes at least a part of the image data based on a preset algorithm to determine whether the processing result of the image data satisfies at least one condition. In an embodiment of the present application, the at least one condition includes: a human being is included in an object included in the image data; or, the image data includes a face region. Here, when the processing result of the image data does not satisfy at least one condition (i.e., no human is included in the object included in the image data, or no human face area is included in the image data), the door controller 14 stops further operations and controls the image pickup system 15 to return to the original state: only the motion detector 151 is in an active state and at least one camera device in the camera system 15 is in a sleep state. In contrast, when the result of the processing of the image data satisfies at least one condition, in response to determining that the at least one condition is satisfied, the door controller 14 outputs at least a portion of the image data to a mobile terminal 16 (e.g., a smart phone, a tablet computer, etc.) via a wireless communication network to allow a homeowner to video monitor the area near the door via the mobile terminal 16. Here, the wireless communication network may include, but is not limited to, a satellite communication network, a cellular communication network, a wireless internet communication network (e.g., WiFi), a wireless radio frequency communication network, and the like.
In particular, in this embodiment of the subject application, the door controller 14 processes at least a portion of the image data via an artificial intelligence algorithm to determine whether at least one condition is satisfied, namely: whether a human is included in an object included in the image data, or whether a face region is included in the image data. That is, the door controller 14 performs human object detection and face detection by an artificial intelligence algorithm to determine whether a human is included in the objects included in the image data or whether a face area is included in the image data.
For example, the door controller 14 processes at least a part of the image data in a motion-based object detection method as disclosed in application No. US 16/078,253, performs human object detection to determine whether a human is included in the objects included in the image data, and mainly includes the following steps.
First, at least a portion of the image data is processed to obtain at least one region of interest. In the field of image processing, a region of interest refers to an image region containing candidate objects potentially belonging to a given category, which is part of the overall image. As mentioned above, the object included in the image data is a moving visiting object, and thus, the at least one region of interest can be obtained by identifying a moving portion in the image data acquired by the camera system 15. For ease of subsequent understanding and explanation, in the present application, the method is defined as a motion-based region of interest extraction method.
It will be appreciated by those skilled in the art that in image representation, corresponding moving parts of the moving visiting object in the image data represent image areas having different image contents between the image data. Therefore, in order to acquire a region of interest in image data, first, at least two images (e.g., a first image and a second image) are provided to obtain a moving part in the image data by a contrast between the images. That is to say, in order to extract the region of interest, in the embodiment of the present application, the image data to be processed includes at least two frames of images (for convenience of description, defined as a first image and a second image), where a different image region between the second image and the first image represents a corresponding motion portion of the moving visiting object in the image data. And then comparing the first image with the second image to identify a moving part in the image, and gathering the moving part in the image to obtain the at least one region of interest.
In a specific implementation, the first image and the second image may be set as two images captured by the camera system 15 at a specific time interval, for example, the time interval between capturing the first image and the second image may be set to 0.5 s. Of course, the time interval between the first image and the second image may be set to other values. For example, the first image and the second image may be from video data (having a specific time window, e.g., 15s) captured by the camera system 15, and the first image and the second image are two consecutive frames in the video data. In other words, the shooting time interval between the first image and the second image is the video frame rate.
Further, during the process of obtaining the first image and the second image by using the camera system 15, the camera (the first camera 153 or the second camera 155) itself may slightly physically move (e.g., translate, rotate, etc.), causing the background areas in the first image and the second image to shift. In order to avoid adverse effects caused by physical movement, in particular, in the embodiment of the present application, physical movement generated by the image capturing apparatus is compensated before a different image content area between the first image and the second image is identified. For example, the second image may be translated to compensate for the physical movement by position data acquired by a position sensor (e.g., a gyroscope) integrated with the camera device. It will be appreciated that the purpose of translating the second image to compensate for this physical movement is to: aligning the background in the second image with the background in the first image.
Further, after the at least one region of interest is obtained by the motion-based region of interest extraction method, the at least one region of interest is subjected to gray scale processing to convert the at least one region of interest into a gray scale image. As will be appreciated by those skilled in the art, in order to more richly characterize a subject, images captured by conventional image capturing apparatuses are typically color images (e.g., RGB format or YUM format), which include luminance information and color information. Compared to a grayscale image, a color image has more data channels (R, G, B three channels). However, the color characteristics of the object to be measured do not contribute much to detecting the class to which the object to be measured belongs.
In the embodiment of the present application, the purpose of performing the gray processing on the at least one region of interest is formally as follows: on one hand, the at least one region of interest is converted into a gray image to filter color information in the at least one region of interest, so that the calculation cost of the deep neural network model is reduced; on the other hand, the color information in the at least one region of interest can be effectively prevented from adversely affecting object detection and identification.
To further reduce the computational cost of the deep neural network, the size of the at least one region of interest may also be reduced to a particular size, e.g., 128 × 128 pixels. Here, the reduced size of the at least one region of interest depends on the accuracy requirements for object detection in the specific application scenario, and the subsequently mentioned first deep neural network model for processing the grayscale image. In other words, the reduced size of the at least one region of interest needs to be adjusted based on the architectural features of the first deep neural network model and the accuracy requirements of object detection. The present application is not limited in this respect.
Further, in the embodiment of the present application, the first deep neural network model for classifying the at least one region of interest after ashing, and further determining whether the visiting object included in the at least one region of interest includes a human being is configured based on a depth separable convolution layer (Depthwise Sparable convergence layers). Those skilled in the art will appreciate that the deep separable convolutional layer replaces the conventional convolution operation with a deep separable convolution operation to solve the problem of computational efficiency and parameters of the deep neural network model. Here, the depth separable Convolution operation refers to decomposing a conventional Convolution operation into a depth Convolution (Depthwise Convolution) for applying a single filter to each input channel and a point-wise Convolution (Pointwise Convolution) for linearly combining outputs of the depth Convolution to obtain an updated feature map. The computation cost of the deep neural network model is effectively reduced and the model size is effectively reduced through convolution operation decomposition. In other words, in this embodiment of the present application, each of the depth separable convolutional layers comprises a depth convolutional layer for applying a single filter to each input channel, and a point-by-point convolutional layer for linearly combining the outputs of the depth convolutions to obtain an updated feature map. In other words, in this embodiment of the present application, the first deep neural network model is optimized in a compression manner by adjusting convolution operation, so that it meets the application requirements of the embedded platform.
More specifically, in this embodiment of the present application, the first deep neural network model includes N depth separable convolution layers for obtaining a feature map of the at least one region of interest, where N is a positive integer and belongs to 4-12. Here, the number of layers of the depth separable convolutional layer depends on the requirements for delay and accuracy in a specific application scenario. In particular, taking as an example that the object detection method is used in the security monitoring field as described above, the deep neural network model includes 5 layers of the depth separable convolutional layers, where a first of the depth separable convolutional layers includes 32 filter factors (the depth convolutional layers) with a size of 3 × 3 and a corresponding number of filter factors (the stagnation convolutional layers) with a size of 1 × 1; a second said depth-separable layer, which is connected to said first said depth-separable layer, comprises 64 filter factors (said depth convolution layers) of size 3 x 3 and a corresponding number of filter factors (said stagnation point convolution layers) of 1 x 1; a third said depth-separable layer, which is connected to said second said depth-separable layer, comprises 128 filter factors (said depth convolution layers) of size 3 x 3 and a corresponding number of filter factors (said stagnation point convolution layers) of 1 x 1; a fourth said depth-separable layer, which is connected to said third said depth-separable layer, comprises 256 filter factors (said depth convolution layers) of size 3 x 3 and a corresponding number of filter factors (said stagnation point convolution layers) of 1 x 1; and a fifth said depth-separable layer, which is connected to said fourth said depth-separable layer, comprises 1024 filter factors (said depth convolution layers) of size 3 x 3 and a corresponding number of filter factors (said stagnation point convolution layers) of 1 x 1.
After obtaining the feature map of the grayscale image by a predetermined number of the depth separable convolution layers, the first depth neural network model further classifies the objects included in the at least one region of interest and determines whether the objects included in the at least one region of interest include humans. In particular, in this embodiment of the present application, the deep network neural network model classifies the candidate objects included in the grayscale image by a Softmax multi-classification model.
In summary, the process of the gate controller 14 processing at least a portion of the image data with the first deep neural network model to determine that the object included in the image data includes a human is illustrated. Fig. 5 illustrates a flow chart of a process by which the gate controller 14 processes at least a portion of the image data using a first deep neural network model to determine that the subject contained in the image data includes a human, according to this embodiment of the application. As shown in fig. 5, the process of the gate controller 14 processing at least a portion of the image data using a first deep neural network model to determine that the subject contained in the image data includes a human, includes: s310, identifying a different image area between a first image and a second image contained in at least a part of the image data; s320, gathering different image areas between the first image and the second image to obtain at least one region of interest; s330, carrying out gray level processing on the at least one region of interest; s340, processing the at least one region of interest after the gray processing by using the first deep neural network model to classify the object contained in the at least one region of interest; and S350, determining that the object contained in the at least one region of interest comprises a human.
In this embodiment of the present application, the at least one condition further includes: the at least one image data includes a face region. Accordingly, in the embodiment of the present application, the door controller 14 can process at least a part of the image data in the following manner to determine that the image data includes a human face region.
Firstly, processing at least one part of the image data by the motion-based region of interest extraction method to extract at least one region of interest; and performing gray processing on the at least one region of interest to reduce the calculation cost of a second neural network model subsequently used for processing the at least one region of interest. Here, the process of extracting at least one region of interest by using the motion-based region of interest extraction method and performing gray processing on the at least one region of interest is consistent with the above description, and therefore, the description thereof is omitted here.
Further, the at least one region of interest after the gray processing is processed by the second deep neural network model to determine that the at least one region of interest includes a human face region. Here, the second deep neural network model may be configured to have the same basic model architecture as the first neural network model, that is, the second deep neural network model may be also configured based on a depth separable convolution layer (Depthwise separable convolution layers). For example, the first deep neural network model and the second deep neural network model have the same front hierarchy and only the last few layers are different. For another example, the first deep neural network and the second deep neural network model differ only by the last layer. In this way, the first deep neural network model and the second deep neural network model are further subjected to model compression to reduce the storage capacity thereof.
During operation, the door controller 14 may process at least a portion of the image data using parallel processing to determine that at least one condition is satisfied. For example, the gate controller 14 processes at least a portion of the image data with the first deep neural network model in a first thread to determine that an object contained in the image data is a human; simultaneously, in a second thread, processing the at least a portion of the image data with the second deep neural network model to determine that the image data contains a face region; further, in response to determining that a human is included in an object included in the image data or that a face region is included in the image data, it is determined that at least one condition is satisfied.
Fig. 6 is a flow chart illustrating a process of the door controller 14 processing at least a portion of the image data using the first deep neural network model to determine the face region included in the image data according to the embodiment of the application. As shown in fig. 6, the process includes the steps of: s410, identifying different image areas between a first image and a second image contained in at least one part of the image data; s420, gathering different image areas between the first image and the second image to obtain at least one region of interest; s430, performing gray scale processing on the at least one region of interest; s440, processing the at least one region of interest after the gray processing by using the first deep neural network model to determine that the at least one region of interest includes a human face region.
In summary, in the embodiment of the present application, the door controller 14 processes at least a portion of the image data through an artificial intelligence algorithm to determine whether at least one condition is satisfied: whether a human is included in an object included in the image data, or whether a face region is included in the image data. When the result of the processing of the image data satisfies at least one condition, in response to determining that the at least one condition is satisfied, the door controller 14 outputs at least a portion of the image data to a mobile terminal 16 (e.g., a smart phone, a tablet computer, etc.) via a wireless communication network to allow a homeowner to video monitor an area near the door via the mobile terminal 16. Here, the wireless communication network may include, but is not limited to, a satellite communication network, a cellular communication network, a wireless internet communication network (e.g., WiFi), a wireless radio frequency communication network, and the like.
It should be understood that the door controller 14 determines whether a human or a human face area is included in the collected image data through an artificial intelligence algorithm to intelligently turn on/off the video monitoring function, in this way, false detection can be effectively filtered out, and the energy consumption of the camera system 15 can be reduced.
Further, after receiving the image data from the door monitoring system through the mobile terminal 16, the homeowner can examine the object information contained in the image data through the mobile terminal 16. Also, in the embodiment of the present application, the homeowner can actively interact with the visiting object by the mobile terminal 16. For example, when a homeowner finds that a visiting object included in the image data is a potential intruder (e.g., a stranger), it may send an alert signal (e.g., a voice message, etc.) to the door controller 14 through the mobile terminal 16 to alert the potential intruder. For another example, when the homeowner determines that the object contained in the image data is a security object (e.g., a relative, friend, or family of the homeowner), the homeowner may inquire via the mobile terminal 16 whether the homeowner needs to remotely unlock the security object, and after confirming that the security object has a need to enter the room, send an unlocking control command to the door controller 14 via the mobile terminal 16 to remotely unlock the door for the security object. Accordingly, the door controller 14 controls the electronically controlled door lock 11 to the open position to unlock the electronically controlled door lock 11 in response to receiving the unlock control command after receiving the unlock control command from the mobile terminal 16 to allow the security object to enter the room. It should be understood that the interaction pattern between the homeowner and the visiting object is not limited to the above examples, and the application is not limited thereto.
It should be appreciated that in a particular application scenario, when a visiting subject comes in front of a door with a definite purpose (e.g., visits a homeowner), there is often a need to actively interact with the homeowner. To meet this requirement, in the embodiment of the present application, the door monitoring system further includes an interaction interface 13 for receiving an interaction request operation of a visiting object, and outputting at least a portion of the image data and the interaction request of the visiting object to the mobile terminal 16 through the door controller 14 in response to receiving the interaction request operation of the visiting object. In other words, in the embodiment of the present application, the door monitoring system further has a function of actively performing remote interaction, that is, allowing a visiting object to actively issue an interaction request to remotely interact with a homeowner. In particular, in the embodiment of the present application, the interactive request includes, but is not limited to, a voice call request, a video call request, an unlocking request, and the like.
In one possible implementation, the interactive request is implemented as a voice call request, i.e. the visiting object may transmit the voice call request to the mobile terminal 16 carried by the homeowner through the door controller 14 by triggering the interactive interface 13. In this way, after determining that the voice call request is granted, the homeowner can make a voice call with a visiting subject through the mobile terminal 16.
In one possible implementation, the interactive request is implemented as a video call request, i.e. the visiting object may transmit the video call request to the mobile terminal 16 carried by the homeowner through the door controller 14 by triggering the interactive interface 13. In this way, after confirming the approval of the video call request, the homeowner can make a video call with the visiting subject through the mobile terminal 16.
In one possible implementation, the interactive request is implemented as an unlocking request, i.e. the visiting object can remotely transmit the unlocking request to the mobile terminal 16 carried by the homeowner through the door controller 14 by triggering the interactive interface 13. In this way, when the homeowner determines that the object included in the image data is a security object (for example, a relative, friend, or family of the homeowner), the homeowner can send an unlocking control command to the door controller 14 through the mobile terminal 16 to remotely unlock and open the door for the security object. Accordingly, the door controller 14 controls the electronically controlled door lock 11 to switch to the open position to unlock the electronically controlled door lock 11 in response to receiving the unlock control command after receiving the unlock control command from the mobile terminal 16 to allow the security object to enter the room.
It should be understood that in other implementations of the embodiment of the present application, the interactive interface 13 may integrate a plurality of interactive requests, for example, the interactive interface 13 integrates a combination of at least two or three of a voice call request, a video call request, and an unlocking request. And is not intended to limit the scope of the present application.
In a specific implementation, the interaction interface 13 includes, but is not limited to, a press control interface (e.g., a button), a touch control interface (e.g., a touch screen), a voice control interface, a gesture control interface, etc., which is configured to receive an interaction request operation of a visiting object to trigger a corresponding interaction request.
In particular, in the present embodiment, the interactive interface 13 is integrated in the cat-eye hole 220 provided on the door. Specifically, fig. 3 illustrates a first schematic diagram of an integrated configuration of a camera system, an interactive interface, and an optical peep hole on a door according to an embodiment of the present application. As shown in fig. 3, the interactive interface 13, the camera system 15 and the optical peep hole 17 are integrally disposed in the peep hole, wherein the optical peep hole 17 is implemented as a common optical peep hole including two optical lenses 170 disposed at two ends of the peep hole 220. In particular, in this example, the camera system 15 includes only the first camera module 153, which is mounted on the upper half of the optical lens 170 on the door outer side and faces the door outer side; the interactive interface 13 is also mounted on the optical lens 170 located outside the door, and is located at the lower half of the optical lens 170. With such an arrangement (relative position relationship between the first camera device 153 and the interactive interface 13), it can be effectively ensured that the first camera module corresponds to a visiting object when the visiting object actively triggers the interactive request operation, so as to effectively acquire image data of the visiting object.
Further, other necessary elements, such as a battery (not shown) and the like, are also included in the cat-eye hole 220. It is worth mentioning that in the embodiment of the present application, since the image capturing devices 153, 155 of the image capturing system 15 process the image data through the deep neural network model with a special network architecture, it has relatively low power consumption, and the image capturing devices 153, 155 start the image capturing operation after the motion detector 151 integrated in the image capturing system 15 detects the object movement. Therefore, compared with the conventional camera device integrated in the peephole, the camera devices 153 and 155 related to the present application have the advantage of low energy consumption, and therefore, the power supply requirement for supplying power to the door monitoring system can be met by using a battery, which is not met by the conventional camera module.
Fig. 4 illustrates a second schematic diagram of a camera system, an interactive interface, and an optical peep hole integrally configured in a door according to an embodiment of the present application. As shown in fig. 4, in this specific example, the image pickup system 15 includes the first image pickup apparatus 153 and the second image pickup apparatus 155, wherein the first image pickup apparatus 153 is mounted on the upper half of the optical lens 170 on the door outer side and faces the door outer side, and the second image pickup apparatus 155 is mounted on the upper half of the optical lens 170 on the door inner side and faces the door inner side; the interactive interface 13 is also mounted on the optical lens 170 located outside the door, and is located at the lower half of the optical lens 170. With such an arrangement (relative position relationship between the first camera device 153 and the interactive interface 13), it can be effectively ensured that the first camera module corresponds to a visiting object to acquire image data of the visiting object when the visiting object actively triggers the interactive request operation.
Further, other necessary elements, such as a battery (not shown) and the like, are also included in the cat-eye hole 220. It is worth mentioning that, in the embodiment of the present application, since the image capturing apparatuses 153, 155 of the image capturing system 15 process image data through a deep neural network model with a special network architecture, it has relatively low power consumption, and the image capturing apparatuses 153, 155 are configured to: the image acquisition operation is started after the movement of the object is detected by the motion detector 151 integrated in the camera system 15. In other words, compared with the conventional camera device integrated in the cat-eye hole, the camera devices 153 and 155 according to the present application have the advantage of low energy consumption, so that the power supply requirement for supplying power to the door monitoring system can be satisfied by using a battery, which is not provided by the conventional camera module.
In particular, in an implementation, the battery may be mounted on a lower half of the optical lens 170 located inside the door and electrically connected to the second camera module 155 to supply power to the second camera module through the battery. A circuit is provided between the first camera module 153 and the battery through a wire, so that the first camera module 153 is also supplied with power through the battery. It should be appreciated that the wire extends between the two optical lenses 170 for ease of routing.
In operation, in response to receiving an interaction request operation from the visitor (e.g., the visitor pressing a doorbell), the door controller 14 outputs image data of the visitor captured by the camera system to the mobile terminal 16, so that the homeowner can view the visitor in the image data through the mobile terminal 16 to determine whether to interact with the visitor.
In summary, the exemplary door monitoring system 10 of the present application is illustrated, which collects image data of a visiting object adjacent to a door through a camera system disposed in a cat eye hole on the door, processes and analyzes the image data collected by the camera system based on an artificial intelligence algorithm, and then selectively sends the image data of the visiting object to a mobile terminal carried by a homeowner after determining that a preset condition is met, so as to allow the homeowner to intelligently video monitor an area near the door. Meanwhile, an interactive interface is also arranged in the cat eye hole and used for receiving interactive request operation of the visiting object, wherein after the interactive request operation of the visiting object is received, the image data of the visiting object and the interactive request are sent to a mobile terminal carried by a homeowner so as to implement remote interaction between the homeowner and the visiting object. In the embodiment of the present application, the interaction request includes, but is not limited to, an unlocking request, a voice call request, a video call request, and the like.
In particular, in the embodiment of the present application, the camera system 15 and the interactive interface 13 are integrally installed in the door cat-eye hole 220. In this way, on the one hand, it is ensured that the entire structure of the door remains intact, and on the other hand, the camera system 15 is effectively hidden in the cat-eye opening 220 for protection.
Schematic intelligent door lock control method
Fig. 7 illustrates a flowchart of a control method of the door monitoring system according to an embodiment of the present application.
As shown in fig. 7, a control method according to an embodiment of the present application includes: s510, comprising: s510, detecting whether an object moves in a field of view of a camera system comprising first camera equipment, wherein the camera system is arranged in a cat eye hole in a door of a residence; s520, in response to the fact that the object moving exists in the field of view of the camera system, capturing image data of the visiting object through the camera system; s530, receiving an interactive request operation of a visiting object through an interactive interface; s540, processing at least a part of the image data of the visiting subject through a door controller to determine that at least one condition is satisfied, wherein the at least one condition includes determining that a human is included in the subject included in the image data, or determining that a human face region is included in the image data; s550, responding to the condition that at least one condition is satisfied, outputting at least one part of the image data to the mobile terminal through the door controller, and S550, responding to the interaction request operation of the visiting object, outputting at least one part of the image data and the interaction request to the mobile terminal through the door controller.
In one example, in the above control method, the interaction request includes an unlocking request, wherein the method further includes: the door controller receives an unlocking control command from the mobile terminal, wherein the unlocking control command is used for triggering unlocking of an electronic control type door lock installed on the door, and the electronic control type door lock is connected with the door controller in a communication mode and used for controlling opening and closing of the door; and unlocking the electronically controlled door lock to open the door of the dwelling in response to receiving the unlock control command.
In one example, in the above control method, the camera system further includes a second camera device provided at a door and facing an inner side of the door, for acquiring image data of the visiting object adjacent to an inner area of the door.
In one example, in the above control method, processing at least a part of the image data of the visiting object by a door controller to determine that at least one condition is satisfied includes: processing at least a portion of the image data using a first deep neural network model to determine that an object contained in the image data is a human; processing the at least a portion of the image data using a second deep neural network model to determine that the image data includes a face region; and determining that at least one condition is satisfied in response to determining that a human is included in an object included in the image data or that a face region is included in the image data.
In one example, in the above control method, the first neural network model and the second neural network model respectively include N layers of depth separable convolutional layers for obtaining a feature map of the image data, where N is a positive integer and belongs to 4 to 12, wherein each depth separable convolutional layer includes a depth convolutional layer for applying a single filter to each input channel and a point-by-point convolutional layer for linearly combining outputs of the depth convolutions to obtain an updated feature map, wherein the second neural network model includes N layers of depth separable convolutional layers for obtaining a feature map of the image data, where N is a positive integer and belongs to 4 to 12, wherein each depth separable convolutional layer includes a depth convolutional layer and a point-by-point layer, the depth convolutional layers, for applying a single filter per input channel, said point-wise convolution layer, for linearly combining the outputs of said depth convolutions to obtain an updated feature map.
In one example, in the above control method, processing, by the gate controller, at least a portion of the image data using a first deep neural network model to determine that an object included in the image data includes a human, includes: identifying a different image region between a first image and a second image contained in at least a portion of the image data; aggregating different image regions between the first and second images to obtain at least one region of interest; carrying out gray level processing on the at least one region of interest; processing the at least one region of interest after the gray scale processing by using the first deep neural network model so as to classify the object contained in the at least one region of interest; and determining that the object contained in the at least one region of interest comprises a human.
In one example, in the above control method, processing, by the gate controller, the at least a portion of the image data using a second deep neural network model to determine that a face region is included in the image data, includes: identifying a different image region between a first image and a second image contained in at least a portion of the image data; aggregating different image regions between the first and second images to obtain at least one region of interest; carrying out gray level processing on the at least one region of interest; and processing the at least one region of interest after the gray scale processing by using the first deep neural network model to determine that the at least one region of interest comprises a human face region.
In summary, the exemplary control method of the door monitoring system of the present application is set forth, which acquires image data of a visiting object adjacent to a door through a camera system disposed in a cat eye hole on the door, processes and analyzes the image data acquired by the camera system based on an artificial intelligence algorithm, and then selectively sends the image data of the visiting object to a mobile terminal carried by a homeowner after determining that a preset condition is met, so as to allow the homeowner to intelligently video monitor an area near the door. Meanwhile, an interactive interface is also arranged in the cat eye hole and used for receiving interactive request operation of the visiting object, wherein after the interactive request operation of the visiting object is received, the image data of the visiting object and the interactive request are sent to a mobile terminal carried by a homeowner so as to implement remote interaction between the homeowner and the visiting object. In the embodiment of the present application, the interaction request includes, but is not limited to, an unlocking request, a voice call request, a video call request, and the like.

Claims (15)

1. A door monitoring system, comprising:
the camera system is arranged in a cat eye hole in a door of a residence, and comprises a motion detector for detecting whether an object moves in a field of view of the camera system, and a first camera device facing the outer side of the door and used for acquiring image data of a visiting object in an area adjacent to the outer side of the door;
the interactive interface is arranged in the cat eye hole and used for receiving interactive request operation of a visiting object;
a door controller comprising a processor and a memory having stored thereon computer program instructions that, when executed by the processor, the processor is configured to:
processing at least a part of the image data of the visiting object to determine that at least one condition is met, wherein the at least one condition comprises that the object contained in the image data comprises a human being or that the image data comprises a human face area;
outputting, by the processor, at least a portion of the image data of the visiting subject to a mobile terminal in response to determining that at least one condition is satisfied; and/or
And in response to receiving the interactive request operation of the visiting object, outputting at least one part of the image data and the interactive request of the visiting object to the mobile terminal through the processor.
2. The door monitoring system of claim 1, wherein the interactive request comprises any one of an unlock request, a voice call request, and a video call request.
3. The door monitoring system of claim 2, wherein the processor is further configured to:
receiving an unlocking control command from the mobile terminal, wherein the unlocking control command is used for triggering unlocking of an electronic control type door lock installed on the door; and
in response to receiving the unlock control command, unlocking the electronically controlled door lock to open the door of the dwelling.
4. The door monitoring system according to claim 1, wherein the camera system further comprises a second camera device, wherein the second camera device is directed towards the inside of the door for acquiring image data of the visiting object adjacent to an area inside the door.
5. The door monitoring system of any of claims 1-4, wherein the processor is further configured to:
processing at least a portion of the image data with a first deep neural network model to determine that an object contained in the image data is a human;
processing the at least a portion of the image data with a second deep neural network model to determine that the image data includes a face region; and
determining that at least one condition is satisfied in response to determining that a human is included in an object included in the image data or that a face region is included in the image data.
6. The door monitoring system according to claim 5, wherein the first and second neural network models respectively comprise N layers of depth separable convolutional layers for obtaining a feature map of the image data, where N is a positive integer and belongs to 4-12, wherein each depth separable convolutional layer comprises a depth convolutional layer for applying a single filter to each input channel and a point-wise convolutional layer for linearly combining outputs of the depth convolutions to obtain an updated feature map, wherein the second neural network model comprises N layers of depth separable convolutional layers for obtaining a feature map of the image data, where N is a positive integer and belongs to 4-12, wherein each depth separable convolutional layer comprises a depth convolutional layer and a point-wise convolutional layer, the depth convolution layer is configured to apply a single filter to each input channel, and the point-by-point convolution layer is configured to linearly combine outputs of the depth convolution to obtain an updated feature map.
7. The door monitoring system of claim 6, wherein the processor is further configured to:
identifying a different image region between a first image and a second image contained in at least a portion of the image data;
aggregating different image regions between the first and second images to obtain at least one region of interest;
carrying out gray level processing on the at least one region of interest;
processing the at least one region of interest after the gray scale processing by using the first deep neural network model so as to classify the object contained in the at least one region of interest; and
determining that the object contained in the at least one region of interest includes a human.
8. The door monitoring system of claim 6, wherein the processor is further configured to:
identifying a different image region between a first image and a second image contained in at least a portion of the image data;
aggregating different image regions between the first and second images to obtain at least one region of interest;
carrying out gray level processing on the at least one region of interest; and
processing the at least one region of interest after the gray scale processing with the second deep neural network model to determine that the at least one region of interest includes a human face region.
9. A control method, comprising:
detecting whether an object moves in a field of view of a camera system including a first camera device, wherein the camera system is disposed in a cat eye hole in a door of a residence;
capturing, by the camera system, image data of the visiting subject in response to detecting that there is subject movement within the field of view of the camera system;
receiving an interactive request operation of a visiting object through an interactive interface;
processing, by a door controller, at least a portion of the image data of the visiting subject to determine that at least one condition is satisfied, wherein the at least one condition includes determining that a human is included in a subject included in the image data or determining that a face region is included in the image data;
outputting at least a portion of the image data to a mobile terminal via the door controller in response to determining that at least one condition is satisfied; and
in response to receiving an interaction request operation of the visiting object, outputting at least a portion of the image data of the visiting object captured by the camera system and the interaction request to the mobile terminal through the door controller.
10. The control method of claim 9, wherein the interactive request comprises an unlock request, wherein the method further comprises:
the door controller receives an unlocking control command from the mobile terminal, wherein the unlocking control command is used for triggering unlocking of an electronic control type door lock installed on the door, and the electronic control type door lock is connected with the door controller in a communication mode and used for controlling opening and closing of the door; and
in response to receiving the unlock control command, unlocking the electronically controlled door lock to open the door of the dwelling.
11. The control method according to claim 9, wherein the camera system further comprises a second camera device provided at a door and facing an inner side of the door for collecting image data of the visiting object adjacent to an inner area of the door.
12. The control method of claim 9, wherein processing at least a portion of the image data of the visiting subject by a door controller to determine that at least one condition is satisfied comprises:
processing at least a portion of the image data using a first deep neural network model to determine that an object contained in the image data is a human;
processing the at least a portion of the image data using a second deep neural network model to determine that the image data includes a face region; and
determining that at least one condition is satisfied in response to determining that a human is included in an object included in the image data or that a face region is included in the image data.
13. The control method according to claim 12, wherein the first and second neural network models respectively include N layers of depth separable convolutional layers for obtaining a feature map of the image data, where N is a positive integer and belongs to 4-12, wherein each depth separable convolutional layer includes a depth convolutional layer for applying a single filter to each input channel and a point-by-point convolutional layer for linearly combining outputs of the depth convolutions to obtain an updated feature map, wherein the second neural network model includes N layers of depth separable convolutional layers for obtaining a feature map of the image data, where N is a positive integer and belongs to 4-12, wherein each depth separable convolutional layer includes a depth convolutional layer and a point-by-point convolutional layer, the depth convolution layer is configured to apply a single filter to each input channel, and the point-by-point convolution layer is configured to linearly combine outputs of the depth convolution to obtain an updated feature map.
14. The control method of claim 13, wherein processing at least a portion of the image data using a first deep neural network model to determine that an object contained in the image data is a human comprises:
identifying a different image region between a first image and a second image contained in at least a portion of the image data;
aggregating different image regions between the first and second images to obtain at least one region of interest;
carrying out gray level processing on the at least one region of interest;
processing the at least one region of interest after the gray scale processing by using the first deep neural network model so as to classify the object contained in the at least one region of interest; and
determining that the object contained in the at least one region of interest includes a human.
15. The control method of claim 13, wherein processing, by the gate controller, the at least a portion of the image data using a second deep neural network model to determine that a face region is included in the image data comprises:
identifying a different image region between a first image and a second image contained in at least a portion of the image data;
aggregating different image regions between the first and second images to obtain at least one region of interest;
carrying out gray level processing on the at least one region of interest; and
processing the at least one region of interest after the gray scale processing with the first deep neural network model to determine that the at least one region of interest includes a face region.
CN201910373374.0A 2018-05-07 2019-05-07 Door monitoring system and control method thereof Pending CN111917967A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201910373374.0A CN111917967A (en) 2019-05-07 2019-05-07 Door monitoring system and control method thereof
US16/503,452 US20190340904A1 (en) 2018-05-07 2019-07-03 Door Surveillance System and Control Method Thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910373374.0A CN111917967A (en) 2019-05-07 2019-05-07 Door monitoring system and control method thereof

Publications (1)

Publication Number Publication Date
CN111917967A true CN111917967A (en) 2020-11-10

Family

ID=73241664

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910373374.0A Pending CN111917967A (en) 2018-05-07 2019-05-07 Door monitoring system and control method thereof

Country Status (1)

Country Link
CN (1) CN111917967A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112965926A (en) * 2021-03-05 2021-06-15 张玉禄 SPI interface safety chip and SPI interface electron device
CN113053002A (en) * 2021-06-01 2021-06-29 德施曼机电(中国)有限公司 Door lock control method and device and storage medium
TWI827356B (en) * 2022-11-11 2023-12-21 大陸商廣州印芯半導體技術有限公司 Behavior image sensor system

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2008245095A (en) * 2007-03-28 2008-10-09 Aiphone Co Ltd Intercom system
CN107506755A (en) * 2017-09-26 2017-12-22 云丁网络技术(北京)有限公司 Monitoring video recognition methods and device
CN207473711U (en) * 2017-12-06 2018-06-08 四川住易物联科技有限公司 Internet visible intercom door inhibition
CN108629870A (en) * 2017-03-17 2018-10-09 江苏微捷付网络科技有限公司宁波分公司 A kind of Gate-ban Monitoring System
WO2018193977A1 (en) * 2017-04-21 2018-10-25 パナソニックIpマネジメント株式会社 Identification system, identification method and program
CN208521347U (en) * 2018-08-10 2019-02-19 湖南华唯识界科技有限公司 A kind of villa recognition of face gate inhibition security protection videophone integrated system
CN208798111U (en) * 2018-11-20 2019-04-26 深圳市中阳通讯有限公司 A kind of recognition of face cloud talk-back host

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2008245095A (en) * 2007-03-28 2008-10-09 Aiphone Co Ltd Intercom system
CN108629870A (en) * 2017-03-17 2018-10-09 江苏微捷付网络科技有限公司宁波分公司 A kind of Gate-ban Monitoring System
WO2018193977A1 (en) * 2017-04-21 2018-10-25 パナソニックIpマネジメント株式会社 Identification system, identification method and program
CN107506755A (en) * 2017-09-26 2017-12-22 云丁网络技术(北京)有限公司 Monitoring video recognition methods and device
CN207473711U (en) * 2017-12-06 2018-06-08 四川住易物联科技有限公司 Internet visible intercom door inhibition
CN208521347U (en) * 2018-08-10 2019-02-19 湖南华唯识界科技有限公司 A kind of villa recognition of face gate inhibition security protection videophone integrated system
CN208798111U (en) * 2018-11-20 2019-04-26 深圳市中阳通讯有限公司 A kind of recognition of face cloud talk-back host

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112965926A (en) * 2021-03-05 2021-06-15 张玉禄 SPI interface safety chip and SPI interface electron device
CN112965926B (en) * 2021-03-05 2024-04-30 张玉禄 SPI interface safety chip and SPI interface electronic device
CN113053002A (en) * 2021-06-01 2021-06-29 德施曼机电(中国)有限公司 Door lock control method and device and storage medium
TWI827356B (en) * 2022-11-11 2023-12-21 大陸商廣州印芯半導體技術有限公司 Behavior image sensor system

Similar Documents

Publication Publication Date Title
US10769914B2 (en) Informative image data generation using audio/video recording and communication devices
EP1346577B1 (en) Method and apparatus to select the best video frame to transmit to a remote station for cctv based residential security monitoring
JP6134825B2 (en) How to automatically determine the probability of image capture by the terminal using context data
KR101387628B1 (en) Entrance control integrated video recorder
US10593174B1 (en) Automatic setup mode after disconnect from a network
CN106803943A (en) Video monitoring system and equipment
US20100052947A1 (en) Camera with built-in license plate recognition function
EP2250632A1 (en) Video sensor and alarm system and method with object and event classification
EP2636823A1 (en) Enhanced-security door lock system and a control method therefor
CN111917967A (en) Door monitoring system and control method thereof
CN209946967U (en) Entrance guard's equipment and access control system
US20190340904A1 (en) Door Surveillance System and Control Method Thereof
US10713928B1 (en) Arming security systems based on communications among a network of security systems
US11900774B2 (en) Camera enhanced with light detecting sensor
US10657783B2 (en) Video surveillance method based on object detection and system thereof
CN112991585A (en) Personnel entering and exiting management method and computer readable storage medium
JP2014187645A (en) Monitoring system, imaging apparatus, server device, monitoring method, and program
CN115346060A (en) Picture abnormity identification method and device, electronic equipment and storage medium
KR101182986B1 (en) Monitoring system and method using image coupler
US9594290B2 (en) Monitoring apparatus for controlling operation of shutter
CN114898443A (en) Face data acquisition method and device
KR102077632B1 (en) Hybrid intellgent monitoring system suing local image analysis and cloud service
CN111311786A (en) Intelligent door lock system and intelligent door lock control method thereof
CN112070943B (en) Access control management system based on active RFID technology and face recognition technology
JP2018173913A (en) Image processing system, information processing device, and program

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20201110