CN109151387A - A kind of dollying head recognition of face low latency solution based on webRTC - Google Patents
A kind of dollying head recognition of face low latency solution based on webRTC Download PDFInfo
- Publication number
- CN109151387A CN109151387A CN201810980968.3A CN201810980968A CN109151387A CN 109151387 A CN109151387 A CN 109151387A CN 201810980968 A CN201810980968 A CN 201810980968A CN 109151387 A CN109151387 A CN 109151387A
- Authority
- CN
- China
- Prior art keywords
- face
- transcoder
- mobile terminal
- webrtc
- data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N7/00—Television systems
- H04N7/18—Closed-circuit television [CCTV] systems, i.e. systems in which the video signal is not broadcast
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/02—Protocols based on web technology, e.g. hypertext transfer protocol [HTTP]
- H04L67/025—Protocols based on web technology, e.g. hypertext transfer protocol [HTTP] for remote control or remote monitoring of applications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/14—Session management
- H04L67/141—Setup of application sessions
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/44—Decoders specially adapted therefor, e.g. video decoders which are asymmetric with respect to the encoder
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N5/00—Details of television systems
- H04N5/222—Studio circuitry; Studio devices; Studio equipment
- H04N5/262—Studio circuits, e.g. for mixing, switching-over, change of character of image, other special effects ; Cameras specially adapted for the electronic generation of special effects
- H04N5/268—Signal distribution or switching
Landscapes
- Engineering & Computer Science (AREA)
- Signal Processing (AREA)
- Multimedia (AREA)
- Computer Networks & Wireless Communication (AREA)
- Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
- Telephonic Communication Services (AREA)
- Studio Devices (AREA)
Abstract
The dollying head recognition of face low latency solution based on webRTC that the invention discloses a kind of.It specifically comprises the following steps: that Face datection request is initiated in mobile terminal;Transcoding task is initiated from monitoring server to transcoder;Transcoder initiates request to RTC server and establishes Chatroom;RTC server return to room number gives transcoder;Transcoder tells monitoring server room number;Monitoring server tells mobile terminal room number again;Mobile terminal connects RTC server by room number and room is added;RTC server and communication cloud establish the data transmission nodal of chummery, carry out the transmission of time low latency data based on webRTC;Mobile terminal starts to send data through communication cloud to transcoder;Transcoder establishes task at one into two, realizes face snap and real-time transparent transmission task.The beneficial effects of the present invention are: can effectively realize reduces picture delay issue, it can be deduced that probably between 200ms to 300ms, theory can drop within 100ms delay result.
Description
Technical field
The present invention relates to coding and decoding video correlative technology fields, refer in particular to a kind of dollying tribal chief based on webRTC
Face identifies low latency solution.
Background technique
In exploitation mobile phone terminal face monitoring project, discovery sends rtmp with mobile phone terminal and flows to server for recognition of face
When, discovery picture postpones excessive problem, and mobile phone terminal distance is remoter, and the delay for walking public network stream higher can reach more than ten seconds.
Summary of the invention
The present invention be in order to overcome the above deficiencies in the prior art, provide one kind can effectively shorten delay when
Between the dollying head recognition of face low latency solution based on webRTC.
To achieve the goals above, the invention adopts the following technical scheme:
A kind of dollying head recognition of face low latency solution based on webRTC, specifically comprises the following steps:
(1) Face datection request is initiated in mobile terminal;
(2) transcoding task is initiated from monitoring server to transcoder;
(3) transcoder initiates request to RTC server and establishes Chatroom;
(4) RTC server return to room number gives transcoder;
(5) transcoder tells monitoring server room number;
(6) monitoring server tells mobile terminal room number again;
(7) mobile terminal connects RTC server by room number and room is added;
(8) RTC server and communication cloud establish the data transmission nodal of chummery, carry out time low latency based on webRTC
Data transmission;
(9) mobile terminal starts to send data through communication cloud to transcoder;
(10) transcoder establishes task at one into two, realizes face snap and real-time transparent transmission task.
Using the low latency solution of the above-mentioned dollying head recognition of face based on webRTC, can effectively realize
Picture delay issue is reduced, the data for changing into RBG24 by decoding video are shown using opecv, it can be deduced that delay result is big
Generally between 200ms to 300ms, theory can be dropped within 100ms;Mobile phone using 4G network be also almost this delay when
Between.
Preferably, in step (8), based on specifically including RtcMessage, communication, logical in webRTC
Believe cloud and hardware, wherein RtcMessage is to initiate request creation room to communication cloud for mobile terminal as a signaling collection
Or room is added, after communication cloud creates room success, then the communication connection of communication is established with mobile terminal, by hard
Part acquisition audio, video data is sent to communication cloud or receives the data of communication cloud.
Preferably, transcoder uses bottom transcoding technology when establishing task at one into two in step (10),
It inherits dshow frame to be realized, be implemented as follows: mobile terminal being obtained by Source module access RTC server first
Video data, then data are distributed by infTee module and spell pin module frame to video data decoder decoder and video
Wrapper, the first branch video data decoder decoder parse bit stream data, then are transmitted to video encoder encoder compiling
Scheme at RGB24, is transmitted to face recognition module and carries out Characteristic Contrast, to capture face;Second branch's video spells pin module frame
Wrapper is transmitted to FLVmux module, generates RTMP live stream, adds audio mute packet, carries out real-time transparent transmission.
Preferably, receiving video data in step (10), it is decoded into H264 uncorrected data, then H264 video data
It is converted to RBG24 figure, figure is not stopped into refreshing with the cv::imshow method of Opencv and is shown, the effect watched in real time is reached
Fruit.
Preferably, face snap includes that Face datection, face tracking, recognition of face and living body are tested in step (10)
Four parts are demonstrate,proved, Face datection refers to detection static images face and returns to face frame coordinate, landmark coordinate and matter
Measure score information;Face tracking refers to the face tracking inspection to monitoring or dynamic video realization Millisecond under complex scene
It surveys, obtains face frame coordinate, landmark coordinate and the mass fraction information of all faces in each frame in real time, and not by people
Face blocks, obscures, the influence of side face factor;Recognition of face refers to that the recognition of face for 1:1 and 1:N compares, wherein 1:1
It compares misclassification rate in the case where recall rate 96% and is lower than ten a ten thousandths, 1:N is compared in extensive unlimited ethnic group, unlimited age
Portrait data bottom library on realize Millisecond retrieval;Whether living body verifying is true man behaviour before referring to verifying mobile terminal camera
Make.
The beneficial effects of the present invention are: can effectively realize reduces picture delay issue, is shown using opecv and pass through solution
Code video changes into the data of RBG24, it can be deduced that probably between 200ms to 300ms, theory can drop to delay result
Within 100ms.
Detailed description of the invention
Fig. 1 is flow chart of the method for the present invention;
Fig. 2 is the schematic diagram based on webRTC;
Fig. 3 is the schematic diagram of bottom transcoding technology.
Specific embodiment
The present invention will be further described with reference to the accompanying drawings and detailed description.
In embodiment as described in Figure 1, a kind of dollying head recognition of face low latency solution based on webRTC,
Specifically comprise the following steps:
(1) mobile terminal (Mobile App) initiates Face datection request;
(2) transcoding task is initiated from monitoring server (monitor server) to transcoder (transcoder);
(3) transcoder (transcoder) initiates request to RTC server and establishes Chatroom;
(4) RTC server return to room number (session id) gives transcoder (transcoder);
(5) transcoder (transcoder) tells monitoring server (monitor server) room number (session
id);
(6) monitoring server (monitor server) tells mobile terminal (Mobile App) room number (session again
id);
(7) mobile terminal (Mobile App) connects RTC server by room number (session id) and room is added;
(8) RTC server and communication cloud establish the data transmission nodal of chummery, carry out time low latency based on webRTC
Data transmission;
As shown in Fig. 2, based on RtcMessage, communication, communication cloud and hardware is specifically included in webRTC,
Wherein RtcMessage is to initiate request creation room as a signaling collection to communication cloud for mobile terminal or room is added
Between, after communication cloud creates room success, then the communication connection of communication is established with mobile terminal, by hardware acquisition sound view
Frequency evidence is sent to communication cloud or receives the data of communication cloud.
(9) mobile terminal starts to send data through communication cloud to transcoder;
(10) transcoder establishes task at one into two, realizes face snap and real-time transparent transmission task;
Transcoder uses bottom transcoding technology when establishing task at one into two, inherits dshow frame and carries out in fact
It is existing, as shown in figure 3, being implemented as follows: mobile terminal video data is obtained by Source module access RTC server first, then
By infTee module distribution data to video data decoder decoder and video spelling pin module frame wrapper, first point
Branch video data decoder decoder parses bit stream data, then is transmitted to video encoder encoder and is compiled into RGB24 figure, is transmitted to
Face recognition module carries out Characteristic Contrast, to capture face;Second branch's video is spelled pin module frame wrapper and is transmitted to
FLVmux module, generation RTMP live stream, addition audio mute packet (and because of the transmission mechanism using pure video, thus
Eliminate AV and synchronize the required time), to adapt to the RTMP streaming player for centainly needing audio, carry out real-time transparent transmission.
DirectShow is that (this method inherits the frame and in linux for Streaming Media frame on a windows platform
Lower realization), provide media stream acquisition and the playback function of high quality.It supports diversified media file format, packet
ASF, MPEG, AVI, MP3 and wav file are included, while supporting to drive using WDM or the VFW of early stage driving to carry out media stream
Acquisition.DirectShow incorporates other DirectX technologies, can automatically detect and use available audio-video hardware
Accelerate, can also support not hardware-accelerated system.DirectShow enormously simplifies media playback, format conversion and acquisition work
Make.But at the same time, it also provides bottom current control framework for the customized solution of user, to allow user certainly
The DirectShow component of new file format or other purposes is supported in row creation.It is that several use DirectShow write below
Typical case: DVD player, video editing application, AVI to ASF converter, MP3 player and Digital Video collection application.
Video data is received, H264 uncorrected data is decoded into, then H264 video data is converted to RBG24 figure, uses Opencv
Cv::imshow method will figure do not stop refresh show, achieve the effect that watch in real time.
Face snap includes that Face datection, face tracking, recognition of face and living body verify four parts, what Face datection referred to
It is detection static images face and returns to face frame coordinate, landmark coordinate and mass fraction information, in FDDB test set
On, detection effect reaches leading level;Face tracking refer to under complex scene monitoring or dynamic video realize milli
The face tracking detection of second grade, obtains face frame coordinate, landmark coordinate and the quality of all faces in each frame in real time
Score information, and do not blocked, obscured by face, side face factor is influenced;Recognition of face refers to the people for 1:1 and 1:N
Face identification compares, and wherein 1:1 compares misclassification rate in the case where recall rate 96% and advising greatly lower than ten a ten thousandths, 1:N comparison
The retrieval of Millisecond is realized in the unlimited ethnic group of mould, the portrait data bottom library at unlimited age, may be implemented under DYNAMIC COMPLEX scene more
The real-time identification and alarm of road video, plurality of human faces, on LFW test set, accuracy rate reaches 99.87%;Living body verifying refers to
Whether be true man's operation before verifying mobile terminal camera, prevent using high definition photo, threedimensional model, video record, change face etc. it is imitative
Behavior is emitted, demand for security of the sensitive industry to recognition of face is met.
Using the low latency solution of the above-mentioned dollying head recognition of face based on webRTC, can effectively realize
Picture delay issue is reduced, the data for changing into RBG24 by decoding video are shown using opecv, it can be deduced that delay result is big
Generally between 200ms to 300ms, theory can be dropped within 100ms;Mobile phone using 4G network be also almost this delay when
Between, mobile terminal is slightly higher in 2S or so using 4G delay meeting at a distance.
Claims (5)
1. a kind of dollying head recognition of face low latency solution based on webRTC, characterized in that specifically include as follows
Step:
(1) Face datection request is initiated in mobile terminal;
(2) transcoding task is initiated from monitoring server to transcoder;
(3) transcoder initiates request to RTC server and establishes Chatroom;
(4) RTC server return to room number gives transcoder;
(5) transcoder tells monitoring server room number;
(6) monitoring server tells mobile terminal room number again;
(7) mobile terminal connects RTC server by room number and room is added;
(8) RTC server and communication cloud establish the data transmission nodal of chummery, carry out time low latency data based on webRTC
Transmission;
(9) mobile terminal starts to send data through communication cloud to transcoder;
(10) transcoder establishes task at one into two, utilizes opecv Display Realization face snap and real-time transparent transmission task.
2. a kind of dollying head recognition of face low latency solution based on webRTC according to claim 1,
It is characterized in, in step (8), based on specifically including RtcMessage, communication, communication cloud and hardware in webRTC,
Wherein RtcMessage is to initiate request creation room as a signaling collection to communication cloud for mobile terminal or room is added
Between, after communication cloud creates room success, then the communication connection of communication is established with mobile terminal, by hardware acquisition sound view
Frequency evidence is sent to communication cloud or receives the data of communication cloud.
3. a kind of dollying head recognition of face low latency solution based on webRTC according to claim 1 or 2,
It is characterized in that transcoder uses bottom transcoding technology, inherits when establishing task at one into two in step (10)
Dshow frame is realized, is implemented as follows: obtaining mobile terminal video counts by Source module access RTC server first
According to, then data are distributed by infTee module and spell pin module frame wrapper to video data decoder decoder and video,
First branch video data decoder decoder parses bit stream data, then is transmitted to video encoder encoder and is compiled into RGB24
Figure is transmitted to face recognition module and carries out Characteristic Contrast, to capture face;Second branch's video spells pin module frame
Wrapper is transmitted to FLVmux module, generates RTMP live stream, adds audio mute packet, carries out real-time transparent transmission.
4. a kind of dollying head recognition of face low latency solution based on webRTC according to claim 3,
It is characterized in, in step (10), receives video data, is decoded into H264 uncorrected data, then H264 video data is converted to RBG24
Figure is not stopped refreshing with the cv::imshow method of Opencv and shown, achievees the effect that watch in real time by figure.
5. a kind of dollying head recognition of face low latency solution based on webRTC according to claim 1,
It is characterized in, in step (10), face snap includes that Face datection, face tracking, recognition of face and living body verify four parts,
Face datection refers to detection static images face and returns to face frame coordinate, landmark coordinate and mass fraction information;
Face tracking refers to obtaining every the face tracking detection of monitoring or dynamic video realization Millisecond under complex scene in real time
Face frame coordinate, landmark coordinate and the mass fraction information of all faces in one frame, and do not blocked, obscured by face,
The influence of side face factor;Recognition of face refers to that the recognition of face for 1:1 and 1:N compares, and wherein 1:1 is compared in recall rate
Misclassification rate is lower than ten a ten thousandths in the case where 96%, and 1:N is compared in extensive unlimited ethnic group, the portrait data bottom at unlimited age
The retrieval of Millisecond is realized on library;Whether living body verifying is true man's operation before referring to verifying mobile terminal camera.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810980968.3A CN109151387B (en) | 2018-08-27 | 2018-08-27 | webRTC-based low-delay solution for face recognition of mobile camera |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810980968.3A CN109151387B (en) | 2018-08-27 | 2018-08-27 | webRTC-based low-delay solution for face recognition of mobile camera |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109151387A true CN109151387A (en) | 2019-01-04 |
CN109151387B CN109151387B (en) | 2020-10-23 |
Family
ID=64828178
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810980968.3A Active CN109151387B (en) | 2018-08-27 | 2018-08-27 | webRTC-based low-delay solution for face recognition of mobile camera |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109151387B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110868609A (en) * | 2019-12-02 | 2020-03-06 | 杭州当虹科技股份有限公司 | Method for monitoring and standardizing live video |
CN112491924A (en) * | 2020-12-09 | 2021-03-12 | 威创集团股份有限公司 | Cross-platform face recognition login method, system and storage medium |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2017118241A (en) * | 2015-12-22 | 2017-06-29 | 西日本電信電話株式会社 | Audio video communication system, server, virtual client, audio video communication method, and audio video communication program |
CN107027045A (en) * | 2017-04-11 | 2017-08-08 | 广州华多网络科技有限公司 | Pushing video streaming control method, device and video flowing instructor in broadcasting end |
CN107995187A (en) * | 2017-11-30 | 2018-05-04 | 上海哔哩哔哩科技有限公司 | Video main broadcaster, live broadcasting method, terminal and system based on HTML5 browsers |
-
2018
- 2018-08-27 CN CN201810980968.3A patent/CN109151387B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2017118241A (en) * | 2015-12-22 | 2017-06-29 | 西日本電信電話株式会社 | Audio video communication system, server, virtual client, audio video communication method, and audio video communication program |
CN107027045A (en) * | 2017-04-11 | 2017-08-08 | 广州华多网络科技有限公司 | Pushing video streaming control method, device and video flowing instructor in broadcasting end |
CN107995187A (en) * | 2017-11-30 | 2018-05-04 | 上海哔哩哔哩科技有限公司 | Video main broadcaster, live broadcasting method, terminal and system based on HTML5 browsers |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110868609A (en) * | 2019-12-02 | 2020-03-06 | 杭州当虹科技股份有限公司 | Method for monitoring and standardizing live video |
CN112491924A (en) * | 2020-12-09 | 2021-03-12 | 威创集团股份有限公司 | Cross-platform face recognition login method, system and storage medium |
CN112491924B (en) * | 2020-12-09 | 2022-03-22 | 威创集团股份有限公司 | Cross-platform face recognition login method, system and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN109151387B (en) | 2020-10-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11463779B2 (en) | Video stream processing method and apparatus, computer device, and storage medium | |
US11622149B2 (en) | Methods and apparatus for an embedded appliance | |
US9478256B1 (en) | Video editing processor for video cloud server | |
US10951857B2 (en) | Method and system for video recording | |
RU2497298C2 (en) | System and method to store multimedia presentations having several sources | |
CN110740386B (en) | Live broadcast switching method and device and storage medium | |
WO2018166162A1 (en) | System and method for detecting playing status of client in audio and video live broadcast | |
CN109089173B (en) | Method and system for detecting advertisement delivery of smart television terminal | |
CN109151387A (en) | A kind of dollying head recognition of face low latency solution based on webRTC | |
CN106792154A (en) | The frame-skipping synchronization system and its control method of video player | |
WO2020215454A1 (en) | Screen recording method, client, and terminal device | |
CN103188474A (en) | Video intelligent analysis system and storing and playing method of surveillance video thereof | |
CN108234940A (en) | A kind of video monitoring server-side, system and method | |
TWM257575U (en) | Encoder and decoder for audio and video information | |
CN108665749A (en) | The display device and multimedia education system of multimedia education system under cloud desktop | |
KR102248097B1 (en) | Method for transmiting contents and terminal apparatus using the same | |
US11785278B1 (en) | Methods and systems for synchronization of closed captions with content output | |
US20220398216A1 (en) | Appliances and methods to provide robust computational services in addition to a/v encoding, for example at edge of mesh networks | |
US9628870B2 (en) | Video system with customized tiling and methods for use therewith | |
CN115037951B (en) | Live broadcast processing method and device | |
US20220394323A1 (en) | Supplmental audio generation system in an audio-only mode | |
US20230071585A1 (en) | Video compression and streaming | |
US20210258656A1 (en) | Technologies for communicating an enhanced event experience | |
CN114222161A (en) | Immersive image synchronous playing system with interactive function | |
FI20206100A1 (en) | Live channel supplementing |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |