CN109151387A - A kind of dollying head recognition of face low latency solution based on webRTC - Google Patents

A kind of dollying head recognition of face low latency solution based on webRTC Download PDF

Info

Publication number
CN109151387A
CN109151387A CN201810980968.3A CN201810980968A CN109151387A CN 109151387 A CN109151387 A CN 109151387A CN 201810980968 A CN201810980968 A CN 201810980968A CN 109151387 A CN109151387 A CN 109151387A
Authority
CN
China
Prior art keywords
face
transcoder
mobile terminal
webrtc
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810980968.3A
Other languages
Chinese (zh)
Other versions
CN109151387B (en
Inventor
叶�武
潘瑶斌
方垚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Dang Hong Polytron Technologies Inc
Hangzhou Arcvideo Technology Co ltd
Original Assignee
Hangzhou Dang Hong Polytron Technologies Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Dang Hong Polytron Technologies Inc filed Critical Hangzhou Dang Hong Polytron Technologies Inc
Priority to CN201810980968.3A priority Critical patent/CN109151387B/en
Publication of CN109151387A publication Critical patent/CN109151387A/en
Application granted granted Critical
Publication of CN109151387B publication Critical patent/CN109151387B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/18Closed-circuit television [CCTV] systems, i.e. systems in which the video signal is not broadcast
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/02Protocols based on web technology, e.g. hypertext transfer protocol [HTTP]
    • H04L67/025Protocols based on web technology, e.g. hypertext transfer protocol [HTTP] for remote control or remote monitoring of applications
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/14Session management
    • H04L67/141Setup of application sessions
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/44Decoders specially adapted therefor, e.g. video decoders which are asymmetric with respect to the encoder
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/222Studio circuitry; Studio devices; Studio equipment
    • H04N5/262Studio circuits, e.g. for mixing, switching-over, change of character of image, other special effects ; Cameras specially adapted for the electronic generation of special effects
    • H04N5/268Signal distribution or switching

Landscapes

  • Engineering & Computer Science (AREA)
  • Signal Processing (AREA)
  • Multimedia (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
  • Telephonic Communication Services (AREA)
  • Studio Devices (AREA)

Abstract

The dollying head recognition of face low latency solution based on webRTC that the invention discloses a kind of.It specifically comprises the following steps: that Face datection request is initiated in mobile terminal;Transcoding task is initiated from monitoring server to transcoder;Transcoder initiates request to RTC server and establishes Chatroom;RTC server return to room number gives transcoder;Transcoder tells monitoring server room number;Monitoring server tells mobile terminal room number again;Mobile terminal connects RTC server by room number and room is added;RTC server and communication cloud establish the data transmission nodal of chummery, carry out the transmission of time low latency data based on webRTC;Mobile terminal starts to send data through communication cloud to transcoder;Transcoder establishes task at one into two, realizes face snap and real-time transparent transmission task.The beneficial effects of the present invention are: can effectively realize reduces picture delay issue, it can be deduced that probably between 200ms to 300ms, theory can drop within 100ms delay result.

Description

A kind of dollying head recognition of face low latency solution based on webRTC
Technical field
The present invention relates to coding and decoding video correlative technology fields, refer in particular to a kind of dollying tribal chief based on webRTC Face identifies low latency solution.
Background technique
In exploitation mobile phone terminal face monitoring project, discovery sends rtmp with mobile phone terminal and flows to server for recognition of face When, discovery picture postpones excessive problem, and mobile phone terminal distance is remoter, and the delay for walking public network stream higher can reach more than ten seconds.
Summary of the invention
The present invention be in order to overcome the above deficiencies in the prior art, provide one kind can effectively shorten delay when Between the dollying head recognition of face low latency solution based on webRTC.
To achieve the goals above, the invention adopts the following technical scheme:
A kind of dollying head recognition of face low latency solution based on webRTC, specifically comprises the following steps:
(1) Face datection request is initiated in mobile terminal;
(2) transcoding task is initiated from monitoring server to transcoder;
(3) transcoder initiates request to RTC server and establishes Chatroom;
(4) RTC server return to room number gives transcoder;
(5) transcoder tells monitoring server room number;
(6) monitoring server tells mobile terminal room number again;
(7) mobile terminal connects RTC server by room number and room is added;
(8) RTC server and communication cloud establish the data transmission nodal of chummery, carry out time low latency based on webRTC Data transmission;
(9) mobile terminal starts to send data through communication cloud to transcoder;
(10) transcoder establishes task at one into two, realizes face snap and real-time transparent transmission task.
Using the low latency solution of the above-mentioned dollying head recognition of face based on webRTC, can effectively realize Picture delay issue is reduced, the data for changing into RBG24 by decoding video are shown using opecv, it can be deduced that delay result is big Generally between 200ms to 300ms, theory can be dropped within 100ms;Mobile phone using 4G network be also almost this delay when Between.
Preferably, in step (8), based on specifically including RtcMessage, communication, logical in webRTC Believe cloud and hardware, wherein RtcMessage is to initiate request creation room to communication cloud for mobile terminal as a signaling collection Or room is added, after communication cloud creates room success, then the communication connection of communication is established with mobile terminal, by hard Part acquisition audio, video data is sent to communication cloud or receives the data of communication cloud.
Preferably, transcoder uses bottom transcoding technology when establishing task at one into two in step (10), It inherits dshow frame to be realized, be implemented as follows: mobile terminal being obtained by Source module access RTC server first Video data, then data are distributed by infTee module and spell pin module frame to video data decoder decoder and video Wrapper, the first branch video data decoder decoder parse bit stream data, then are transmitted to video encoder encoder compiling Scheme at RGB24, is transmitted to face recognition module and carries out Characteristic Contrast, to capture face;Second branch's video spells pin module frame Wrapper is transmitted to FLVmux module, generates RTMP live stream, adds audio mute packet, carries out real-time transparent transmission.
Preferably, receiving video data in step (10), it is decoded into H264 uncorrected data, then H264 video data It is converted to RBG24 figure, figure is not stopped into refreshing with the cv::imshow method of Opencv and is shown, the effect watched in real time is reached Fruit.
Preferably, face snap includes that Face datection, face tracking, recognition of face and living body are tested in step (10) Four parts are demonstrate,proved, Face datection refers to detection static images face and returns to face frame coordinate, landmark coordinate and matter Measure score information;Face tracking refers to the face tracking inspection to monitoring or dynamic video realization Millisecond under complex scene It surveys, obtains face frame coordinate, landmark coordinate and the mass fraction information of all faces in each frame in real time, and not by people Face blocks, obscures, the influence of side face factor;Recognition of face refers to that the recognition of face for 1:1 and 1:N compares, wherein 1:1 It compares misclassification rate in the case where recall rate 96% and is lower than ten a ten thousandths, 1:N is compared in extensive unlimited ethnic group, unlimited age Portrait data bottom library on realize Millisecond retrieval;Whether living body verifying is true man behaviour before referring to verifying mobile terminal camera Make.
The beneficial effects of the present invention are: can effectively realize reduces picture delay issue, is shown using opecv and pass through solution Code video changes into the data of RBG24, it can be deduced that probably between 200ms to 300ms, theory can drop to delay result Within 100ms.
Detailed description of the invention
Fig. 1 is flow chart of the method for the present invention;
Fig. 2 is the schematic diagram based on webRTC;
Fig. 3 is the schematic diagram of bottom transcoding technology.
Specific embodiment
The present invention will be further described with reference to the accompanying drawings and detailed description.
In embodiment as described in Figure 1, a kind of dollying head recognition of face low latency solution based on webRTC, Specifically comprise the following steps:
(1) mobile terminal (Mobile App) initiates Face datection request;
(2) transcoding task is initiated from monitoring server (monitor server) to transcoder (transcoder);
(3) transcoder (transcoder) initiates request to RTC server and establishes Chatroom;
(4) RTC server return to room number (session id) gives transcoder (transcoder);
(5) transcoder (transcoder) tells monitoring server (monitor server) room number (session id);
(6) monitoring server (monitor server) tells mobile terminal (Mobile App) room number (session again id);
(7) mobile terminal (Mobile App) connects RTC server by room number (session id) and room is added;
(8) RTC server and communication cloud establish the data transmission nodal of chummery, carry out time low latency based on webRTC Data transmission;
As shown in Fig. 2, based on RtcMessage, communication, communication cloud and hardware is specifically included in webRTC, Wherein RtcMessage is to initiate request creation room as a signaling collection to communication cloud for mobile terminal or room is added Between, after communication cloud creates room success, then the communication connection of communication is established with mobile terminal, by hardware acquisition sound view Frequency evidence is sent to communication cloud or receives the data of communication cloud.
(9) mobile terminal starts to send data through communication cloud to transcoder;
(10) transcoder establishes task at one into two, realizes face snap and real-time transparent transmission task;
Transcoder uses bottom transcoding technology when establishing task at one into two, inherits dshow frame and carries out in fact It is existing, as shown in figure 3, being implemented as follows: mobile terminal video data is obtained by Source module access RTC server first, then By infTee module distribution data to video data decoder decoder and video spelling pin module frame wrapper, first point Branch video data decoder decoder parses bit stream data, then is transmitted to video encoder encoder and is compiled into RGB24 figure, is transmitted to Face recognition module carries out Characteristic Contrast, to capture face;Second branch's video is spelled pin module frame wrapper and is transmitted to FLVmux module, generation RTMP live stream, addition audio mute packet (and because of the transmission mechanism using pure video, thus Eliminate AV and synchronize the required time), to adapt to the RTMP streaming player for centainly needing audio, carry out real-time transparent transmission.
DirectShow is that (this method inherits the frame and in linux for Streaming Media frame on a windows platform Lower realization), provide media stream acquisition and the playback function of high quality.It supports diversified media file format, packet ASF, MPEG, AVI, MP3 and wav file are included, while supporting to drive using WDM or the VFW of early stage driving to carry out media stream Acquisition.DirectShow incorporates other DirectX technologies, can automatically detect and use available audio-video hardware Accelerate, can also support not hardware-accelerated system.DirectShow enormously simplifies media playback, format conversion and acquisition work Make.But at the same time, it also provides bottom current control framework for the customized solution of user, to allow user certainly The DirectShow component of new file format or other purposes is supported in row creation.It is that several use DirectShow write below Typical case: DVD player, video editing application, AVI to ASF converter, MP3 player and Digital Video collection application.
Video data is received, H264 uncorrected data is decoded into, then H264 video data is converted to RBG24 figure, uses Opencv Cv::imshow method will figure do not stop refresh show, achieve the effect that watch in real time.
Face snap includes that Face datection, face tracking, recognition of face and living body verify four parts, what Face datection referred to It is detection static images face and returns to face frame coordinate, landmark coordinate and mass fraction information, in FDDB test set On, detection effect reaches leading level;Face tracking refer to under complex scene monitoring or dynamic video realize milli The face tracking detection of second grade, obtains face frame coordinate, landmark coordinate and the quality of all faces in each frame in real time Score information, and do not blocked, obscured by face, side face factor is influenced;Recognition of face refers to the people for 1:1 and 1:N Face identification compares, and wherein 1:1 compares misclassification rate in the case where recall rate 96% and advising greatly lower than ten a ten thousandths, 1:N comparison The retrieval of Millisecond is realized in the unlimited ethnic group of mould, the portrait data bottom library at unlimited age, may be implemented under DYNAMIC COMPLEX scene more The real-time identification and alarm of road video, plurality of human faces, on LFW test set, accuracy rate reaches 99.87%;Living body verifying refers to Whether be true man's operation before verifying mobile terminal camera, prevent using high definition photo, threedimensional model, video record, change face etc. it is imitative Behavior is emitted, demand for security of the sensitive industry to recognition of face is met.
Using the low latency solution of the above-mentioned dollying head recognition of face based on webRTC, can effectively realize Picture delay issue is reduced, the data for changing into RBG24 by decoding video are shown using opecv, it can be deduced that delay result is big Generally between 200ms to 300ms, theory can be dropped within 100ms;Mobile phone using 4G network be also almost this delay when Between, mobile terminal is slightly higher in 2S or so using 4G delay meeting at a distance.

Claims (5)

1. a kind of dollying head recognition of face low latency solution based on webRTC, characterized in that specifically include as follows Step:
(1) Face datection request is initiated in mobile terminal;
(2) transcoding task is initiated from monitoring server to transcoder;
(3) transcoder initiates request to RTC server and establishes Chatroom;
(4) RTC server return to room number gives transcoder;
(5) transcoder tells monitoring server room number;
(6) monitoring server tells mobile terminal room number again;
(7) mobile terminal connects RTC server by room number and room is added;
(8) RTC server and communication cloud establish the data transmission nodal of chummery, carry out time low latency data based on webRTC Transmission;
(9) mobile terminal starts to send data through communication cloud to transcoder;
(10) transcoder establishes task at one into two, utilizes opecv Display Realization face snap and real-time transparent transmission task.
2. a kind of dollying head recognition of face low latency solution based on webRTC according to claim 1, It is characterized in, in step (8), based on specifically including RtcMessage, communication, communication cloud and hardware in webRTC, Wherein RtcMessage is to initiate request creation room as a signaling collection to communication cloud for mobile terminal or room is added Between, after communication cloud creates room success, then the communication connection of communication is established with mobile terminal, by hardware acquisition sound view Frequency evidence is sent to communication cloud or receives the data of communication cloud.
3. a kind of dollying head recognition of face low latency solution based on webRTC according to claim 1 or 2, It is characterized in that transcoder uses bottom transcoding technology, inherits when establishing task at one into two in step (10) Dshow frame is realized, is implemented as follows: obtaining mobile terminal video counts by Source module access RTC server first According to, then data are distributed by infTee module and spell pin module frame wrapper to video data decoder decoder and video, First branch video data decoder decoder parses bit stream data, then is transmitted to video encoder encoder and is compiled into RGB24 Figure is transmitted to face recognition module and carries out Characteristic Contrast, to capture face;Second branch's video spells pin module frame Wrapper is transmitted to FLVmux module, generates RTMP live stream, adds audio mute packet, carries out real-time transparent transmission.
4. a kind of dollying head recognition of face low latency solution based on webRTC according to claim 3, It is characterized in, in step (10), receives video data, is decoded into H264 uncorrected data, then H264 video data is converted to RBG24 Figure is not stopped refreshing with the cv::imshow method of Opencv and shown, achievees the effect that watch in real time by figure.
5. a kind of dollying head recognition of face low latency solution based on webRTC according to claim 1, It is characterized in, in step (10), face snap includes that Face datection, face tracking, recognition of face and living body verify four parts, Face datection refers to detection static images face and returns to face frame coordinate, landmark coordinate and mass fraction information; Face tracking refers to obtaining every the face tracking detection of monitoring or dynamic video realization Millisecond under complex scene in real time Face frame coordinate, landmark coordinate and the mass fraction information of all faces in one frame, and do not blocked, obscured by face, The influence of side face factor;Recognition of face refers to that the recognition of face for 1:1 and 1:N compares, and wherein 1:1 is compared in recall rate Misclassification rate is lower than ten a ten thousandths in the case where 96%, and 1:N is compared in extensive unlimited ethnic group, the portrait data bottom at unlimited age The retrieval of Millisecond is realized on library;Whether living body verifying is true man's operation before referring to verifying mobile terminal camera.
CN201810980968.3A 2018-08-27 2018-08-27 webRTC-based low-delay solution for face recognition of mobile camera Active CN109151387B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810980968.3A CN109151387B (en) 2018-08-27 2018-08-27 webRTC-based low-delay solution for face recognition of mobile camera

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810980968.3A CN109151387B (en) 2018-08-27 2018-08-27 webRTC-based low-delay solution for face recognition of mobile camera

Publications (2)

Publication Number Publication Date
CN109151387A true CN109151387A (en) 2019-01-04
CN109151387B CN109151387B (en) 2020-10-23

Family

ID=64828178

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810980968.3A Active CN109151387B (en) 2018-08-27 2018-08-27 webRTC-based low-delay solution for face recognition of mobile camera

Country Status (1)

Country Link
CN (1) CN109151387B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110868609A (en) * 2019-12-02 2020-03-06 杭州当虹科技股份有限公司 Method for monitoring and standardizing live video
CN112491924A (en) * 2020-12-09 2021-03-12 威创集团股份有限公司 Cross-platform face recognition login method, system and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2017118241A (en) * 2015-12-22 2017-06-29 西日本電信電話株式会社 Audio video communication system, server, virtual client, audio video communication method, and audio video communication program
CN107027045A (en) * 2017-04-11 2017-08-08 广州华多网络科技有限公司 Pushing video streaming control method, device and video flowing instructor in broadcasting end
CN107995187A (en) * 2017-11-30 2018-05-04 上海哔哩哔哩科技有限公司 Video main broadcaster, live broadcasting method, terminal and system based on HTML5 browsers

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2017118241A (en) * 2015-12-22 2017-06-29 西日本電信電話株式会社 Audio video communication system, server, virtual client, audio video communication method, and audio video communication program
CN107027045A (en) * 2017-04-11 2017-08-08 广州华多网络科技有限公司 Pushing video streaming control method, device and video flowing instructor in broadcasting end
CN107995187A (en) * 2017-11-30 2018-05-04 上海哔哩哔哩科技有限公司 Video main broadcaster, live broadcasting method, terminal and system based on HTML5 browsers

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110868609A (en) * 2019-12-02 2020-03-06 杭州当虹科技股份有限公司 Method for monitoring and standardizing live video
CN112491924A (en) * 2020-12-09 2021-03-12 威创集团股份有限公司 Cross-platform face recognition login method, system and storage medium
CN112491924B (en) * 2020-12-09 2022-03-22 威创集团股份有限公司 Cross-platform face recognition login method, system and storage medium

Also Published As

Publication number Publication date
CN109151387B (en) 2020-10-23

Similar Documents

Publication Publication Date Title
US11463779B2 (en) Video stream processing method and apparatus, computer device, and storage medium
US11622149B2 (en) Methods and apparatus for an embedded appliance
US9478256B1 (en) Video editing processor for video cloud server
US10951857B2 (en) Method and system for video recording
RU2497298C2 (en) System and method to store multimedia presentations having several sources
CN110740386B (en) Live broadcast switching method and device and storage medium
WO2018166162A1 (en) System and method for detecting playing status of client in audio and video live broadcast
CN109089173B (en) Method and system for detecting advertisement delivery of smart television terminal
CN109151387A (en) A kind of dollying head recognition of face low latency solution based on webRTC
CN106792154A (en) The frame-skipping synchronization system and its control method of video player
WO2020215454A1 (en) Screen recording method, client, and terminal device
CN103188474A (en) Video intelligent analysis system and storing and playing method of surveillance video thereof
CN108234940A (en) A kind of video monitoring server-side, system and method
TWM257575U (en) Encoder and decoder for audio and video information
CN108665749A (en) The display device and multimedia education system of multimedia education system under cloud desktop
KR102248097B1 (en) Method for transmiting contents and terminal apparatus using the same
US11785278B1 (en) Methods and systems for synchronization of closed captions with content output
US20220398216A1 (en) Appliances and methods to provide robust computational services in addition to a/v encoding, for example at edge of mesh networks
US9628870B2 (en) Video system with customized tiling and methods for use therewith
CN115037951B (en) Live broadcast processing method and device
US20220394323A1 (en) Supplmental audio generation system in an audio-only mode
US20230071585A1 (en) Video compression and streaming
US20210258656A1 (en) Technologies for communicating an enhanced event experience
CN114222161A (en) Immersive image synchronous playing system with interactive function
FI20206100A1 (en) Live channel supplementing

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant