EP3271744B1 - Ultrasonic echo canceler-based technique to detect participant presence at a video conference endpoint - Google Patents

Ultrasonic echo canceler-based technique to detect participant presence at a video conference endpoint Download PDF

Info

Publication number
EP3271744B1
EP3271744B1 EP16713210.9A EP16713210A EP3271744B1 EP 3271744 B1 EP3271744 B1 EP 3271744B1 EP 16713210 A EP16713210 A EP 16713210A EP 3271744 B1 EP3271744 B1 EP 3271744B1
Authority
EP
European Patent Office
Prior art keywords
spatial region
ultrasonic
people
signal
estimate
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
EP16713210.9A
Other languages
German (de)
French (fr)
Other versions
EP3271744A1 (en
Inventor
Oystein Birkenes
Lennart Burenius
Kristian Tangeland
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Cisco Technology Inc
Original Assignee
Cisco Technology Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Cisco Technology Inc filed Critical Cisco Technology Inc
Publication of EP3271744A1 publication Critical patent/EP3271744A1/en
Application granted granted Critical
Publication of EP3271744B1 publication Critical patent/EP3271744B1/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/15Conference systems
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S15/00Systems using the reflection or reradiation of acoustic waves, e.g. sonar systems
    • G01S15/02Systems using the reflection or reradiation of acoustic waves, e.g. sonar systems using reflection of acoustic waves
    • G01S15/04Systems determining presence of a target
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S15/00Systems using the reflection or reradiation of acoustic waves, e.g. sonar systems
    • G01S15/02Systems using the reflection or reradiation of acoustic waves, e.g. sonar systems using reflection of acoustic waves
    • G01S15/50Systems of measurement, based on relative movement of the target
    • G01S15/52Discriminating between fixed and moving objects or between objects moving at different speeds
    • G01S15/523Discriminating between fixed and moving objects or between objects moving at different speeds for presence detection
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/40Support for services or applications
    • H04L65/403Arrangements for multi-party communication, e.g. for conferences
    • H04L65/4038Arrangements for multi-party communication, e.g. for conferences with floor control
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/80Responding to QoS
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/141Systems for two-way working between two video terminals, e.g. videophone
    • H04N7/142Constructional details of the terminal equipment, e.g. arrangements of the camera and the display

Definitions

  • the present disclosure relates to detecting the presence of people using ultrasonic sound.
  • a video conference endpoint includes a camera and a microphone to capture video and audio of a participant in a meeting room, and a display to present video. While no participant is in the meeting room, the endpoint may be placed in a standby or sleep mode to conserve power. In standby, components of the endpoint, such as the camera and display, may be deactivated or turned-off When a participant initially enters the meeting room, the endpoint remains in standby until the participant manually wakes-up the endpoint using a remote control or other touch device. If the participant is unfamiliar with the endpoint or if the touch device is not readily available, the simple act of manually activating the endpoint may frustrate the participant and diminish his or her experience.
  • US 2010/0226487 A1 discloses a video conferencing endpoint which controls its power state using information received by environmental sensors.
  • a microphone is active while a video camera is inactive.
  • the system Upon detecting sound energy above a threshold level and above a threshold frequency, the system transitions to a higher power state by applying power to the video camera. Captured video information is analysed to detect motion. If motion is detected, the system automatically transitions to a yet higher power state.
  • Video conference environment 100 includes video conference endpoints 104 operated by local users/participants 106 (also referred to as "people" 106) and configured to establish audio-visual teleconference collaboration sessions with each other over a communication network 110.
  • Communication network 110 may include one or more wide area networks (WANs), such as the Internet, and one or more local area networks (LANs).
  • WANs wide area networks
  • LANs local area networks
  • a conference server 102 may also be deployed to coordinate the routing of audio-video streams among the video conference endpoints.
  • Each video conference endpoint 104 may include multiple video cameras (VC) 112, a video display 114, a loudspeaker (LDSPKR) 116, and one or more microphones (MIC) 118.
  • Endpoints 104 may be wired or wireless communication devices equipped with the aforementioned components, such as, but not limited to laptop and tablet computers, smartphones, etc.
  • endpoints 104 capture audio/video from their local participants 106 with microphone 118/VC 112, encode the captured audio/video into data packets, and transmit the data packets to other endpoints or to the conference server 102.
  • endpoints 104 decode audio/video from data packets received from the conference server 102 or other endpoints and present the audio/video to their local participants 106 via loudspeaker 116/display 114.
  • Video conference endpoint 104 includes video cameras 112A and 112B positioned proximate and centered on display 114. Cameras 112A and 112B (collectively referred to as "cameras 112") are each operated under control of endpoint 104 to capture video of participants 106 seated around a table 206 opposite from or facing (i.e., in front of) the cameras (and display 114). The combination of two center video cameras depicted in FIG.
  • microphone 118 is positioned adjacent to, and centered along, a bottom side of display 114 (i.e., below the display) so as to receive audio from participants 106 in room 204, although other positions for the microphone are possible.
  • video conference endpoint 104 includes an ultrasonic echo canceler to detect whether participants are present (i.e., to detect "people presence") in room 204. Also, endpoint 104 may use people presence detection decisions from the ultrasonic echo canceler to transition the endpoint from sleep to awake or vice versa, as appropriate.
  • the ultrasonic echo canceler is described below in connection with FIG. 4 .
  • Controller 308 includes a network interface unit 342, a processor 344, and memory 348.
  • the network interface (I/F) unit (NR7) 342 is, for example, an Ethernet card or other interface device that allows the controller 308 to communicate over communication network 110.
  • Network I/F unit 342 may include wired and/or wireless connection capability.
  • Processor 344 may include a collection of microcontrollers and/or microprocessors, for example, each configured to execute respective software instructions stored in the memory 348.
  • the collection of microcontrollers may include, for example: a video controller to receive, send, and process video signals related to display 112 and video cameras 112; an audio processor to receive, send, and process audio signals (in human audible and ultrasonic frequency ranges) related to loudspeaker 116 and microphone array 118; and a high-level controller to provide overall control.
  • Portions of memory 348 (and the instructions therein) may be integrated with processor 344.
  • the terms "audio" and "sound" are synonymous and interchangeable.
  • Processor 344 may send pan, tilt, and zoom commands to video cameras 112 to control the cameras. Processor 344 may also send wakeup (i.e., activate) and sleep (i.e., deactivate) commands to video cameras 112.
  • the camera wakeup command is used to wakeup cameras 112 to a fully powered-on operational state so they can capture video, while the camera sleep command is used to put the cameras to sleep to save power. In the sleep state, portions of cameras 112 are powered-off or deactivated and the cameras are unable to capture video.
  • Processor 344 may similarly send wakeup and sleep commands to display 114 to wakeup the display or put the display to sleep. In another embodiment, processor 344 may selectively wakeup and put to sleep portions of controller 308 while the processor remains active.
  • endpoint 104 When any of cameras 112, display 114, and portions of controller 308 are asleep, endpoint 104 is said to be in standby or asleep (i.e., in the sleep mode). Conversely, when all of the components of endpoint 104 are awake and fully operational, endpoint 104 is said to be awake. Operation of the aforementioned components of endpoint 104 in sleep and awake modes, and sleep and wakeup commands that processor 344 may issue to transition the components between the sleep and awake modes are known to those of ordinary skill in the relevant arts.
  • the memory 348 may comprise read only memory (ROM), random access memory (RAM), magnetic disk storage media devices, optical storage media devices, flash memory devices, electrical, optical, or other physical/tangible (e.g., non-transitory) memory storage devices.
  • ROM read only memory
  • RAM random access memory
  • magnetic disk storage media devices such as magnetic disks
  • optical storage media devices such as magnetic tapes
  • flash memory devices such as electrical, optical, or other physical/tangible (e.g., non-transitory) memory storage devices.
  • the memory 348 may comprise one or more computer readable storage media (e.g., a memory device) encoded with software comprising computer executable instructions and when the software is executed (by the processor 344) it is operable to perform the operations described herein.
  • the memory 348 stores or is encoded with instructions for control logic 350 to perform operations described herein to (i) implement an ultrasonic echo canceler to detect a change in people presence, and (ii) wakeup endpoint 104 or put the endpoint to sleep based on the detected people presence.
  • memory 348 stores data/information 356 used and generated by logic 350, including, but not limited to, adaptive filter coefficients, power estimate thresholds indicative of people presence, predetermined timeouts, and current operating modes of the various components of endpoint 104 (e.g., sleep and awake states), as described below.
  • Ultrasonic echo canceler 400 includes loudspeaker 116, microphone 118, analysis filter banks 404 and 406, a differencer S (i.e. a subtractor S), an adaptive filter 407 associated with adaptive filter coefficients, a power estimator 408, and a people presence detector 410.
  • Analysis filter banks 404 and 406, differencer S, adaptive filter 407, power estimator 408, and detector 410 represent ultrasonic sound signal processing modules that may be implemented in controller 308.
  • ultrasonic echo canceler 400 detects people presence in room 204 (i.e.
  • controller 308 uses the people presence indications to selectively wakeup endpoint 104 when people are present (e.g., have entered the room) or put the endpoint to sleep when people are not present (e.g., have left the room), as indicated by the echo canceler.
  • Echo canceler 104 and controller 308 perform the aforementioned operations automatically, i.e., without manual intervention. Also, echo canceler 104 and controller 308 are operational to perform the operations described herein while endpoint 104 (or components thereof) is both awake and asleep.
  • Ultrasonic echo canceler 400 operates as follows. Controller 308 generates an ultrasonic signal x(n), where n is a time index that increases with time, and provides the ultrasonic signal x(n) to an input of loudspeaker 116. Loudspeaker 116 transmits ultrasonic signal x(n) into a spatial region (e.g., room 204). Ultrasonic signal x(n) has a frequency in an audio frequency range that is generally beyond the frequency range of human hearing, but which can be transmitted from most loudspeakers and picked up by most microphones.
  • This frequency range is generally accepted as approximately 20Khz and above, however, embodiments described herein may also operate at frequencies below 20Khz (e.g., 19Khz) that most people would not be able to hear.
  • the transmitted ultrasonic signal bounces around in room 204 before it is received and thereby picked up by microphone 118 via an echo path 420.
  • Microphone 118 transduces sound received at the microphone into a microphone signal y(n), comprising ultrasonic echo u(n), local sound v(n), and background noise w(n).
  • Microphone 118 provides microphone signal y(n) to analysis filter bank 406, which transforms the microphone signal y(n) into a time-frequency domain including multiple ultrasonic frequency subbands Y(m,1)-Y(m,N) spanning an ultrasonic frequency range. Also, analysis filter bank 404 transforms ultrasonic signal x(n) into a time-frequency domain including multiple ultrasonic frequency subbands X(m,1)-X(m,N) spanning an ultrasonic frequency range.
  • adaptive filter 407 In a k'th one of the ultrasonic frequency subbands X(m,k), adaptive filter 407 generates an estimate ⁇ ( m, k ) of the subband echo signal U(m,k), where m denotes the time frame index. Differencer S subtracts the echo estimate ⁇ ( m,k ) from the subband microphone signal Y(m,k) output by analysis filter bank 406 to form an error (signal) Z(m,k) that is fed back into adaptive filter 407. Adaptive filter coefficients of adaptive filter 407 are adjusted responsive to the fed back error signal.
  • Power estimator 408 computes a running estimate of the mean squared error (power) E
  • the first term is the R X k -norm of the divergence between optimal and estimated filter coefficients
  • the second term is the power of the local sound signal (v(n))
  • the third terms is the power of the background noise (w(n)). It is assumed in the following that the power of the background noise is stationary and time-invariant.
  • adaptive filter 407 When somebody enters room 204, the acoustic room impulse response will change abruptly and adaptive filter 407 will no longer be in a converged state, and it will therefore provide a poor estimate ⁇ ( m, k ) of subband echo signal U(m,k). Also, as long as there is movement in the room, adaptive filter 407 attempts to track the continuously changing impulse response and may never achieve the same depth of convergence. Furthermore, movement in the room may cause Doppler shift, so that some of the energy in one frequency subband leaks over to a neighboring subband. The Doppler Effect can result in both a changed impulse response for a subband and also a mismatch between audio content in the loudspeaker subband output from analysis filter bank 404 and the microphone subband output from analysis filter bank 406.
  • power estimator 408 estimates the power of error signal Z(m,k).
  • detector 410 receives the power estimates of the error signal and performs people presence detection based on the power estimates. As indicated above, as soon as a person enters room 204, the power estimate of the error signal will change from a relatively small level to relatively large level.
  • detection may be performed by comparing the power estimate of the error signal over time with a threshold that may be set, for example, to a few dBs above the steady-state power (e.g., the steady-state power is the power corresponding to when adaptive filter 407 is in a converged state, such that the threshold is indicative of the steady-state or converged state of the adaptive filter), or, if a running estimate of the variance of the power signal is also computed, to a fixed number of standard deviations above the steady-state power.
  • a threshold estimates a statistical model of the power signal, and bases the decision on likelihood evaluations. It is desirable to design adaptive filter 407 for deep convergence instead of fast convergence. This is a well known tradeoff that can be controlled in stochastic gradient descent algorithms like normalized least mean squares (NLMS) with a step size and/or a regularization parameter.
  • NLMS normalized least mean squares
  • a more robust method may be achieved using an individual adaptive filter in each of multiple frequency subbands X(m,1)-X(m,N) (i.e., replicate adaptive filter 407) for each frequency subband, to produce an estimate of echo for each frequency subband.
  • an error signal is generated corresponding to each of the frequency subbands from analysis filter bank 404, and an error signal for each frequency subband is produced based on the estimate of the echo for that frequency subband and a corresponding one of the transformed microphone signal frequency subbands Y(m,k) from analysis filter bank 406.
  • Power estimator 408 computes a power estimate of the error signal for each of the frequency subbands and then combines them all into a total power estimate across the frequency subbands.
  • each power estimate may be a moving average of power estimates so that the total power estimate is a total of the moving average of power estimates.
  • Ultrasonic signal x(n) may either be an ultrasonic signal that is dedicated to the task of detecting people presence, or an existing ultrasonic signal, such as an ultrasonic pairing signal, as long as endpoint 104 is able to generate and transmit the ultrasonic signal while the endpoint is asleep, i.e., in standby. Best performance may be achieved when ultrasonic signal x(n) is stationary and when there is minimal autocorrelation of the non-zero lags of the subband transmitted loudspeaker signal.
  • the correlation matrix R X k of ultrasonic signal x(n) may be used to a certain degree to control the relative sensitivity of the people presence detection to the adaptive filter mismatch and the local sound from within the room.
  • FIG. 5 there is a flowchart of an example method 500 of detecting people presence in a spatial region (e.g., room 204) using ultrasonic echo cancel er 400, and using the detections to selectively wakeup or put to sleep endpoint 104.
  • Echo canceler 400 and controller 308 are fully operational while endpoint 104 is asleep and awake.
  • processor 308 generates an ultrasonic signal (e.g., x(n)).
  • loudspeaker 116 transmits the ultrasonic signal into a spatial region (e.g., room 204).
  • microphone 118 transduces sound, including ultrasonic sound that includes an echo of the transmitted ultrasonic signal, into a received ultrasonic signal (e.g., y(n)).
  • a received ultrasonic signal e.g., y(n)
  • analysis filter banks 404 and 406 transform the ultrasonic signal (e.g., u(n)) and the received ultrasonic microphone signal into respective time-frequency domains each having respective ultrasonic frequency subbands.
  • differencer S computes an error signal, representative of an estimate of an echo-free received ultrasonic signal, based on the transformed ultrasonic signal and the transformed received ultrasonic signal. More specifically, difference 406 subtracts an estimate of the echo signal in the time-frequency domain from the transformed received ultrasonic signal to produce the error signal. This is a closed-loop ultrasonic echo canceling operation performed in at least one ultrasonic frequency subband using adaptive filter 407, which produces the estimate of the echo signal, where the error signal is fed back to the adaptive filter.
  • power estimator 408 computes power estimates of the error signal over time, e.g., the power estimator repetitively performs the power estimate computation as time progresses to produce a time sequence of power estimates.
  • the power estimates may be a moving average of power estimates based on a current power estimate and one or more previous power estimates.
  • detector 410 detects people presence in the spatial region (e.g., room 204) over time based on the power estimates of the error signal over time.
  • detector 410 may detect a change in people presence in the spatial region over time based on a change in the power estimates (or a change in the moving average power estimates) of the error signal over time.
  • processor 344 issues commands to selectively wakeup endpoint 104 or put the endpoint to sleep as appropriate based on the detections at 530.
  • the detection of people presence as described above may activate only those components of endpoint 104, comprising video cameras 112, required by the endpoint to aid in additional processing by processor 344, comprising detecting faces and motion in room 204 based on video captured by the activated/awakened cameras.
  • the people presence detection triggers face and motion detection by endpoint 104. If faces and/or motion are detected subsequent to people presence being detected, only then does processor 344 issue commands to fully wakeup endpoint 104.
  • the face and motion detection is a confirmation that people have entered room 204, which may avoid unnecessary wakeups due to false (initial) detections of people presence. Any known or hereafter developed technique to perform face and motion detection may be used in the confirmation operation.
  • detector 410 compares power estimates (or a moving average of power estimates computed using a rectangular window as in equation (6) or an exponentially decaying window as in equation (7)) to a power estimate (or moving average) threshold indicative of people presence over time.
  • a detection threshold a few dBs (e.g., 2-5 dBs) above a steady-state power of the power estimates.
  • the steady-state power estimates occurs or corresponds to when adaptive filter 407 is in a steady-state, i.e., a converged state.
  • Another way would be to compute the mean and variance over time of the power estimates in steady-state, and to set the threshold automatically as a few standard deviations (e.g., 2-5) above the mean (steady-state power).
  • standard deviations e.g. 2-5
  • processor 308 issues commands to wakeup endpoint 104 if the endpoint was previously asleep.
  • processor 308 issues commands to put endpoint 104 to sleep if the endpoint was previously awake.
  • controller 308 may respectively issue wakeup and sleep commands to cameras 112, display 114, and/or portions of the controller that may be selectively awoken and put to sleep responsive to the commands. Also, timers may be used in operations 610 and 615 to ensure a certain level of hysteresis to dampen frequent switching between awake and sleep states of endpoint 104.
  • operation 610 may require that the power estimate level remain above the threshold for a first predetermined time (e.g., on the order of several seconds, such as 3 or more seconds) measured from the time that the level reaches the threshold before issuing a command to wakeup endpoint 104, and operation 615 may require that the power estimate level remain below the threshold for a second predetermined time (e.g., also on the order of several seconds) measured from the time the level falls below the threshold before issuing a command to put endpoint 104 to sleep.
  • a first predetermined time e.g., on the order of several seconds, such as 3 or more seconds
  • second predetermined time e.g., also on the order of several seconds
  • embodiments presented herein perform the following operations: play/transmit a stationary ultrasonic signal from a loudspeaker; convert sound picked-up by a microphone (i.e., a microphone signal) and the ultrasonic signal from the loudspeaker into the time-frequency domain; estimate an echo-free near-end signal (i.e., error signal) at the microphone with an ultrasonic frequency sub-band adaptive filter; (this is an ultrasonic echo canceling operation); compute an estimate on the power of the error signal (or a running estimate thereof); detect people presence (or a change in people presence) from the estimated power (or changes/variations) in the estimated power.
  • detections are used to wakeup a camera that was previously asleep, and also cause additional processing to occur, comprising detection of faces and motion using video captured by the awakened camera.
  • a video conference endpoint is provided as defined in claim 7.
  • a (non-transitory) processor readable medium is provided as defined in claim 9.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Physics & Mathematics (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Remote Sensing (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Acoustics & Sound (AREA)
  • General Physics & Mathematics (AREA)
  • Measurement Of Velocity Or Position Using Acoustic Or Ultrasonic Waves (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Computational Linguistics (AREA)

Description

    TECHNICAL FIELD
  • The present disclosure relates to detecting the presence of people using ultrasonic sound.
  • BACKGROUND
  • A video conference endpoint includes a camera and a microphone to capture video and audio of a participant in a meeting room, and a display to present video. While no participant is in the meeting room, the endpoint may be placed in a standby or sleep mode to conserve power. In standby, components of the endpoint, such as the camera and display, may be deactivated or turned-off When a participant initially enters the meeting room, the endpoint remains in standby until the participant manually wakes-up the endpoint using a remote control or other touch device. If the participant is unfamiliar with the endpoint or if the touch device is not readily available, the simple act of manually activating the endpoint may frustrate the participant and diminish his or her experience.
  • US 2010/0226487 A1 discloses a video conferencing endpoint which controls its power state using information received by environmental sensors. In a lower powered state a microphone is active while a video camera is inactive. Upon detecting sound energy above a threshold level and above a threshold frequency, the system transitions to a higher power state by applying power to the video camera. Captured video information is analysed to detect motion. If motion is detected, the system automatically transitions to a yet higher power state.
  • SUMMARY OF THE INVENTION
  • The invention is defined by the attached independent claims. Embodiments of the invention are defined by the dependent claims. Any embodiments described herein which do not fall within the scope of the claims are to be interpreted as examples.
  • BRIEF DESCRIPTION OF THE DRAWINGS
    • FIG. 1 is a block diagram of a video conference (e.g., teleconference) environment in which embodiments to automatically detect the presence of people proximate a video conference endpoint in a room and selectively wakeup the video conference endpoint or put the endpoint to sleep may be implemented, according to an example embodiment.
    • FIG. 2 is an illustration of a video conference endpoint deployed in a room, according to an example embodiment.
    • FIG. 3 is a block diagram of a controller of the video conference endpoint, according to an example embodiment.
    • FIG. 4 is a block diagram of an ultrasonic echo canceler implemented in the video conference endpoint to detect whether people are present in a room, according to an example.
    • FIG. 5 is a flowchart of a method of detecting whether people are present in a room using the ultrasonic echo canceler of the video conference endpoint, and using the detections to selectively wakeup the endpoint or put the endpoint to sleep, according to an example
    • FIG. 6 is a series of operations expanding on detection and wakeup/sleep control operations from the method of FIG. 5, according to an example.
    DESCRIPTION OF EXAMPLE EMBODIMENTS Example Embodiments
  • With reference to FIG. 1, there is depicted a block diagram of an example video conference (e.g., teleconference) environment 100 in which embodiments to automatically detect the presence of people (i.e., "people presence") proximate a video conference endpoint (EP) and selectively wakeup the endpoint or put the endpoint to sleep may be implemented. Video conference environment 100 includes video conference endpoints 104 operated by local users/participants 106 (also referred to as "people" 106) and configured to establish audio-visual teleconference collaboration sessions with each other over a communication network 110. Communication network 110 may include one or more wide area networks (WANs), such as the Internet, and one or more local area networks (LANs). A conference server 102 may also be deployed to coordinate the routing of audio-video streams among the video conference endpoints.
  • Each video conference endpoint 104 may include multiple video cameras (VC) 112, a video display 114, a loudspeaker (LDSPKR) 116, and one or more microphones (MIC) 118. Endpoints 104 may be wired or wireless communication devices equipped with the aforementioned components, such as, but not limited to laptop and tablet computers, smartphones, etc. In a transmit direction, endpoints 104 capture audio/video from their local participants 106 with microphone 118/VC 112, encode the captured audio/video into data packets, and transmit the data packets to other endpoints or to the conference server 102. In a receive direction, endpoints 104 decode audio/video from data packets received from the conference server 102 or other endpoints and present the audio/video to their local participants 106 via loudspeaker 116/display 114.
  • Referring now to FIG. 2, there is depicted an illustration of video conference endpoint 104 deployed in a conference room 204 (depicted simplistically as an outline in FIG. 2), according to an embodiment. Video conference endpoint 104 includes video cameras 112A and 112B positioned proximate and centered on display 114. Cameras 112A and 112B (collectively referred to as "cameras 112") are each operated under control of endpoint 104 to capture video of participants 106 seated around a table 206 opposite from or facing (i.e., in front of) the cameras (and display 114). The combination of two center video cameras depicted in FIG. 2 is only one example of many possible camera combinations that may be used, including video cameras spaced-apart from display 114, as would be appreciated by one of ordinary skill in the relevant arts having read the present description. As depicted in the example of FIG. 2, microphone 118 is positioned adjacent to, and centered along, a bottom side of display 114 (i.e., below the display) so as to receive audio from participants 106 in room 204, although other positions for the microphone are possible.
  • According to examples presented herein, video conference endpoint 104 includes an ultrasonic echo canceler to detect whether participants are present (i.e., to detect "people presence") in room 204. Also, endpoint 104 may use people presence detection decisions from the ultrasonic echo canceler to transition the endpoint from sleep to awake or vice versa, as appropriate. The ultrasonic echo canceler is described below in connection with FIG. 4.
  • Reference is now made to FIG. 3, which shows an example block diagram of a controller 308 of video conference endpoint 104 configured to perform techniques described herein. There are numerous possible configurations for controller 308 and FIG. 3 is meant to be an example. Controller 308 includes a network interface unit 342, a processor 344, and memory 348. The network interface (I/F) unit (NR7) 342 is, for example, an Ethernet card or other interface device that allows the controller 308 to communicate over communication network 110. Network I/F unit 342 may include wired and/or wireless connection capability.
  • Processor 344 may include a collection of microcontrollers and/or microprocessors, for example, each configured to execute respective software instructions stored in the memory 348. The collection of microcontrollers may include, for example: a video controller to receive, send, and process video signals related to display 112 and video cameras 112; an audio processor to receive, send, and process audio signals (in human audible and ultrasonic frequency ranges) related to loudspeaker 116 and microphone array 118; and a high-level controller to provide overall control. Portions of memory 348 (and the instructions therein) may be integrated with processor 344. As used herein, the terms "audio" and "sound" are synonymous and interchangeable.
  • Processor 344 may send pan, tilt, and zoom commands to video cameras 112 to control the cameras. Processor 344 may also send wakeup (i.e., activate) and sleep (i.e., deactivate) commands to video cameras 112. The camera wakeup command is used to wakeup cameras 112 to a fully powered-on operational state so they can capture video, while the camera sleep command is used to put the cameras to sleep to save power. In the sleep state, portions of cameras 112 are powered-off or deactivated and the cameras are unable to capture video. Processor 344 may similarly send wakeup and sleep commands to display 114 to wakeup the display or put the display to sleep. In another embodiment, processor 344 may selectively wakeup and put to sleep portions of controller 308 while the processor remains active. When any of cameras 112, display 114, and portions of controller 308 are asleep, endpoint 104 is said to be in standby or asleep (i.e., in the sleep mode). Conversely, when all of the components of endpoint 104 are awake and fully operational, endpoint 104 is said to be awake. Operation of the aforementioned components of endpoint 104 in sleep and awake modes, and sleep and wakeup commands that processor 344 may issue to transition the components between the sleep and awake modes are known to those of ordinary skill in the relevant arts.
  • The memory 348 may comprise read only memory (ROM), random access memory (RAM), magnetic disk storage media devices, optical storage media devices, flash memory devices, electrical, optical, or other physical/tangible (e.g., non-transitory) memory storage devices. Thus, in general, the memory 348 may comprise one or more computer readable storage media (e.g., a memory device) encoded with software comprising computer executable instructions and when the software is executed (by the processor 344) it is operable to perform the operations described herein. For example, the memory 348 stores or is encoded with instructions for control logic 350 to perform operations described herein to (i) implement an ultrasonic echo canceler to detect a change in people presence, and (ii) wakeup endpoint 104 or put the endpoint to sleep based on the detected people presence.
  • In addition, memory 348 stores data/information 356 used and generated by logic 350, including, but not limited to, adaptive filter coefficients, power estimate thresholds indicative of people presence, predetermined timeouts, and current operating modes of the various components of endpoint 104 (e.g., sleep and awake states), as described below.
  • With reference to FIG. 4, there is depicted a block diagram of example ultrasonic echo canceler 400 implemented in endpoint 104 to detect people presence. Ultrasonic echo canceler 400 includes loudspeaker 116, microphone 118, analysis filter banks 404 and 406, a differencer S (i.e. a subtractor S), an adaptive filter 407 associated with adaptive filter coefficients, a power estimator 408, and a people presence detector 410. Analysis filter banks 404 and 406, differencer S, adaptive filter 407, power estimator 408, and detector 410 represent ultrasonic sound signal processing modules that may be implemented in controller 308. As will be described in detail below, ultrasonic echo canceler 400 detects people presence in room 204 (i.e. when people are and are not present in the room), and controller 308 uses the people presence indications to selectively wakeup endpoint 104 when people are present (e.g., have entered the room) or put the endpoint to sleep when people are not present (e.g., have left the room), as indicated by the echo canceler. Echo canceler 104 and controller 308 perform the aforementioned operations automatically, i.e., without manual intervention. Also, echo canceler 104 and controller 308 are operational to perform the operations described herein while endpoint 104 (or components thereof) is both awake and asleep.
  • Ultrasonic echo canceler 400 operates as follows. Controller 308 generates an ultrasonic signal x(n), where n is a time index that increases with time, and provides the ultrasonic signal x(n) to an input of loudspeaker 116. Loudspeaker 116 transmits ultrasonic signal x(n) into a spatial region (e.g., room 204). Ultrasonic signal x(n) has a frequency in an audio frequency range that is generally beyond the frequency range of human hearing, but which can be transmitted from most loudspeakers and picked up by most microphones. This frequency range is generally accepted as approximately 20Khz and above, however, embodiments described herein may also operate at frequencies below 20Khz (e.g., 19Khz) that most people would not be able to hear. The transmitted ultrasonic signal bounces around in room 204 before it is received and thereby picked up by microphone 118 via an echo path 420. Microphone 118 transduces sound received at the microphone into a microphone signal y(n), comprising ultrasonic echo u(n), local sound v(n), and background noise w(n). Microphone 118 provides microphone signal y(n) to analysis filter bank 406, which transforms the microphone signal y(n) into a time-frequency domain including multiple ultrasonic frequency subbands Y(m,1)-Y(m,N) spanning an ultrasonic frequency range. Also, analysis filter bank 404 transforms ultrasonic signal x(n) into a time-frequency domain including multiple ultrasonic frequency subbands X(m,1)-X(m,N) spanning an ultrasonic frequency range.
  • In a k'th one of the ultrasonic frequency subbands X(m,k), adaptive filter 407 generates an estimate (m, k) of the subband echo signal U(m,k), where m denotes the time frame index. Differencer S subtracts the echo estimate (m,k) from the subband microphone signal Y(m,k) output by analysis filter bank 406 to form an error (signal) Z(m,k) that is fed back into adaptive filter 407. Adaptive filter coefficients of adaptive filter 407 are adjusted responsive to the fed back error signal. Power estimator 408 computes a running estimate of the mean squared error (power) E|Z(m,k)|2 of the error signal Z(m,k) and detector 410 detects a changing people presence, e.g., when somebody walks into room 204 where nobody has been for a while, based on the mean squared power.
  • The following is an explanation of how the mean squared error E |Z(m, k)|2 is a good indicator of whether someone enters the room 204. Let Xk(m) = [X(m,k), X(m-1,k) ... X(m-M+1, k)]T denote a delay line for adaptive filter 407, where M denotes the number of adaptive filter coefficients employed in the adaptive filter. Furthermore, let k (m) denote the vector of the M adaptive filter coefficients. The echo estimate can then be written: Û m k = X k H m H ^ k m
    Figure imgb0001

    where (·)H denotes the Hermitian operator. The time-frequency domain transformation of the microphone signal y(n) is given by: Y m k = X k H m H k m + V m k + W m k
    Figure imgb0002

    where Hk(m) is the unknown optimal linear filter, and where it is assumed that the error introduced by analysis filter bank 406 is negligible. The error is then given by the following equation (3): Z m k = Y m k Û m k = X k H m H k m H ^ k m + V m k + W m k
    Figure imgb0003

    and the mean squared error can be written as the following equation (4): E Z m k 2 = H k H m H ^ k H m R X k m H k m H ^ k m + σ V 2 m k + σ W 2 m k
    Figure imgb0004

    where R X k m = EX k m X k H m
    Figure imgb0005
    is the correlation matrix, σ V 2 m k =
    Figure imgb0006
    E|V(m, k)|2 and σ W 2 m k = E W m k 2 .
    Figure imgb0007
    Assuming x(n) is stationary, the correlation matrix will be constant and independent of m, i.e., RXk (m) = RXk . Then the following relationship applies: E Z m k 2 = H k m H ^ k m R X k 2 + σ V 2 m k + σ W 2 m k
    Figure imgb0008

    where . A 2
    Figure imgb0009
    denotes the A-norm. From equation (5) it is seen that E|Z(m, k)|2 comprises three main terms. The first term is the R Xk -norm of the divergence between optimal and estimated filter coefficients, the second term is the power of the local sound signal (v(n)), and the third terms is the power of the background noise (w(n)). It is assumed in the following that the power of the background noise is stationary and time-invariant.
  • When nobody is in room 204, and nobody has been in the room for a while, the acoustic room impulse response will be approximately static (no change) and adaptive filter 407 will be in a well converged state and provide a good estimate (m, k) of subband echo signal U(m,k). Therefore we have that H k m H ^ k m R X k 2
    Figure imgb0010
    is approximately equal to 0. Moreover, there are no or minimal local sounds coming from within room 204 (i.e., σ V 2 m k = 0 )
    Figure imgb0011
    except from the ultrasonic background noise σ W 2 m k .
    Figure imgb0012
    Thus the mean squared error E|Z(m, k)|2 will be small comprising small residual echo and background noise.
  • When somebody enters room 204, the acoustic room impulse response will change abruptly and adaptive filter 407 will no longer be in a converged state, and it will therefore provide a poor estimate (m, k) of subband echo signal U(m,k). Also, as long as there is movement in the room, adaptive filter 407 attempts to track the continuously changing impulse response and may never achieve the same depth of convergence. Furthermore, movement in the room may cause Doppler shift, so that some of the energy in one frequency subband leaks over to a neighboring subband. The Doppler Effect can result in both a changed impulse response for a subband and also a mismatch between audio content in the loudspeaker subband output from analysis filter bank 404 and the microphone subband output from analysis filter bank 406. Both of these effects lead to residual echo and thus
    Figure imgb0013
    Moreover, if the person entering the room makes some sound in the ultrasonic frequency range, the power σ V 2 m k > 0
    Figure imgb0014
    of this sound will also contribute to a large error signal. The mean squared error (power) is a theoretical variable that is useful in theoretical analysis. In practice, power estimator 408 estimates the power of error signal Z(m,k). To do this, either a rectangular window of length L may be used as in: P Z m k = 1 L l = 0 L 1 Z m l , k 2
    Figure imgb0015

    or an exponential recursive weighting may be used as in: P Z m k = αP Z m 1 , k + 1 α Z m k 2
    Figure imgb0016

    where a is a forgetting factor in the range [0, 1].
  • As mentioned above, detector 410 receives the power estimates of the error signal and performs people presence detection based on the power estimates. As indicated above, as soon as a person enters room 204, the power estimate of the error signal will change from a relatively small level to relatively large level. Thus, detection may be performed by comparing the power estimate of the error signal over time with a threshold that may be set, for example, to a few dBs above the steady-state power (e.g., the steady-state power is the power corresponding to when adaptive filter 407 is in a converged state, such that the threshold is indicative of the steady-state or converged state of the adaptive filter), or, if a running estimate of the variance of the power signal is also computed, to a fixed number of standard deviations above the steady-state power. Another example estimates a statistical model of the power signal, and bases the decision on likelihood evaluations. It is desirable to design adaptive filter 407 for deep convergence instead of fast convergence. This is a well known tradeoff that can be controlled in stochastic gradient descent algorithms like normalized least mean squares (NLMS) with a step size and/or a regularization parameter.
  • With a single adaptive filter in one narrow ultrasonic frequency subband, e.g., k, as depicted in FIG. 4, the detection performance may be degraded due to a notch in the frequency response of loudspeaker 116, a notch in the frequency response of microphone 118, or an absorbent in room 204 within that particular frequency subband. Therefore, according to an embodiment of the invention a more robust method may be achieved using an individual adaptive filter in each of multiple frequency subbands X(m,1)-X(m,N) (i.e., replicate adaptive filter 407) for each frequency subband, to produce an estimate of echo for each frequency subband. In that case, an error signal is generated corresponding to each of the frequency subbands from analysis filter bank 404, and an error signal for each frequency subband is produced based on the estimate of the echo for that frequency subband and a corresponding one of the transformed microphone signal frequency subbands Y(m,k) from analysis filter bank 406. Power estimator 408 computes a power estimate of the error signal for each of the frequency subbands and then combines them all into a total power estimate across the frequency subbands. For example, the total power estimate across the subbands for a given frame may be computed according to: PZ (m) = αPZ (m - 1) + (1 - αk |Z(m, k)|2, where Σk indicates the sum over all subbands k that are in use. Alternatively, the total power estimate across the subbands may be computed according to: PZ(m) = Σk PZ(m,k) . In an embodiment, each power estimate may be a moving average of power estimates so that the total power estimate is a total of the moving average of power estimates.
  • Ultrasonic signal x(n) may either be an ultrasonic signal that is dedicated to the task of detecting people presence, or an existing ultrasonic signal, such as an ultrasonic pairing signal, as long as endpoint 104 is able to generate and transmit the ultrasonic signal while the endpoint is asleep, i.e., in standby. Best performance may be achieved when ultrasonic signal x(n) is stationary and when there is minimal autocorrelation of the non-zero lags of the subband transmitted loudspeaker signal. The correlation matrix R Xk of ultrasonic signal x(n) may be used to a certain degree to control the relative sensitivity of the people presence detection to the adaptive filter mismatch and the local sound from within the room.
  • With reference to FIG. 5, there is a flowchart of an example method 500 of detecting people presence in a spatial region (e.g., room 204) using ultrasonic echo cancel er 400, and using the detections to selectively wakeup or put to sleep endpoint 104. Echo canceler 400 and controller 308 are fully operational while endpoint 104 is asleep and awake.
  • At 505, processor 308 generates an ultrasonic signal (e.g., x(n)).
  • At 510, loudspeaker 116 transmits the ultrasonic signal into a spatial region (e.g., room 204).
  • At 510, microphone 118 transduces sound, including ultrasonic sound that includes an echo of the transmitted ultrasonic signal, into a received ultrasonic signal (e.g., y(n)).
  • At 515, analysis filter banks 404 and 406 transform the ultrasonic signal (e.g., u(n)) and the received ultrasonic microphone signal into respective time-frequency domains each having respective ultrasonic frequency subbands.
  • At 520, differencer S computes an error signal, representative of an estimate of an echo-free received ultrasonic signal, based on the transformed ultrasonic signal and the transformed received ultrasonic signal. More specifically, difference 406 subtracts an estimate of the echo signal in the time-frequency domain from the transformed received ultrasonic signal to produce the error signal. This is a closed-loop ultrasonic echo canceling operation performed in at least one ultrasonic frequency subband using adaptive filter 407, which produces the estimate of the echo signal, where the error signal is fed back to the adaptive filter.
  • At 525, power estimator 408 computes power estimates of the error signal over time, e.g., the power estimator repetitively performs the power estimate computation as time progresses to produce a time sequence of power estimates. The power estimates may be a moving average of power estimates based on a current power estimate and one or more previous power estimates.
  • At 530, detector 410 detects people presence in the spatial region (e.g., room 204) over time based on the power estimates of the error signal over time. In an example, detector 410 may detect a change in people presence in the spatial region over time based on a change in the power estimates (or a change in the moving average power estimates) of the error signal over time.
  • At 535, processor 344 issues commands to selectively wakeup endpoint 104 or put the endpoint to sleep as appropriate based on the detections at 530.
  • According to the invention, the detection of people presence as described above may activate only those components of endpoint 104, comprising video cameras 112, required by the endpoint to aid in additional processing by processor 344, comprising detecting faces and motion in room 204 based on video captured by the activated/awakened cameras. In other words, the people presence detection triggers face and motion detection by endpoint 104. If faces and/or motion are detected subsequent to people presence being detected, only then does processor 344 issue commands to fully wakeup endpoint 104. Thus, the face and motion detection is a confirmation that people have entered room 204, which may avoid unnecessary wakeups due to false (initial) detections of people presence. Any known or hereafter developed technique to perform face and motion detection may be used in the confirmation operation.
  • With reference to FIG. 6, there is a flowchart of operations 600, which expand on operations 530 and 535 of method 500.
  • To detect people presence (or a change in people presence), at 605 detector 410 compares power estimates (or a moving average of power estimates computed using a rectangular window as in equation (6) or an exponentially decaying window as in equation (7)) to a power estimate (or moving average) threshold indicative of people presence over time. One way to detect people presence is to set a detection threshold a few dBs (e.g., 2-5 dBs) above a steady-state power of the power estimates. The steady-state power estimates occurs or corresponds to when adaptive filter 407 is in a steady-state, i.e., a converged state. Another way would be to compute the mean and variance over time of the power estimates in steady-state, and to set the threshold automatically as a few standard deviations (e.g., 2-5) above the mean (steady-state power). These methods for detection apply to both the case when a single subband is used, and for the case when multiple subbands are used.
  • At 610, if the power estimates transition from a first level that is less than the power estimate threshold to a second level that is greater than or equal to the power estimate threshold, processor 308 issues commands to wakeup endpoint 104 if the endpoint was previously asleep.
  • At 615, if the power estimates transition from a first level that is greater than or equal to the threshold to a second level that is less than the threshold, processor 308 issues commands to put endpoint 104 to sleep if the endpoint was previously awake.
  • In operations 610 and 615, controller 308 may respectively issue wakeup and sleep commands to cameras 112, display 114, and/or portions of the controller that may be selectively awoken and put to sleep responsive to the commands. Also, timers may be used in operations 610 and 615 to ensure a certain level of hysteresis to dampen frequent switching between awake and sleep states of endpoint 104. For example, operation 610 may require that the power estimate level remain above the threshold for a first predetermined time (e.g., on the order of several seconds, such as 3 or more seconds) measured from the time that the level reaches the threshold before issuing a command to wakeup endpoint 104, and operation 615 may require that the power estimate level remain below the threshold for a second predetermined time (e.g., also on the order of several seconds) measured from the time the level falls below the threshold before issuing a command to put endpoint 104 to sleep.
  • In summary, embodiments presented herein perform the following operations: play/transmit a stationary ultrasonic signal from a loudspeaker; convert sound picked-up by a microphone (i.e., a microphone signal) and the ultrasonic signal from the loudspeaker into the time-frequency domain; estimate an echo-free near-end signal (i.e., error signal) at the microphone with an ultrasonic frequency sub-band adaptive filter; (this is an ultrasonic echo canceling operation); compute an estimate on the power of the error signal (or a running estimate thereof); detect people presence (or a change in people presence) from the estimated power (or changes/variations) in the estimated power.
  • According to the invention, detections are used to wakeup a camera that was previously asleep, and also cause additional processing to occur, comprising detection of faces and motion using video captured by the awakened camera.
  • In summary, in one form, a method is provided as defined in claim 1.
  • In another form, a video conference endpoint is provided as defined in claim 7.
  • In yet another form, a (non-transitory) processor readable medium is provided as defined in claim 9.
  • The above description is intended by way of example only. Various modifications and structural changes may be made therein without departing from the scope of the concepts described herein, the invention being defined solely by the scope of the claims.

Claims (12)

  1. A method performed by a video conference endpoint, the method comprising:
    transmitting (510) an ultrasonic signal into a spatial region;
    transducing (515) ultrasonic sound, including an echo of the transmitted ultrasonic signal, received from the spatial region at a microphone into a received ultrasonic signal;
    transforming (520) the ultrasonic signal and the received ultrasonic signal into respective time-frequency domains that cover respective ultrasonic frequency subbands;
    computing (525) an error signal, representative of an estimate of an echo-free received ultrasonic signal, based on the transformed ultrasonic signal and the transformed received ultrasonic signal by:
    adaptively filtering multiple ultrasonic frequency subbands of the transformed ultrasonic signal based on a set of adaptive filter coefficients adjusted responsive to the error signal individually to produce a respective estimate of the echo for each of the ultrasonic frequency subbands;
    differencing each echo estimate and a corresponding one of multiple ultrasonic frequency subbands of the transformed received ultrasonic signal to produce the error signal for each of the ultrasonic frequency subbands; and
    feeding-back the error signal to the adaptively filtering operation;
    repetitively computing (530) a total power estimate based on the error signals across the ultrasonic frequency subbands overtime;
    detecting (535) a change in people presence in the spatial region overtime based on a change in the total power estimates of the error signal computed across the ultrasonic frequency subbands over time;
    if the detecting indicates that people are present in the spatial region, issuing (540) a command to wakeup a video camera that was previously asleep so that the video camera is able to capture video of at least a portion of the spatial region;
    performing face and motion detection based on video of the spatial region captured by the video camera; and
    issuing one or more commands to wakeup the video conference endpoint only if the face and motion detection confirms the presence of people in the spatial region.
  2. The method of claim 1, wherein:
    repetitively computing the power estimate includes computing a moving average of power estimates over time, wherein the moving average is based on a current power estimate of the error signal and one or more previous power estimates of the error signal; and
    the detecting includes detecting a change in people presence over time based on a change in the moving average overtime.
  3. The method of claim 2, wherein the detecting includes:
    comparing the moving average of power estimates to a moving average power threshold indicative of a change in people presence in the spatial region; and
    if the comparing indicates that the moving average of power estimates has changed from a first level below the threshold to a second level equal to or greater than the threshold, then declaring that people are present in the spatial region.
  4. The method of claim 1, wherein the detecting includes:
    comparing the total power estimate to a power estimate threshold indicative of a change in people presence in the spatial region; and
    if the comparing indicates that the total power estimate has changed from a first level below the threshold to a second level equal to or greater than the threshold, then declaring people are present in the spatial region.
  5. A video conference endpoint (104) comprising:
    a loudspeaker (116) configured to transmit (510) an ultrasonic signal into a spatial region;
    a video camera (112) arranged to be able to capture video of at least a portion of the spatial region;
    a microphone (118) configured to transduce (515) ultrasonic sound, including an echo of the transmitted ultrasonic signal, received from the spatial region into a received ultrasonic signal;
    a processor coupled to the loudspeaker (116), the video camera (112) and the microphone (118) wherein the processor is configured to:
    transform (520) the ultrasonic signal and the received ultrasonic signal into respective time-frequency domains that cover respective ultrasonic frequency subbands;
    compute (525) an error signal, representative of an estimate of an echo-free received ultrasonic signal, based on the transformed ultrasound signal and the transformed received ultrasonic signals by
    adaptively filtering multiple ultrasonic frequency subbands of the transformed ultrasonic signal based on a set of adaptive filter coefficients adjusted responsive to the error signal individually to produce a respective estimate of the echo for each of the ultrasonic frequency subbands;
    differencing each echo estimate and a corresponding one of multiple ultrasonic frequency subbands of the transformed received ultrasonic signal to produce the error signal for each of the ultrasonic frequency subbands; and
    feeding-back the error signal to the adaptively filtering operation;
    repetitively compute (530) a total power estimate based on the error signals across the ultrasonic frequency subbands over time; and
    detect (535) a change in people presence in the spatial region over time based on a change in the total power estimates of the error signal computed across the ultrasonic frequency subbands overtime;
    if the detect operation indicates people are present in the spatial region, issue (540) a wakeup command to the video camera to wakeup if the video camera was previously asleep so that the video camera is able to capture video of at least a portion of the spatial region;
    perform face and motion detection based on video of the spatial region captured by the video camera; and
    issue one or more commands to wakeup the video conference endpoint only if the face and motion detection confirms the presence of people in the spatial region.
  6. The apparatus of claim 5, wherein the processor is further configured to repetitively compute the total power estimate by:
    computing a moving average of power estimates over time, wherein the moving average is based on a current power estimate of the error signal and one or more previous power estimates of the error signal; and
    the detect operation includes detecting a change in people presence over time based on a change in the moving average over time.
  7. The apparatus of claim 6, wherein the detecting includes:
    comparing the moving average of power estimates to a moving average power threshold indicative of a change in people presence in the spatial region; and
    if the comparing indicates that the moving average of power estimates has changed from a first level below the threshold to a second level equal to or greater than the threshold, then declaring that people are present in the spatial region.
  8. The apparatus of claim 5, wherein the detect operation includes:
    comparing the total power estimate to a power estimate threshold indicative of a change in people presence in the spatial region; and
    if the comparing indicates that the total power estimate has changed from a first level below the threshold to a second level equal to or greater than the threshold, then declaring people are present in the spatial region.
  9. A non-transitory processor readable medium storing instructions that, when executed by a processor, cause the processor to:
    cause a loudspeaker (116) to transmit (510) an ultrasonic signal into a spatial region;
    access a received ultrasonic signal representative of transduced ultrasonic sound, including an echo of the transmitted ultrasonic signal, received from the spatial region at a microphone (118);
    transform (520) the ultrasonic signal and the received ultrasonic signal into respective time-frequency domains that cover respective ultrasonic frequency subbands;
    compute (525) an error signal, representative of an estimate of an echo-free received ultrasonic signal, based on the transformed ultrasonic signal and the transformed received ultrasonic signal by:
    adaptively filtering multiple ultrasonic frequency subbands of the transformed ultrasonic signal based on a set of adaptive filter coefficients adjusted responsive to the error signal individually to produce a respective estimate of the echo for each ultrasonic frequency subband;
    differencing each echo estimate and a corresponding one of multiple ultrasonic frequency subbands of the transformed received ultrasonic signal to produce the error signal for each ultrasonic frequency subband; and
    feeding-back the error signal to the adaptively filtering operation;
    repetitively compute (530) a total power estimate based on the error signals across the ultrasonic frequency subbands overtime;
    detect (535) a change in people presence in the spatial region overtime based on a change in the total power estimates of the error signal computed across the ultrasonic frequency subbands over time;
    if the detect operation indicates that people are present in the spatial region, issue (540) a command to wakeup a video camera that was previously asleep so that the video camera is able to capture video of at least a portion of the spatial region;
    perform face and motion detection based on video of the spatial region captured by the video camera; and
    issue one or more commands to wakeup the video conference endpoint only if the face and motion detection confirms the presence of people in the spatial region.
  10. The processor readable medium of claim 9, wherein the instructions further cause the processor to:
    repetitively compute the power estimate by computing a moving average of power estimates over time, wherein the moving average is based on a current power estimate of the error signal and one or more previous power estimates of the error signal; and
    detect a change in people presence over time based on a change in the moving average over time.
  11. The processor readable medium of claim 10, wherein the instructions further cause the processor to:
    compare the moving average of power estimates to a moving average power threshold indicative of a change in people presence in the spatial region; and
    if the comparing indicates that the moving average of power estimates has changed from a first level below the threshold to a second level equal to or greater than the threshold, then declare that people are present in the spatial region.
  12. The processor readable medium of claim 9, wherein the instructions further cause the processor to:
    compare the total power estimate to a power estimate threshold indicative of a change in people presence in the spatial region; and
    if the comparing indicates that the total power estimate has changed from a first level below the threshold to a second level equal to or greater than the threshold, then declare people are present in the spatial region.
EP16713210.9A 2015-03-19 2016-03-15 Ultrasonic echo canceler-based technique to detect participant presence at a video conference endpoint Active EP3271744B1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US14/662,691 US9319633B1 (en) 2015-03-19 2015-03-19 Ultrasonic echo canceler-based technique to detect participant presence at a video conference endpoint
PCT/US2016/022422 WO2016149245A1 (en) 2015-03-19 2016-03-15 Ultrasonic echo canceler-based technique to detect participant presence at a video conference endpoint

Publications (2)

Publication Number Publication Date
EP3271744A1 EP3271744A1 (en) 2018-01-24
EP3271744B1 true EP3271744B1 (en) 2020-08-26

Family

ID=55642880

Family Applications (1)

Application Number Title Priority Date Filing Date
EP16713210.9A Active EP3271744B1 (en) 2015-03-19 2016-03-15 Ultrasonic echo canceler-based technique to detect participant presence at a video conference endpoint

Country Status (3)

Country Link
US (1) US9319633B1 (en)
EP (1) EP3271744B1 (en)
WO (1) WO2016149245A1 (en)

Families Citing this family (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2989986B1 (en) * 2014-09-01 2019-12-18 Samsung Medison Co., Ltd. Ultrasound diagnosis apparatus and method of operating the same
US9838646B2 (en) 2015-09-24 2017-12-05 Cisco Technology, Inc. Attenuation of loudspeaker in microphone array
US10024712B2 (en) * 2016-04-19 2018-07-17 Harman International Industries, Incorporated Acoustic presence detector
US10473751B2 (en) 2017-04-25 2019-11-12 Cisco Technology, Inc. Audio based motion detection
US10141973B1 (en) 2017-06-23 2018-11-27 Cisco Technology, Inc. Endpoint proximity pairing using acoustic spread spectrum token exchange and ranging information
CN109429136A (en) * 2017-08-31 2019-03-05 台南科技大学 Mute ultralow frequency sound wave sleeping system and device
CN107785027B (en) * 2017-10-31 2020-02-14 维沃移动通信有限公司 Audio processing method and electronic equipment
CN108093350B (en) * 2017-12-21 2020-12-15 广东小天才科技有限公司 Microphone control method and microphone
US10267912B1 (en) 2018-05-16 2019-04-23 Cisco Technology, Inc. Audio based motion detection in shared spaces using statistical prediction
US10297266B1 (en) 2018-06-15 2019-05-21 Cisco Technology, Inc. Adaptive noise cancellation for multiple audio endpoints in a shared space
GB2587231B (en) 2019-09-20 2024-04-17 Neatframe Ltd Ultrasonic-based person detection system and method
US11395091B2 (en) * 2020-07-02 2022-07-19 Cisco Technology, Inc. Motion detection triggered wake-up for collaboration endpoints
US10992905B1 (en) * 2020-07-02 2021-04-27 Cisco Technology, Inc. Motion detection triggered wake-up for collaboration endpoints
US12044810B2 (en) 2021-12-28 2024-07-23 Samsung Electronics Co., Ltd. On-device user presence detection using low power acoustics in the presence of multi-path sound propagation

Family Cites Families (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE3938428C1 (en) * 1989-11-18 1991-04-18 Standard Elektrik Lorenz Ag, 7000 Stuttgart, De
JPH07146988A (en) * 1993-11-24 1995-06-06 Nippon Telegr & Teleph Corp <Ntt> Body movement detecting device
US20090046538A1 (en) * 1995-06-07 2009-02-19 Automotive Technologies International, Inc. Apparatus and method for Determining Presence of Objects in a Vehicle
US6108028A (en) * 1998-11-02 2000-08-22 Intel Corporation Method of activating and deactivating a screen saver in a video conferencing system
US6374145B1 (en) 1998-12-14 2002-04-16 Mark Lignoul Proximity sensor for screen saver and password delay
US20080224863A1 (en) * 2005-10-07 2008-09-18 Harry Bachmann Method for Monitoring a Room and an Apparatus For Carrying Out the Method
US20100226487A1 (en) * 2009-03-09 2010-09-09 Polycom, Inc. Method & apparatus for controlling the state of a communication system
US8842153B2 (en) * 2010-04-27 2014-09-23 Lifesize Communications, Inc. Automatically customizing a conferencing system based on proximity of a participant
CN102893175B (en) 2010-05-20 2014-10-29 皇家飞利浦电子股份有限公司 Distance estimation using sound signals
US8907929B2 (en) 2010-06-29 2014-12-09 Qualcomm Incorporated Touchless sensing and gesture recognition using continuous wave ultrasound signals
US9313454B2 (en) * 2011-06-07 2016-04-12 Intel Corporation Automated privacy adjustments to video conferencing streams
US9363386B2 (en) 2011-11-23 2016-06-07 Qualcomm Incorporated Acoustic echo cancellation based on ultrasound motion detection
WO2013187869A1 (en) * 2012-06-11 2013-12-19 Intel Corporation Providing spontaneous connection and interaction between local and remote interaction devices

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
None *

Also Published As

Publication number Publication date
US9319633B1 (en) 2016-04-19
WO2016149245A1 (en) 2016-09-22
EP3271744A1 (en) 2018-01-24

Similar Documents

Publication Publication Date Title
EP3271744B1 (en) Ultrasonic echo canceler-based technique to detect participant presence at a video conference endpoint
EP2783504B1 (en) Acoustic echo cancellation based on ultrasound motion detection
US10473751B2 (en) Audio based motion detection
EP3257236B1 (en) Nearby talker obscuring, duplicate dialogue amelioration and automatic muting of acoustically proximate participants
EP2692123B1 (en) Determining the distance and/or acoustic quality between a mobile device and a base unit
US10267912B1 (en) Audio based motion detection in shared spaces using statistical prediction
US10924872B2 (en) Auxiliary signal for detecting microphone impairment
EP3061242B1 (en) Acoustic echo control for automated speaker tracking systems
US9462552B1 (en) Adaptive power control
US8103011B2 (en) Signal detection using multiple detectors
KR102409536B1 (en) Event detection for playback management on audio devices
Enzner Bayesian inference model for applications of time-varying acoustic system identification
EP2700161B1 (en) Processing audio signals
US20190132452A1 (en) Acoustic echo cancellation based sub band domain active speaker detection for audio and video conferencing applications
JP2022542962A (en) Acoustic Echo Cancellation Control for Distributed Audio Devices
KR20170017381A (en) Terminal and method for operaing terminal
US9225937B2 (en) Ultrasound pairing signal control in a teleconferencing system
Favrot et al. Adaptive equalizer for acoustic feedback control
EP3332558B1 (en) Event detection for playback management in an audio device
US20230421952A1 (en) Subband domain acoustic echo canceller based acoustic state estimator
Ahgren et al. A study of doubletalk detection performance in the presence of acoustic echo path changes
KR20230087525A (en) Method and device for variable pitch echo cancellation
CN116156256A (en) Equipment state control method, device, equipment and medium
Fozunbal et al. A decision-making framework for acoustic echo cancellation

Legal Events

Date Code Title Description
STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE INTERNATIONAL PUBLICATION HAS BEEN MADE

PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: REQUEST FOR EXAMINATION WAS MADE

17P Request for examination filed

Effective date: 20170718

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR

AX Request for extension of the european patent

Extension state: BA ME

DAV Request for validation of the european patent (deleted)
DAX Request for extension of the european patent (deleted)
STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: EXAMINATION IS IN PROGRESS

17Q First examination report despatched

Effective date: 20190611

GRAP Despatch of communication of intention to grant a patent

Free format text: ORIGINAL CODE: EPIDOSNIGR1

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: GRANT OF PATENT IS INTENDED

INTG Intention to grant announced

Effective date: 20200320

GRAS Grant fee paid

Free format text: ORIGINAL CODE: EPIDOSNIGR3

GRAA (expected) grant

Free format text: ORIGINAL CODE: 0009210

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE PATENT HAS BEEN GRANTED

AK Designated contracting states

Kind code of ref document: B1

Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR

REG Reference to a national code

Ref country code: GB

Ref legal event code: FG4D

REG Reference to a national code

Ref country code: CH

Ref legal event code: EP

REG Reference to a national code

Ref country code: DE

Ref legal event code: R096

Ref document number: 602016042695

Country of ref document: DE

REG Reference to a national code

Ref country code: AT

Ref legal event code: REF

Ref document number: 1306887

Country of ref document: AT

Kind code of ref document: T

Effective date: 20200915

REG Reference to a national code

Ref country code: IE

Ref legal event code: FG4D

REG Reference to a national code

Ref country code: LT

Ref legal event code: MG4D

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: BG

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20201126

Ref country code: NO

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20201126

Ref country code: SE

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20200826

Ref country code: HR

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20200826

Ref country code: PT

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20201228

Ref country code: LT

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20200826

Ref country code: GR

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20201127

Ref country code: FI

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20200826

REG Reference to a national code

Ref country code: NL

Ref legal event code: MP

Effective date: 20200826

REG Reference to a national code

Ref country code: AT

Ref legal event code: MK05

Ref document number: 1306887

Country of ref document: AT

Kind code of ref document: T

Effective date: 20200826

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: NL

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20200826

Ref country code: LV

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20200826

Ref country code: RS

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20200826

Ref country code: PL

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20200826

Ref country code: IS

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20201226

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: EE

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20200826

Ref country code: RO

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20200826

Ref country code: SM

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20200826

Ref country code: DK

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20200826

Ref country code: CZ

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20200826

REG Reference to a national code

Ref country code: DE

Ref legal event code: R097

Ref document number: 602016042695

Country of ref document: DE

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: AL

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20200826

Ref country code: AT

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20200826

Ref country code: ES

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20200826

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: SK

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20200826

PLBE No opposition filed within time limit

Free format text: ORIGINAL CODE: 0009261

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: NO OPPOSITION FILED WITHIN TIME LIMIT

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: IT

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20200826

26N No opposition filed

Effective date: 20210527

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: SI

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20200826

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: MC

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20200826

REG Reference to a national code

Ref country code: CH

Ref legal event code: PL

REG Reference to a national code

Ref country code: BE

Ref legal event code: MM

Effective date: 20210331

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: CH

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20210331

Ref country code: LU

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20210315

Ref country code: LI

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20210331

Ref country code: IE

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20210315

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: BE

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20210331

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: CY

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20200826

P01 Opt-out of the competence of the unified patent court (upc) registered

Effective date: 20230525

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: HU

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT; INVALID AB INITIO

Effective date: 20160315

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: MK

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20200826

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: DE

Payment date: 20240319

Year of fee payment: 9

Ref country code: GB

Payment date: 20240320

Year of fee payment: 9

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: FR

Payment date: 20240327

Year of fee payment: 9

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: TR

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20200826