CN113023512A - Intelligent elevator voice control method and device, elevator equipment and storage medium - Google Patents

Intelligent elevator voice control method and device, elevator equipment and storage medium Download PDF

Info

Publication number
CN113023512A
CN113023512A CN202110224729.7A CN202110224729A CN113023512A CN 113023512 A CN113023512 A CN 113023512A CN 202110224729 A CN202110224729 A CN 202110224729A CN 113023512 A CN113023512 A CN 113023512A
Authority
CN
China
Prior art keywords
voice
elevator
video information
sound source
command
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110224729.7A
Other languages
Chinese (zh)
Inventor
林梦圆
李晖
邱钊鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Polytechnic
Original Assignee
Beijing Polytechnic
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Polytechnic filed Critical Beijing Polytechnic
Priority to CN202110224729.7A priority Critical patent/CN113023512A/en
Publication of CN113023512A publication Critical patent/CN113023512A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • BPERFORMING OPERATIONS; TRANSPORTING
    • B66HOISTING; LIFTING; HAULING
    • B66BELEVATORS; ESCALATORS OR MOVING WALKWAYS
    • B66B1/00Control systems of elevators in general
    • B66B1/02Control systems without regulation, i.e. without retroactive action
    • B66B1/06Control systems without regulation, i.e. without retroactive action electric
    • B66B1/14Control systems without regulation, i.e. without retroactive action electric with devices, e.g. push-buttons, for indirect control of movements
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B66HOISTING; LIFTING; HAULING
    • B66BELEVATORS; ESCALATORS OR MOVING WALKWAYS
    • B66B1/00Control systems of elevators in general
    • B66B1/34Details, e.g. call counting devices, data transmission from car to control system, devices giving information to the control system
    • B66B1/3415Control system configuration and the data transmission or communication within the control system
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B66HOISTING; LIFTING; HAULING
    • B66BELEVATORS; ESCALATORS OR MOVING WALKWAYS
    • B66B5/00Applications of checking, fault-correcting, or safety devices in elevators
    • B66B5/0006Monitoring devices or performance analysers
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/18Speech classification or search using natural language modelling
    • G10L15/1822Parsing for meaning understanding
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/20Speech recognition techniques specially adapted for robustness in adverse environments, e.g. in noise, of stress induced speech
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B66HOISTING; LIFTING; HAULING
    • B66BELEVATORS; ESCALATORS OR MOVING WALKWAYS
    • B66B2201/00Aspects of control systems of elevators
    • B66B2201/40Details of the change of control mode
    • B66B2201/46Switches or switchgear
    • B66B2201/4607Call registering systems
    • B66B2201/4638Wherein the call is registered without making physical contact with the elevator system
    • B66B2201/4646Wherein the call is registered without making physical contact with the elevator system using voice recognition
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/223Execution procedure of a spoken command
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L2021/02082Noise filtering the noise being echo, reverberation of the speech
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • G10L2021/02161Number of inputs available containing the signal or the noise to be suppressed
    • G10L2021/02166Microphone arrays; Beamforming

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Computational Linguistics (AREA)
  • Automation & Control Theory (AREA)
  • Quality & Reliability (AREA)
  • Signal Processing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Artificial Intelligence (AREA)
  • Indicating And Signalling Devices For Elevators (AREA)

Abstract

The invention belongs to the technical field of elevator equipment, solves the technical problems of energy saving and poor user experience caused by difficult control of elevators by keys when the elevators are used in a centralized manner, and provides an intelligent elevator voice control method, an intelligent elevator voice control device, elevator equipment and a storage medium. Collecting audio information and video information of a target area; analyzing the audio information by combining the video information, and outputting each voice instruction; and analyzing each voice command to generate a control command to control the elevator to operate. The invention also comprises a device for carrying out the method, an elevator installation and a storage medium. The elevator non-contact control method has the advantages that the voice command is accurately obtained by combining the video and the audio, the non-contact control of the elevator is completed, and the method is high in reliability and good in experience effect.

Description

Intelligent elevator voice control method and device, elevator equipment and storage medium
Technical Field
The invention relates to the technical field of elevator equipment, in particular to an intelligent elevator voice control method and device, elevator equipment and a storage medium.
Background
Elevators become an indispensable part of people's life and work, and in many business buildings or shopping malls, because the personnel density in the building is very high, the elevator that must be used in the process of personnel's uplink and downlink.
Among the prior art, generally adopt the uplink and downlink of button direct control elevator, but when using the elevator in a set, like going to work and going to a business shop shopping, because the flow of people is big, some passengers can't reach and press the key region, lead to unable floor selection, and energy-conservingly influences user experience.
Disclosure of Invention
In view of this, embodiments of the present invention provide an intelligent elevator voice control method and apparatus, an elevator device, and a storage medium, so as to solve the technical problems of energy inefficiency and poor user experience caused by control difficulty in controlling an elevator by using a key when the elevator is used in a centralized manner.
The technical scheme adopted by the invention is as follows:
the invention provides an intelligent elevator voice control method, which comprises the following steps:
s1: collecting audio information and video information of a target area;
s2: analyzing the audio information by combining the video information, and outputting each voice instruction;
s3: analyzing each voice command, and generating a control command to control the elevator to operate;
preferably, the destination area comprises a first area inside the lift car and/or a second area outside the lift car.
Preferably, the S1 includes:
s11: acquiring a first command and a second command in the running process of the elevator;
s12: controlling the image sensor to start to acquire video information of a target area according to the first instruction;
s13: controlling the sound sensor to start to collect the audio information of the target area according to the second instruction;
wherein the first instruction is executed prior to the second instruction.
Preferably, the S2 includes:
s21: determining the initial sound source position of each voice according to the audio information;
s22: determining a sound source target corresponding to each voice according to each initial sound source position and the video information;
s23: processing the voice corresponding to each sound source target according to the video information, and outputting each reliable voice;
s24: and analyzing each reliable voice and outputting the voice command.
Preferably, the S21 includes:
s211: carrying out simultaneous elimination processing on each voice;
s212: performing anti-reverberation processing on each voice subjected to simultaneous acoustic elimination processing;
s213: and carrying out sound enhancement processing and automatic gain control on each voice subjected to the anti-reverberation processing, and outputting the initial sound source position of each voice.
Preferably, the S23 includes:
s231: outputting the moving track of each sound source target according to the video information;
s232: obtaining all sound source positions corresponding to all voices according to all the moving tracks;
s233: and processing the audio information according to all sound source positions of each voice, and outputting each reliable voice.
Preferably, the S23 includes:
s234: outputting the category information and the moving track of each sound source target according to the video information;
s235: and screening all the voices of the audio information according to the category information and the moving track, and outputting all the reliable voices.
The invention also provides an intelligent elevator voice control device, which comprises:
a data acquisition module: the system comprises a data acquisition module, a data processing module and a display module, wherein the data acquisition module is used for acquiring audio information and video information of a target area;
a data processing module: the audio information is analyzed by combining the video information, and each voice instruction is output;
the data conversion module: and the voice command is used for analyzing each voice command and generating a control command to control the elevator to operate.
The present invention also provides an electronic device, comprising: at least one processor, at least one memory, and computer program instructions stored in the memory that, when executed by the processor, implement the method of any of the above.
The invention also provides a medium having stored thereon computer program instructions which, when executed by a processor, implement the method of any of the above.
In conclusion, the beneficial effects of the invention are as follows:
according to the intelligent elevator voice control method, the intelligent elevator voice control device, the elevator equipment and the storage medium, the voice instructions in the audio information are determined by acquiring the audio information and the video information and combining the video information, then the voice instructions are analyzed, the control instructions for controlling the operation of the elevator corresponding to the system are generated, and the operation of the elevator is controlled; the method has the advantages that the voice command is accurately obtained by combining the video and the audio, the non-contact control of the elevator is completed, the reliability is high, the experience effect is good, meanwhile, the stop time of the elevator door can be accurately controlled according to the video information and the audio information, and the energy conservation is realized.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required to be used in the embodiments of the present invention will be briefly described below, and for those skilled in the art, without any creative effort, other drawings may be obtained according to the drawings, and these drawings are all within the protection scope of the present invention.
Fig. 1 is a flow chart of a voice control method of an intelligent elevator in embodiment 1 of the invention;
fig. 2 is a schematic structural diagram of a voice-controlled intelligent elevator in embodiment 1 of the present invention;
FIG. 3 is a schematic structural diagram of an interaction mechanism installation in embodiment 1 of the present invention;
fig. 4 is a schematic structural diagram of an interaction area in embodiment 1 of the present invention;
fig. 5 is a schematic flowchart of acquiring audio information and video information according to embodiment 1 of the present invention;
fig. 6 is a schematic structural view of an interaction zone baffle in embodiment 1 of the present invention;
FIG. 7 is a flowchart illustrating a process of obtaining a voice command according to embodiment 1 of the present invention;
fig. 8 is a schematic flowchart of acquiring the sound source position in embodiment 1 of the present invention;
fig. 9 is a schematic flowchart of acquiring reliable speech in embodiment 1 of the present invention;
FIG. 10 is a schematic flow chart of reliable speech screening in embodiment 1 of the present invention;
fig. 11 is a schematic structural diagram of voice control in embodiment 1 of the present invention;
fig. 12 is a flowchart illustrating a previous deletion of audio and video data according to embodiment 1 of the present invention;
fig. 13 is a schematic structural view of a sound-absorbing band on the inner wall of an elevator cage in embodiment 1 of the present invention;
fig. 14 is a schematic structural view of an intelligent elevator voice control device in embodiment 2 of the present invention;
fig. 15 is a schematic structural view of an elevator apparatus in embodiment 3 of the present invention.
Reference numerals of fig. 1 to 15:
1. an inner wall; 11. a key area; 12. an interaction area; 131. a second display area; 132. a first display area; 133. an information acquisition area; 1331. a microphone; 1332. a camera; 1333. a baffle plate; 1334. a chute; 1335. a first baffle plate; 1336. a second baffle; 134. an interaction mechanism; 135. mounting holes; 136. a groove; 14. a first inner wall; 15. a second inner wall; 16. a sound absorbing band; 2. a sound absorbing layer; 3. a housing.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention. It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. In the description of the present invention, it is to be understood that the terms "center", "upper", "lower", "front", "rear", "left", "right", "vertical", "horizontal", "top", "bottom", "inner", "outer", and the like indicate orientations or positional relationships based on those shown in the drawings, and are only for convenience of description and simplicity of description, and do not indicate or imply that the referenced devices or elements must have a particular orientation, be constructed in a particular orientation, and be operated, and thus, are not to be construed as limiting the present invention. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element. In case of conflict, the embodiments of the present invention and the various features of the embodiments may be combined with each other within the scope of the present invention.
Example 1
The intelligent elevator voice control method mainly realizes non-contact control of an elevator in a voice instruction mode from the perspective of combining hardware and a software algorithm.
Referring to fig. 1, fig. 1 is a schematic flow chart of an intelligent elevator voice control method in embodiment 1 of the present invention.
S1: collecting audio information and video information of a target area;
specifically, at least one sound sensor, such as a microphone 1331, for acquiring audio information and at least one image sensor, such as a camera 1332, for acquiring video information in a target area are installed inside and/or outside the elevator cavity; the destination area comprises a first area inside the lift car and/or a second area outside the lift car; after the elevator door is opened, or when the elevator stops at a certain position (at this time, the elevator door is not opened), the camera 1332 starts to acquire video information of the target area, and after the elevator door is opened, the microphone 1331 starts to acquire audio information of the target area; referring to fig. 2 and 3, the elevator car of the elevator comprises a housing 3, an inner wall 1 and an intermediate layer, wherein the inner wall is provided with a key area 11 and an interaction area 12, the intermediate layer is provided with a groove 136, a mounting hole 136 for mounting an interaction mechanism 134 of the interaction area 12 is arranged at a position of the inner wall 1 corresponding to the groove 136, and each component of the interaction area 12 is directly or indirectly fixed on the mounting hole 136 and is accommodated in the groove 136; referring to fig. 4, the interactive area 12 includes a first display area 132, a second display area 131 and an information collecting area 133, wherein the first display area 132 is used for displaying current status information of the elevator door, the second display area 131 is used for displaying current operating status of the elevator door, the information collecting area 133 is used for installing a camera 1332 and a microphone 1331, the first display screen of the first display area 132, the second display screen of the second display area 131, the camera 1332 and the microphone 1331 are all electrically connected with a controller (not shown), the middle layer is an attraction layer 2, and the sound absorption layer 2 includes at least one protrusion, which is an arc surface; the attraction layer 2 is formed by firing a fiber material.
In one embodiment, referring to fig. 5, the S1 includes:
s11: acquiring a first command and a second command in the running process of the elevator;
specifically, the elevator executes a first command before stopping, and the first command is used for starting the camera 1332; and the elevator executes a second instruction after stopping, and the second instruction is used for starting the microphone. Referring to fig. 6, a chute 1334 and a baffle 1333 are provided at an opening position of the information acquisition area 133, the baffle 1333 can slide along the chute 1334 to hide or view the camera 1332 and/or the microphone 1331, and when the camera 1332 and/or the microphone 1331 need to be opened, the baffle 1333 moves along the chute 1334 to view the camera 1332 and/or the microphone 1331; when the camera 1332 and/or the microphone 1331 do not need to be opened, the baffle 1333 moves along the chute 1334, so that the camera 1332 and/or the microphone 1331 are hidden; the movement of the flap 1333 may be directly driven by a motor, or the movement and resetting of the position of the flap 1333 may be realized by a mechanical structure. The baffle 1333 corresponding to the camera 1332 and the microphone 1331 is integral (left), or the baffle 1333 includes a first baffle 1335 of the camera 1332 and a second baffle 1336 of the microphone 1331 (right). If the camera 1332 and the baffle 1333 corresponding to the microphone 1331 are integrated, the chute 1334 may be disposed along the X direction or along the Y direction.
S12: controlling the image sensor to start to acquire video information of a target area according to the first instruction;
specifically, when the elevator is about to stop, a first instruction is executed, the baffle 1333 slides along the chute 1334, so that the camera 1332 is visible, or both the camera 1332 and the microphone 1331 are visible, and then the camera 1332 starts to acquire video information of a target area; the position information of the objects in the elevator cage can be determined according to the video information of the first area, and the arriving sequence of the objects in the elevator cage is mastered according to the moving path of the objects in the elevator cage, such as: the target that will arrive will instinctively move to being close to the elevator door position, the meeting that arrives later will instinctively move to keeping away from the elevator door position, can confirm the target that the elevator car needs to take the elevator outside according to the video data in second area, simultaneously, can close or closed but the elevator has not been moved yet at the elevator through the video information in second area, the target that has arrived at this moment needs to take the elevator, the elevator can open the elevator door once more, waits for the target to get into the elevator, can avoid some to take the elevator next time in time of the target that does not reach the button, resources are saved, improve experience effect.
S13: controlling the sound sensor to start to collect the audio information of the target area according to the second instruction;
specifically, after the elevator stops, the second command is executed, and then the microphone 1331 starts to collect audio information inside and outside the elevator car.
Wherein the first instruction is executed prior to the second instruction.
Specifically, the first instruction is generated before the second instruction, or the first instruction and the second instruction are generated simultaneously, and the first instruction is executed before the second instruction; before the elevator stops, no new effective voice command is generated for the elevator cage, and the microphone 1331 and the camera 1332 do not acquire data, so that the data processing amount can be reduced, the voice command can be prevented from being recognized mistakenly, and the voice control efficiency is improved; meanwhile, the camera 1332 works first to determine the position change information of the first target in the elevator cage, for example, the target far from the elevator door on the low floor moves to the position of the elevator door, the target near the elevator door on the high floor moves to the position far from the elevator door, and the position adjustment is completed before the elevator door is opened; after the elevator door is opened, the target detection of the camera 1332 is mainly concentrated near the elevator door, so that the processing amount can be reduced, and the detection precision is improved.
S2: analyzing the audio information by combining the video information, and outputting each voice instruction;
specifically, the audio information collected by the microphone 1331 is processed to obtain at least one voice command for controlling the operation of the elevator, where the voice command at least includes one of the following: up, down, etc., to level a, closed, etc.; the voice commands sent by the targets can be determined by combining the video information, the accuracy of audio information analysis is improved, if the target object sends the voice commands in the process of entering the elevator, the sound source position of the audio information is changed, the sources of the voice commands can be rapidly and accurately distinguished through the video information, and meanwhile, targeted noise elimination can be achieved.
In one embodiment, referring to fig. 7, the S2 includes:
s21: determining the initial sound source position of each voice according to the audio information;
specifically, the sound production position corresponding to each voice is determined according to the audio information received by the microphone 1331, and the position is recorded as the initial sound source position of the corresponding voice.
In one embodiment, referring to fig. 8, the S21 includes:
s211: carrying out simultaneous elimination processing on each voice;
specifically, the method comprises at least two microphones 1331, determines the position of the sound source according to the time difference of the multiple microphones 1331 receiving the same voice, and then acquires another voice from the position of the sound source by adopting a beam forming technology, so that the reliability and accuracy of acquiring the voice command can be improved, the noise interference of the environment where the voice command is located is reduced, and the response rate and reliability of the voice command are improved.
S212: performing anti-reverberation processing on each voice subjected to simultaneous acoustic elimination processing;
specifically, by performing anti-reverberation processing on each voice subjected to the simultaneous acoustic cancellation processing, target detection under a reverberation background can be effectively realized, and adverse effects caused by Doppler mismatch can be well overcome.
S213: and carrying out sound enhancement processing and automatic gain control on each voice subjected to the anti-reverberation processing, and outputting the initial sound source position of each voice.
S22: determining a sound source target corresponding to each voice according to each initial sound source position and the video information;
specifically, according to the position of each initial sound source, a sound production target corresponding to each voice is determined in the video information and recorded as a sound source target.
S23: processing the voice corresponding to each sound source target according to the video information, and outputting each reliable voice;
in one embodiment, referring to fig. 9, the S23 includes:
s231: outputting the moving track of each sound source target according to the video information;
s232: obtaining all sound source positions corresponding to all voices according to all the moving tracks;
specifically, the moving track of each sound source target can be determined according to the video information, the starting position and the ending position of the same voice pronunciation are intercepted, and all positions of the target are determined when the voice pronunciation is completed.
S233: and processing the audio information according to all sound source positions of each voice, and outputting each reliable voice.
Specifically, each piece of voice is composed of audio signals generated by vocalization of a target at different positions, time difference exists between arrival times of audio information of the same piece of voice at the microphone 1331 due to different sound source positions, other audio signals of a previous vocalization position are ignored by using the time difference, noise data processing amount can be reduced, and each piece of processed voice is output as high-reliability voice.
In an embodiment, referring to fig. 10, the S23 includes:
s234: outputting the category information and the moving track of each sound source target according to the video information;
specifically, the category of the sound source target includes at least one of the following: human, pet, toy, doll.
S235: and screening all the voices of the audio information according to the category information and the moving track, and outputting all the reliable voices.
Specifically, the moving direction of the target is determined according to the moving track, and the moving direction includes: moving the target to the outside of the elevator and moving the target to the outside of the elevator, and ignoring all audio frequencies corresponding to the target moving to the outside of the elevator; then classifying the targets moving in the elevator, neglecting all audio frequencies corresponding to the pets and the dolls, and outputting all voices of the targets moving in the elevator as reliable voices; the screening efficiency and reliability of reliable voice can be improved.
S24: and analyzing each reliable voice and outputting the voice command.
Specifically, each processed reliable voice is analyzed, then compared with each voice instruction of the database, each voice instruction is output, and then converted into a corresponding control instruction, such as floor selection, opening or closing of an elevator door and the like.
S3: analyzing each voice command, and generating a control command to control the elevator to operate;
specifically, the voice command extracted from the audio information is analyzed, the voice command is matched with the stored elevator control command to generate a control command corresponding to the voice command, and the elevator stops at floor 1 according to the control command in the key area 11, as shown in fig. 11, if the voice command is: when the building is 10 or the like, the key 10 is turned on, the elevator door is turned on, the first display area 132 displays the elevator door turning-on identification, the second display area 131 displays the parking state or the floor number 1, when the camera 1332 detects that all people enter the elevator, or no people enter the elevator, or the elevator door is turned off after the time reaches the threshold value, or the voice command of the elevator door turning-off is received, the first display area 132 displays the elevator door turning-off identification, and the second display area 131 displays the upward identification; wherein the identification comprises at least one of: numbers, words, symbols.
In an embodiment, the destination area comprises a first area inside the lift car and/or a second area outside the lift car.
Specifically, through setting up first region and second region, can accurately master the personnel's condition in the elevator railway carriage or compartment, through the video information and the audio information of first region, can confirm the condition that the target took the elevator in the elevator railway carriage or compartment, can confirm the target that needs to take the elevator through the video information and the audio information of second region, simultaneously, can solve the demand that special target took the elevator, for example: the elevator cage is about to be closed and/or is closed, but the elevator is not operated, at the moment, the object of the key is determined to be not reached through the video information and the audio information of the second area, and if the object exists, the elevator cage is opened again, so that the resources are saved, and the user experience effect is improved; then, the audio information and the video information can be used for eliminating the wrong operation of opening the elevator cage caused by the objects in other elevators after the elevators are out of the elevator; and the wrong operation of opening the elevator cage caused by the fact that the riding direction of the target is opposite to the running direction of the elevator at the moment; such as: when the elevator door is about to be closed and the video information in the second area has the target which is not entered, the target is analyzed at the moment, whether the target is the destination of other elevators and/or whether corresponding voice exists or not is determined, whether the voice content is consistent with the running direction of the elevator or not is determined, if the target which is not entered and needs to take the elevator exists, the operation is started after all the targets enter, the operation efficiency is improved, and the cost is saved.
In an embodiment, please refer to fig. 12, after the step S3, the method further includes:
s4: acquiring a third instruction in the running process of the elevator;
specifically, the elevator sends a third instruction after starting to move according to an elevator ascending or elevator descending instruction in the voice instruction; the third instruction is used to control the camera 1332 and the microphone 1331 to stop collecting data and enter a hidden state.
S5: turning off the image sensor and the sound sensor according to the third instruction;
specifically, according to the third instruction, the camera 1332 stops collecting images of the target area, the microphone 1331 stops collecting sound data, and the baffle 1334 slides along the chute, so that the camera 1332 and the microphone 1331 are hidden; the camera 1332 and the microphone 1331 are arranged and only displayed at the opening stage of the elevator door and used for collecting voice instructions, and the camera 1332 and the microphone 1331 are hidden at the corresponding operation stage of the ascending or descending of the elevator, so that the worry of privacy disclosure of a user can be eliminated, and the user experience is improved.
S6: and deleting the previous video data collected by the image sensor and the previous audio data collected by the sound sensor according to the third instruction.
Specifically, when the elevator starts to ascend or descend according to the voice command, the previous audio information and video information are deleted according to the third command, so that the privacy of a user can be protected, and the storage capacity of equipment can be improved.
In one embodiment, please refer to fig. 13, the inner wall 1 of the elevator cage is further provided with a sound absorption band 16, and the sound absorption band 16 divides the inner wall into a first inner wall 14 and a second inner wall 15; the first inner wall 14 is offset towards the inside of the lift car and forms with the second inner wall 15 an installation station (not shown) which is adapted to a sound-absorbing strip 16, on which the sound-absorbing strip 16 is installed, the sound-absorbing strip 16 facing towards the ground.
Specifically, elevator railway carriage or compartment inner wall 1 divide into first inner wall 14 and second inner wall 15, and the handing-over position of first inner wall 14 and second inner wall 15 has a banding installation station, and sound absorption area 16 is installed on the installation station, and sound in the elevator railway carriage or compartment passes through sound absorption area and gets into sound absorption layer fast, realizes falling fast in the elevator and falls the noise, improves and experiences the effect, and sound absorption area 16 is towards ground, can prevent the deposition.
By adopting the intelligent elevator voice control method of the embodiment, the voice instructions in the audio information are determined by acquiring the audio information and the video information and combining the video information, then the voice instructions are analyzed, the control instruction corresponding to the system for controlling the elevator to run is generated, and the elevator is controlled to run; the method has the advantages that the voice command is accurately obtained by combining the video and the audio, the non-contact control of the elevator is completed, the reliability is high, the experience effect is good, meanwhile, the stop time of the elevator door can be accurately controlled according to the video information and the audio information, and the energy conservation is realized.
Example 2
The invention also provides a device for controlling the voice of the intelligent elevator, please refer to fig. 14, which comprises:
a data acquisition module: the system comprises a data acquisition module, a data processing module and a display module, wherein the data acquisition module is used for acquiring audio information and video information of a target area;
a data processing module: the audio information is analyzed by combining the video information, and each voice instruction is output;
the data conversion module: and the voice command is used for analyzing each voice command and generating a control command to control the elevator to operate.
By adopting the intelligent elevator voice control device of the embodiment, the voice instructions in the audio information are determined by acquiring the audio information and the video information and combining the video information, then the voice instructions are analyzed, the control instruction corresponding to the system for controlling the elevator to run is generated, and the elevator is controlled to run; the method has the advantages that the voice command is accurately obtained by combining the video and the audio, the non-contact control of the elevator is completed, the reliability is high, the experience effect is good, meanwhile, the stop time of the elevator door can be accurately controlled according to the video information and the audio information, and the energy conservation is realized.
In one embodiment, the data acquisition module comprises:
an instruction acquisition unit: acquiring a first command and a second command in the running process of the elevator;
a video acquisition unit: controlling the image sensor to start to acquire video information of a target area according to the first instruction;
the audio acquisition unit: controlling the sound sensor to start to collect the audio information according to the second instruction;
wherein the first instruction is executed prior to the second instruction.
In one embodiment, the data processing module comprises:
a sound source position acquisition unit: determining the initial sound source position of each voice according to the audio information;
sound source target unit: determining a sound source target corresponding to each voice according to each initial sound source position and the video information;
reliable speech output unit: processing the voice corresponding to each sound source target according to the video information, and outputting each reliable voice;
a voice analysis unit: and analyzing each reliable voice and outputting the voice command.
Preferably, the sound source position acquiring unit includes:
the simultaneous sound elimination unit: carrying out simultaneous elimination processing on each voice;
an anti-reverberation unit: performing anti-reverberation processing on each voice subjected to simultaneous acoustic elimination processing;
a sound enhancement unit: and carrying out sound enhancement processing and automatic gain control on each voice subjected to the anti-reverberation processing, and outputting the initial sound source position of each voice.
In one embodiment, the reliable speech output unit includes:
a movement trajectory unit: outputting the moving track of each sound source target according to the video information;
sound source position output means: obtaining all sound source positions corresponding to all voices according to all the moving tracks;
reliable speech unit: and processing the audio information according to all sound source positions of each voice, and outputting each reliable voice.
In one embodiment, the reliable speech output unit includes:
a video information analysis unit: outputting the category information and the moving track of each sound source target according to the video information;
the voice screening unit: and screening all the voices of the audio information according to the category information and the moving track, and outputting all the reliable voices.
In one embodiment, the data conversion module further comprises:
an operation instruction unit: acquiring a third instruction in the running process of the elevator;
the information acquisition control unit: turning off the image sensor and the sound sensor according to the third instruction;
an information processing unit: and deleting the previous video data collected by the image sensor and the previous audio data collected by the sound sensor according to the third instruction.
By adopting the intelligent elevator voice control device of the embodiment, the voice instructions in the audio information are determined by acquiring the audio information and the video information and combining the video information, then the voice instructions are analyzed, the control instruction corresponding to the system for controlling the elevator to run is generated, and the elevator is controlled to run; the method has the advantages that the voice command is accurately obtained by combining the video and the audio, the non-contact control of the elevator is completed, the reliability is high, the experience effect is good, meanwhile, the stop time of the elevator door can be accurately controlled according to the video information and the audio information, and the energy conservation is realized.
Example 3
The present invention provides an elevator apparatus and storage medium, as shown in fig. 15, comprising at least one processor, at least one memory, and computer program instructions stored in the memory.
In particular, the processor may include a Central Processing Unit (CPU), or an Application Specific Integrated Circuit (ASIC), or one or more Integrated circuits that may be configured to implement embodiments of the present invention.
The memory may include mass storage for data or instructions. By way of example, and not limitation, memory may include a Hard Disk Drive (HDD), floppy Disk Drive, flash memory, optical Disk, magneto-optical Disk, magnetic tape, or Universal Serial Bus (USB) Drive or a combination of two or more of these. The memory may include removable or non-removable (or fixed) media, where appropriate. The memory may be internal or external to the data processing apparatus, where appropriate. In a particular embodiment, the memory is non-volatile solid-state memory. In a particular embodiment, the memory includes Read Only Memory (ROM). Where appropriate, the ROM may be mask-programmed ROM, Programmable ROM (PROM), Erasable PROM (EPROM), Electrically Erasable PROM (EEPROM), electrically rewritable ROM (EAROM), or flash memory or a combination of two or more of these.
The processor reads and executes the computer program instructions stored in the memory to realize the intelligent elevator voice control method in any one of the above embodiment modes.
In one example, the electronic device may also include a communication interface and a bus. The processor, the memory and the communication interface are connected through a bus and complete mutual communication.
The communication interface is mainly used for realizing communication among modules, devices, units and/or equipment in the embodiment of the invention.
A bus comprises hardware, software, or both that couple components of an electronic device to one another. By way of example, and not limitation, a bus may include an Accelerated Graphics Port (AGP) or other graphics bus, an Enhanced Industry Standard Architecture (EISA) bus, a Front Side Bus (FSB), a Hypertransport (HT) interconnect, an Industry Standard Architecture (ISA) bus, an infiniband interconnect, a Low Pin Count (LPC) bus, a memory bus, a Micro Channel Architecture (MCA) bus, a Peripheral Component Interconnect (PCI) bus, a PCI-Express (PCI-X) bus, a Serial Advanced Technology Attachment (SATA) bus, a video electronics standards association local (VLB) bus, or other suitable bus or a combination of two or more of these. A bus may include one or more buses, where appropriate. Although specific buses have been described and shown in the embodiments of the invention, any suitable buses or interconnects are contemplated by the invention.
In summary, the embodiments of the present invention provide an intelligent elevator voice control method, an intelligent elevator voice control device, an elevator apparatus, and a storage medium.
According to the intelligent elevator voice control method, the intelligent elevator voice control device, the elevator equipment and the storage medium, the voice instructions in the audio information are determined by acquiring the audio information and the video information and combining the video information, then the voice instructions are analyzed, the control instructions for controlling the operation of the elevator corresponding to the system are generated, and the operation of the elevator is controlled; the method has the advantages that the voice command is accurately obtained by combining the video and the audio, the non-contact control of the elevator is completed, the reliability is high, the experience effect is good, meanwhile, the stop time of the elevator door can be accurately controlled according to the video information and the audio information, and the energy conservation is realized.
It is to be understood that the invention is not limited to the specific arrangements and instrumentality described above and shown in the drawings. A detailed description of known methods is omitted herein for the sake of brevity. In the above embodiments, several specific steps are described and shown as examples. However, the method processes of the present invention are not limited to the specific steps described and illustrated, and those skilled in the art can make various changes, modifications and additions or change the order between the steps after comprehending the spirit of the present invention.
The functional blocks shown in the above-described structural block diagrams may be implemented as hardware, software, firmware, or a combination thereof. When implemented in hardware, it may be, for example, an electronic circuit, an Application Specific Integrated Circuit (ASIC), suitable firmware, plug-in, function card, or the like. When implemented in software, the elements of the invention are the programs or code segments used to perform the required tasks. The program or code segments may be stored in a machine-readable medium or transmitted by a data signal carried in a carrier wave over a transmission medium or a communication link. A "machine-readable medium" may include any medium that can store or transfer information. Examples of a machine-readable medium include electronic circuits, semiconductor memory devices, ROM, flash memory, Erasable ROM (EROM), floppy disks, CD-ROMs, optical disks, hard disks, fiber optic media, Radio Frequency (RF) links, and so forth. The code segments may be downloaded via computer networks such as the internet, intranet, etc.
Finally, it should be noted that: the above embodiments are only used to illustrate the technical solution of the present invention, and not to limit the same; while the invention has been described in detail and with reference to the foregoing embodiments, it will be understood by those skilled in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.

Claims (10)

1. An intelligent elevator voice control method, characterized in that the method comprises:
s1: collecting audio information and video information of a target area;
s2: analyzing the audio information by combining the video information, and outputting each voice instruction;
s3: and analyzing each voice command to generate a control command to control the elevator to operate.
2. An intelligent elevator speech control method according to claim 1, characterized in that the destination area comprises a first area inside the elevator car and/or a second area outside the elevator car.
3. The intelligent elevator voice control method according to claim 2, wherein the S1 includes:
s11: acquiring a first command and a second command in the running process of the elevator;
s12: controlling the image sensor to start to collect video information of a target area according to the first instruction;
s13: controlling a sound sensor to start to collect the audio information of the target area according to the second instruction;
wherein the first instruction is executed prior to the second instruction.
4. The intelligent elevator voice control method according to claim 3, wherein the S2 includes:
s21: determining the initial sound source position of each voice according to the audio information;
s22: determining a sound source target corresponding to each voice according to each initial sound source position and the video information;
s23: processing the voice corresponding to each sound source target according to the video information, and outputting each reliable voice;
s24: and analyzing each reliable voice and outputting the voice command.
5. The intelligent elevator voice control method according to claim 4, wherein the S21 includes:
s211: carrying out simultaneous elimination processing on each voice;
s212: performing anti-reverberation processing on each voice subjected to simultaneous acoustic elimination processing;
s213: and carrying out sound enhancement processing and automatic gain control on each voice subjected to the anti-reverberation processing, and outputting the initial sound source position of each voice.
6. The intelligent elevator voice control method according to claim 5, wherein the S23 includes:
s231: outputting the moving track of each sound source target according to the video information;
s232: obtaining all sound source positions corresponding to all voices according to all the moving tracks;
s233: and processing the audio information according to all sound source positions of each voice, and outputting each reliable voice.
7. The intelligent elevator voice control method according to claim 5 or 6, wherein the S23 includes:
s234: outputting the category information and the moving track of each sound source target according to the video information;
s235: and screening all the voices of the audio information according to the category information and the moving track, and outputting all the reliable voices.
8. An intelligent elevator voice control device, comprising:
a data acquisition module: the system comprises a data acquisition module, a data processing module and a display module, wherein the data acquisition module is used for acquiring audio information and video information of a target area;
a data processing module: the audio information is analyzed by combining the video information, and each voice instruction is output;
the data conversion module: and the voice command is used for analyzing each voice command and generating a control command to control the elevator to operate.
9. An elevator installation, comprising: at least one processor, at least one memory, and computer program instructions stored in the memory that, when executed by the processor, implement the method of any of claims 1-7.
10. A storage medium having computer program instructions stored thereon, which when executed by a processor implement the method of any one of claims 1-7.
CN202110224729.7A 2021-03-01 2021-03-01 Intelligent elevator voice control method and device, elevator equipment and storage medium Pending CN113023512A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110224729.7A CN113023512A (en) 2021-03-01 2021-03-01 Intelligent elevator voice control method and device, elevator equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110224729.7A CN113023512A (en) 2021-03-01 2021-03-01 Intelligent elevator voice control method and device, elevator equipment and storage medium

Publications (1)

Publication Number Publication Date
CN113023512A true CN113023512A (en) 2021-06-25

Family

ID=76466335

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110224729.7A Pending CN113023512A (en) 2021-03-01 2021-03-01 Intelligent elevator voice control method and device, elevator equipment and storage medium

Country Status (1)

Country Link
CN (1) CN113023512A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114360515A (en) * 2021-12-09 2022-04-15 北京声智科技有限公司 Information processing method, information processing apparatus, electronic device, information processing medium, and computer program product
CN114368654A (en) * 2021-12-06 2022-04-19 北京声智科技有限公司 Data processing method, device, equipment and computer readable storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020020586A1 (en) * 2000-08-08 2002-02-21 Georg Bauer Speech-controlled location-familiar elevator
CN208814428U (en) * 2018-08-01 2019-05-03 迅达(中国)电梯有限公司 Elevator call system
CN110950204A (en) * 2019-11-28 2020-04-03 星络智能科技有限公司 Call calling method based on intelligent panel, intelligent panel and storage medium
CN111977475A (en) * 2020-07-08 2020-11-24 三洋电梯(珠海)有限公司 Intelligent calling landing method, system and medium
CN112299211A (en) * 2020-10-29 2021-02-02 联想(北京)有限公司 Elevator control method, device and system
CN212581260U (en) * 2020-05-09 2021-02-23 森赫电梯股份有限公司 Non-contact man-machine interaction system for elevator

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020020586A1 (en) * 2000-08-08 2002-02-21 Georg Bauer Speech-controlled location-familiar elevator
CN208814428U (en) * 2018-08-01 2019-05-03 迅达(中国)电梯有限公司 Elevator call system
CN110950204A (en) * 2019-11-28 2020-04-03 星络智能科技有限公司 Call calling method based on intelligent panel, intelligent panel and storage medium
CN212581260U (en) * 2020-05-09 2021-02-23 森赫电梯股份有限公司 Non-contact man-machine interaction system for elevator
CN111977475A (en) * 2020-07-08 2020-11-24 三洋电梯(珠海)有限公司 Intelligent calling landing method, system and medium
CN112299211A (en) * 2020-10-29 2021-02-02 联想(北京)有限公司 Elevator control method, device and system

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114368654A (en) * 2021-12-06 2022-04-19 北京声智科技有限公司 Data processing method, device, equipment and computer readable storage medium
CN114360515A (en) * 2021-12-09 2022-04-15 北京声智科技有限公司 Information processing method, information processing apparatus, electronic device, information processing medium, and computer program product

Similar Documents

Publication Publication Date Title
CN113023512A (en) Intelligent elevator voice control method and device, elevator equipment and storage medium
EP3301948A1 (en) System and method for localization and acoustic voice interface
EP3285160A1 (en) Intention recognition for triggering voice recognition system
US10095315B2 (en) System and method for distant gesture-based control using a network of sensors across the building
US20180052520A1 (en) System and method for distant gesture-based control using a network of sensors across the building
EP3253701B1 (en) Elevator operating panel comprising touch screen
US20130048436A1 (en) Automated elevator car call prompting
US20120145487A1 (en) Elevator Using Variable Communication Protocol and Its Control Method
CN214692798U (en) Voice control elevator
WO2019015642A1 (en) Smart guidance for controlling passenger to enter correct elevator car
JP2003201076A (en) Method for modernizing elevator facility and computer program product
EP3428917A1 (en) Voice processing device and voice processing method
KR101687296B1 (en) Object tracking system for hybrid pattern analysis based on sounds and behavior patterns cognition, and method thereof
WO2016186383A1 (en) Artificial intelligence (ai) unmanned smart car and method of operating same
CN110950206A (en) Passenger movement detection system, passenger movement detection method, passenger call control method, readable storage medium, and elevator system
CN111344245A (en) Monitoring image transmitting device for elevator
KR102366450B1 (en) Method and apparatus for providing advertisement in an elevator
CN111977476A (en) Elevator control method and system
JP7426631B2 (en) Unmanned mobile object and information processing method
EP3551563B1 (en) Elevator installation with predictive call based on noise analysis
WO2022217621A1 (en) Speech interaction method and apparatus
CN112258885B (en) Arrival reminding method and device, electronic equipment and storage medium
JP5017246B2 (en) Dictionary learning apparatus and method
CN207416819U (en) A kind of urban rail shield door anti-pinch prediction suggestion device with laser
KR102509863B1 (en) Method for control lighting based on the voices of people entering apartment house

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination