WO2014112226A1 - 電子機器及び掃除機 - Google Patents
電子機器及び掃除機 Download PDFInfo
- Publication number
- WO2014112226A1 WO2014112226A1 PCT/JP2013/082441 JP2013082441W WO2014112226A1 WO 2014112226 A1 WO2014112226 A1 WO 2014112226A1 JP 2013082441 W JP2013082441 W JP 2013082441W WO 2014112226 A1 WO2014112226 A1 WO 2014112226A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- unit
- voice
- electronic device
- word
- sentence
- Prior art date
Links
- 238000006243 chemical reaction Methods 0.000 claims abstract description 61
- 238000004140 cleaning Methods 0.000 claims description 26
- 238000004891 communication Methods 0.000 claims description 22
- 238000007664 blowing Methods 0.000 claims description 11
- 239000000428 dust Substances 0.000 claims description 9
- 230000004044 response Effects 0.000 claims description 3
- 238000010408 sweeping Methods 0.000 claims description 2
- 238000001514 detection method Methods 0.000 description 13
- 238000000605 extraction Methods 0.000 description 11
- 238000010586 diagram Methods 0.000 description 10
- 230000006870 function Effects 0.000 description 9
- 230000015572 biosynthetic process Effects 0.000 description 8
- 238000003786 synthesis reaction Methods 0.000 description 8
- 230000009471 action Effects 0.000 description 7
- 238000000034 method Methods 0.000 description 6
- 230000008569 process Effects 0.000 description 5
- 230000000694 effects Effects 0.000 description 4
- 230000008859 change Effects 0.000 description 3
- 239000000284 extract Substances 0.000 description 3
- 230000035945 sensitivity Effects 0.000 description 2
- 230000007423 decrease Effects 0.000 description 1
- 230000001771 impaired effect Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 238000001228 spectrum Methods 0.000 description 1
- 230000002194 synthesizing effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
- G10L17/22—Interactive procedures; Man-machine interfaces
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
-
- A—HUMAN NECESSITIES
- A47—FURNITURE; DOMESTIC ARTICLES OR APPLIANCES; COFFEE MILLS; SPICE MILLS; SUCTION CLEANERS IN GENERAL
- A47L—DOMESTIC WASHING OR CLEANING; SUCTION CLEANERS IN GENERAL
- A47L9/00—Details or accessories of suction cleaners, e.g. mechanical means for controlling the suction or for effecting pulsating action; Storing devices specially adapted to suction cleaners or parts thereof; Carrying-vehicles specially adapted for suction cleaners
- A47L9/28—Installation of the electric equipment, e.g. adaptation or attachment to the suction cleaner; Controlling suction cleaners by electric means
- A47L9/2836—Installation of the electric equipment, e.g. adaptation or attachment to the suction cleaner; Controlling suction cleaners by electric means characterised by the parts which are controlled
- A47L9/2842—Suction motors or blowers
-
- A—HUMAN NECESSITIES
- A47—FURNITURE; DOMESTIC ARTICLES OR APPLIANCES; COFFEE MILLS; SPICE MILLS; SUCTION CLEANERS IN GENERAL
- A47L—DOMESTIC WASHING OR CLEANING; SUCTION CLEANERS IN GENERAL
- A47L9/00—Details or accessories of suction cleaners, e.g. mechanical means for controlling the suction or for effecting pulsating action; Storing devices specially adapted to suction cleaners or parts thereof; Carrying-vehicles specially adapted for suction cleaners
- A47L9/28—Installation of the electric equipment, e.g. adaptation or attachment to the suction cleaner; Controlling suction cleaners by electric means
- A47L9/2836—Installation of the electric equipment, e.g. adaptation or attachment to the suction cleaner; Controlling suction cleaners by electric means characterised by the parts which are controlled
- A47L9/2847—Surface treating elements
-
- A—HUMAN NECESSITIES
- A47—FURNITURE; DOMESTIC ARTICLES OR APPLIANCES; COFFEE MILLS; SPICE MILLS; SUCTION CLEANERS IN GENERAL
- A47L—DOMESTIC WASHING OR CLEANING; SUCTION CLEANERS IN GENERAL
- A47L9/00—Details or accessories of suction cleaners, e.g. mechanical means for controlling the suction or for effecting pulsating action; Storing devices specially adapted to suction cleaners or parts thereof; Carrying-vehicles specially adapted for suction cleaners
- A47L9/28—Installation of the electric equipment, e.g. adaptation or attachment to the suction cleaner; Controlling suction cleaners by electric means
- A47L9/2857—User input or output elements for control, e.g. buttons, switches or displays
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/28—Constructional details of speech recognition systems
-
- A—HUMAN NECESSITIES
- A47—FURNITURE; DOMESTIC ARTICLES OR APPLIANCES; COFFEE MILLS; SPICE MILLS; SUCTION CLEANERS IN GENERAL
- A47L—DOMESTIC WASHING OR CLEANING; SUCTION CLEANERS IN GENERAL
- A47L2201/00—Robotic cleaning machines, i.e. with automatic control of the travelling movement or the cleaning operation
- A47L2201/04—Automatic control of the travelling movement; Automatic obstacle detection
-
- A—HUMAN NECESSITIES
- A47—FURNITURE; DOMESTIC ARTICLES OR APPLIANCES; COFFEE MILLS; SPICE MILLS; SUCTION CLEANERS IN GENERAL
- A47L—DOMESTIC WASHING OR CLEANING; SUCTION CLEANERS IN GENERAL
- A47L2201/00—Robotic cleaning machines, i.e. with automatic control of the travelling movement or the cleaning operation
- A47L2201/06—Control of the cleaning action for autonomous devices; Automatic detection of the surface condition before, during or after cleaning
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/20—Speech recognition techniques specially adapted for robustness in adverse environments, e.g. in noise, of stress induced speech
Definitions
- the present invention relates to an electronic device or the like, and more particularly to an electronic device or the like provided with voice recognition means.
- operation buttons, remote controllers, and the like have been used as user interfaces for inputting instructions for various operations to electronic devices.
- electronic devices equipped with voice recognition means for inputting instructions by voice uttered by a user have been developed.
- Patent Document 1 describes a speech recognition device that performs a response to a user who has produced a speech when the speech has not been successfully recognized.
- FIG. 10 is a block diagram illustrating a main configuration of the controller 302 included in the voice recognition device 301 described in Patent Document 1.
- the voice recognition device 301 includes a microphone 303 to which voice is input, a certainty factor calculation unit 304 that calculates the certainty factor of a word that has been voice-recognized, and a talker speaking based on the certainty factor of the word calculated by the certainty factor calculation unit 304.
- a sentence identifying unit 305 that identifies a sentence
- a first answer determination unit 306 that determines whether or not a conversation person needs to be answered based on the certainty factor of a word included in the identified sentence.
- the first answer determination unit 306 determines that the answer is unnecessary when the certainty level of the word is equal to or greater than a predetermined threshold, and more clearly to the user when the certainty level of the word is less than the predetermined threshold value. Judge that it is necessary to hear back to encourage them to speak.
- Japanese Patent Publication Japanese Patent Laid-Open No. 2008-52178 (published March 6, 2008)”
- the present invention has been made in view of the above circumstances, and an object thereof is to provide an electronic device or the like that can appropriately listen back to the user when recognizing the voice emitted by the user. There is to do.
- An electronic device specifies a word or sentence included in the voice data by analyzing voice data by voice input means for converting the input voice into voice data, and the specified word
- the speech recognition means for calculating the certainty level of the sentence
- the reaction determining means for determining whether or not the user needs to be answered based on the certainty degree
- the answering means for performing the answering are provided. A decision is made to hear back if it is greater than or equal to the second threshold and less than the first threshold, and a decision is made not to ask if the confidence is less than the second threshold.
- an electronic device or the like that can appropriately listen to the user when recognizing the voice uttered by the user.
- Embodiment 1 The electronic apparatus according to Embodiment 1 of the present invention will be described below.
- the electronic apparatus includes a traveling unit and a blower unit, and is a self-propelled floor that is driven by the traveling unit, and also performs cleaning by sucking dust on the floor surface with an airflow generated by the blower unit. It is.
- the electronic device includes a voice recognition unit, recognizes the voice uttered by the user, and performs various reactions based on instructions included in the voice. For example, when “cleaned” is included in the voice uttered by the user, the electronic device performs a predetermined cleaning operation by controlling the traveling unit and the air blowing unit.
- the electronic device performs a replay when it is determined that a replay to the user is necessary for voice recognition.
- listening back prompts the user to speak again.
- the listening is performed, for example, by voice and / or action.
- FIG. 1 is a perspective view of an electronic apparatus 1 according to the first embodiment.
- the advancing direction when the electronic device 1 is self-propelled and performs cleaning is defined as the front, and is indicated by an arrow in FIG.
- the direction opposite to the traveling direction is the rear.
- the electronic device 1 includes a circular housing 2 in plan view. An upper surface 2 a of the housing 2 is provided with an exhaust port 2 b through which air from which dust has been removed is exhausted and a panel operation unit 4 for inputting instructions to the electronic device 1.
- the panel operation unit 4 includes an operation unit for inputting various instructions to the electronic device 1 and a display unit for displaying various information.
- the operation unit is provided with a plurality of operation buttons. The user can use both instruction input via the operation unit and instruction input by voice recognition.
- a feedback signal receiving unit 5 for receiving a feedback signal from the charging stand is provided on the front side of the upper surface 2a of the housing 2. And when the electronic device 1 judges that the cleaning of the floor surface has been completed, for example, the electronic device 1 can autonomously return to the charging stand by receiving the feedback signal via the feedback signal receiving unit 5. It is configured.
- the side surface 2c of the housing 2 is divided into two in the front-rear direction.
- the front part of the side surface 2c is configured to be slidable in the front-rear direction with respect to the other parts of the housing 2, and functions as a buffer member when the electronic device 1 collides with an obstacle.
- an audio output unit 31 is provided on the side surface 2 c of the housing 2.
- the sound output unit 31 outputs sound such as sound or music.
- the audio output unit 31 is composed of, for example, a speaker.
- the audio output unit 31 is an example of a listening device according to the present invention.
- a side brush 34 b is provided on the bottom surface of the electronic device 1 so as to protrude from the housing 2. The side brush 34b will be described in detail later.
- FIG. 2 is a bottom view of the electronic device 1. Also in FIG. 2, the direction of travel when the electronic device 1 self-propels and performs cleaning is indicated by an arrow.
- a suction port 2e for sucking dust on the floor is recessed in the bottom surface 2d of the housing 2.
- a traveling portion 32, a cleaning brush portion 34, a front wheel 6a, and a rear wheel 6b are provided on the bottom surface 2d of the housing 2.
- the traveling unit 32 is a part that causes the electronic device 1 to travel.
- the traveling unit 32 includes, for example, a drive wheel provided so as to protrude from the bottom surface 2d, a motor that drives the drive wheel, and the like. In FIG. 2, a part of the driving wheel that protrudes from the bottom surface 2 d of the traveling unit 32 is shown.
- the traveling unit 32 is an example of traveling means according to the present invention.
- the cleaning brush part 34 is a part that sweeps and cleans the floor surface.
- the cleaning brush part 34 is composed of, for example, a brush that sweeps the floor and a motor that drives the brush.
- the brush is provided in the suction port 2e, and is provided so as to protrude from the housing 2 on both sides of the rotating brush 34a that rotates on a rotating shaft that is supported in parallel with the floor surface and the bottom surface 2d obliquely forward, A side brush 34b that rotates on a rotary shaft that is pivotally supported perpendicular to the floor surface can be used.
- the front wheel 6 a and the rear wheel 6 b are driven wheels that follow the traveling of the traveling unit 32.
- FIG. 3 is a block diagram illustrating a main configuration of the electronic apparatus 1.
- the electronic device 1 includes a voice input unit 3 and a blower unit 33.
- the voice input unit 3 is a part that generates voice data by digitally converting the inputted voice while the voice is inputted.
- the voice input unit 3 includes, for example, a microphone and an analog / digital conversion device.
- a microphone As the microphone, a directional microphone that collects sound coming from a predetermined direction with particularly high sensitivity may be used, and an omnidirectional microphone that collects sound with constant sensitivity regardless of the direction from which the sound comes. May be used.
- the voice input unit 3 can be provided on the back side of the upper surface 2a of the housing 2, for example.
- the blower 33 generates an air flow for sucking dust.
- the generated air current is guided from a suction port 2e to a dust collection unit (not shown), and after the dust is separated by the dust collection unit, it is discharged from the exhaust port 2b to the outside of the electronic device 1.
- the electronic device 1 further includes a storage unit 20.
- the storage unit 20 will be described in detail below.
- the storage unit 20 stores various programs executed by the control unit 10 to be described later, various data used and created when the various programs are executed, various data input to the electronic device 1, and the like.
- the storage unit 20 includes, for example, a non-volatile storage device such as a ROM (Read Only Memory), a flash memory, or an HDD (Hard Disk Drive), and a volatile storage device such as a RAM (Random Access Memory) constituting a work area. ing.
- the storage unit 20 includes an acoustic feature storage unit 21, a dictionary storage unit 22, and a grammar storage unit 23.
- the acoustic feature storage unit 21 is a part that stores acoustic features that indicate the acoustic features of the speech to be recognized.
- the type of acoustic feature can be selected as appropriate.
- the acoustic feature is, for example, a speech waveform or a frequency spectrum of speech power.
- the speech recognition unit 11 compares the acoustic features included in the speech data generated by the speech input unit 3 with the acoustic features stored in the acoustic feature storage unit 21 to compare the user's Recognize the voice you utter.
- the dictionary storage unit 22 is a part that stores a dictionary in which each word to be speech-recognized and phonological information related to the word are registered.
- the grammar storage unit 23 is a part that stores grammar rules that describe how words registered in the dictionary of the dictionary storage unit 22 are linked.
- the grammatical rules are based on, for example, the probability that statistically obtained words are linked.
- the electronic device 1 further includes a control unit 10.
- the control unit 10 will be described in detail below.
- control unit 10 controls each unit of the electronic device 1 based on a program or data stored in the storage unit 20. By executing the program, a speech recognition unit 11, a reaction determination unit 12, a speech synthesis unit 13, and a motion generation unit 14 are constructed in the control unit 10.
- the voice recognition unit 11 is a part that performs voice recognition on the voice uttered by the user.
- the voice recognition unit 11 outputs information on the word or sentence included in the voice data and the certainty of the word or sentence as a result of the voice recognition.
- the voice recognition unit 11 includes a voice section detection unit 111, an acoustic feature extraction unit 112, and an acoustic feature comparison unit 113.
- the information regarding the word or sentence includes, for example, phonological information of the word or sentence.
- the voice section detection unit 111 is a part that detects the start and end of voice recognition. When no voice is detected, the voice section detection unit 111 monitors whether the power of the voice data generated by the voice input unit 3 is equal to or higher than a predetermined threshold stored in the storage unit 20. Then, the voice section detection unit 111 determines that voice has been detected when the power of the voice data becomes equal to or higher than the threshold value. In addition, the voice section detection unit 111 determines that the voice has ended when the power of the voice data becomes less than the threshold.
- the acoustic feature extraction unit 112 is a part that extracts acoustic features for each appropriate frame of the voice data generated by the voice input unit 3.
- the acoustic feature comparison unit 113 compares the acoustic feature extracted by the acoustic feature extraction unit 112 with the acoustic feature stored in the acoustic feature storage unit 21 to identify a word or sentence included in the speech data, and the identified feature. This is a part for calculating the certainty of the word or sentence.
- the acoustic feature comparison unit 113 can refer to the dictionary stored in the dictionary storage unit 22 and / or the grammar rules stored in the grammar storage unit 23 as necessary. Information regarding the word or sentence specified by the acoustic feature comparison unit 113 and the certainty factor of the specified word or sentence is output to the reaction determination unit 12.
- the acoustic feature comparison unit 113 compares the acoustic feature extracted from the audio data with the acoustic feature stored in the acoustic feature storage unit 21 for each frame extracted by the acoustic feature extraction unit 112. Then, the acoustic feature comparison unit 113 calculates the word certainty factor for each candidate word stored in the storage unit 20 and identifies the word having the highest word certainty factor. Further, the acoustic feature comparison unit 113 refers to the dictionary stored in the dictionary storage unit 22 and acquires phonological information of the identified word.
- the acoustic feature comparison unit 113 creates a sentence by appropriately connecting words determined for each of the plurality of frames. Then, the acoustic feature comparison unit 113 calculates the certainty of the sentence for each created sentence and identifies the sentence having the highest certainty of the sentence.
- the acoustic feature comparison unit 113 can calculate the certainty of the sentence by referring to the grammar rules stored in the grammar storage unit 23.
- the reaction determination unit 12 is a part that determines the reaction of the electronic device 1 based on the result of the voice recognition input from the voice recognition unit 11. Specifically, the reaction determination unit 12 determines the reaction of the electronic device 1 based on the certainty of the specified word or sentence. That is, when the certainty factor of the identified word or sentence is so high that the result of speech recognition is not ambiguous, the reaction determination unit 12 determines to perform a reaction corresponding to the word or sentence. In addition, when the certainty level of the speech-recognized word or sentence has a certain degree of ambiguity in the speech recognition result, the reaction determination unit 12 determines to hear back. Furthermore, when the certainty level of the voice-recognized word or sentence is further lower, the reaction determination unit 12 determines that neither the reaction corresponding to the word or the sentence nor the replay is performed.
- the voice synthesis unit 13 is a part that synthesizes voice data corresponding to the reaction determined by the reaction determination unit 12.
- the voice synthesizer 13 outputs the synthesized voice data to the voice output unit 31.
- the voice synthesizer 13 can refer to the dictionary stored in the dictionary storage unit 22 and / or the grammar rules stored in the grammar storage unit 23 as necessary.
- the operation generation unit 14 is a unit that generates an operation pattern corresponding to the reaction determined by the reaction determination unit 12.
- the motion generation unit 14 outputs the generated motion pattern to the traveling unit 32, the air blowing unit 33, and / or the cleaning brush unit 34.
- FIG. 4 is a flowchart showing the flow of voice recognition processing performed by the electronic device 1.
- step is represented by “S”.
- S represents “step”.
- the voice section detection unit 111 monitors voice data input from the voice input unit 3 and determines whether or not voice recognition voice is detected (S1).
- the acoustic feature extraction unit 112 extracts acoustic features indicating acoustic features for each appropriate frame from the speech data input from the speech input unit 3 ( S2).
- the voice section detection unit 111 continues to monitor voice data input from the voice input unit 3.
- the acoustic feature comparison unit 113 compares the acoustic feature extracted by the acoustic feature extraction unit 112 with the acoustic feature stored in the acoustic feature storage unit 21 and specifies a word or sentence included in the speech data.
- the certainty factor of the identified word or sentence is calculated (S3).
- the voice section detection unit 111 monitors the voice data input from the voice input unit 3 and determines whether or not the voice recognition voice has ended (S4). If the end of the voice is not detected (NO in S4), the voice section detection unit 111 continues to monitor the voice data input from the voice input unit 3.
- the voice recognizing unit 11 may output the certainty factor calculated for the previously detected voice to the reaction determining unit 12.
- the confidence level calculated for the voice detected later may be output to the reaction determination unit 12, or the confidence level calculated for the voice detected earlier and the voice detected later.
- the certainty factor calculated in this way may be output to the reaction determination unit 12.
- the reaction determination unit 12 determines whether or not the certainty of the word or sentence specified by the acoustic feature comparison unit 113 is equal to or higher than the first threshold ( S5). And when the certainty factor of a word or a sentence is more than a 1st threshold value, the reaction determination part 12 determines performing reaction corresponding to the word or sentence which carried out the speech recognition, and the speech synthesizing part 13 and the action generation part 14 are made. The reaction is carried out through (S6).
- the reaction determining unit 12 determines whether the certainty factor of the word or sentence is equal to or greater than the second threshold value. It is determined whether or not (S7). And when the certainty factor of a word or a sentence is more than a 2nd threshold value (in the case of YES at S7), the reaction determination part 12 determines performing a reply, and through the speech synthesis part 13 and the action generation part 14 Then, a response is made (S8).
- the reaction determination unit 12 determines that neither the reaction corresponding to the word or sentence nor the answer is performed, End the process. Note that the second threshold value is smaller than the first threshold value.
- FIG. 5 is a schematic diagram illustrating a specific example of the replay performed by the electronic device 1.
- FIG. 5A illustrates a case where a replay is performed by voice
- FIG. 5B illustrates a case where a replay is performed by operation
- the voice synthesizer 13 When the voice is heard back, the voice synthesizer 13 synthesizes voice data corresponding to “What did you say?” And outputs it to the voice output unit 31.
- the audio output unit 31 performs analog conversion on the input audio data and outputs “What did you say?” As audio.
- the motion generation unit 14 In the case of performing a loopback operation, for example, the motion generation unit 14 generates a motion pattern that rotates the electronic device 1 left and right by a certain angle on the spot, and controls the traveling unit 32 to travel in the motion pattern. .
- the electronic device 1 when the certainty factor of the word or sentence specified by the speech recognition unit 11 is equal to or lower than the first threshold and equal to or higher than the second threshold, the electronic device 1 listens to the user. Therefore, the electronic device 1 prevents misrecognition by performing a replay when the certainty of the word or sentence is ambiguous, and does not perform a replay when the certainty of the word or sentence is lower. By doing so, useless feedback can be reduced.
- the electronic device 1 of this embodiment demonstrated the case where the electronic device 1 would hear back if a word or a sentence was voice-recognized even once with the certainty of the predetermined range, it is not limited only to this.
- the electronic device 1 may listen back when a word or sentence is recognized multiple times and continuously with a certain range of certainty. By configuring the electronic device 1 in this manner, unnecessary listening back can be further reduced.
- Embodiment 2 An electronic apparatus 1 according to Embodiment 2 of the present invention will be described with reference to the drawings.
- the electronic device 1 according to the present invention is different from the above-described embodiment in that the electronic device 1 performs different listening based on the certainty of the word or sentence recognized by the voice recognition unit 11.
- the component demonstrated in Embodiment 1 it shall have the same function as Embodiment 1, and description is abbreviate
- FIG. 6 is a flowchart showing the flow of voice recognition processing performed by the electronic device 1.
- it shall have the same function as Embodiment 1, and description is abbreviate
- the reaction determination unit 12 determines whether the certainty factor of the word or sentence is equal to or more than the third threshold value. Is determined (S11). And when the certainty factor of a word or a sentence is more than a 3rd threshold value (when it is YES at S11), the reaction determination part 12 determines performing 1st hearing, the speech synthesis part 13 and the action generation part 14 The reaction is carried out via (S12).
- the third threshold value is smaller than the first threshold value.
- the reaction determination unit 12 determines whether the certainty factor of the word or sentence is equal to or more than the fourth threshold value. Is determined (S13). And when the certainty factor of a word or a sentence is more than a 4th threshold value (when it is YES at S13), the reaction determination part 12 determines performing a 2nd hearing, the speech synthesis part 13 and the action generation part 14 The reaction is performed via (S14). Note that the fourth threshold value is smaller than the third threshold value.
- the reaction determination unit 12 When the certainty factor of the word or sentence calculated by the voice recognition unit 11 is less than the fourth threshold value (in the case of NO in S13), the reaction determination unit 12 performs both the reaction and the replay corresponding to the word or sentence recognized by the voice recognition. It decides not to carry out, and ends the processing.
- FIG. 7 is a schematic diagram showing a specific example of the replay performed by the electronic device 1.
- FIG. 7A shows a case where the first replay is performed
- FIG. 7B shows a case where the second replay is performed.
- the voice synthesizer 13 When performing the first listening, the voice synthesizer 13 synthesizes voice data corresponding to, for example, “cleaned and said?” And outputs it to the voice output unit 31.
- the audio output unit 31 converts the input audio data into an analog signal and outputs “Is it clean?”
- the voice of the first listening is synthesized based on the word or sentence having the highest certainty specified by the voice recognition unit 11. For example, when the sentence with the highest certainty is “clean up”, the reaction determination unit 12 determines to hear back “Did you clean up?” Based on the sentence.
- the speech synthesizer 13 When performing the second listening, the speech synthesizer 13 synthesizes speech data corresponding to “What did you say?” And outputs it to the speech output unit 31.
- the audio output unit 31 performs analog conversion on the input audio data and outputs “What did you say?” As audio.
- the electronic device 1 performs different listening based on the certainty of the word or sentence recognized by the voice recognition unit 11. Therefore, the user can know how much the electronic device 1 recognizes the voice from the voice and / or operation of the answer, so that, for example, the user can input the instruction again by voice, or the panel operation unit 4 or the like. It is possible to select whether or not to input an instruction, and the convenience of the user is improved.
- the electronic device 1a according to the present invention includes the communication unit 6 that communicates with the external device 200, and performs the speech recognition processing of the voice uttered by the user by the external device 200 by communicating with the external device 200.
- the communication unit 6 that communicates with the external device 200, and performs the speech recognition processing of the voice uttered by the user by the external device 200 by communicating with the external device 200.
- the component demonstrated in Embodiment 1 it shall have the same function as Embodiment 1, and description is abbreviate
- FIG. 8 is a block diagram illustrating main configurations of the electronic apparatus 1a and the external device 200.
- the electronic device 1a further includes a communication unit 6 in addition to the components described in the first embodiment. In FIG. 8, only some of the components described in the first embodiment are shown.
- the communication unit 6 transmits / receives information to / from the external device 200.
- the communication unit 6 is connected to the communication network 300 and is connected to the external device 200 via the communication network 300.
- the communication network 300 is not limited and can be selected as appropriate.
- the communication network 300 can use, for example, the Internet.
- the communication network 300 may use infrared rays such as IrDA or a remote control, wireless such as Bluetooth (registered trademark), WiFi (registered trademark), or IEEE 802.11.
- the reaction determination unit 12a determines the reaction of the electronic device 1a based on the result of speech recognition input from the speech recognition unit 11 and the result of speech recognition received from the speech recognition unit 11a of the external device 200 described later. It is.
- the external device 200 includes a communication unit 206, a storage unit 220, and a control unit 210.
- the communication unit 206 transmits and receives information to and from the electronic device 1a.
- the communication unit 206 is connected to the communication network 300, and is connected to the electronic device 1a via the communication network 300.
- the storage unit 220 stores various programs executed by the control unit 210 described later, various data used and created when executing the various programs, various data input to the external device 200, and the like.
- the storage unit 220 includes, for example, a nonvolatile storage device such as a ROM, a flash memory, and an HDD, and a volatile storage device such as a RAM that constitutes a work area.
- the storage unit 220 includes an acoustic feature storage unit 21a, a dictionary storage unit 22a, and a grammar storage unit 23a.
- the acoustic feature storage unit 21a stores data similar to that of the acoustic feature storage unit 21 described above.
- the dictionary storage unit 22a stores the same data as the dictionary storage unit 22 described above.
- the grammar storage unit 23a stores the same data as the grammar storage unit 23 described above.
- the control unit 210 controls each unit of the external device 200 based on a program or data stored in the storage unit 220. By executing the program, the speech recognition unit 11a is constructed in the control unit 210.
- the voice recognition unit 11a includes a voice section detection unit 111a, an acoustic feature extraction unit 112a, and an acoustic feature comparison unit 113a.
- the voice segment detection unit 111a has the same function as the voice segment detection unit 111 described above.
- the acoustic feature extraction unit 112a has the same function as the acoustic feature extraction unit 112 described above.
- the acoustic feature comparison unit 113a has the same function as the acoustic feature comparison unit 113 described above.
- FIG. 9 is a flowchart showing the flow of voice recognition processing performed by the electronic device 1a.
- it shall have the same function as Embodiment 1, and description is abbreviate
- the control unit 10 receives the voice input from the voice input unit 3 via the communication unit 6. Data is transmitted to the external device 200 (S21).
- the voice recognition unit 11a performs voice recognition by the same processing as S2 and S3 shown in FIGS. 4 and 6, thereby specifying a word or sentence included in the voice data and the specified Calculate certainty of word or sentence.
- the control part 210 transmits the information regarding the specified word or sentence and the certainty degree of this specified word or sentence with respect to the electronic device 1a via the communication part 206.
- FIG. The electronic device 1a receives the information from the external device 200 (S22).
- the reaction determination unit 12a determines whether or not the certainty of the word or sentence received from the external device 200 is equal to or higher than the first threshold (S23). And when the certainty factor of a word or a sentence is more than a 1st threshold value (in the case of YES at S23), the reaction determination part 12a determines performing the reaction corresponding to the word or sentence recognized by speech, and the speech synthesis part 13 and the action generation unit 14 perform the reaction (S6).
- the reaction determination unit 12a determines whether the certainty factor of the word or sentence is equal to or more than the second threshold value. Judgment is made (S24). And when the certainty factor of a word or a sentence is more than a 2nd threshold value (when it is YES at S24), the reaction determination part 12a determines performing a rehearsal, and through the speech synthesis part 13 and the action generation part 14 The reaction is performed (S8).
- the reaction determination unit 12a decides not to perform any reaction corresponding to the voice-recognized word or sentence and the replay. Then, the process ends.
- the electronic device 1a when the certainty of the word or sentence calculated by the electronic device 1a is equal to or less than the first threshold, the electronic device 1a uses the word calculated by the external device 200. Or while receiving the information regarding the certainty level of a sentence, it is judged again based on the received information whether the certainty level of a word or a sentence is below a 1st threshold value. Therefore, when the electronic device 1a has ambiguity in the result of the speech recognition performed by the electronic device 1a, the electronic device 1a performs the speech recognition again through the external device 200 without immediately listening back. Can be reduced.
- the number of data of acoustic features, dictionaries and / or grammatical rules stored in the storage unit 220 can be made larger than the number of data stored in the electronic device 1a. In that case, the accuracy of speech recognition can be improved compared to the case where speech recognition is performed only by the electronic device 1a.
- the time when the predetermined condition is satisfied is, for example, when the electronic device 1 is driving the traveling unit 32, the air blowing unit 33 and / or the cleaning brush unit 34.
- the electronic device 1 drives the traveling unit 32, the air blowing unit 33, and / or the cleaning brush unit 34
- noise is generated by the traveling unit 32, the air blowing unit 33, and / or the cleaning brush unit 34. Since the accuracy of the voice recognition is lowered, it is possible to avoid the replay in order to prevent unnecessary replay.
- the time when the predetermined condition is satisfied is, for example, a predetermined time zone such as nighttime.
- a predetermined time zone such as at night, the electronic device 1 is prevented from listening back, thereby preventing the user from feeling troublesome.
- the electronic device 1 compares the certainty factor of the word or sentence specified by the voice recognition unit 11 with the predetermined first threshold value to the fourth threshold value, thereby determining whether or not to hear back.
- the electronic device 1 may be configured to change the first threshold value to the fourth threshold value in accordance with a condition for performing speech recognition or the content of the specified word or sentence. Good.
- the second threshold value can be set to a low value or a high value as compared with the case where there is not.
- whether the second threshold value is set to a low value or a high value may be appropriately selected according to the type of the electronic device 1 or the use environment.
- the electronic device 1 sets the first threshold to a higher value than when the content is not accompanied by the operation. be able to.
- the electronic device 1 is configured in this way, it is possible to prevent erroneous recognition of a voice instruction accompanied by an operation that is particularly necessary to prevent erroneous recognition.
- the electronic device 1a may receive, from the external device 200, information related to acoustic features, dictionaries, and / or grammatical rules that are referred to in voice recognition processing, for example.
- information related to acoustic features, dictionaries, and / or grammatical rules that are referred to in voice recognition processing for example.
- the electronic device 1a is configured in this way, the number of words or sentences that can be recognized by the electronic device 1a can be increased.
- the electronic device 1a may receive, for example, audio data corresponding to the audio output from the audio output unit 31 from the external device 200.
- the electronic device 1a is configured as described above, the user can change the sound output from the sound output unit 31.
- the received information may be created by the external device 200 by the user.
- the user instructs the external device 200 to create information such as a desired dictionary or audio data by accessing the external device 200 via a terminal device such as a smartphone.
- the control unit 210 of the external device 200 generates the information based on the program or data stored in the storage unit 220.
- the user can use various existing sound data such as sound data recorded by himself / herself, sound data acquired via the Internet or the like, or music data such as a music CD.
- the created information may be provided to the electronic device 1 by supplying a recording medium storing the information to the electronic device 1.
- the recording medium is not particularly limited.
- the recording medium is, for example, a tape such as a magnetic tape, a magnetic disk such as an HDD, an optical disk such as a CD-ROM, a card such as an IC card, a semiconductor memory such as a flash ROM, or a logic circuit such as a PLD (Programmable logic device). Can be used.
- the vacuum cleaner has been described as the electronic device, but the invention is not limited thereto.
- the electronic device may be an AVC device such as a TV or a PC (Personal Computer) or a home appliance such as an electronic cooker or an air conditioner.
- the electronic device specifies a word or sentence included in the voice data by analyzing the voice data and voice input means for converting the input voice into voice data.
- a speech recognition means for calculating the certainty of the word or sentence, a reaction determining means for determining whether or not the user needs to be answered based on the certainty degree, and a replying means for performing the answering.
- the degree is greater than or equal to the second threshold value and less than the first threshold value, a decision is made to hear back, and when the certainty degree is less than the second threshold value, a decision is made not to ask the answer.
- reaction determination means may select a plurality of answers based on the certainty factor.
- the user can know how much the electronic device is recognizing the voice from the voice and / or operation of the listening, so that, for example, whether to input the instruction again by voice, Alternatively, it is possible to select whether to input an instruction via the panel operation unit or the like, and the convenience for the user is improved.
- the electronic device may further include a communication unit that transmits the voice data to the external device and receives the certainty of the word or sentence included in the voice data from the external device.
- the electronic device performs voice recognition again via the external device without immediately replaying if the result of the voice recognition performed by the electronic device is ambiguous. So you can reduce unnecessary feedback.
- the replaying means may perform the replaying by performing a predetermined sound and / or operation.
- the vacuum cleaner includes any one of the electronic devices, self-running means for self-running the electronic device, blower means for sucking dust and / or cleaning brush means for sweeping and cleaning the floor surface. You may prepare.
- the vacuum cleaner can be sure of words or sentences in a vacuum cleaner that is often used in noisy situations, such as by driving self-propelled means, blower means, and / or cleaning brush means.
- noisy situations such as by driving self-propelled means, blower means, and / or cleaning brush means.
- reaction determining means may change the second threshold value while the self-running means, the air blowing means and / or the cleaning brush means are being driven.
- the vacuum cleaner determines whether it is necessary to hear back by comparing the certainty factor with the second threshold value changed according to the noisy situation. , Can listen more appropriately.
- the electronic device according to the present invention can be widely used for electronic devices equipped with voice recognition means.
Landscapes
- Engineering & Computer Science (AREA)
- Acoustics & Sound (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Multimedia (AREA)
- Mechanical Engineering (AREA)
- Computational Linguistics (AREA)
- Machine Translation (AREA)
- Toys (AREA)
- Electric Vacuum Cleaner (AREA)
- Telephone Function (AREA)
Abstract
Description
本発明の実施形態1に係る電子機器について、以下に説明する。
図1は、実施形態1に係る電子機器1の斜視図である。ここで、電子機器1が自走して掃除を行う際の進行方向を前方とし、図1に矢印で示す。また、進行方向と逆方向を後方とする。
図3は、電子機器1の主な構成を示すブロック図である。電子機器1は、音声入力部3及び送風部33を備えている。
記憶部20は、後述する制御部10が実行する各種プログラム、各種プログラムを実行する際に使用及び作成される各種データ並びに電子機器1に入力される各種データ等などを記憶するものである。記憶部20は、例えば、ROM(Read Only Memory)、フラッシュメモリ又はHDD(Hard Disk Drive)等の不揮発性記憶装置及び作業領域を構成するRAM(Random Access Memory)等の揮発性記憶装置から構成されている。
制御部10は、記憶部20に記憶されたプログラム又はデータに基づいて、電子機器1の各部を統括して制御するものである。プログラムが実行されることで、制御部10には、音声認識部11、反応決定部12、音声合成部13及び動作生成部14が構築される。
以下に示す処理は、電子機器1の制御部10が、記憶部20に記憶されたプログラムを実行することでなされる。
本発明の実施形態2に係る電子機器1について、図面を参照して説明する。本発明に係る電子機器1は、音声認識部11が音声認識した単語又は文章の確信度に基づいて異なった聞き返しを行う点で上述の実施形態と異なる。なお、実施形態1で説明した構成要素については、実施形態1と同じ機能を有するものとし、特に記載する場合を除いて説明を省略する。
本発明の実施形態3に係る電子機器1aについて、図面を参照して説明する。本発明に係る電子機器1aは、外部装置200と通信を行う通信部6を備え、外部装置200と通信を行うことで、ユーザが発した音声の音声認識処理を、外部装置200でも行う点で上述の実施形態のいずれとも異なる。なお、実施形態1で説明した構成要素については、実施形態1と同じ機能を有するものとし、特に記載する場合を除いて説明を省略する。
図8は、電子機器1a及び外部装置200の主な構成を示すブロック図である。電子機器1aは、実施形態1で説明した構成要素に加えて、通信部6をさらに備えている。なお、図8においては、実施形態1で説明した構成要素の内一部のみを示している。
以下に示す処理は、電子機器1aの制御部10が、記憶部20に記憶されたプログラムを実行することでなされる。
上述の実施形態においては、音声認識部11によって特定された単語又は文章の確信度が所定の範囲に含まれる時、電子機器1がユーザに対して聞き返しを行う場合について説明したが、特定された単語又は文章の確信度が所定の範囲に含まれる時でも、所定の条件を満たす時には、電子機器1は、聞き返しを行わないように構成されてもよい。
2 筐体
2a 上面
2b 排気口
2c 側面
2d 底面
2e 吸込み口
3 音声入力部
6 通信部
10 制御部
11、11a 音声認識部
111、111a 音声区間検出部
112、112a 音響特徴抽出部
113、113a 音響特徴比較部
12 反応決定部
13 音声合成部
14 動作生成部
20 記憶部
21、21a 音響特徴記憶部
22、22a 辞書記憶部
23、23a 文法記憶部
31 音声出力部
32 走行部
33 送風部
34 掃除ブラシ部
200 外部装置
206 通信部
210 制御部
220 記憶部
Claims (5)
- 入力された音声を音声データに変換する音声入力手段と、
前記音声データを解析することで、該音声データに含まれる単語又は文章を特定すると共に、該特定された単語又は文章の確信度を算出する音声認識手段と、
前記確信度に基づいて、ユーザに対する聞き返しの要否を決定する反応決定手段と、
前記聞き返しを行う聞き返し手段と、を備え、
前記反応決定手段は、前記確信度が第2の閾値以上で第1の閾値未満の場合に聞き返しを行う決定をし、前記確信度が前記第2の閾値未満の場合に聞き返しを行わない決定をすることを特徴とする電子機器。 - 前記反応決定手段は、前記確信度に基づいて複数の聞き返しを選択することを特徴とする請求項1に記載の電子機器。
- 前記音声データを外部装置に対して送信すると共に、該音声データに含まれる単語又は文章の確信度を該外部装置から受信する通信手段をさらに備えることを特徴とする請求項1又は2に記載の電子機器。
- 請求項1から3のいずれか1項に記載の電子機器と、
前記電子機器を自走させる為の自走手段、塵埃を吸引する為の送風手段及び/又は床面を掃いて掃除する為の掃除ブラシ手段と、を備えることを特徴とする掃除機。 - 前記反応決定手段は、前記自走手段、前記送風手段及び/又は前記掃除ブラシ手段の駆動中は、前記第2の閾値を変更することを特徴とする請求項4に記載の掃除機。
Priority Applications (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US14/652,177 US20150332675A1 (en) | 2013-01-16 | 2013-12-03 | Electronic apparatus and vacuum cleaner |
KR1020157016096A KR101707359B1 (ko) | 2013-01-16 | 2013-12-03 | 전자 기기 및 청소기 |
EP13871591.7A EP2947651B1 (en) | 2013-01-16 | 2013-12-03 | Vacuum cleaner |
CN201380066048.6A CN104871239B (zh) | 2013-01-16 | 2013-12-03 | 电子设备和吸尘器 |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2013005065A JP2014137430A (ja) | 2013-01-16 | 2013-01-16 | 電子機器及び掃除機 |
JP2013-005065 | 2013-01-16 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2014112226A1 true WO2014112226A1 (ja) | 2014-07-24 |
Family
ID=51209336
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2013/082441 WO2014112226A1 (ja) | 2013-01-16 | 2013-12-03 | 電子機器及び掃除機 |
Country Status (6)
Country | Link |
---|---|
US (1) | US20150332675A1 (ja) |
EP (1) | EP2947651B1 (ja) |
JP (1) | JP2014137430A (ja) |
KR (1) | KR101707359B1 (ja) |
CN (1) | CN104871239B (ja) |
WO (1) | WO2014112226A1 (ja) |
Families Citing this family (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP6659514B2 (ja) * | 2016-10-12 | 2020-03-04 | 東芝映像ソリューション株式会社 | 電子機器及びその制御方法 |
CN106710592B (zh) * | 2016-12-29 | 2021-05-18 | 北京奇虎科技有限公司 | 一种智能硬件设备中的语音识别纠错方法和装置 |
JP6941856B2 (ja) * | 2017-03-31 | 2021-09-29 | 国立大学法人大阪大学 | 対話ロボットおよびロボット制御プログラム |
CN108231069B (zh) * | 2017-08-30 | 2021-05-11 | 深圳乐动机器人有限公司 | 清洁机器人的语音控制方法、云服务器、清洁机器人及其存储介质 |
TWI672690B (zh) * | 2018-03-21 | 2019-09-21 | 塞席爾商元鼎音訊股份有限公司 | 人工智慧語音互動之方法、電腦程式產品及其近端電子裝置 |
CN111369989B (zh) * | 2019-11-29 | 2022-07-05 | 添可智能科技有限公司 | 清洁设备的语音交互方法及清洁设备 |
WO2021049445A1 (ja) * | 2019-09-10 | 2021-03-18 | 日本電気株式会社 | 言語推定装置、言語推定方法、およびプログラム |
KR20210047173A (ko) * | 2019-10-21 | 2021-04-29 | 엘지전자 주식회사 | 오인식된 단어를 바로잡아 음성을 인식하는 인공 지능 장치 및 그 방법 |
JP6858335B2 (ja) * | 2020-02-06 | 2021-04-14 | Tvs Regza株式会社 | 電子機器及びその制御方法 |
JP6858336B2 (ja) * | 2020-02-06 | 2021-04-14 | Tvs Regza株式会社 | 電子機器及びその制御方法 |
JP6858334B2 (ja) * | 2020-02-06 | 2021-04-14 | Tvs Regza株式会社 | 電子機器及びその制御方法 |
JP7471921B2 (ja) | 2020-06-02 | 2024-04-22 | 株式会社日立製作所 | 音声対話装置、音声対話方法、および音声対話プログラム |
US11521604B2 (en) | 2020-09-03 | 2022-12-06 | Google Llc | User mediation for hotword/keyword detection |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH03248199A (ja) * | 1990-02-26 | 1991-11-06 | Ricoh Co Ltd | 音声認識方式 |
JPH11143488A (ja) * | 1997-11-10 | 1999-05-28 | Hitachi Ltd | 音声認識装置 |
JP2000135186A (ja) * | 1998-10-30 | 2000-05-16 | Ym Creation:Kk | 掃除玩具 |
JP2003079552A (ja) * | 2001-09-17 | 2003-03-18 | Toshiba Tec Corp | 掃除装置 |
JP2008009153A (ja) * | 2006-06-29 | 2008-01-17 | Xanavi Informatics Corp | 音声対話システム |
JP2008052178A (ja) | 2006-08-28 | 2008-03-06 | Toyota Motor Corp | 音声認識装置と音声認識方法 |
JP2008233305A (ja) * | 2007-03-19 | 2008-10-02 | Toyota Central R&D Labs Inc | 音声対話装置、音声対話方法及びプログラム |
JP2010055044A (ja) * | 2008-04-22 | 2010-03-11 | Ntt Docomo Inc | 音声認識結果訂正装置および音声認識結果訂正方法、ならびに音声認識結果訂正システム |
Family Cites Families (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5758322A (en) * | 1994-12-09 | 1998-05-26 | International Voice Register, Inc. | Method and apparatus for conducting point-of-sale transactions using voice recognition |
US6292782B1 (en) * | 1996-09-09 | 2001-09-18 | Philips Electronics North America Corp. | Speech recognition and verification system enabling authorized data transmission over networked computer systems |
JP2001075595A (ja) * | 1999-09-02 | 2001-03-23 | Honda Motor Co Ltd | 車載用音声認識装置 |
JP2001175276A (ja) * | 1999-12-17 | 2001-06-29 | Denso Corp | 音声認識装置及び記録媒体 |
JP2003036091A (ja) * | 2001-07-23 | 2003-02-07 | Matsushita Electric Ind Co Ltd | 電化情報機器 |
US20030061053A1 (en) * | 2001-09-27 | 2003-03-27 | Payne Michael J. | Method and apparatus for processing inputs into a computing device |
JP2006205497A (ja) * | 2005-01-27 | 2006-08-10 | Canon Inc | 音声認識手段を持つ複合機 |
ES2346343T3 (es) * | 2005-02-18 | 2010-10-14 | Irobot Corporation | Robot autonomo de limpieza de superficies para una limpieza en seco y en mojado. |
KR101832952B1 (ko) * | 2011-04-07 | 2018-02-28 | 엘지전자 주식회사 | 로봇 청소기 및 이의 제어 방법 |
US9934780B2 (en) * | 2012-01-17 | 2018-04-03 | GM Global Technology Operations LLC | Method and system for using sound related vehicle information to enhance spoken dialogue by modifying dialogue's prompt pitch |
-
2013
- 2013-01-16 JP JP2013005065A patent/JP2014137430A/ja active Pending
- 2013-12-03 KR KR1020157016096A patent/KR101707359B1/ko active IP Right Grant
- 2013-12-03 EP EP13871591.7A patent/EP2947651B1/en not_active Not-in-force
- 2013-12-03 US US14/652,177 patent/US20150332675A1/en not_active Abandoned
- 2013-12-03 WO PCT/JP2013/082441 patent/WO2014112226A1/ja active Application Filing
- 2013-12-03 CN CN201380066048.6A patent/CN104871239B/zh not_active Expired - Fee Related
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH03248199A (ja) * | 1990-02-26 | 1991-11-06 | Ricoh Co Ltd | 音声認識方式 |
JPH11143488A (ja) * | 1997-11-10 | 1999-05-28 | Hitachi Ltd | 音声認識装置 |
JP2000135186A (ja) * | 1998-10-30 | 2000-05-16 | Ym Creation:Kk | 掃除玩具 |
JP2003079552A (ja) * | 2001-09-17 | 2003-03-18 | Toshiba Tec Corp | 掃除装置 |
JP2008009153A (ja) * | 2006-06-29 | 2008-01-17 | Xanavi Informatics Corp | 音声対話システム |
JP2008052178A (ja) | 2006-08-28 | 2008-03-06 | Toyota Motor Corp | 音声認識装置と音声認識方法 |
JP2008233305A (ja) * | 2007-03-19 | 2008-10-02 | Toyota Central R&D Labs Inc | 音声対話装置、音声対話方法及びプログラム |
JP2010055044A (ja) * | 2008-04-22 | 2010-03-11 | Ntt Docomo Inc | 音声認識結果訂正装置および音声認識結果訂正方法、ならびに音声認識結果訂正システム |
Non-Patent Citations (1)
Title |
---|
See also references of EP2947651A4 |
Also Published As
Publication number | Publication date |
---|---|
KR20150086339A (ko) | 2015-07-27 |
CN104871239A (zh) | 2015-08-26 |
EP2947651B1 (en) | 2017-04-12 |
JP2014137430A (ja) | 2014-07-28 |
EP2947651A4 (en) | 2016-01-06 |
US20150332675A1 (en) | 2015-11-19 |
KR101707359B1 (ko) | 2017-02-15 |
CN104871239B (zh) | 2018-05-01 |
EP2947651A1 (en) | 2015-11-25 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2014112226A1 (ja) | 電子機器及び掃除機 | |
US10586534B1 (en) | Voice-controlled device control using acoustic echo cancellation statistics | |
JP4837917B2 (ja) | 音声に基づく装置制御 | |
JP6844608B2 (ja) | 音声処理装置および音声処理方法 | |
JP6122816B2 (ja) | 音声出力装置、ネットワークシステム、音声出力方法、および音声出力プログラム | |
JP5996603B2 (ja) | サーバ、発話制御方法、発話装置、発話システムおよびプログラム | |
WO2017168936A1 (ja) | 情報処理装置、情報処理方法、及びプログラム | |
JP2005084253A (ja) | 音響処理装置、方法、プログラム及び記憶媒体 | |
JP2014191029A (ja) | 音声認識システムおよび音声認識システムの制御方法 | |
JPWO2011055410A1 (ja) | 音声認識装置 | |
JP2008256802A (ja) | 音声認識装置および音声認識方法 | |
US20120278066A1 (en) | Communication interface apparatus and method for multi-user and system | |
KR20210017392A (ko) | 전자 장치 및 이의 음성 인식 방법 | |
WO2003107327A1 (en) | Controlling an apparatus based on speech | |
JP5365530B2 (ja) | 通信機器 | |
JP2018022086A (ja) | サーバ装置、制御システム、方法、情報処理端末、および制御プログラム | |
JP2011221101A (ja) | コミュニケーション装置 | |
JP2008249893A (ja) | 音声応答装置及びその方法 | |
JP4539313B2 (ja) | 音声認識辞書作成システム、音声認識辞書作成方法、音声認識システムおよびロボット | |
Vovos et al. | Speech operated smart-home control system for users with special needs. | |
JP7131362B2 (ja) | 制御装置、音声対話装置及びプログラム | |
JP7429107B2 (ja) | 音声翻訳装置、音声翻訳方法及びそのプログラム | |
JP2018091911A (ja) | 音声対話システム及び音声対話方法 | |
JP2014238486A (ja) | 音声認識装置 | |
JP2021089376A (ja) | 情報処理装置及びプログラム |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 13871591 Country of ref document: EP Kind code of ref document: A1 |
|
REEP | Request for entry into the european phase |
Ref document number: 2013871591 Country of ref document: EP |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2013871591 Country of ref document: EP |
|
WWE | Wipo information: entry into national phase |
Ref document number: 14652177 Country of ref document: US |
|
ENP | Entry into the national phase |
Ref document number: 20157016096 Country of ref document: KR Kind code of ref document: A |
|
NENP | Non-entry into the national phase |
Ref country code: DE |