CN109979487A - Voice signal detection method and device - Google Patents

Voice signal detection method and device Download PDF

Info

Publication number
CN109979487A
CN109979487A CN201910172909.8A CN201910172909A CN109979487A CN 109979487 A CN109979487 A CN 109979487A CN 201910172909 A CN201910172909 A CN 201910172909A CN 109979487 A CN109979487 A CN 109979487A
Authority
CN
China
Prior art keywords
voice
signal
measured
voice signal
collecting device
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910172909.8A
Other languages
Chinese (zh)
Other versions
CN109979487B (en
Inventor
张腾飞
陈建哲
钟思思
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Apollo Intelligent Connectivity Beijing Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN201910172909.8A priority Critical patent/CN109979487B/en
Publication of CN109979487A publication Critical patent/CN109979487A/en
Application granted granted Critical
Publication of CN109979487B publication Critical patent/CN109979487B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • G10L25/18Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being spectral information of each sub-band
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
    • G10L25/60Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination for measuring the quality of voice signals

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Signal Processing (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Computational Linguistics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Quality & Reliability (AREA)
  • Electrically Operated Instructional Devices (AREA)
  • Circuit For Audible Band Transducer (AREA)

Abstract

The embodiment of the present invention proposes a kind of voice signal detection method and device.Wherein, which comprises sent to playback equipment and play instruction;According to voice metrics to be measured, signal collection instruction is sent to collecting device;The voice signal to be measured that the collecting device is returned according to signal collection instruction is received, the voice signal to be measured includes the signal of phonetic function node processing relevant to the voice metrics to be measured in the collecting device;According to the voice signal to be measured, analysis result is obtained.The embodiment of the present invention sends signal collection instruction to collecting device according to voice metrics to be measured, to obtain the voice signal to be measured of phonetic function node processing relevant to the index in collecting device, convenient for determining whether each phonetic function node is normal in collecting device using voice signal to be measured.

Description

Voice signal detection method and device
Technical field
The present invention relates to multimedia technology field more particularly to a kind of voice signal detection methods and device.
Background technique
At present in automatic Pilot field, the detection method of voice system performance and voice quality etc. is immature.Many vehicle enterprises There is a fixed accessories supplier, for example vehicle device hardware (such as: processor, display etc.), sound equipment, microphone mould group, the whole world are fixed Position system (GPS, Global Positioning System), vehicle intelligent communication system (Telematics BOX, TBOX) etc. Device.If vehicle device hardware, microphone mould group, sound equipment, vehicle system voice signal access etc. are out of joint, language will result directly in Sound wakes up, recognition effect is poor.Find the problem after factory, then modify vehicle device hardware difficulty it is big.
Summary of the invention
The embodiment of the present invention provides a kind of voice signal detection method and device, to solve one in the prior art or more A technical problem.
In a first aspect, the embodiment of the invention provides a kind of voice signal detection methods, comprising:
It is sent to playback equipment and plays instruction;
According to voice metrics to be measured, signal collection instruction is sent to collecting device;
Receive the voice signal to be measured that the collecting device is returned according to signal collection instruction, the voice letter to be measured It number include the signal of phonetic function node processing relevant to the voice metrics to be measured in the collecting device;
According to the voice signal to be measured, analysis result is obtained.
In one embodiment, the playback equipment and the collecting device are same equipment.
In one embodiment, the signal collection instruction is used to indicate the collecting device and collects what microphone picked up At least one of the voice signal that voice signal, the voice signal of DSP output, application layer software obtain.
In one embodiment, described according to the voice signal to be measured, obtaining analysis result includes:
The voice signal to be measured is analyzed using phonetic algorithm, the phonetic algorithm includes the frequency spectrum of voice signal At least one of analysis, time-delay analysis and noise suppressed.
In one embodiment, the voice metrics to be measured include: trip delay value, node voice quality, near field language Sound signal amplitude, Devices to test system bottom are made an uproar, microphone array frequency invariance, microphone array phase equalization, signal saturation are examined It surveys, echo is with reference at least one in delay, reference signal frequency consistency.
Second aspect, the embodiment of the invention provides a kind of voice signal detection methods, comprising:
Receiving control apparatus is indicated according to the signal collection that voice metrics to be measured are sent;
It in the case where playback equipment plays voice content, is indicated, is collected and the language to be measured according to the signal collection The signal of the relevant phonetic function node processing of sound index, obtains voice signal to be measured;
The voice signal to be measured is sent to the control equipment.
In one embodiment, this method further include:
It receives and plays instruction, described play includes the voice content for needing to play in instruction;
Play the voice content.
In one embodiment, the signal of phonetic function node processing relevant to the voice metrics to be measured is collected, Include:
Voice signal, the application layer software for collecting voice signal, the DSP output that microphone picks up in collecting device obtain At least one of voice signal.
In one embodiment, the playback equipment and the collecting device are same equipment.
In one embodiment, the voice metrics to be measured include: trip delay value, node voice quality, near field language Sound signal amplitude, Devices to test system bottom are made an uproar, microphone array frequency invariance, microphone array phase equalization, signal saturation are examined It surveys, echo is with reference at least one in delay, reference signal frequency consistency.
The third aspect, the embodiment of the invention provides a kind of Speech signal detection devices, comprising:
First sending module plays instruction for sending to playback equipment;
Second sending module, for sending signal collection instruction to collecting device according to voice metrics to be measured;
First receiving module is believed for receiving the collecting device according to the voice to be measured that signal collection instruction returns Number, the voice signal to be measured includes phonetic function node processing relevant to the voice metrics to be measured in the collecting device Signal;
Analysis module, for obtaining analysis result according to the voice signal to be measured.
In one embodiment, the playback equipment and the collecting device are same equipment.
In one embodiment, the analysis module is also used to carry out the voice signal to be measured using phonetic algorithm Analysis, the phonetic algorithm includes at least one of spectrum analysis, time-delay analysis and noise suppressed of voice signal.
Fourth aspect, the embodiment of the invention provides a kind of Speech signal detection devices, comprising:
Second receiving module is indicated for receiving control apparatus according to the signal collection that voice metrics to be measured are sent;
Collection module is collected for being indicated according to the signal collection in the case where playback equipment plays voice content The signal of phonetic function node processing relevant to the voice metrics to be measured, obtains voice signal to be measured;
Third sending module, for sending the voice signal to be measured to the control equipment.
In one embodiment, the device further include:
Third receiving module plays instruction for receiving, and described play includes the voice content for needing to play in instruction;
Playback module, for playing the voice content.
In one embodiment, the collection module is also used to collect the voice letter that microphone picks up in collecting device Number, at least one of the voice signal that obtains of the voice signal of DSP output, application layer software.
5th aspect, the embodiment of the invention provides a kind of Speech signal detection equipment, the function of the equipment can lead to Hardware realization is crossed, corresponding software realization can also be executed by hardware.The hardware or software include it is one or more with it is upper State the corresponding module of function.
It in one embodiment, include processor and memory in the structure of the equipment, the memory is for depositing Storage supports the equipment to execute the program of above-mentioned voice signal detection method, the processor is configured to for executing described deposit The program stored in reservoir.The equipment can also include communication interface, be used for and other equipment or communication.
6th aspect, the embodiment of the invention provides a kind of computer readable storage mediums, examine for storage of speech signals Computer software instructions used in measurement equipment comprising for executing program involved in above-mentioned voice signal detection method.
A technical solution in above-mentioned technical proposal have the following advantages that or the utility model has the advantages that according to voice metrics to be measured to Collecting device sends signal collection instruction, thus obtain phonetic function node processing relevant to the index in collecting device to Voice signal is surveyed, convenient for determining whether each phonetic function node is normal in collecting device using voice signal to be measured.
Another technical solution in above-mentioned technical proposal has the following advantages that or the utility model has the advantages that can be before vehicle release Signal quality, the system voice access etc. of vehicle device detection vehicle device voice are detected, is conducive to find in advance, quickly position simultaneously It solves the problems, such as, reduces the cost of voice exploitation.
Above-mentioned general introduction is merely to illustrate that the purpose of book, it is not intended to be limited in any way.Except foregoing description Schematical aspect, except embodiment and feature, by reference to attached drawing and the following detailed description, the present invention is further Aspect, embodiment and feature, which will be, to be readily apparent that.
Detailed description of the invention
In the accompanying drawings, unless specified otherwise herein, otherwise indicate the same or similar through the identical appended drawing reference of multiple attached drawings Component or element.What these attached drawings were not necessarily to scale.It should be understood that these attached drawings depict only according to the present invention Disclosed some embodiments, and should not serve to limit the scope of the present invention.
Fig. 1 shows the flow chart of voice signal detection method according to an embodiment of the present invention.
Fig. 2 shows the flow charts of voice signal detection method according to an embodiment of the present invention.
Fig. 3 shows the flow chart of voice signal detection method according to an embodiment of the present invention.
Fig. 4 shows the flow chart of voice signal detection method according to an embodiment of the present invention.
Fig. 5 show voice signal detection method according to an embodiment of the present invention using exemplary structural schematic diagram.
Fig. 6 show voice signal detection method according to an embodiment of the present invention using exemplary flow chart.
Fig. 7 shows the structural block diagram of Speech signal detection device according to an embodiment of the present invention.
Fig. 8 shows the structural block diagram of Speech signal detection device according to an embodiment of the present invention.
Fig. 9 shows the structural block diagram of Speech signal detection device according to an embodiment of the present invention.
Figure 10 shows the structural block diagram of Speech signal detection device according to an embodiment of the present invention.
Specific embodiment
Hereinafter, certain exemplary embodiments are simply just described.As one skilled in the art will recognize that Like that, without departing from the spirit or scope of the present invention, described embodiment can be modified by various different modes. Therefore, attached drawing and description are considered essentially illustrative rather than restrictive.
Fig. 1 shows the flow chart of voice signal detection method according to an embodiment of the present invention.This method can be applied to control Control equipment.In embodiments of the present invention, control equipment can include but is not limited to personal computer (Personal Computer, PC), laptop, palm PC, mobile phone etc. have the equipment of control function.As shown in Figure 1, this method may include:
Step S10, it is sent to playback equipment and plays instruction, described play in instruction may include in the voice for needing to play Hold;
Step S11, according to voice metrics to be measured, signal collection instruction is sent to collecting device;
Step S12, receive the voice signal to be measured that the collecting device is returned according to signal collection instruction, it is described to Survey the signal that voice signal includes phonetic function node processing relevant to the voice metrics to be measured in the collecting device;
Step S13, according to the voice signal to be measured, analysis result is obtained.
In a kind of example, vehicle device is referred to as car running computer, car-mounted computer, intelligent control device etc..Vehicle device can To realize the multiple voices functions such as voice amusement control, vehicle-mounted voice control, intelligent voice prompt, Voice Navigation.It can in vehicle device To include microphone mould group, sound equipment, hardware speech denoising module DSP (Digital Signal Processor, at digital signal Manage device), voice wake up the software and hardwares functional node such as identification function module.Equipment is controlled according to various voice metrics to be measured, it can be with Corresponding signal collection instruction is sent to collecting device to be detected.
In a kind of example, after playback equipment plays voice content, the microphone of collecting device can pick up voice letter Number, the voice signal that microphone picks up, which can be input in DSP, to be handled.The voice signal exported after DSP processing, using System carries out the processing such as frequency conversion, resampling, de-noising.Application layer software passes through API (Application Programming Interface, application programming interface) available system treated voice signal.Microphone, DSP and application layer are soft Part etc. belongs to phonetic function node relevant to voice metrics to be measured.
In one embodiment, the voice metrics to be measured include: trip delay value, node voice quality, near field language Sound signal amplitude, Devices to test system bottom are made an uproar, microphone array frequency invariance, microphone array phase equalization, signal saturation are examined It surveys, echo is with reference at least one in delay, reference signal frequency consistency.
In one embodiment, the signal collection instruction is used to indicate the collecting device and collects what microphone picked up At least one of the voice signal that voice signal, the voice signal of DSP output, application layer software obtain.
The language for the phonetic function node processing relevant to voice metrics to be measured that different voice metrics to be measured need to collect Sound signal may be different.
For example, in vehicle system, the letter of vehicle device voice can be detected by following voice metrics
Number quality and vehicle system voice pathway.
(1) Round trip delay (round-trip delay) value may include voice signal uplink, downlink it is round-trip Length of delay.For example, voice signal includes: to handle from microphone (mic) radio reception to DSP in the uplink process of vehicle system, MCU is arrived (Microcontroller Unit, micro-control unit) processing, then arrive vehicle device system on chip (System on Chip, SOC) Application layer.Downlink process includes handling from vehicle system layer to MCU, then carry out echo de-noising to DSP, eventually arrives at vehicle audio Etc. equipment.For the index, vehicle device need to collect uplink, downlink each node voice signal, be sent to PC.By PC The time that voice signal reaches each node of uplink and downlink is calculated, and then trip delay value is calculated.
(2) mic&Lout node voice quality.Wherein, mic&Lout node can be output to the node of SOC for DSP.Cause This, for the index, vehicle device, which needs to collect DSP and exports, is sent to PC to the voice signal of SOC.Mic&Lout node is calculated by PC Voice quality.
(3) near field voice signal amplitude.Near field voice can be apart from the closer voice of Devices to test, such as true user Voice, voice of sound equipment broadcasting of analog subscriber voice etc..For the index, vehicle device needs to collect microphone (mic) array institute The signal that signal, the amplified output signal of DSP de-noising, the systematic difference layer of pickup obtain, is sent to PC.PC can be calculated The amplitude of these signals.
(4) vehicle system bottom is made an uproar size.For the index, the signal that vehicle device needs the application layer of collection system to obtain, hair Give PC.The signal that PC can use application layer acquisition calculates vehicle system bottom and makes an uproar size.
(5) mic array frequency invariance.For the index, vehicle device can collect the voice signal that multiple microphones pick up, It is sent to PC.PC analyzes mic array frequency invariance using the signal of these microphones.
(6) mic array phase consistency.For the index, vehicle device can collect the voice signal that multiple microphones pick up, It is sent to PC.PC analyzes mic array phase consistency using the signal of these microphones.
(7) mic, Lout signal saturation detection.Wherein, saturation detection may include: whether detection mic, Lout signal surpasses Preset signal maximum is crossed, to judge whether signal cut ridge occurs.Therefore, for the index, vehicle device needs to collect microphone The signal and DSP that each microphone is picked up in array are output to the signal of SOC.
(8) echo is with reference to delay.For the index, vehicle device needs to collect the signal of application layer acquisition, and is sent to PC.By PC calculates echo with reference to delay.
(9) mic-in (microphone input), AEC (Acoustic Echo Cancellation, acoustic echo are eliminated) ginseng Examine signal frequency consistency.For the index, vehicle device needs to collect microphone input voice signal and AEC input speech signal, And it is sent to PC.The frequency invariance of microphone input and AEC reference signal is calculated by PC.
(10) linear AEC effect.For the index, in the case where DSP opens linear AEC noise reduction, vehicle device needs to collect The voice signal of DSP output, and it is sent to PC.How many decibel (dB) can be eliminated by calculating the linear AEC noise reduction of unlatching by PC.
For different voice metrics, different signal collection instructions can be pre-configured in control equipment.Also, needle To different voice metrics, it may be necessary to which playback equipment plays specific voice content.The playback equipment may be with collecting device For same equipment.For example, realizing playing function by loudspeaker etc. in collecting device.The playback equipment is also likely to be and receives Collect the independent equipment of equipment.Therefore, different voice metrics can be directed to, can be pre-configured in control equipment and need to collect The voice content that equipment and/or playback equipment play.
In some scenes, it may be necessary to which collecting device plays specified voice content.Control equipment can receive signal Collection instruction and broadcasting instruction send jointly to collecting device, can also be sent respectively to collecting device.For example, PC and vehicle device connect It connects, PC can send signal collection instruction corresponding with some voice metrics to vehicle device and play instruction.Playing can be in instruction Including the voice content for needing the vehicle device to play.
In some scenes, it may be necessary to which independent playback equipment plays specified voice content.Controlling equipment can incite somebody to action Signal collection instruction is sent to collecting device, will play instruction and is sent to playback equipment.For example, in speech recognition scene, it can To control the speech identifying function progress that independent playback equipment plays some user's one's voices in speech prerecorded to vehicle device Test.In this case, the sound equipment connecting with PC can be sent by the sound prerecorded by playing instruction with PC, led to Cross the sound that the sound equipment carrys out played pre-recorded as playback equipment.
In some scenes, it may be necessary to which collecting device and independent playback equipment play specified voice content simultaneously. Signal collection instruction can be sent to collecting device and play instruction by controlling equipment, and is sent to playback equipment and played instruction.It receives The voice content that collection equipment and playback equipment play simultaneously may be the same or different.
In one embodiment, as shown in Fig. 2, step S13 further include: step S21, using phonetic algorithm to it is described to It surveys voice signal to be analyzed, the phonetic algorithm includes in spectrum analysis, time-delay analysis and the noise suppressed of voice signal It is at least one.
In a kind of example, analyzing in result may include analysis report to above-mentioned various indexs.In addition, being set in control The voice signal to be measured from collecting device, the corpus as subsequent analysis and reference can also be saved in standby.
In the present embodiment, control equipment sends signal collection instruction to collecting device according to voice metrics to be measured, to obtain The voice signal to be measured of phonetic function node processing relevant to the index in collecting device is obtained, convenient for determining using voice to be measured Whether each phonetic function node is normal in collecting device.
Fig. 3 shows the flow chart of voice signal detection method according to an embodiment of the present invention.This method can be applied to In the speech ciphering equipment of survey, as shown in figure 3, this method may include:
Step S31, receiving control apparatus is indicated according to the signal collection that voice metrics to be measured are sent;
Step S32, it in the case where playback equipment plays voice content, is indicated according to the signal collection, collection and institute The signal for stating the relevant phonetic function node processing of voice metrics to be measured, obtains voice signal to be measured;
Step S33, the voice signal to be measured is sent to the control equipment.
In the present embodiment, control equipment sends signal collection instruction to collecting device according to voice metrics to be measured.It is to be measured Collecting device receive signal collection instruction after, start the letter for collecting phonetic function node processing relevant to voice metrics to be measured Number.Then collected voice signal to be measured is sent to control equipment.Control equipment can use phonetic algorithm to it is described to It surveys voice signal to be analyzed, the phonetic algorithm includes in spectrum analysis, time-delay analysis and the noise suppressed of voice signal It is at least one.
For different voice metrics, different signal collection instructions can be pre-configured in control equipment.Also, needle To different voice metrics, it may be necessary to which playback equipment plays specific voice content.The playback equipment may be with collecting device For same equipment.For example, realizing playing function by loudspeaker etc. in collecting device.The playback equipment is also likely to be and receives Collect the independent equipment of equipment.Therefore, different voice metrics can be directed to, can be pre-configured in control equipment and need to collect The voice content that equipment and/or playback equipment play.
In one embodiment, as shown in figure 4, this method comprises:
Step S41, it receives and plays instruction, described play includes the voice content for needing to play in instruction;
Step S42, the voice content is played.
In one embodiment, the playback equipment and the collecting device are same equipment.
In some scenes, it may be necessary to which collecting device plays specified voice content.Control equipment can receive signal Collection instruction and broadcasting instruction send jointly to collecting device, can also be sent respectively to collecting device.
In some scenes, it may be necessary to which independent playback equipment plays specified voice content and carrys out auxiliary control appliance pair The detection of collecting device.Signal collection can be indicated to be sent to collecting device, will play instruction and be sent to broadcasting by control equipment Equipment.
In some scenes, it may be necessary to which collecting device and independent playback equipment play specified voice content simultaneously. Signal collection instruction can be sent to collecting device and play instruction by controlling equipment, and is sent to playback equipment and played instruction.It receives The voice content that collection equipment and playback equipment play simultaneously may be the same or different.
In one embodiment, the signal of phonetic function node processing relevant to the voice metrics to be measured is collected, Include:
Voice signal, the application layer software for collecting voice signal, the DSP output that microphone picks up in collecting device obtain At least one of voice signal.
In one embodiment, the voice metrics to be measured include: trip delay value, node voice quality, near field language Sound signal amplitude, Devices to test system bottom are made an uproar, microphone array frequency invariance, microphone array phase equalization, signal saturation are examined It surveys, echo is with reference at least one in delay, reference signal frequency consistency.The specific example of these indexs, may refer to State the associated description of embodiment.
In a kind of application example, referring to Fig. 5, before the preferable sound equipment 53 of the orientation consistency that can will pronounce is placed in the car It arranges close to the position on driver head, sound equipment 53 connects PC 52.As shown in fig. 6, Speech signal detection process may include: in PC PC is connected vehicle device 51 by USB (Universal Serial Bus, universal serial bus) by middle opening test software (S61), And ADB is established with vehicle system and connect (S62) by ADB (Android Debug Bridge, Android debug bridge) tool.
Then, FE (Far End, distal end) and NE (Near End, proximal end) (S63) is calibrated.For example, FE includes in vehicle device 51 The music that the sound equipment in portion plays, NE include the voice that external sound equipment 53 plays.The volume of both sound can be calibrated, It imposes a condition so that the volume for the voice signal that the central point of the microphone array of vehicle device receives meets.For example, if two kinds The volume of signal is about 80dBA, then calibrates success.
Then, test (S64) is executed.The index measured as needed, sound equipment can play near field white noise signal, vehicle device The noise signals such as music can be played.Wherein it is possible to send the content for needing to play from PC to sound equipment and vehicle device.In addition, PC can To carry out the work of cyclic recording and playing by the dsp software etc. integrated in the application layer software of software protocol control vehicle device, vehicle device. PC is sent to after microphone signal that vehicle device is collected into, application layer voice signal.
PC is calculated using related voices algorithms such as speech signal spec-trum analysis, time-delay analysis, noise suppresseds, is generated specific Process (S65) is evaluated and tested in analysis report, complete automation.It may include above-mentioned various voice metrics in analysis report.In addition, in PC Expectation preservation can also be carried out, is saved from the received various voice signals of vehicle device.
By calculating, various voice metrics can detecte the signal quality of vehicle device voice to the embodiment of the present invention, system voice leads to Road etc. is conducive to the complete and objective index to the hardware setting access of each supplier, thus between clear project parties Division of duty can be positioned quickly, be solved the problems, such as.
In addition, using the voice signal detection method of the embodiment of the present invention, can the access stage detect signal quality, The problems such as system pass, notice OEM (Original Entrusted Manufacture, original delegation production) manufacturer's modification. Advantageously allow so many problems OEM vendor submit to depot check and accept before be exposed, reduce voice exploitation at This, to greatly reduce the time of DSP joint debugging access.
In the present embodiment, collecting device receiving control apparatus to be measured refers to according to the signal collection that voice metrics to be measured are sent Show, can collect the voice signal to be measured of phonetic function node processing relevant to the index in collecting device, then by collection Voice signal to be measured is sent to control equipment.In this way, control equipment it is subsequent can use voice to be measured determine in collecting device it is each Whether phonetic function node is normal.In addition, can use control equipment before vehicle release using the method for the embodiment of the present invention Quality of speech signal, the system voice access etc. of vehicle device detection vehicle device are detected, is conducive to find in advance, quickly position simultaneously It solves the problems, such as, reduces the cost of voice exploitation.
In addition, the method for the present embodiment invention also can be applied to other equipment with phonetic function, detection device Whether each phonetic function node is normal.
Fig. 7 shows the structural block diagram of Speech signal detection device according to an embodiment of the present invention.The device can be set It controls in equipment, as shown in fig. 7, the apparatus may include:
First sending module 70 plays instruction for sending to playback equipment, may include needs in the broadcasting instruction The voice content of broadcasting;
Second sending module 71, for sending signal collection instruction to collecting device according to voice metrics to be measured;First connects Module 72 is received, the voice signal to be measured returned for receiving the collecting device according to signal collection instruction is described to be measured Voice signal includes the signal of phonetic function node processing relevant to the voice metrics to be measured in the collecting device;
Analysis module 73, for obtaining analysis result according to the voice signal to be measured.
In one embodiment, the playback equipment and the collecting device are same equipment.
In one embodiment, the signal collection instruction is used to indicate the collecting device and collects what microphone picked up At least one of the voice signal that voice signal, the voice signal of DSP output, application layer software obtain.
In one embodiment, the analysis module is also used to carry out the voice signal to be measured using phonetic algorithm Analysis, the phonetic algorithm includes at least one of spectrum analysis, time-delay analysis and noise suppressed of voice signal.
In one embodiment, the voice metrics to be measured include: trip delay value, node voice quality, near field language Sound signal amplitude, Devices to test system bottom are made an uproar, microphone array frequency invariance, microphone array phase equalization, signal saturation are examined It surveys, echo is with reference at least one in delay, reference signal frequency consistency.
Fig. 8 shows the structural block diagram of Speech signal detection device according to an embodiment of the present invention.The device can be set In speech ciphering equipment to be measured, as shown in figure 8, the apparatus may include:
Second receiving module 81 is indicated for receiving control apparatus according to the signal collection that voice metrics to be measured are sent;
Collection module 82 is received for being indicated according to the signal collection in the case where playback equipment plays voice content The signal for collecting phonetic function node processing relevant to the voice metrics to be measured, obtains voice signal to be measured;
Third sending module 83, for sending the voice signal to be measured to the control equipment.Controlling equipment can evidence The voice signal to be measured obtains analysis result.
As shown in figure 9, described device further include:
Third receiving module 85 plays instruction for receiving, and described play includes the voice content for needing to play in instruction;
Playback module 84, for playing the voice content.
In one embodiment, the collection module is also used to the collection module and is also used to collect wheat in collecting device At least one of the voice signal that voice signal, the application layer software of voice signal, DSP output that gram wind picks up obtain.
In one embodiment, the voice metrics to be measured include: trip delay value, node voice quality, near field language Sound signal amplitude, Devices to test system bottom are made an uproar, microphone array frequency invariance, microphone array phase equalization, signal saturation are examined It surveys, echo is with reference at least one in delay, reference signal frequency consistency.
In one embodiment, the playback equipment and the collecting device are same equipment.
The function of each module in each device of the embodiment of the present invention may refer to the corresponding description in the above method, herein not It repeats again.
Figure 10 shows the structural block diagram of Speech signal detection equipment according to an embodiment of the present invention.As shown in Figure 10, the language Sound signal detection device includes: memory 910 and processor 920, and being stored in memory 910 can run on processor 920 Computer program.The processor 920 realizes the Speech signal detection in above-described embodiment when executing the computer program Method.The quantity of the memory 910 and processor 920 can be one or more.
The equipment further include:
Communication interface 930 carries out data interaction for being communicated with external device.
Memory 910 may include high speed RAM memory, it is also possible to further include nonvolatile memory (non- Volatile memory), a for example, at least magnetic disk storage.
If memory 910, processor 920 and the independent realization of communication interface 930, memory 910,920 and of processor Communication interface 930 can be connected with each other by bus and complete mutual communication.The bus can be Industry Standard Architecture Structure (ISA, Industry Standard Architecture) bus, external equipment interconnection (PCI, Peripheral Component Interconnect) bus or extended industry-standard architecture (EISA, Extended Industry Standard Architecture) bus etc..The bus can be divided into address bus, data/address bus, control bus etc..For Convenient for indicating, only indicated with a thick line in Figure 10, it is not intended that an only bus or a type of bus.
Optionally, in specific implementation, if memory 910, processor 920 and communication interface 930 are integrated in one piece of core On piece, then memory 910, processor 920 and communication interface 930 can complete mutual communication by internal interface.
The embodiment of the invention provides a kind of computer readable storage mediums, are stored with computer program, the program quilt Processor realizes any method in above-described embodiment when executing.
In the description of this specification, reference term " one embodiment ", " some embodiments ", " example ", " specifically show The description of example " or " some examples " etc. means specific features, structure, material or spy described in conjunction with this embodiment or example Point is included at least one embodiment or example of the invention.Moreover, particular features, structures, materials, or characteristics described It may be combined in any suitable manner in any one or more of the embodiments or examples.In addition, without conflicting with each other, this The technical staff in field can be by the spy of different embodiments or examples described in this specification and different embodiments or examples Sign is combined.
In addition, term " first ", " second " are used for descriptive purposes only and cannot be understood as indicating or suggesting relative importance Or implicitly indicate the quantity of indicated technical characteristic." first " is defined as a result, the feature of " second " can be expressed or hidden It include at least one this feature containing ground.In the description of the present invention, the meaning of " plurality " is two or more, unless otherwise Clear specific restriction.
Any process described otherwise above or method description are construed as in flow chart or herein, and expression includes It is one or more for realizing specific logical function or process the step of executable instruction code module, segment or portion Point, and the range of the preferred embodiment of the present invention includes other realization, wherein can not press shown or discussed suitable Sequence, including according to related function by it is basic simultaneously in the way of or in the opposite order, Lai Zhihang function, this should be of the invention Embodiment person of ordinary skill in the field understood.
Expression or logic and/or step described otherwise above herein in flow charts, for example, being considered use In the order list for the executable instruction for realizing logic function, may be embodied in any computer-readable medium, for Instruction execution system, device or equipment (such as computer based system, including the system of processor or other can be held from instruction The instruction fetch of row system, device or equipment and the system executed instruction) it uses, or combine these instruction execution systems, device or set It is standby and use.For the purpose of this specification, " computer-readable medium ", which can be, any may include, stores, communicates, propagates or pass Defeated program is for instruction execution system, device or equipment or the dress used in conjunction with these instruction execution systems, device or equipment It sets.The more specific example (non-exhaustive list) of computer-readable medium include the following: there is the electricity of one or more wirings Interconnecting piece (electronic device), portable computer diskette box (magnetic device), random access memory (RAM), read-only memory (ROM), erasable edit read-only storage (EPROM or flash memory), fiber device and portable read-only memory (CDROM).In addition, computer-readable medium can even is that the paper that can print described program on it or other suitable Jie Matter, because can then be edited, be interpreted or when necessary with other for example by carrying out optical scanner to paper or other media Suitable method is handled electronically to obtain described program, is then stored in computer storage.
It should be appreciated that each section of the invention can be realized with hardware, software, firmware or their combination.Above-mentioned In embodiment, software that multiple steps or method can be executed in memory and by suitable instruction execution system with storage Or firmware is realized.It, and in another embodiment, can be under well known in the art for example, if realized with hardware Any one of column technology or their combination are realized: having a logic gates for realizing logic function to data-signal Discrete logic, with suitable combinational logic gate circuit specific integrated circuit, programmable gate array (PGA), scene Programmable gate array (FPGA) etc..
Those skilled in the art are understood that realize all or part of step that above-described embodiment method carries It suddenly is that relevant hardware can be instructed to complete by program, the program can store in a kind of computer-readable storage medium In matter, which when being executed, includes the steps that one or a combination set of embodiment of the method.
It, can also be in addition, each functional unit in each embodiment of the present invention can integrate in a processing module It is that each unit physically exists alone, can also be integrated in two or more units in a module.Above-mentioned integrated mould Block both can take the form of hardware realization, can also be realized in the form of software function module.The integrated module is such as Fruit is realized and when sold or used as an independent product in the form of software function module, also can store in a computer In readable storage medium storing program for executing.The storage medium can be read-only memory, disk or CD etc..
The above description is merely a specific embodiment, but scope of protection of the present invention is not limited thereto, any Those familiar with the art in the technical scope disclosed by the present invention, can readily occur in its various change or replacement, These should be covered by the protection scope of the present invention.Therefore, protection scope of the present invention should be with the guarantor of the claim It protects subject to range.

Claims (18)

1. a kind of voice signal detection method characterized by comprising
It is sent to playback equipment and plays instruction;
According to voice metrics to be measured, signal collection instruction is sent to collecting device;
Receive the voice signal to be measured that the collecting device is returned according to signal collection instruction, the voice signal packet to be measured Include the signal of phonetic function node processing relevant to the voice metrics to be measured in the collecting device;
According to the voice signal to be measured, analysis result is obtained.
2. the method according to claim 1, wherein the playback equipment is same set with the collecting device It is standby.
3. the method according to claim 1, wherein signal collection instruction is used to indicate the collecting device Collect at least one in the voice signal that voice signal, the application layer software that voice signal, the DSP that microphone picks up are exported obtain Kind.
4. obtaining analysis knot the method according to claim 1, wherein described according to the voice signal to be measured Fruit includes:
The voice signal to be measured is analyzed using phonetic algorithm, the phonetic algorithm includes the frequency spectrum point of voice signal At least one of analysis, time-delay analysis and noise suppressed.
5. method according to claim 1 to 4, which is characterized in that the voice metrics to be measured include: round-trip Length of delay, node voice quality, near field voice signal amplitude, Devices to test system bottom make an uproar, microphone array frequency invariance, Mike Array phase consistency, signal saturation detection, echo are with reference at least one in delay, reference signal frequency consistency.
6. a kind of voice signal detection method characterized by comprising
Receiving control apparatus is indicated according to the signal collection that voice metrics to be measured are sent;
It in the case where playback equipment plays voice content, is indicated according to the signal collection, collection refers to the voice to be measured The signal for marking relevant phonetic function node processing, obtains voice signal to be measured;
The voice signal to be measured is sent to the control equipment.
7. according to the method described in claim 6, it is characterized by further comprising:
It receives and plays instruction, described play includes the voice content for needing to play in instruction;
Play the voice content.
8. according to the method described in claim 6, it is characterized in that, collecting phonetic function relevant to the voice metrics to be measured The signal of node processing, comprising:
Collect the voice that voice signal, the application layer software of voice signal, the DSP output that microphone picks up in collecting device obtain At least one of signal.
9. according to the method described in claim 8, it is characterized in that, the playback equipment is set with the collecting device to be same It is standby.
10. method according to any one of claims 6 to 9, which is characterized in that the voice metrics to be measured include: past Return length of delay, node voice quality, near field voice signal amplitude, Devices to test system bottom are made an uproar, microphone array frequency invariance, wheat Gram array phase consistency, signal saturation detection, echo with reference in delay, reference signal frequency consistency at least one of.
11. a kind of Speech signal detection device characterized by comprising
First sending module plays instruction for sending to playback equipment;
Second sending module, for sending signal collection instruction to collecting device according to voice metrics to be measured;
First receiving module, the voice signal to be measured returned for receiving the collecting device according to signal collection instruction, The voice signal to be measured includes phonetic function node processing relevant to the voice metrics to be measured in the collecting device Signal;
Analysis module, for obtaining analysis result according to the voice signal to be measured.
12. device according to claim 11, which is characterized in that the playback equipment is same set with the collecting device It is standby.
13. device according to claim 11, which is characterized in that the analysis module is also used to using phonetic algorithm to institute It states voice signal to be measured to be analyzed, the phonetic algorithm includes spectrum analysis, time-delay analysis and the noise suppressed of voice signal At least one of.
14. a kind of Speech signal detection device characterized by comprising
Second receiving module is indicated for receiving control apparatus according to the signal collection that voice metrics to be measured are sent;
Collection module, for being indicated according to the signal collection in the case where playback equipment plays voice content, collection and institute The signal for stating the relevant phonetic function node processing of voice metrics to be measured, obtains voice signal to be measured;
Third sending module, for sending the voice signal to be measured to the control equipment.
15. device according to claim 14, which is characterized in that further include:
Third receiving module plays instruction for receiving, and described play includes the voice content for needing to play in instruction;
Playback module, for playing the voice content.
16. device according to claim 15, which is characterized in that the collection module is also used to collect wheat in collecting device At least one of the voice signal that voice signal, the application layer software of voice signal, DSP output that gram wind picks up obtain.
17. a kind of Speech signal detection equipment characterized by comprising
One or more processors;
Storage device, for storing one or more programs;
When one or more of programs are executed by one or more of processors, so that one or more of processors Realize the method as described in any one of claims 1 to 10.
18. a kind of computer readable storage medium, is stored with computer program, which is characterized in that the program is held by processor The method as described in any one of claims 1 to 10 is realized when row.
CN201910172909.8A 2019-03-07 2019-03-07 Voice signal detection method and device Active CN109979487B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910172909.8A CN109979487B (en) 2019-03-07 2019-03-07 Voice signal detection method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910172909.8A CN109979487B (en) 2019-03-07 2019-03-07 Voice signal detection method and device

Publications (2)

Publication Number Publication Date
CN109979487A true CN109979487A (en) 2019-07-05
CN109979487B CN109979487B (en) 2021-07-30

Family

ID=67078102

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910172909.8A Active CN109979487B (en) 2019-03-07 2019-03-07 Voice signal detection method and device

Country Status (1)

Country Link
CN (1) CN109979487B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110390954A (en) * 2019-08-06 2019-10-29 京东方科技集团股份有限公司 Method and device for evaluating quality of voice product
CN112017636A (en) * 2020-08-27 2020-12-01 大众问问(北京)信息科技有限公司 Vehicle-based user pronunciation simulation method, system, device and storage medium

Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2000072306A1 (en) * 1999-05-25 2000-11-30 Algorex, Inc. Real-time quality analyzer for voice and audio signals
CN102143524A (en) * 2010-08-31 2011-08-03 华为技术有限公司 Method, system and device for detecting voice quality
CN102368384A (en) * 2011-10-19 2012-03-07 福建联迪商用设备有限公司 Voice module test method and voice module test device
CN103077727A (en) * 2013-01-04 2013-05-01 华为技术有限公司 Method and device used for speech quality monitoring and prompting
CN103716470A (en) * 2012-09-29 2014-04-09 华为技术有限公司 Method and device for speech quality monitoring
CN105989853A (en) * 2015-02-28 2016-10-05 科大讯飞股份有限公司 Audio quality evaluation method and system
CN106816158A (en) * 2015-11-30 2017-06-09 华为技术有限公司 A kind of speech quality assessment method, device and equipment
US20170229122A1 (en) * 2011-02-22 2017-08-10 Speak With Me, Inc. Hybridized client-server speech recognition
CN107886951A (en) * 2016-09-29 2018-04-06 百度在线网络技术(北京)有限公司 A kind of speech detection method, device and equipment
CN108389592A (en) * 2018-02-27 2018-08-10 上海讯飞瑞元信息技术有限公司 A kind of voice quality assessment method and device
WO2018192659A1 (en) * 2017-04-20 2018-10-25 Telefonaktiebolaget Lm Ericsson (Publ) Handling of poor audio quality in a terminal device
CN108877806A (en) * 2018-06-29 2018-11-23 中国航空无线电电子研究所 System is verified in the test for testing instruction type speech control system
CN109147765A (en) * 2018-11-16 2019-01-04 安徽听见科技有限公司 Audio quality comprehensive evaluating method and system
CN109256148A (en) * 2017-07-14 2019-01-22 ***通信集团浙江有限公司 A kind of speech quality assessment method and device

Patent Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2000072306A1 (en) * 1999-05-25 2000-11-30 Algorex, Inc. Real-time quality analyzer for voice and audio signals
CN102143524A (en) * 2010-08-31 2011-08-03 华为技术有限公司 Method, system and device for detecting voice quality
US20170229122A1 (en) * 2011-02-22 2017-08-10 Speak With Me, Inc. Hybridized client-server speech recognition
CN102368384A (en) * 2011-10-19 2012-03-07 福建联迪商用设备有限公司 Voice module test method and voice module test device
CN103716470A (en) * 2012-09-29 2014-04-09 华为技术有限公司 Method and device for speech quality monitoring
CN103077727A (en) * 2013-01-04 2013-05-01 华为技术有限公司 Method and device used for speech quality monitoring and prompting
CN105989853A (en) * 2015-02-28 2016-10-05 科大讯飞股份有限公司 Audio quality evaluation method and system
CN106816158A (en) * 2015-11-30 2017-06-09 华为技术有限公司 A kind of speech quality assessment method, device and equipment
CN107886951A (en) * 2016-09-29 2018-04-06 百度在线网络技术(北京)有限公司 A kind of speech detection method, device and equipment
WO2018192659A1 (en) * 2017-04-20 2018-10-25 Telefonaktiebolaget Lm Ericsson (Publ) Handling of poor audio quality in a terminal device
CN109256148A (en) * 2017-07-14 2019-01-22 ***通信集团浙江有限公司 A kind of speech quality assessment method and device
CN108389592A (en) * 2018-02-27 2018-08-10 上海讯飞瑞元信息技术有限公司 A kind of voice quality assessment method and device
CN108877806A (en) * 2018-06-29 2018-11-23 中国航空无线电电子研究所 System is verified in the test for testing instruction type speech control system
CN109147765A (en) * 2018-11-16 2019-01-04 安徽听见科技有限公司 Audio quality comprehensive evaluating method and system

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
LEONARDO O.NUNES,ET AL.: "A Parametric Objective Quality Assessment Tool for Speech Signals Degraded by Acoustic Echo", 《IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING》 *
董昕等: "网络实时音频QoS性能分析新方法", 《电讯技术》 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110390954A (en) * 2019-08-06 2019-10-29 京东方科技集团股份有限公司 Method and device for evaluating quality of voice product
CN110390954B (en) * 2019-08-06 2022-05-13 京东方科技集团股份有限公司 Method and device for evaluating quality of voice product
CN112017636A (en) * 2020-08-27 2020-12-01 大众问问(北京)信息科技有限公司 Vehicle-based user pronunciation simulation method, system, device and storage medium
CN112017636B (en) * 2020-08-27 2024-02-23 大众问问(北京)信息科技有限公司 User pronunciation simulation method, system, equipment and storage medium based on vehicle

Also Published As

Publication number Publication date
CN109979487B (en) 2021-07-30

Similar Documents

Publication Publication Date Title
CN107221319A (en) A kind of speech recognition test system and method
CN111724782B (en) Response time testing system, method and equipment of vehicle-mounted voice interaction system
CN103179495B (en) The ear microphone of mobile terminal and the audio testing method of receiver and system
CN111798852A (en) Voice wake-up recognition performance test method, device and system and terminal equipment
CN105259459B (en) Automation quality detecting method, device and the equipment of a kind of electronic equipment
CN111031463B (en) Microphone array performance evaluation method, device, equipment and medium
CN101023469A (en) Digital filtering method, digital filtering equipment
CN1685762A (en) Sound reproduction system, program and data carrier
CN109712608B (en) Multi-sound zone awakening test method, device and storage medium
CN109979487A (en) Voice signal detection method and device
CN113259832B (en) Microphone array detection method and device, electronic equipment and storage medium
CN108280179B (en) Method and system, terminal and the computer readable storage medium of audio advertisement detection
CN106960290B (en) System and method for evaluating sales service quality of automobile 4S shop team
CN106205652A (en) Audio follow-up reading evaluation method and device
CN112261229B (en) Bone conduction call equipment testing method, device and system
CN108305637A (en) Earphone method of speech processing, terminal device and storage medium
CN111739512A (en) Voice wake-up rate testing method, system, device and medium based on real vehicle
CN110070866A (en) Audio recognition method and device
CN111060874A (en) Sound source positioning method and device, storage medium and terminal equipment
CN110475181A (en) Equipment configuration method, device, equipment and storage medium
CN109600697A (en) Method and device for determining terminal play quality
CN112017636A (en) Vehicle-based user pronunciation simulation method, system, device and storage medium
CN110390954A (en) Method and device for evaluating quality of voice product
US6792404B2 (en) STI measuring
CN112995882B (en) Intelligent equipment audio open loop test method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20211012

Address after: 100176 Room 101, 1st floor, building 1, yard 7, Ruihe West 2nd Road, economic and Technological Development Zone, Daxing District, Beijing

Patentee after: Apollo Intelligent Connectivity (Beijing) Technology Co., Ltd.

Address before: 100085 Baidu Building, 10 Shangdi Tenth Street, Haidian District, Beijing

Patentee before: BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) Co.,Ltd.

TR01 Transfer of patent right