CN112205981B - Hearing assessment method and device based on speech intelligibility index - Google Patents
Hearing assessment method and device based on speech intelligibility index Download PDFInfo
- Publication number
- CN112205981B CN112205981B CN202011077820.2A CN202011077820A CN112205981B CN 112205981 B CN112205981 B CN 112205981B CN 202011077820 A CN202011077820 A CN 202011077820A CN 112205981 B CN112205981 B CN 112205981B
- Authority
- CN
- China
- Prior art keywords
- speech
- hearing
- threshold
- double
- confusable
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims description 54
- 238000012360 testing method Methods 0.000 claims abstract description 56
- 238000012076 audiometry Methods 0.000 claims abstract description 55
- 230000006870 function Effects 0.000 claims abstract description 28
- 238000011156 evaluation Methods 0.000 claims abstract description 16
- 238000007476 Maximum Likelihood Methods 0.000 claims abstract description 6
- 238000000691 measurement method Methods 0.000 claims abstract description 6
- 238000011478 gradient descent method Methods 0.000 claims description 8
- 230000008569 process Effects 0.000 claims description 6
- 230000003044 adaptive effect Effects 0.000 claims description 2
- 238000005259 measurement Methods 0.000 claims description 2
- 208000016354 hearing loss disease Diseases 0.000 description 31
- 206010011878 Deafness Diseases 0.000 description 27
- 230000010370 hearing loss Effects 0.000 description 27
- 231100000888 hearing loss Toxicity 0.000 description 27
- 238000002474 experimental method Methods 0.000 description 10
- 238000005094 computer simulation Methods 0.000 description 8
- 238000010586 diagram Methods 0.000 description 5
- 208000032041 Hearing impaired Diseases 0.000 description 4
- 210000005069 ears Anatomy 0.000 description 4
- 230000036541 health Effects 0.000 description 4
- 238000004088 simulation Methods 0.000 description 4
- 238000004364 calculation method Methods 0.000 description 3
- 230000008520 organization Effects 0.000 description 3
- 230000003595 spectral effect Effects 0.000 description 3
- 238000004891 communication Methods 0.000 description 2
- 238000002955 isolation Methods 0.000 description 2
- 206010012289 Dementia Diseases 0.000 description 1
- 230000006978 adaptation Effects 0.000 description 1
- 230000002411 adverse Effects 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 230000006735 deficit Effects 0.000 description 1
- 230000003111 delayed effect Effects 0.000 description 1
- 238000003745 diagnosis Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 238000012854 evaluation process Methods 0.000 description 1
- 208000015181 infectious disease Diseases 0.000 description 1
- 238000007477 logistic regression Methods 0.000 description 1
- 230000004630 mental health Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000008447 perception Effects 0.000 description 1
- 230000008092 positive effect Effects 0.000 description 1
- 230000005180 public health Effects 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 230000000638 stimulation Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
Images
Classifications
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B5/00—Measuring for diagnostic purposes; Identification of persons
- A61B5/12—Audiometering
- A61B5/121—Audiometering evaluating hearing capacity
- A61B5/123—Audiometering evaluating hearing capacity subjective methods
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/78—Detection of presence or absence of voice signals
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/78—Detection of presence or absence of voice signals
- G10L2025/783—Detection of presence or absence of voice signals based on threshold decision
Landscapes
- Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Life Sciences & Earth Sciences (AREA)
- Human Computer Interaction (AREA)
- Signal Processing (AREA)
- Computational Linguistics (AREA)
- Biomedical Technology (AREA)
- Heart & Thoracic Surgery (AREA)
- Medical Informatics (AREA)
- Molecular Biology (AREA)
- Surgery (AREA)
- Animal Behavior & Ethology (AREA)
- General Health & Medical Sciences (AREA)
- Public Health (AREA)
- Veterinary Medicine (AREA)
- Pathology (AREA)
- Biophysics (AREA)
- Otolaryngology (AREA)
- Measurement Of The Respiration, Hearing Ability, Form, And Blood Characteristics Of Living Organisms (AREA)
Abstract
The invention discloses a hearing evaluation method and equipment based on a speech intelligibility index, which comprises the following steps: 1) establishing a functional relationship between the hearing threshold and speech recognition performance by using the speech intelligibility index; 2) constructing an easily confused double-syllable word pair corpus as a test corpus for speech audiometry according to the selected easily confused vowel pair and the selected consonant pair, and measuring a frequency band weight function (BIF) of the test corpus by using a rapid frequency band weight measurement method; 3) using the confusable double syllable word pair constructed in the step 2) to perform speech audiometry on the testee; then, the sound intensity condition which makes the confusing double syllable word pair with the maximum likelihood value of the test result is selected as the final hearing threshold of the testee. The invention can obtain more stable and reliable results in non-professional environment, and has larger correlation with the result of pure-tone audiometry, thereby being a feasible scheme for solving the hearing evaluation of the mobile terminal.
Description
Technical Field
The invention belongs to the technical field of hearing aids, relates to a hearing evaluation method, and particularly relates to a hearing evaluation method and equipment based on a speech intelligibility index.
Background
The hearing loss is one of important public health problems, and the evaluation data of the world health organization on the hearing disability conditions of member countries of the world health organization show that the hearing disabilities are more than 4.6 hundred million, and according to the second national disability sampling survey in 2006, 2780 million people in China have the hearing loss, which accounts for 34% of the total number of the disabled people. Hearing loss can cause communication impairment, thereby affecting interpersonal relationships and working capacity, leading to social isolation and a reduction in quality of life. Studies have shown that elderly people with hearing loss are at higher risk of falling, dementia, depression and death compared to individuals without hearing loss. Children with hearing loss may develop a delayed speech function and have a significant adverse effect on academic performance. From a socio-economic perspective, the world health organization estimates that hearing loss results in a worldwide loss of $ 7500 billion per year, including economic losses from health sector costs, educational support costs, and productivity losses. Hearing loss affects physical and mental health, and brings high socioeconomic cost, and early discovery and intervention are needed. Hearing assessment is crucial.
The standard for hearing loss diagnosis is the hearing threshold (hearing threshold), which is obtained by playing a fixed frequency pure tone signal under certain circumstances and equipment, etc., and measuring the minimum sound pressure level at which a predetermined percentage of the subject's responses are perceived to be correct. Hearing loss can cause the threshold to rise, as evidenced by the inability to detect relatively faint sounds, which is the most obvious characteristic of hearing loss. The hearing threshold reflects the condition of hearing loss at the different frequencies tested. The normal hearing person has a binaural hearing threshold of 25dB HL or less, and a hearing threshold higher than this value suffers from hearing loss. The hearing threshold was mild hearing loss at 26 to 40dB HL, moderate at 41 to 55dB HL, moderate at 56 to 70dB HL, severe at 71 to 90dB HL, and very severe above 91dB HL. The disabled hearing loss means that the hearing threshold of a better one of the ears of an adult is higher than the average hearing threshold of a normal hearing person by more than 40dB, and the hearing loss of a better one of the ears of a child is higher than 30 dB. For hearing loss, the hearing aid is a commonly used hearing compensation device, which can improve the speech communication ability of the hearing loss, and the hearing threshold is usually input into an adaptation formula to derive the gain of the hearing aid. The hearing threshold may therefore not only reflect the hearing status of a person, but also be applied to the hearing compensation of a hearing impaired person.
The measurement of the hearing threshold has strict requirements on the test environment and audiometric equipment, and usually needs to go to a special institution, and needs a professional person as a main test of the experiment. Also commonly used for adult hearing assessment are whisper Test (Whispered Voice Test), Finger Rub Test (Finger Rub Test), questionnaire, and portable audiometers, among others. The results of previous studies show that although these methods are more convenient than pure-tone audiometry, the accuracy needs to be verified, professional personnel are still needed to participate, quantitative damage conditions at each frequency cannot be given, and the cost of the portable audiometer is high. With the popularization of intelligent devices, some audiometry methods based on mobile intelligent terminals appear. The methods can finish hearing evaluation on the mobile terminal, and have the characteristics of convenience, rapidness and low cost. Some audiometry applications based on intelligent terminals implement pure tone audiometry, but generally, the measured result has deviation due to the strict requirements of the pure tone audiometry on environment and audiometry equipment.
Disclosure of Invention
Aiming at the defects of the existing method, the invention provides a hearing evaluation method and equipment based on speech intelligibility index. The result of the speech audiometry can reflect the auditory perception ability of the user, and meanwhile, the speech audiometry does not need complex guidance, so that the operation of the user is facilitated. Furthermore, speech audiometry has relatively reduced environmental and equipment requirements compared to pure tone audiometry, and thus may be a more robust hearing assessment approach.
The technical scheme of the invention is as follows:
a hearing assessment method based on speech intelligibility index comprises the following steps:
1) the Speech Intelligibility Index (SII) is used to establish the functional relationship between the hearing threshold and Speech recognition performance, and the frame diagram of the Speech Intelligibility Index is shown in fig. 2.
2) Constructing an easily-confused double-syllable word pair corpus as a test corpus for speech audiometry according to the selected easily-confused vowel pair and the selected consonant pair, and measuring a Band weight function (BIF) of the test corpus by using a Quick Band weight measurement method (qBIF) to quantify the importance degree of each Band in each easily-confused double-syllable word pair;
3) using the confusable double-syllable words constructed in the step 2) to perform speech audiometry on listeners; the listeners comprise normal hearing persons and hearing loss persons;
4) selecting the sound intensity of the confusable double syllable word pair to be played by using a random method and a self-adaptive method; the test of the first K confusable double syllable word pairs uses a random method to select the sound intensity, and the subsequent test selects the sound intensity by a self-adaptive method. In the random method, the sound intensity of the confusing syllable doublesyllable word pair to be played is randomly selected from the candidate sound intensity conditions each time. In the self-adaptive method, a gradient descent method is used for selecting the sound intensity condition which enables the likelihood value of the confusing double syllable word pair to be maximum as the tested hearing threshold, then the speech recognition rate of each candidate sound intensity condition is predicted according to an SII model (namely formula 6), and the sound intensity condition with the recognition accuracy rate closest to a set threshold (such as 0.95) is selected as the sound intensity condition when the next confusing double syllable word pair is played;
5) selecting the confusable double-syllable word pair of the next round of speech audiometry from the test material, and carrying out speech audiometry on listeners;
6) repeating the steps 3) to 5) until the set termination condition is met; and then selecting the sound intensity condition which enables the confusing double syllable word pair to have the maximum likelihood value of the test result by using a gradient descent method as the final hearing threshold to be tested. The termination conditions are as follows: the number of test words of confusing bisyllabic word pairs reaches a limit value.
Further, the functional relationshipWherein SII is speech intelligibility index, W0The SII value is the SII value corresponding to the speech recognition threshold, and the PC indicates the speech recognition accuracy.
Further, SII ═ ΣfAf*Wf(ii) a Wherein A isfIndicating the audibility of the frequency band f, WfRepresenting the band weight function.
Further, in the above-mentioned case,wherein, TfIs the pure tone threshold at the center frequency of the frequency band f, EfFor the audiometry of the played sound intensity of the mid-band f signal.
Further, the weight of each selected confusing syllable pair in the frequency band f is measured by a rapid frequency band weight measurement method to obtain the frequency band weight function Wf。
Further, the method for carrying out speech audiometry on the testee by using the confusable double syllable word comprises the following steps: firstly, selecting K confusable double-syllable word pairs to carry out K rounds of speech audiometry, playing a selected confusable double-syllable word pair in each round, and selecting the sound intensity of the confusable double-syllable word pair to be played in each round by using a random method; then, for each confusable double-syllable word pair played subsequently, the sound intensity of each round of confusable double-syllable word pair to be played is selected by a self-adaptive method.
Furthermore, in the self-adaptive method, a gradient descent method is used for selecting the sound intensity condition which enables the likelihood value of the confusing double syllable word pair to be maximum as the hearing threshold of the testee, then the speech recognition rate of each candidate sound intensity condition is predicted according to the SII model, and the sound intensity condition with the recognition accuracy rate closest to the set threshold is selected as the sound intensity condition when the next confusing double syllable word pair is played.
Furthermore, selecting an easily confused double syllable word pair each time to carry out a round of speech audiometry; in each round of speech audiometry, the same confusable double syllable word pair is tested repeatedly, and one word or syllable in the confusable double syllable word pair is played randomly during each test.
Further, the method for selecting the sound intensity condition which makes the confusing bisyllable word pair with the maximum likelihood value of the test result as the final hearing threshold of the human subject comprises the following steps: setting M rounds of speech audiometry to obtain M test results, wherein the strength of the frequency band f of the mth test result isThe band weight function of playing confusing syllable doublet pairs in band f isThe SII value corresponding to the speech recognition threshold isThe result of the identification of the subject is ymThe hearing threshold estimated by the M test results is TMThe hearing threshold at frequency band f isPCmRepresenting the speech recognition accuracy obtained according to the m-th recognition result; SII dieThe log-likelihood function of the pattern is: LL ═ Σm(ym*log(PCm)+(1-ym)*log(1-PCm) M is 1 to M; then, the likelihood function is maximized by using a gradient descent method to obtain a log likelihood function pairDerivative of (2)Then randomizing the initial thresholdUsing formulasUpdating the hearing threshold, and recording the t-th step to obtain the hearing thresholdWhen the above-mentioned process is terminated after iteration T steps, then the obtained hearing threshold is updated in T stepI.e. the final hearing threshold of the frequency band f estimated by M rounds of speech audiometry.
A hearing assessment device based on speech intelligibility index, comprising a speech intelligibility index, a test material and a hearing assessment unit; the hearing evaluation unit comprises a sound intensity selection module and a parameter estimation module; wherein
The speech intelligibility index is used for establishing a functional relation between a hearing threshold and speech recognition performance and calculating the recognition accuracy of the confusable words of the test material under the given sound intensity;
the test material is an easily-confused double-syllable word pair corpus constructed according to the selected easily-confused vowel pair and the consonant pair;
the sound intensity selection module is used for selecting the sound intensity condition of the confusable double-syllable word pair during playing;
and the parameter estimation module is used for selecting the sound intensity condition which enables the confusing double syllables to have the maximum likelihood value of the test result according to the test result as the final hearing threshold of the testee.
Compared with the prior art, the invention has the following positive effects:
the invention uses speech audiometry for hearing evaluation, does not need complex guidance, is convenient for the operation of a user, has weak dependence on environment and audiometric equipment, and is convenient to implement on an intelligent terminal.
The invention can obtain more stable and reliable results in non-professional environment, and has larger correlation with the result of pure-tone audiometry, thereby being a feasible scheme for solving the hearing evaluation of the mobile terminal.
Drawings
FIG. 1 is a block diagram of a hearing assessment method based on speech intelligibility index;
FIG. 2 is a block diagram of a speech intelligibility index;
FIG. 3 is a graph of results under computer simulation conditions;
FIG. 4 is a graph of the results of a real hearing loss test;
fig. 5 is a graph of the results of a hearing loss test using the present invention and pure tone audiometry.
Detailed Description
Specific embodiments of the present invention will be described in more detail below. Fig. 1 is a block diagram of a hearing evaluation method based on speech intelligibility index according to the present invention. The specific implementation steps of the invention comprise the establishment of the relationship between the hearing threshold and the speech recognition, the establishment of speech audiometric materials based on the confusable double syllable word pair, the parameter estimation and the sound intensity condition selection. The specific implementation process of each step is as follows:
1. establishing relationship between hearing threshold and speech recognition
The present work relates the hearing threshold to Speech recognition performance by Speech Intelligibility Index (SII), whose block diagram is shown in fig. 2.
The formula for calculating the speech intelligibility index is as follows:
SII=∑fAf*Wf (1)
wherein f denotes a frequency band, AfAudibility (Audibility) of the f-band, which is between 0 and 1, indicates that no voice cue can be heard if 0, and indicates that all available voice cues are received by the listener. The calculation formula is as follows, wherein SNRfRepresents the signal-to-noise ratio within band f:
Wfis a Band Impedance Function (BIF) which represents the Importance of the Band f for speech understanding; as shown in FIG. 2, there are n bands, and the weights of the n bands are measured for each confusing bisyllable pair and are referred to as the band weight function Wf,WfCan be measured by a fast band weight measurement method. After the SII value is calculated, the predicted value of the speech recognition accuracy PC is obtained by a Transfer Function:
wherein W0The method is a constant term of logistic regression, and the physical meaning is an SII value corresponding to the recognition accuracy of 50%, namely the SII value corresponding to a speech recognition threshold.
For speech intelligibility index calculation for hearing impaired persons, SII introduces the concept of Equivalent interference (Equivalent interference). SNR used in calculating audibility of a hearing personfIs the difference between the signal and the noise spectral intensity, the signal-to-noise ratio used in calculating the audibility of the hearing impaired is the difference between the signal and the equivalent interference spectral intensity, which is calculated as follows:
Df=max{Tf,Nf} (4)
wherein T isfIs the pure tone threshold at the center frequency of band f in dB SPL, NfIs the noise spectral strength of the frequency band f. If the noise-free infection condition is predictedThen equation (4) becomes:
Df=Tf (5)
at this time, the relationship between the hearing threshold and the speech recognition expression is as follows
If the frequency band weight function W of each confusable syllable pair is measured by the rapid frequency band weight measurement methodfAnd obtaining a recognition result PC through a speech audiometry experiment. Then the playing sound intensity of the test stimulus when the audiometric midband f signal is known as EfThen, the hearing threshold T at the frequency band f can be solved by equation (6)fFinally, the tested hearing threshold T is obtained.
2. Construction of speech audiometric material based on confusable double syllable word pair
The invention constructs a speech audiometric corpus according to easily confused vowels and easily confused consonants, firstly collocating the easily confused consonant pairs with the same vowels and collocating the easily confused vowel pairs with the same consonants to form an easily confused monosyllable word pair, then collocating the easily confused monosyllable word pair with another monosyllable word with the same pronunciation to form an easily confused monosyllable word pair, ensuring that the accents of the two words are the same and ensuring that only one vowel or consonant at the same position in the easily confused monosyllable word pair is different.
3. Selection of sound intensity conditions
The invention needs to collect the result of the tested identified word pair, each experiment shows stimulation with different sound intensity, and the selection of the sound intensity condition is an important problem. A simple strategy is to randomly select-10 to 90dB SPL sound intensity conditions for each band. If random selection is made at-10 to 90dB SPL with 10dB intervals, each band has 11 selectable sound intensity conditions, and 5 octave bands are set in the work, so that 11 are total5Possible sound intensity conditions. The method enables the parameter space of the sound intensity condition to be large, and more accurate evaluation results can be difficult to obtain under the condition of less test times and less experimental data. The invention is based on a large number of hearing loss patientsThe clustering results of the threshold selected a "range selection" strategy that used ranges around the thresholds (shown in table 1) for 4 mild to moderate hearing loss types as optional intensity conditions. The method specifically comprises the following steps: the method specifically comprises the following steps: the lowest sound intensities were found for each band at-10, 20 and 50dB SPL, with 10dB intervals, with the sound intensity conditions randomly selected in the range of 40dB, and the sound intensity conditions randomly selected in the range around the threshold for steep hearing loss, as shown in table 2. The strategy has 9530 possible sound intensity conditions, and the parameter space is greatly reduced.
TABLE 1 Hearing threshold Table for representative types of hearing loss
TABLE 2 Sound intensity Range settings
The "range selection" strategy limits the candidate sound intensity conditions, but if the selection is made randomly from the candidate sound intensity conditions each time a stimulus is played, it is difficult for the parameters to converge in a short time. The invention provides a self-adaptive method to accelerate convergence speed. The method estimates a hearing threshold based on existing data, calculates the predicted speech recognition accuracy of each candidate sound intensity condition through an SII model according to the hearing threshold, and selects the sound intensity condition with the recognition accuracy closest to 0.95. After the stimulus is played, new data is collected and the hearing threshold is updated. This process was repeated until all experiments were completed. According to the SII model, if the audibility of each band is 0.5, i.e., the sound intensity is close to the hearing threshold of the subject, the predicted average speech recognition accuracy is 0.95, and therefore a value of 0.95 is used in the adaptive method.
4. Parameter estimation
The invention estimates the listener's hearing threshold through the tested result, and uses the value to proceed the next iteration. Suppose that M pieces of data are collected by experiment, where the intensity of the f-band of the M-th piece of data is expressed asThe band weight function of the played confusing syllable doublet to the band f isThe SII value corresponding to the speech recognition threshold isThe identification result of the test is ym. Suppose the hearing threshold estimated by M pieces of data is TMThe hearing threshold at the frequency band f isThe log-likelihood function of the SII model is:
LL=∑m(ym*log(PCm)+(1-ym)*log(1-PCm)) (7)
PCmshows the speech recognition accuracy obtained by the m-th recognition result (since only one piece of data is limited, PC heremValue of 0-1) maximizing the log-likelihood function and the log-likelihood function pair by using a gradient descent methodThe derivative of (c) is:
When a condition is found that maximizes the likelihood of the confusing word for the test result, equations 7 through 9 stop the iteration. Assuming that the above process terminates after the iteration T steps, thenI.e. the final hearing threshold of the frequency band f estimated by M rounds of speech audiometry.
The advantages of the invention are illustrated below with reference to specific embodiments.
The method is used for evaluating the computer simulation conditions and the real speech audiometry result of the hearing loss test. The results of the method will be compared to the results of pure tone audiometry.
1. Hearing assessment experiment setup
In order to verify the accuracy and stability of the hearing evaluation method provided by the invention under an ideal condition, a computer simulation experiment is carried out. The computer simulation experiment refers to replacing a tested object in a real experiment by a calculation method. The recognition accuracy of each stimulus was predicted using the SII model, which uses the parameters: the BIF of the strip and the simulated hearing threshold of the subject. And (4) taking the identification accuracy rate predicted by the SII as the probability of generating a correct answer, and randomly generating an identification result. According to experimental data predicted by the SII model, the hearing evaluation method based on the speech intelligibility index provided by the work is used for evaluating the hearing condition of the simulated tested person and comparing the hearing condition with the hearing threshold of the simulated tested person, namely the hearing threshold set in the SII model, so that the accuracy of the method is analyzed. The simulation experiment of each condition can be run for a plurality of times, and the average result of the simulation experiment can be used as the final result of the simulated speech audiometry.
14 hearing impaired subjects participated in the hearing evaluation process, and all subjects completed speech audiometry and pure tone audiometry of both ears in a sound isolation booth environment. Under the computer simulation condition, the pure tone audiometry results of 14 hearing impairment test subjects (28 test ears) are used as the hearing threshold of the simulation test subjects, and the hearing of the simulation test subjects is evaluated by using the method provided by the invention.
2. Results of the Hearing assessment experiment
In order to evaluate the effectiveness of the invention, the method is respectively used for evaluating the hearing on a real hearing impairment test under the computer simulation condition. Fig. 3 and 4 are graphs showing the average results of real hearing impairment tests under computer simulation conditions.
Under computer simulation conditions, the hearing threshold estimated by the speech audiometry result is high in similarity (r is 0.87, p is less than 0.01) with the result of pure-tone audiometry estimation, and the mean square error is 13.65. For the real hearing loss test, the method also has a significant correlation with the result of pure-tone audiometry (r is 0.54, p is less than 0.01), and the mean square error is 14.77. Fig. 5 is a graph of the results of a hearing impairment test using the present invention and pure tone audiometry under sound isolation booth conditions.
The experimental result shows that the method can obtain the result close to pure-tone audiometry, and meanwhile, the speech audiometry does not need complex guidance, so that the method is convenient for the operation of a user. In addition, the requirements of speech audiometry on environment and equipment are relatively reduced compared with pure tone audiometry, so that the speech audiometry can be used as a feasible scheme for hearing evaluation of the mobile terminal.
Although specific embodiments of the invention have been disclosed for illustrative purposes and the accompanying drawings, which are included to provide a further understanding of the invention and are incorporated by reference, those skilled in the art will appreciate that: various substitutions, changes and modifications are possible without departing from the spirit and scope of the present invention and the appended claims. Therefore, the present invention should not be limited to the disclosure of the preferred embodiments and the accompanying drawings.
Claims (9)
1. A hearing assessment method based on speech intelligibility index comprises the following steps:
1) establishing a functional relationship between the hearing threshold and speech recognition performance by using the speech intelligibility index;
2) constructing an easily confused double-syllable word pair corpus as a test corpus for speech audiometry according to the selected easily confused vowel pair and the selected consonant pair, and measuring a frequency band weight function of the test corpus by using a rapid frequency band weight measurement method;
3) using the confusable double syllable word pair constructed in the step 2) to perform speech audiometry on the testee; then selecting the sound intensity condition which enables the confusing double syllables to have the maximum likelihood value of the test result as the final hearing threshold of the testee; the method for obtaining the final hearing threshold comprises the following steps: setting M rounds of speech audiometry to obtain M test results, wherein the strength of the frequency band f of the mth test result isThe band weight function of playing confusing syllable doublet pairs in band f isSpeech intelligibility index SII value corresponding to speech recognition threshold isThe result of the identification of the subject is ymThe hearing threshold estimated by the M test results is TMThe hearing threshold at frequency band f isPCmRepresenting the speech recognition accuracy obtained according to the m-th recognition result; the log-likelihood function of the speech intelligibility index SII model is: LL ═ Σm(ym*log(PCm)+(1-ym)*log(1-PCm) M is 1 to M; then, the likelihood function is maximized by using a gradient descent method to obtain a log likelihood function pairDerivative of (2)Then randomizing the initial thresholdUsing formulasUpdating the hearing threshold, and recording the t-th step to obtain the hearing thresholdWhen the above-mentioned process is terminated after iteration T steps, then the obtained hearing threshold is updated in T stepI.e. the final hearing threshold of the frequency band f estimated by M rounds of speech audiometry.
3. The method of claim 2, wherein the speech intelligibility index SII ═ ΣfAf*Wf(ii) a Wherein A isfIndicating the audibility of the frequency band f, WfRepresenting the band weight function.
5. A method as claimed in claim 2 or 3, characterized in that the selected individual ones are measured by means of a fast band weight measurement methodThe weight of each confusable syllable pair in the frequency band f is obtained to obtain the frequency band weight function Wf。
6. The method of claim 1, wherein the method of speech audiometry for the subject using the confusable bi-syllabic word pair constructed in step 2) comprises: firstly, selecting K confusable double-syllable word pairs to carry out K rounds of speech audiometry, playing a selected confusable double-syllable word pair in each round, and selecting the sound intensity of the confusable double-syllable word pair to be played in each round by using a random method; then, for each confusable double-syllable word pair played subsequently, the sound intensity of each round of confusable double-syllable word pair to be played is selected by a self-adaptive method.
7. The method as claimed in claim 6, wherein in the adaptive method, a gradient descent method is used to select the intensity condition that maximizes the likelihood of the confusing bisyllable word pair as the hearing threshold of the subject, the speech recognition rate of each candidate intensity condition is predicted according to the speech intelligibility index SII model, and the intensity condition whose recognition accuracy is closest to the set threshold is selected as the intensity condition when the next confusing bisyllable word pair is played.
8. The method of claim 1 or 6, wherein a round of speech audiometry is performed each time a confusable pair of bisyllables is selected; in each round of speech audiometry, the same confusable double syllable word pair is tested repeatedly, and one word or syllable in the confusable double syllable word pair is played randomly during each test.
9. A hearing assessment device based on speech intelligibility index, comprising a speech intelligibility index, a test material and a hearing assessment unit; the hearing evaluation unit comprises a sound intensity selection module and a parameter estimation module; wherein
The speech intelligibility index is used for establishing a functional relation between a hearing threshold and speech recognition performance and calculating the recognition accuracy of the confusable words of the test material under the given sound intensity;
the test material is an easily-confused double-syllable word pair corpus constructed according to the selected easily-confused vowel pair and the consonant pair;
the sound intensity selection module is used for selecting the sound intensity condition of the confusable double-syllable word pair during playing;
the parameter estimation module is used for selecting the sound intensity condition which enables the confusable double syllables to have the maximum likelihood value of the test result according to the test result as the final hearing threshold of the testee; the method for obtaining the final hearing threshold comprises the following steps: setting M rounds of speech audiometry to obtain M test results, wherein the strength of the frequency band f of the mth test result isThe band weight function of playing confusing syllable doublet pairs in band f isSpeech intelligibility index SII value corresponding to speech recognition threshold isThe result of the identification of the subject is ymThe hearing threshold estimated by the M test results is TMThe hearing threshold at frequency band f isPCmRepresenting the speech recognition accuracy obtained according to the m-th recognition result; the log-likelihood function of the speech intelligibility index SII model is: LL ═ Σm(ym*log(PCm)+(1-ym)*log(1-PCm) M is 1 to M; then, the likelihood function is maximized by using a gradient descent method to obtain a log likelihood function pairDerivative of (2) Then randomizing the initial thresholdUsing formulas Updating the hearing threshold, and recording the t-th step to obtain the hearing thresholdWhen the above-mentioned process is terminated after iteration T steps, then the obtained hearing threshold is updated in T stepI.e. the final hearing threshold of the frequency band f estimated by M rounds of speech audiometry.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011077820.2A CN112205981B (en) | 2020-10-10 | 2020-10-10 | Hearing assessment method and device based on speech intelligibility index |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011077820.2A CN112205981B (en) | 2020-10-10 | 2020-10-10 | Hearing assessment method and device based on speech intelligibility index |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112205981A CN112205981A (en) | 2021-01-12 |
CN112205981B true CN112205981B (en) | 2021-09-28 |
Family
ID=74053023
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011077820.2A Active CN112205981B (en) | 2020-10-10 | 2020-10-10 | Hearing assessment method and device based on speech intelligibility index |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112205981B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113286242A (en) * | 2021-04-29 | 2021-08-20 | 佛山博智医疗科技有限公司 | Device for decomposing speech signal to modify syllable and improving definition of speech signal |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH07124137A (en) * | 1993-09-03 | 1995-05-16 | Technol Res Assoc Of Medical & Welfare Apparatus | Auditory function inspecting device |
CN202179545U (en) * | 2011-07-22 | 2012-04-04 | 华南理工大学 | Auditory evoked potential audiometry apparatus based on oversampling multiple-frequency multiple-amplitude joint estimation |
CN109327785A (en) * | 2018-10-09 | 2019-02-12 | 北京大学 | A kind of hearing aid gain adaptation method and apparatus based on speech audiometry |
EP3718476A1 (en) * | 2019-04-02 | 2020-10-07 | Mimi Hearing Technologies GmbH | Systems and methods for evaluating hearing health |
-
2020
- 2020-10-10 CN CN202011077820.2A patent/CN112205981B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH07124137A (en) * | 1993-09-03 | 1995-05-16 | Technol Res Assoc Of Medical & Welfare Apparatus | Auditory function inspecting device |
CN202179545U (en) * | 2011-07-22 | 2012-04-04 | 华南理工大学 | Auditory evoked potential audiometry apparatus based on oversampling multiple-frequency multiple-amplitude joint estimation |
CN109327785A (en) * | 2018-10-09 | 2019-02-12 | 北京大学 | A kind of hearing aid gain adaptation method and apparatus based on speech audiometry |
EP3718476A1 (en) * | 2019-04-02 | 2020-10-07 | Mimi Hearing Technologies GmbH | Systems and methods for evaluating hearing health |
Non-Patent Citations (2)
Title |
---|
北京某地不同年龄段中老年人群的听力学特征分析;刘宸箐等;《中华耳科学杂志》;20180615(第03期);全文 * |
语音可懂度客观评价方法的研究;徐宇卓;《中国优秀硕士学位论文数据库 信息科技辑》;20150501(第9期);全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN112205981A (en) | 2021-01-12 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Holube et al. | Speech intelligibility prediction in hearing‐impaired listeners based on a psychoacoustically motivated perception model | |
Chen et al. | Predicting the intelligibility of reverberant speech for cochlear implant listeners with a non-intrusive intelligibility measure | |
Healy et al. | A deep learning algorithm to increase intelligibility for hearing-impaired listeners in the presence of a competing talker and reverberation | |
Lai et al. | Multi-objective learning based speech enhancement method to increase speech quality and intelligibility for hearing aid device users | |
Suter | Speech recognition in noise by individuals with mild hearing impairments | |
CN109327785B (en) | Hearing aid gain adaptation method and device based on speech audiometry | |
Beechey et al. | Hearing aid amplification reduces communication effort of people with hearing impairment and their conversation partners | |
US8849391B2 (en) | Speech sound intelligibility assessment system, and method and program therefor | |
CN102781322A (en) | Evaluation system of speech sound hearing, method of same and program of same | |
Davies-Venn et al. | The role of spectral resolution, working memory, and audibility in explaining variance in susceptibility to temporal envelope distortion | |
Hossain et al. | Reference-free assessment of speech intelligibility using bispectrum of an auditory neurogram | |
Gaballah et al. | Objective and subjective speech quality assessment of amplification devices for patients with Parkinson’s disease | |
Souza et al. | Consequences of broad auditory filters for identification of multichannel-compressed vowels | |
CN112205981B (en) | Hearing assessment method and device based on speech intelligibility index | |
Chiang et al. | Hasa-net: A non-intrusive hearing-aid speech assessment network | |
US11322167B2 (en) | Auditory communication devices and related methods | |
Jürgens et al. | Prediction of consonant recognition in quiet for listeners with normal and impaired hearing using an auditory model | |
Healy et al. | An ideal quantized mask to increase intelligibility and quality of speech in noise | |
Miller et al. | The effects of static and moving spectral ripple sensitivity on unaided and aided speech perception in noise | |
Yamamoto et al. | Effective data screening technique for crowdsourced speech intelligibility experiments: Evaluation with IRM-based speech enhancement | |
Brennan et al. | Influence of audibility and distortion on recognition of reverberant speech for children and adults with hearing aid amplification | |
Liu et al. | Psychometric functions of vowel detection and identification in long-term speech-shaped noise | |
Gaballah et al. | Objective and subjective assessment of amplified parkinsonian speech quality | |
POLO et al. | Development and evaluation of a novel adaptive staircase procedure for automated speech-in-noise testing | |
Shahidi et al. | Objective intelligibility measurement of reverberant vocoded speech for normal-hearing listeners: Towards facilitating the development of speech enhancement algorithms for cochlear implants |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |