CN112205981B - Hearing assessment method and device based on speech intelligibility index - Google Patents

Hearing assessment method and device based on speech intelligibility index Download PDF

Info

Publication number
CN112205981B
CN112205981B CN202011077820.2A CN202011077820A CN112205981B CN 112205981 B CN112205981 B CN 112205981B CN 202011077820 A CN202011077820 A CN 202011077820A CN 112205981 B CN112205981 B CN 112205981B
Authority
CN
China
Prior art keywords
speech
hearing
threshold
double
confusable
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011077820.2A
Other languages
Chinese (zh)
Other versions
CN112205981A (en
Inventor
陈婧
吴玺宏
杜逾凡
牛亚东
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Peking University
Original Assignee
Peking University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Peking University filed Critical Peking University
Priority to CN202011077820.2A priority Critical patent/CN112205981B/en
Publication of CN112205981A publication Critical patent/CN112205981A/en
Application granted granted Critical
Publication of CN112205981B publication Critical patent/CN112205981B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/12Audiometering
    • A61B5/121Audiometering evaluating hearing capacity
    • A61B5/123Audiometering evaluating hearing capacity subjective methods
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/78Detection of presence or absence of voice signals
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/78Detection of presence or absence of voice signals
    • G10L2025/783Detection of presence or absence of voice signals based on threshold decision

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Signal Processing (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • Heart & Thoracic Surgery (AREA)
  • Medical Informatics (AREA)
  • Molecular Biology (AREA)
  • Surgery (AREA)
  • Animal Behavior & Ethology (AREA)
  • General Health & Medical Sciences (AREA)
  • Public Health (AREA)
  • Veterinary Medicine (AREA)
  • Pathology (AREA)
  • Biophysics (AREA)
  • Otolaryngology (AREA)
  • Measurement Of The Respiration, Hearing Ability, Form, And Blood Characteristics Of Living Organisms (AREA)

Abstract

The invention discloses a hearing evaluation method and equipment based on a speech intelligibility index, which comprises the following steps: 1) establishing a functional relationship between the hearing threshold and speech recognition performance by using the speech intelligibility index; 2) constructing an easily confused double-syllable word pair corpus as a test corpus for speech audiometry according to the selected easily confused vowel pair and the selected consonant pair, and measuring a frequency band weight function (BIF) of the test corpus by using a rapid frequency band weight measurement method; 3) using the confusable double syllable word pair constructed in the step 2) to perform speech audiometry on the testee; then, the sound intensity condition which makes the confusing double syllable word pair with the maximum likelihood value of the test result is selected as the final hearing threshold of the testee. The invention can obtain more stable and reliable results in non-professional environment, and has larger correlation with the result of pure-tone audiometry, thereby being a feasible scheme for solving the hearing evaluation of the mobile terminal.

Description

Hearing assessment method and device based on speech intelligibility index
Technical Field
The invention belongs to the technical field of hearing aids, relates to a hearing evaluation method, and particularly relates to a hearing evaluation method and equipment based on a speech intelligibility index.
Background
The hearing loss is one of important public health problems, and the evaluation data of the world health organization on the hearing disability conditions of member countries of the world health organization show that the hearing disabilities are more than 4.6 hundred million, and according to the second national disability sampling survey in 2006, 2780 million people in China have the hearing loss, which accounts for 34% of the total number of the disabled people. Hearing loss can cause communication impairment, thereby affecting interpersonal relationships and working capacity, leading to social isolation and a reduction in quality of life. Studies have shown that elderly people with hearing loss are at higher risk of falling, dementia, depression and death compared to individuals without hearing loss. Children with hearing loss may develop a delayed speech function and have a significant adverse effect on academic performance. From a socio-economic perspective, the world health organization estimates that hearing loss results in a worldwide loss of $ 7500 billion per year, including economic losses from health sector costs, educational support costs, and productivity losses. Hearing loss affects physical and mental health, and brings high socioeconomic cost, and early discovery and intervention are needed. Hearing assessment is crucial.
The standard for hearing loss diagnosis is the hearing threshold (hearing threshold), which is obtained by playing a fixed frequency pure tone signal under certain circumstances and equipment, etc., and measuring the minimum sound pressure level at which a predetermined percentage of the subject's responses are perceived to be correct. Hearing loss can cause the threshold to rise, as evidenced by the inability to detect relatively faint sounds, which is the most obvious characteristic of hearing loss. The hearing threshold reflects the condition of hearing loss at the different frequencies tested. The normal hearing person has a binaural hearing threshold of 25dB HL or less, and a hearing threshold higher than this value suffers from hearing loss. The hearing threshold was mild hearing loss at 26 to 40dB HL, moderate at 41 to 55dB HL, moderate at 56 to 70dB HL, severe at 71 to 90dB HL, and very severe above 91dB HL. The disabled hearing loss means that the hearing threshold of a better one of the ears of an adult is higher than the average hearing threshold of a normal hearing person by more than 40dB, and the hearing loss of a better one of the ears of a child is higher than 30 dB. For hearing loss, the hearing aid is a commonly used hearing compensation device, which can improve the speech communication ability of the hearing loss, and the hearing threshold is usually input into an adaptation formula to derive the gain of the hearing aid. The hearing threshold may therefore not only reflect the hearing status of a person, but also be applied to the hearing compensation of a hearing impaired person.
The measurement of the hearing threshold has strict requirements on the test environment and audiometric equipment, and usually needs to go to a special institution, and needs a professional person as a main test of the experiment. Also commonly used for adult hearing assessment are whisper Test (Whispered Voice Test), Finger Rub Test (Finger Rub Test), questionnaire, and portable audiometers, among others. The results of previous studies show that although these methods are more convenient than pure-tone audiometry, the accuracy needs to be verified, professional personnel are still needed to participate, quantitative damage conditions at each frequency cannot be given, and the cost of the portable audiometer is high. With the popularization of intelligent devices, some audiometry methods based on mobile intelligent terminals appear. The methods can finish hearing evaluation on the mobile terminal, and have the characteristics of convenience, rapidness and low cost. Some audiometry applications based on intelligent terminals implement pure tone audiometry, but generally, the measured result has deviation due to the strict requirements of the pure tone audiometry on environment and audiometry equipment.
Disclosure of Invention
Aiming at the defects of the existing method, the invention provides a hearing evaluation method and equipment based on speech intelligibility index. The result of the speech audiometry can reflect the auditory perception ability of the user, and meanwhile, the speech audiometry does not need complex guidance, so that the operation of the user is facilitated. Furthermore, speech audiometry has relatively reduced environmental and equipment requirements compared to pure tone audiometry, and thus may be a more robust hearing assessment approach.
The technical scheme of the invention is as follows:
a hearing assessment method based on speech intelligibility index comprises the following steps:
1) the Speech Intelligibility Index (SII) is used to establish the functional relationship between the hearing threshold and Speech recognition performance, and the frame diagram of the Speech Intelligibility Index is shown in fig. 2.
2) Constructing an easily-confused double-syllable word pair corpus as a test corpus for speech audiometry according to the selected easily-confused vowel pair and the selected consonant pair, and measuring a Band weight function (BIF) of the test corpus by using a Quick Band weight measurement method (qBIF) to quantify the importance degree of each Band in each easily-confused double-syllable word pair;
3) using the confusable double-syllable words constructed in the step 2) to perform speech audiometry on listeners; the listeners comprise normal hearing persons and hearing loss persons;
4) selecting the sound intensity of the confusable double syllable word pair to be played by using a random method and a self-adaptive method; the test of the first K confusable double syllable word pairs uses a random method to select the sound intensity, and the subsequent test selects the sound intensity by a self-adaptive method. In the random method, the sound intensity of the confusing syllable doublesyllable word pair to be played is randomly selected from the candidate sound intensity conditions each time. In the self-adaptive method, a gradient descent method is used for selecting the sound intensity condition which enables the likelihood value of the confusing double syllable word pair to be maximum as the tested hearing threshold, then the speech recognition rate of each candidate sound intensity condition is predicted according to an SII model (namely formula 6), and the sound intensity condition with the recognition accuracy rate closest to a set threshold (such as 0.95) is selected as the sound intensity condition when the next confusing double syllable word pair is played;
5) selecting the confusable double-syllable word pair of the next round of speech audiometry from the test material, and carrying out speech audiometry on listeners;
6) repeating the steps 3) to 5) until the set termination condition is met; and then selecting the sound intensity condition which enables the confusing double syllable word pair to have the maximum likelihood value of the test result by using a gradient descent method as the final hearing threshold to be tested. The termination conditions are as follows: the number of test words of confusing bisyllabic word pairs reaches a limit value.
Further, the functional relationship
Figure GDA0003115268890000021
Wherein SII is speech intelligibility index, W0The SII value is the SII value corresponding to the speech recognition threshold, and the PC indicates the speech recognition accuracy.
Further, SII ═ ΣfAf*Wf(ii) a Wherein A isfIndicating the audibility of the frequency band f, WfRepresenting the band weight function.
Further, in the above-mentioned case,
Figure GDA0003115268890000031
wherein, TfIs the pure tone threshold at the center frequency of the frequency band f, EfFor the audiometry of the played sound intensity of the mid-band f signal.
Further, the weight of each selected confusing syllable pair in the frequency band f is measured by a rapid frequency band weight measurement method to obtain the frequency band weight function Wf
Further, the method for carrying out speech audiometry on the testee by using the confusable double syllable word comprises the following steps: firstly, selecting K confusable double-syllable word pairs to carry out K rounds of speech audiometry, playing a selected confusable double-syllable word pair in each round, and selecting the sound intensity of the confusable double-syllable word pair to be played in each round by using a random method; then, for each confusable double-syllable word pair played subsequently, the sound intensity of each round of confusable double-syllable word pair to be played is selected by a self-adaptive method.
Furthermore, in the self-adaptive method, a gradient descent method is used for selecting the sound intensity condition which enables the likelihood value of the confusing double syllable word pair to be maximum as the hearing threshold of the testee, then the speech recognition rate of each candidate sound intensity condition is predicted according to the SII model, and the sound intensity condition with the recognition accuracy rate closest to the set threshold is selected as the sound intensity condition when the next confusing double syllable word pair is played.
Furthermore, selecting an easily confused double syllable word pair each time to carry out a round of speech audiometry; in each round of speech audiometry, the same confusable double syllable word pair is tested repeatedly, and one word or syllable in the confusable double syllable word pair is played randomly during each test.
Further, the method for selecting the sound intensity condition which makes the confusing bisyllable word pair with the maximum likelihood value of the test result as the final hearing threshold of the human subject comprises the following steps: setting M rounds of speech audiometry to obtain M test results, wherein the strength of the frequency band f of the mth test result is
Figure GDA0003115268890000032
The band weight function of playing confusing syllable doublet pairs in band f is
Figure GDA0003115268890000033
The SII value corresponding to the speech recognition threshold is
Figure GDA0003115268890000034
The result of the identification of the subject is ymThe hearing threshold estimated by the M test results is TMThe hearing threshold at frequency band f is
Figure GDA0003115268890000035
PCmRepresenting the speech recognition accuracy obtained according to the m-th recognition result; SII dieThe log-likelihood function of the pattern is: LL ═ Σm(ym*log(PCm)+(1-ym)*log(1-PCm) M is 1 to M; then, the likelihood function is maximized by using a gradient descent method to obtain a log likelihood function pair
Figure GDA0003115268890000036
Derivative of (2)
Figure GDA0003115268890000037
Then randomizing the initial threshold
Figure GDA0003115268890000038
Using formulas
Figure GDA0003115268890000039
Updating the hearing threshold, and recording the t-th step to obtain the hearing threshold
Figure GDA00031152688900000310
When the above-mentioned process is terminated after iteration T steps, then the obtained hearing threshold is updated in T step
Figure GDA00031152688900000311
I.e. the final hearing threshold of the frequency band f estimated by M rounds of speech audiometry.
A hearing assessment device based on speech intelligibility index, comprising a speech intelligibility index, a test material and a hearing assessment unit; the hearing evaluation unit comprises a sound intensity selection module and a parameter estimation module; wherein
The speech intelligibility index is used for establishing a functional relation between a hearing threshold and speech recognition performance and calculating the recognition accuracy of the confusable words of the test material under the given sound intensity;
the test material is an easily-confused double-syllable word pair corpus constructed according to the selected easily-confused vowel pair and the consonant pair;
the sound intensity selection module is used for selecting the sound intensity condition of the confusable double-syllable word pair during playing;
and the parameter estimation module is used for selecting the sound intensity condition which enables the confusing double syllables to have the maximum likelihood value of the test result according to the test result as the final hearing threshold of the testee.
Compared with the prior art, the invention has the following positive effects:
the invention uses speech audiometry for hearing evaluation, does not need complex guidance, is convenient for the operation of a user, has weak dependence on environment and audiometric equipment, and is convenient to implement on an intelligent terminal.
The invention can obtain more stable and reliable results in non-professional environment, and has larger correlation with the result of pure-tone audiometry, thereby being a feasible scheme for solving the hearing evaluation of the mobile terminal.
Drawings
FIG. 1 is a block diagram of a hearing assessment method based on speech intelligibility index;
FIG. 2 is a block diagram of a speech intelligibility index;
FIG. 3 is a graph of results under computer simulation conditions;
FIG. 4 is a graph of the results of a real hearing loss test;
fig. 5 is a graph of the results of a hearing loss test using the present invention and pure tone audiometry.
Detailed Description
Specific embodiments of the present invention will be described in more detail below. Fig. 1 is a block diagram of a hearing evaluation method based on speech intelligibility index according to the present invention. The specific implementation steps of the invention comprise the establishment of the relationship between the hearing threshold and the speech recognition, the establishment of speech audiometric materials based on the confusable double syllable word pair, the parameter estimation and the sound intensity condition selection. The specific implementation process of each step is as follows:
1. establishing relationship between hearing threshold and speech recognition
The present work relates the hearing threshold to Speech recognition performance by Speech Intelligibility Index (SII), whose block diagram is shown in fig. 2.
The formula for calculating the speech intelligibility index is as follows:
SII=∑fAf*Wf (1)
wherein f denotes a frequency band, AfAudibility (Audibility) of the f-band, which is between 0 and 1, indicates that no voice cue can be heard if 0, and indicates that all available voice cues are received by the listener. The calculation formula is as follows, wherein SNRfRepresents the signal-to-noise ratio within band f:
Figure GDA0003115268890000051
Wfis a Band Impedance Function (BIF) which represents the Importance of the Band f for speech understanding; as shown in FIG. 2, there are n bands, and the weights of the n bands are measured for each confusing bisyllable pair and are referred to as the band weight function Wf,WfCan be measured by a fast band weight measurement method. After the SII value is calculated, the predicted value of the speech recognition accuracy PC is obtained by a Transfer Function:
Figure GDA0003115268890000052
wherein W0The method is a constant term of logistic regression, and the physical meaning is an SII value corresponding to the recognition accuracy of 50%, namely the SII value corresponding to a speech recognition threshold.
For speech intelligibility index calculation for hearing impaired persons, SII introduces the concept of Equivalent interference (Equivalent interference). SNR used in calculating audibility of a hearing personfIs the difference between the signal and the noise spectral intensity, the signal-to-noise ratio used in calculating the audibility of the hearing impaired is the difference between the signal and the equivalent interference spectral intensity, which is calculated as follows:
Df=max{Tf,Nf} (4)
wherein T isfIs the pure tone threshold at the center frequency of band f in dB SPL, NfIs the noise spectral strength of the frequency band f. If the noise-free infection condition is predictedThen equation (4) becomes:
Df=Tf (5)
at this time, the relationship between the hearing threshold and the speech recognition expression is as follows
Figure GDA0003115268890000053
If the frequency band weight function W of each confusable syllable pair is measured by the rapid frequency band weight measurement methodfAnd obtaining a recognition result PC through a speech audiometry experiment. Then the playing sound intensity of the test stimulus when the audiometric midband f signal is known as EfThen, the hearing threshold T at the frequency band f can be solved by equation (6)fFinally, the tested hearing threshold T is obtained.
2. Construction of speech audiometric material based on confusable double syllable word pair
The invention constructs a speech audiometric corpus according to easily confused vowels and easily confused consonants, firstly collocating the easily confused consonant pairs with the same vowels and collocating the easily confused vowel pairs with the same consonants to form an easily confused monosyllable word pair, then collocating the easily confused monosyllable word pair with another monosyllable word with the same pronunciation to form an easily confused monosyllable word pair, ensuring that the accents of the two words are the same and ensuring that only one vowel or consonant at the same position in the easily confused monosyllable word pair is different.
3. Selection of sound intensity conditions
The invention needs to collect the result of the tested identified word pair, each experiment shows stimulation with different sound intensity, and the selection of the sound intensity condition is an important problem. A simple strategy is to randomly select-10 to 90dB SPL sound intensity conditions for each band. If random selection is made at-10 to 90dB SPL with 10dB intervals, each band has 11 selectable sound intensity conditions, and 5 octave bands are set in the work, so that 11 are total5Possible sound intensity conditions. The method enables the parameter space of the sound intensity condition to be large, and more accurate evaluation results can be difficult to obtain under the condition of less test times and less experimental data. The invention is based on a large number of hearing loss patientsThe clustering results of the threshold selected a "range selection" strategy that used ranges around the thresholds (shown in table 1) for 4 mild to moderate hearing loss types as optional intensity conditions. The method specifically comprises the following steps: the method specifically comprises the following steps: the lowest sound intensities were found for each band at-10, 20 and 50dB SPL, with 10dB intervals, with the sound intensity conditions randomly selected in the range of 40dB, and the sound intensity conditions randomly selected in the range around the threshold for steep hearing loss, as shown in table 2. The strategy has 9530 possible sound intensity conditions, and the parameter space is greatly reduced.
TABLE 1 Hearing threshold Table for representative types of hearing loss
Figure GDA0003115268890000061
TABLE 2 Sound intensity Range settings
Figure GDA0003115268890000062
Figure GDA0003115268890000071
The "range selection" strategy limits the candidate sound intensity conditions, but if the selection is made randomly from the candidate sound intensity conditions each time a stimulus is played, it is difficult for the parameters to converge in a short time. The invention provides a self-adaptive method to accelerate convergence speed. The method estimates a hearing threshold based on existing data, calculates the predicted speech recognition accuracy of each candidate sound intensity condition through an SII model according to the hearing threshold, and selects the sound intensity condition with the recognition accuracy closest to 0.95. After the stimulus is played, new data is collected and the hearing threshold is updated. This process was repeated until all experiments were completed. According to the SII model, if the audibility of each band is 0.5, i.e., the sound intensity is close to the hearing threshold of the subject, the predicted average speech recognition accuracy is 0.95, and therefore a value of 0.95 is used in the adaptive method.
4. Parameter estimation
The invention estimates the listener's hearing threshold through the tested result, and uses the value to proceed the next iteration. Suppose that M pieces of data are collected by experiment, where the intensity of the f-band of the M-th piece of data is expressed as
Figure GDA0003115268890000072
The band weight function of the played confusing syllable doublet to the band f is
Figure GDA0003115268890000073
The SII value corresponding to the speech recognition threshold is
Figure GDA0003115268890000074
The identification result of the test is ym. Suppose the hearing threshold estimated by M pieces of data is TMThe hearing threshold at the frequency band f is
Figure GDA0003115268890000075
The log-likelihood function of the SII model is:
LL=∑m(ym*log(PCm)+(1-ym)*log(1-PCm)) (7)
PCmshows the speech recognition accuracy obtained by the m-th recognition result (since only one piece of data is limited, PC heremValue of 0-1) maximizing the log-likelihood function and the log-likelihood function pair by using a gradient descent method
Figure GDA0003115268890000076
The derivative of (c) is:
Figure GDA0003115268890000077
randomized initial threshold
Figure GDA0003115268890000078
Updating hearing threshold in the t step
Figure GDA0003115268890000079
Figure GDA00031152688900000710
When a condition is found that maximizes the likelihood of the confusing word for the test result, equations 7 through 9 stop the iteration. Assuming that the above process terminates after the iteration T steps, then
Figure GDA00031152688900000711
I.e. the final hearing threshold of the frequency band f estimated by M rounds of speech audiometry.
The advantages of the invention are illustrated below with reference to specific embodiments.
The method is used for evaluating the computer simulation conditions and the real speech audiometry result of the hearing loss test. The results of the method will be compared to the results of pure tone audiometry.
1. Hearing assessment experiment setup
In order to verify the accuracy and stability of the hearing evaluation method provided by the invention under an ideal condition, a computer simulation experiment is carried out. The computer simulation experiment refers to replacing a tested object in a real experiment by a calculation method. The recognition accuracy of each stimulus was predicted using the SII model, which uses the parameters: the BIF of the strip and the simulated hearing threshold of the subject. And (4) taking the identification accuracy rate predicted by the SII as the probability of generating a correct answer, and randomly generating an identification result. According to experimental data predicted by the SII model, the hearing evaluation method based on the speech intelligibility index provided by the work is used for evaluating the hearing condition of the simulated tested person and comparing the hearing condition with the hearing threshold of the simulated tested person, namely the hearing threshold set in the SII model, so that the accuracy of the method is analyzed. The simulation experiment of each condition can be run for a plurality of times, and the average result of the simulation experiment can be used as the final result of the simulated speech audiometry.
14 hearing impaired subjects participated in the hearing evaluation process, and all subjects completed speech audiometry and pure tone audiometry of both ears in a sound isolation booth environment. Under the computer simulation condition, the pure tone audiometry results of 14 hearing impairment test subjects (28 test ears) are used as the hearing threshold of the simulation test subjects, and the hearing of the simulation test subjects is evaluated by using the method provided by the invention.
2. Results of the Hearing assessment experiment
In order to evaluate the effectiveness of the invention, the method is respectively used for evaluating the hearing on a real hearing impairment test under the computer simulation condition. Fig. 3 and 4 are graphs showing the average results of real hearing impairment tests under computer simulation conditions.
Under computer simulation conditions, the hearing threshold estimated by the speech audiometry result is high in similarity (r is 0.87, p is less than 0.01) with the result of pure-tone audiometry estimation, and the mean square error is 13.65. For the real hearing loss test, the method also has a significant correlation with the result of pure-tone audiometry (r is 0.54, p is less than 0.01), and the mean square error is 14.77. Fig. 5 is a graph of the results of a hearing impairment test using the present invention and pure tone audiometry under sound isolation booth conditions.
The experimental result shows that the method can obtain the result close to pure-tone audiometry, and meanwhile, the speech audiometry does not need complex guidance, so that the method is convenient for the operation of a user. In addition, the requirements of speech audiometry on environment and equipment are relatively reduced compared with pure tone audiometry, so that the speech audiometry can be used as a feasible scheme for hearing evaluation of the mobile terminal.
Although specific embodiments of the invention have been disclosed for illustrative purposes and the accompanying drawings, which are included to provide a further understanding of the invention and are incorporated by reference, those skilled in the art will appreciate that: various substitutions, changes and modifications are possible without departing from the spirit and scope of the present invention and the appended claims. Therefore, the present invention should not be limited to the disclosure of the preferred embodiments and the accompanying drawings.

Claims (9)

1. A hearing assessment method based on speech intelligibility index comprises the following steps:
1) establishing a functional relationship between the hearing threshold and speech recognition performance by using the speech intelligibility index;
2) constructing an easily confused double-syllable word pair corpus as a test corpus for speech audiometry according to the selected easily confused vowel pair and the selected consonant pair, and measuring a frequency band weight function of the test corpus by using a rapid frequency band weight measurement method;
3) using the confusable double syllable word pair constructed in the step 2) to perform speech audiometry on the testee; then selecting the sound intensity condition which enables the confusing double syllables to have the maximum likelihood value of the test result as the final hearing threshold of the testee; the method for obtaining the final hearing threshold comprises the following steps: setting M rounds of speech audiometry to obtain M test results, wherein the strength of the frequency band f of the mth test result is
Figure FDA0003115268880000011
The band weight function of playing confusing syllable doublet pairs in band f is
Figure FDA0003115268880000012
Speech intelligibility index SII value corresponding to speech recognition threshold is
Figure FDA0003115268880000013
The result of the identification of the subject is ymThe hearing threshold estimated by the M test results is TMThe hearing threshold at frequency band f is
Figure FDA0003115268880000014
PCmRepresenting the speech recognition accuracy obtained according to the m-th recognition result; the log-likelihood function of the speech intelligibility index SII model is: LL ═ Σm(ym*log(PCm)+(1-ym)*log(1-PCm) M is 1 to M; then, the likelihood function is maximized by using a gradient descent method to obtain a log likelihood function pair
Figure FDA0003115268880000015
Derivative of (2)
Figure FDA0003115268880000016
Then randomizing the initial threshold
Figure FDA0003115268880000017
Using formulas
Figure FDA0003115268880000018
Updating the hearing threshold, and recording the t-th step to obtain the hearing threshold
Figure FDA0003115268880000019
When the above-mentioned process is terminated after iteration T steps, then the obtained hearing threshold is updated in T step
Figure FDA00031152688800000110
I.e. the final hearing threshold of the frequency band f estimated by M rounds of speech audiometry.
2. The method of claim 1, wherein the functional relationship
Figure FDA00031152688800000111
Wherein SII is speech intelligibility index, W0The SII value is the SII value corresponding to the speech recognition threshold, and the PC indicates the speech recognition accuracy.
3. The method of claim 2, wherein the speech intelligibility index SII ═ ΣfAf*Wf(ii) a Wherein A isfIndicating the audibility of the frequency band f, WfRepresenting the band weight function.
4. The method of claim 3,
Figure FDA00031152688800000112
wherein, TfIs the pure tone threshold at the center frequency of the frequency band f, EfFor the audiometry of the played sound intensity of the mid-band f signal.
5. A method as claimed in claim 2 or 3, characterized in that the selected individual ones are measured by means of a fast band weight measurement methodThe weight of each confusable syllable pair in the frequency band f is obtained to obtain the frequency band weight function Wf
6. The method of claim 1, wherein the method of speech audiometry for the subject using the confusable bi-syllabic word pair constructed in step 2) comprises: firstly, selecting K confusable double-syllable word pairs to carry out K rounds of speech audiometry, playing a selected confusable double-syllable word pair in each round, and selecting the sound intensity of the confusable double-syllable word pair to be played in each round by using a random method; then, for each confusable double-syllable word pair played subsequently, the sound intensity of each round of confusable double-syllable word pair to be played is selected by a self-adaptive method.
7. The method as claimed in claim 6, wherein in the adaptive method, a gradient descent method is used to select the intensity condition that maximizes the likelihood of the confusing bisyllable word pair as the hearing threshold of the subject, the speech recognition rate of each candidate intensity condition is predicted according to the speech intelligibility index SII model, and the intensity condition whose recognition accuracy is closest to the set threshold is selected as the intensity condition when the next confusing bisyllable word pair is played.
8. The method of claim 1 or 6, wherein a round of speech audiometry is performed each time a confusable pair of bisyllables is selected; in each round of speech audiometry, the same confusable double syllable word pair is tested repeatedly, and one word or syllable in the confusable double syllable word pair is played randomly during each test.
9. A hearing assessment device based on speech intelligibility index, comprising a speech intelligibility index, a test material and a hearing assessment unit; the hearing evaluation unit comprises a sound intensity selection module and a parameter estimation module; wherein
The speech intelligibility index is used for establishing a functional relation between a hearing threshold and speech recognition performance and calculating the recognition accuracy of the confusable words of the test material under the given sound intensity;
the test material is an easily-confused double-syllable word pair corpus constructed according to the selected easily-confused vowel pair and the consonant pair;
the sound intensity selection module is used for selecting the sound intensity condition of the confusable double-syllable word pair during playing;
the parameter estimation module is used for selecting the sound intensity condition which enables the confusable double syllables to have the maximum likelihood value of the test result according to the test result as the final hearing threshold of the testee; the method for obtaining the final hearing threshold comprises the following steps: setting M rounds of speech audiometry to obtain M test results, wherein the strength of the frequency band f of the mth test result is
Figure FDA0003115268880000021
The band weight function of playing confusing syllable doublet pairs in band f is
Figure FDA0003115268880000022
Speech intelligibility index SII value corresponding to speech recognition threshold is
Figure FDA0003115268880000023
The result of the identification of the subject is ymThe hearing threshold estimated by the M test results is TMThe hearing threshold at frequency band f is
Figure FDA0003115268880000024
PCmRepresenting the speech recognition accuracy obtained according to the m-th recognition result; the log-likelihood function of the speech intelligibility index SII model is: LL ═ Σm(ym*log(PCm)+(1-ym)*log(1-PCm) M is 1 to M; then, the likelihood function is maximized by using a gradient descent method to obtain a log likelihood function pair
Figure FDA0003115268880000025
Derivative of (2)
Figure FDA0003115268880000026
Figure FDA0003115268880000027
Then randomizing the initial threshold
Figure FDA0003115268880000028
Using formulas
Figure FDA0003115268880000029
Figure FDA00031152688800000210
Updating the hearing threshold, and recording the t-th step to obtain the hearing threshold
Figure FDA00031152688800000211
When the above-mentioned process is terminated after iteration T steps, then the obtained hearing threshold is updated in T step
Figure FDA00031152688800000212
I.e. the final hearing threshold of the frequency band f estimated by M rounds of speech audiometry.
CN202011077820.2A 2020-10-10 2020-10-10 Hearing assessment method and device based on speech intelligibility index Active CN112205981B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011077820.2A CN112205981B (en) 2020-10-10 2020-10-10 Hearing assessment method and device based on speech intelligibility index

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011077820.2A CN112205981B (en) 2020-10-10 2020-10-10 Hearing assessment method and device based on speech intelligibility index

Publications (2)

Publication Number Publication Date
CN112205981A CN112205981A (en) 2021-01-12
CN112205981B true CN112205981B (en) 2021-09-28

Family

ID=74053023

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011077820.2A Active CN112205981B (en) 2020-10-10 2020-10-10 Hearing assessment method and device based on speech intelligibility index

Country Status (1)

Country Link
CN (1) CN112205981B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113286242A (en) * 2021-04-29 2021-08-20 佛山博智医疗科技有限公司 Device for decomposing speech signal to modify syllable and improving definition of speech signal

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH07124137A (en) * 1993-09-03 1995-05-16 Technol Res Assoc Of Medical & Welfare Apparatus Auditory function inspecting device
CN202179545U (en) * 2011-07-22 2012-04-04 华南理工大学 Auditory evoked potential audiometry apparatus based on oversampling multiple-frequency multiple-amplitude joint estimation
CN109327785A (en) * 2018-10-09 2019-02-12 北京大学 A kind of hearing aid gain adaptation method and apparatus based on speech audiometry
EP3718476A1 (en) * 2019-04-02 2020-10-07 Mimi Hearing Technologies GmbH Systems and methods for evaluating hearing health

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH07124137A (en) * 1993-09-03 1995-05-16 Technol Res Assoc Of Medical & Welfare Apparatus Auditory function inspecting device
CN202179545U (en) * 2011-07-22 2012-04-04 华南理工大学 Auditory evoked potential audiometry apparatus based on oversampling multiple-frequency multiple-amplitude joint estimation
CN109327785A (en) * 2018-10-09 2019-02-12 北京大学 A kind of hearing aid gain adaptation method and apparatus based on speech audiometry
EP3718476A1 (en) * 2019-04-02 2020-10-07 Mimi Hearing Technologies GmbH Systems and methods for evaluating hearing health

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
北京某地不同年龄段中老年人群的听力学特征分析;刘宸箐等;《中华耳科学杂志》;20180615(第03期);全文 *
语音可懂度客观评价方法的研究;徐宇卓;《中国优秀硕士学位论文数据库 信息科技辑》;20150501(第9期);全文 *

Also Published As

Publication number Publication date
CN112205981A (en) 2021-01-12

Similar Documents

Publication Publication Date Title
Holube et al. Speech intelligibility prediction in hearing‐impaired listeners based on a psychoacoustically motivated perception model
Chen et al. Predicting the intelligibility of reverberant speech for cochlear implant listeners with a non-intrusive intelligibility measure
Healy et al. A deep learning algorithm to increase intelligibility for hearing-impaired listeners in the presence of a competing talker and reverberation
Lai et al. Multi-objective learning based speech enhancement method to increase speech quality and intelligibility for hearing aid device users
Suter Speech recognition in noise by individuals with mild hearing impairments
CN109327785B (en) Hearing aid gain adaptation method and device based on speech audiometry
Beechey et al. Hearing aid amplification reduces communication effort of people with hearing impairment and their conversation partners
US8849391B2 (en) Speech sound intelligibility assessment system, and method and program therefor
CN102781322A (en) Evaluation system of speech sound hearing, method of same and program of same
Davies-Venn et al. The role of spectral resolution, working memory, and audibility in explaining variance in susceptibility to temporal envelope distortion
Hossain et al. Reference-free assessment of speech intelligibility using bispectrum of an auditory neurogram
Gaballah et al. Objective and subjective speech quality assessment of amplification devices for patients with Parkinson’s disease
Souza et al. Consequences of broad auditory filters for identification of multichannel-compressed vowels
CN112205981B (en) Hearing assessment method and device based on speech intelligibility index
Chiang et al. Hasa-net: A non-intrusive hearing-aid speech assessment network
US11322167B2 (en) Auditory communication devices and related methods
Jürgens et al. Prediction of consonant recognition in quiet for listeners with normal and impaired hearing using an auditory model
Healy et al. An ideal quantized mask to increase intelligibility and quality of speech in noise
Miller et al. The effects of static and moving spectral ripple sensitivity on unaided and aided speech perception in noise
Yamamoto et al. Effective data screening technique for crowdsourced speech intelligibility experiments: Evaluation with IRM-based speech enhancement
Brennan et al. Influence of audibility and distortion on recognition of reverberant speech for children and adults with hearing aid amplification
Liu et al. Psychometric functions of vowel detection and identification in long-term speech-shaped noise
Gaballah et al. Objective and subjective assessment of amplified parkinsonian speech quality
POLO et al. Development and evaluation of a novel adaptive staircase procedure for automated speech-in-noise testing
Shahidi et al. Objective intelligibility measurement of reverberant vocoded speech for normal-hearing listeners: Towards facilitating the development of speech enhancement algorithms for cochlear implants

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant