CN105469802A - Speech quality improving method and system and mobile terminal - Google Patents

Speech quality improving method and system and mobile terminal Download PDF

Info

Publication number
CN105469802A
CN105469802A CN201410428590.8A CN201410428590A CN105469802A CN 105469802 A CN105469802 A CN 105469802A CN 201410428590 A CN201410428590 A CN 201410428590A CN 105469802 A CN105469802 A CN 105469802A
Authority
CN
China
Prior art keywords
mobile terminal
face
locus
audio frequency
frequency parameter
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201410428590.8A
Other languages
Chinese (zh)
Inventor
李闻
薛华
王进军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
ZTE Corp
Original Assignee
ZTE Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ZTE Corp filed Critical ZTE Corp
Priority to CN201410428590.8A priority Critical patent/CN105469802A/en
Priority to PCT/CN2014/087707 priority patent/WO2015117343A1/en
Publication of CN105469802A publication Critical patent/CN105469802A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/003Changing voice quality, e.g. pitch or formants
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M1/00Substation equipment, e.g. for use by subscribers
    • H04M1/60Substation equipment, e.g. for use by subscribers including speech amplifiers
    • H04M1/6008Substation equipment, e.g. for use by subscribers including speech amplifiers in the transmitter circuit
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M1/00Substation equipment, e.g. for use by subscribers
    • H04M1/60Substation equipment, e.g. for use by subscribers including speech amplifiers
    • H04M1/6016Substation equipment, e.g. for use by subscribers including speech amplifiers in the receiver circuit
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M1/00Substation equipment, e.g. for use by subscribers
    • H04M1/60Substation equipment, e.g. for use by subscribers including speech amplifiers
    • H04M1/6033Substation equipment, e.g. for use by subscribers including speech amplifiers for providing handsfree use or a loudspeaker mode in telephone sets
    • H04M1/6041Portable telephones adapted for handsfree use
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M2250/00Details of telephonic subscriber devices
    • H04M2250/12Details of telephonic subscriber devices including a sensor for measuring a physical value, e.g. temperature or motion
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M2250/00Details of telephonic subscriber devices
    • H04M2250/52Details of telephonic subscriber devices including functional features of a camera

Landscapes

  • Engineering & Computer Science (AREA)
  • Signal Processing (AREA)
  • Quality & Reliability (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Telephone Function (AREA)

Abstract

The invention discloses a speech quality improving method and system and a mobile terminal. The method comprises: obtaining, by the mobile terminal, a spatial position from the face of a user to the mobile terminal; determining audio parameters corresponding to the spatial position from the face of the user to the mobile terminal; adjusting audio parameters of a speech processing module of the mobile terminal to the audio parameters corresponding to the spatial position from the face to the mobile terminal; and outputting a speech signal, which is received from a network side, after being processed by the speech processing module, the audio parameters of which are adjusted. By means of the method, the hands-free frequency response and sensitivity can basically be kept in a relatively good state and can be basically kept unchanged, thereby avoiding the case of low frequency or medium-high frequency loss of a frequency response curve due to change of the distance or the angle from the mobile terminal to the face, improving speech quality in video call, and improving the listening effect of the user.

Description

A kind of method, system and mobile terminal improving speech quality
Technical field
The present invention relates to the communications field, be specifically related to a kind of method, system and the mobile terminal that improve speech quality.
Background technology
Now, widely, popularity is very high in the application of mobile terminal.Video calling, as a New function of mobile terminal, obtains increasing use.The lifting of the speech quality in video calling is also the problem that all mobile terminals are all devoted to make great efforts research.
From the auditory properties of people's ear, low frequency is root, if low-frequency sound sound pressure level is inadequate, then the tone color that can seem is simple, and lack dynamics, this part is very large on the impact of the sense of hearing.For intermediate frequency sound, be the region that human auditory system is the sensitiveest, suitably promote the telepresenc being conducive to strengthening playback, be conducive to promoting clearness and stereovision.Lifting for high-frequency sound can make tone color seem lively.Generally, the quality of sounding tonequality can be made of frequency response curve and judge, good frequency response curve sounds that subjective sensation is good.
During usual video calling, people can use headset mode or hands-free mode to carry out speech exchange.For hands-free mode, each user in use mobile terminal is incomplete same to the locus (such as Distance geometry angle) of the number of people, and at present mobile terminal is all placed on by fixed position for the debugging of the speech quality in video calling and calibration and carries out, in the prior art, the sensitivity of the voice signal that the low-pass filter that in video calling, hands-free mode Speech processing uses exports and frequency response set according to normal place (such as just to the position of mobile terminal camera 20cm) test, but because the difference of user's use habit, hold mobile terminal and have very big difference to the angle and distance of face, so the frequency response of mobile terminal user impression relative to sensitivity changes when non-standard location (other angle and distances).So actual use sense lacks by high pitch during to be exactly mobile terminal to the locus of face be bordering on standard value, and sound articulation and stereovision are not strong; Mobile terminal is distal to standard value during to the locus of face, then bass lacks, and sound sounds that dullness is full not, is that bass or alt frequency lack and all can affect user's hearing effect.
As can be seen here, mobile terminal is all placed on fixed position and carries out by the debugging of the speech quality in prior art in video calling hands-free mode and calibration, but along with mobile terminal is to the change of the angle and distance of face, mobile terminal all can change the frequency response of the voice signal of face and sensitivity, and such difference just have impact on the effect of speech quality.Therefore the difference needing a kind of method of Speech processing to bring to make up human factor, makes the speech quality in video calling can reach better hearing effect, thus improves Consumer's Experience effect, add the competitiveness of product in market.
Summary of the invention
The technical issues that need to address of the present invention are to provide a kind of method, system and the mobile terminal that improve speech quality, hands-free frequency response and sensitivity can be kept substantially to remain on a preferable states substantially constant, compensate for because mobile terminal to the distance of face or angle change and make the situation that frequency response curve low frequency or medium-high frequency lack, thus the speech quality in lifting video calling, promote user's hearing effect.
In order to solve the problems of the technologies described above, the invention provides a kind of method improving speech quality, comprising:
The face of acquisition for mobile terminal user is to the locus of mobile terminal;
Determine the audio frequency parameter that described face is corresponding to the locus of described mobile terminal;
The audio frequency parameter of the speech processing module of described mobile terminal is adjusted to the audio frequency parameter that described face is corresponding to the locus of described mobile terminal;
The speech processing module of the voice signal received from network side after adjustment audio frequency parameter is exported.
Further, described face comprises to the locus of described mobile terminal: described face is to the Distance geometry angle of described mobile terminal, and the angle ranging from angle just to the left or to the right to mobile terminal, described angle is less than or equal to 90 degree.
Further, described acquisition user face to mobile terminal locus before, described method also comprises:
Pre-set the locus of described face to described mobile terminal and the corresponding relation of the human face data collected.
Further, described in pre-set the locus of described face to described mobile terminal and the corresponding relation of human face data, comprising:
Described face is set to the maximal value of the distance of described mobile terminal and minimum value, arrange angle maximal value to the left be just to mobile terminal left avertence 90 degree and angle maximal value to the right for just to mobile terminal right avertence 90 degree;
According to the distance interval of presetting and angle intervals, from distance minimum value and left avertence 90 degree, gather the different distance of face to mobile terminal and the human face data of different angles successively to the right;
Preserve described different distance and human face data corresponding to different angles.
Further, described acquisition user face, to the locus of mobile terminal, comprising:
Gather the human face data of active user, the human face data of more described active user and the human face data of preservation, when the difference of the human face data of described active user and the human face data of described preservation is less than predetermined threshold value, then using locus corresponding for the human face data of described preservation as active user's face to the locus of mobile terminal.
Further, described determine described face to the audio frequency parameter that the locus of described mobile terminal is corresponding before, also comprise: the locus of pre-configured face to mobile terminal and the corresponding relation of audio frequency parameter;
Describedly determine the audio frequency parameter that described face is corresponding to the locus of described mobile terminal, comprising:
Determine to the locus of mobile terminal and the corresponding relation of audio frequency parameter the audio frequency parameter that described face is corresponding to the locus of described mobile terminal according to pre-configured described face.
Further, the described locus of pre-configured face to mobile terminal and the corresponding relation of audio frequency parameter, comprising:
Described face is set to the maximal value of the distance of described mobile terminal and minimum value, arrange angle maximal value to the left be just to mobile terminal left avertence 90 degree and angle maximal value to the right for just to mobile terminal right avertence 90 degree;
According to the distance interval of presetting and angle intervals, from distance minimum value and left avertence 90 degree, measure sensitivity and the frequency response of the voice signal that same voice signal is exported to different distance and the different angles of described mobile terminal by described speech processing module process at described face to the right successively;
Calculate different distance and different angles are exported the sensitivity of voice signal and the audio frequency parameter of frequency response in critical field;
Preserve different distance and audio frequency parameter corresponding to different angles.
In order to solve the problems of the technologies described above, present invention also offers a kind of system improving speech quality, comprising:
Locus identification module, for the locus of the face to mobile terminal that obtain user;
Audio frequency parameter determination module, for determining the audio frequency parameter that described face is corresponding to the locus of described mobile terminal;
Speech processing module, for its audio frequency parameter being adjusted to the described face audio frequency parameter corresponding to the locus of described mobile terminal, then exports the voice signal received from network side.
Further, described face comprises to the locus of described mobile terminal: described face is to the Distance geometry angle of described mobile terminal, and the angle ranging from angle just to the left or to the right to mobile terminal, described angle is less than or equal to 90 degree.
Further, also comprise:
Configuration module, for described acquisition user face to mobile terminal locus before, pre-set described face to the locus of described mobile terminal and the corresponding relation of human face data that collects.
Further, described configuration module, for pre-setting the locus of described face to described mobile terminal and the corresponding relation of human face data, comprising:
Described face is set to the maximal value of the distance of described mobile terminal and minimum value, arrange angle maximal value to the left be just to mobile terminal left avertence 90 degree and angle maximal value to the right for just to mobile terminal right avertence 90 degree;
According to the distance interval of presetting and angle intervals, from distance minimum value and left avertence 90 degree, gather the different distance of face to mobile terminal and the human face data of different angles successively to the right;
Preserve described different distance and human face data corresponding to different angles.
Further, described locus identification module, for obtaining the locus of user's face to mobile terminal, comprising:
Gather the human face data of active user, the human face data of more described active user and the human face data of preservation, when the difference of the human face data of described active user and the human face data of described preservation is less than predetermined threshold value, then using locus corresponding for the human face data of described preservation as active user's face to the locus of mobile terminal.
Further, described configuration module, also for before determining described face to the audio frequency parameter that the locus of described mobile terminal is corresponding, the also locus of pre-configured face to mobile terminal and the corresponding relation of audio frequency parameter;
Described audio frequency parameter determination module, for determining the audio frequency parameter that described face is corresponding to the locus of described mobile terminal, comprising:
Determine to the locus of mobile terminal and the corresponding relation of audio frequency parameter the audio frequency parameter that described face is corresponding to the locus of described mobile terminal according to pre-configured described face.
Further, described configuration module, also for the locus of pre-configured face to mobile terminal and the corresponding relation of audio frequency parameter, comprising:
Described face is set to the maximal value of the distance of described mobile terminal and minimum value, arrange angle maximal value to the left be just to mobile terminal left avertence 90 degree and angle maximal value to the right for just to mobile terminal right avertence 90 degree;
According to the distance interval of presetting and angle intervals, from distance minimum value and left avertence 90 degree, measure sensitivity and the frequency response of the voice signal that same voice signal is exported to different distance and the different angles of described mobile terminal by described speech processing module process at described face to the right successively;
Calculate different distance and different angles are exported the sensitivity of voice signal and the audio frequency parameter of frequency response in critical field;
Preserve different distance and audio frequency parameter corresponding to different angles.
In order to solve the problems of the technologies described above, present invention also offers a kind of mobile terminal, comprising: the system improving speech quality as above.
Compared with prior art, the method of the raising speech quality that the embodiment of the present invention provides, system and mobile terminal, after video calling opens hands-free voice pattern, the frequency response that user hears and sensitivity can carry out dynamic conditioning along with user's face to the change of the locus (angle and distance) of mobile terminal, hands-free frequency response and sensitivity can be kept substantially to remain on a preferable states substantially constant, compensate for because mobile terminal to the distance of face or angle change and make the situation that frequency response curve low frequency or medium-high frequency lack, thus the speech quality improved in video calling, and improve the hearing effect of user.
Accompanying drawing explanation
Fig. 1 is the method flow diagram improving speech quality in embodiment;
Fig. 2 be in embodiment face to the schematic diagram of the angle and distance of mobile terminal;
Fig. 3 pre-sets the locus of face to mobile terminal and the process flow diagram of the corresponding relation of the human face data collected in embodiment;
Fig. 4 pre-sets the process flow diagram of face to the corresponding relation of the locus of mobile terminal and audio frequency parameter in embodiment;
Fig. 5 is the structural drawing of the system improving speech quality in embodiment.
Embodiment
For making the object, technical solutions and advantages of the present invention clearly understand, hereinafter will be described in detail to embodiments of the invention by reference to the accompanying drawings.It should be noted that, when not conflicting, the embodiment in the application and the feature in embodiment can combination in any mutually.
Embodiment:
As shown in Figure 1, present embodiments provide a kind of method improving speech quality, comprise the following steps:
S101: the face of acquisition for mobile terminal user is to the locus of mobile terminal;
Wherein, described face comprises to the locus of described mobile terminal: described face is to the Distance geometry angle of described mobile terminal, and the angle ranging from angle just to the left or to the right to mobile terminal, described angle is less than or equal to 90 degree.
In actual use, the image that camera obtains be can be shown to mobile terminal display screen above.Because the mobile terminal camera generalized case of each model is on the plane top, display screen place of mobile terminal, but right position all can some difference, and have plenty of to the left, what have is to the right, also have in centre.
When the facial image seen from the display screen of mobile terminal is just in middle, the plane that actual face axis and the axis of mobile terminal vertical direction are formed also is not orthogonal to the plane at the display screen place of mobile terminal.
As shown in Figure 2, when the facial image seen above the display screen of mobile terminal is just in middle, when the face axis of display and the axis of mobile terminal vertical direction overlap, the straight line AB that at this moment actual mouth center A point (on face axis) and its are projected between the B point of mobile terminal display screen place plane and these 2 is benchmark; After face moves left and right, mouth center is changed to C point and projects the D point of mobile terminal display screen place plane.In the present embodiment, face refers to the angle of mobile terminal: the angle α between AB and BC two straight lines, face to the distance of mobile terminal is: the length d of line segment AB 1or the length d of line segment CD 2.
Wherein, before step S101, also comprise: pre-set described face to the locus of described mobile terminal and the corresponding relation of human face data collected, as shown in Figure 3, specifically comprise the following steps:
S201: described face is set to the maximal value of the distance of described mobile terminal and minimum value, arrange angle maximal value to the left be just to mobile terminal left avertence 90 degree and angle maximal value to the right for just to mobile terminal right avertence 90 degree;
S202: according to the distance interval of presetting and angle intervals, gathers the different distance of face to mobile terminal and the human face data of different angles successively to the right from distance minimum value and left avertence 90 degree;
S203: preserve described different distance and human face data corresponding to different angles.
Such as, the distance of face to mobile terminal is set from 10cm to 50cm, distance interval can be set to 10cm, angle is from left avertence 90 degree to right avertence 90 degree, angle intervals can be set to 10 degree, the locus of the combination of different distance and different angles can be obtained, gather the human face data of these different spatial respectively.
When specifically measuring, can the human face data of first measurement standard locus, according to 0 degree, gauged distance 20cm (namely just to mobile terminal, face is 20cm to the distance of mobile terminal), camera catches a frame human face data in this state, using these frame data as standard value;
Then, increase distance to 30cm, 40cm, 50cm, angle is constant, catch a frame human face data by camera again, facial contour at this time can diminish than entirety under gauged distance, but position is still placed in the middle, there is no angle deviating, these data and corresponding distance, angle are stored.Reduce distance to 10cm, angle is constant, catches a frame human face data equally by camera, and facial contour at this time can be larger than change overall under gauged distance, do not have angle deviating, same distance, the angle storing these data and correspondence;
Then, according to gauged distance 20cm, angle change to the left 10 degree, a frame human face data is caught by camera, facial contour is at this time compared with gauged distance, and size is constant, but overall can to the right from the angle of camera be to the right, what the view data but exported showed on the display screen of mobile terminal still offsets left to some extent, and these data and corresponding Distance geometry angle are stored.Again change to the right for angle 10 degree, catch a frame human face data by camera, facial contour is at this time compared with gauged distance, and size is constant, but overall meeting offset to the right to some extent, and these data and corresponding Distance geometry angle are stored;
Obtain human face data corresponding to the locus of multiple Distance geometry angle combinations according to which, these data are stored in a memory in the mobile terminal together with corresponding distance, angle.
Wherein, described acquisition user face, to the locus of mobile terminal, comprising:
Gather the human face data of active user, the human face data of more described active user and the human face data of preservation, when the difference of the human face data of described active user and the human face data of described preservation is less than predetermined threshold value, then using locus corresponding for the human face data of described preservation as active user's face to the locus of mobile terminal.
S102: determine the audio frequency parameter that described face is corresponding to the locus of described mobile terminal;
Wherein, before determining described face to the audio frequency parameter that the locus of described mobile terminal is corresponding, also comprise:
The locus of pre-configured face to mobile terminal and the corresponding relation of audio frequency parameter, step S102 specifically comprises: determine to the locus of mobile terminal and the corresponding relation of audio frequency parameter the audio frequency parameter that described face is corresponding to the locus of described mobile terminal according to described face.
Wherein, pre-set the locus of face to mobile terminal and the corresponding relation of audio frequency parameter, as shown in Figure 4, specifically comprise:
Step S301 and step S201 are identical;
S302: according to the distance interval of presetting and angle intervals, measure successively to the right from distance minimum value and left avertence 90 degree same voice signal described face to the different distance of described mobile terminal and different angles by described speech processing module process after the sensitivity of voice signal that exports and frequency response;
Here it should be noted that, in the present embodiment, the granularity of division of the locus that the locus that audio frequency parameter is corresponding is corresponding with human face data and the concrete angle of division and position and angle are all consistent with the combination of position, so that the human face data collected when carrying out video calling can find corresponding audio frequency parameter.If distance interval or the angle intervals of setting are thinner, the combination of Distance geometry angle is more, and so the adjustment of audio frequency parameter is more careful.
S303: calculate different distance and different angles are exported the sensitivity of voice signal and the audio frequency parameter of frequency response in critical field;
The standard of reference is VDF about audio hands terminal receiving sensitivity and standard corresponding to frequency response, and VDF is the standard that mobile voice business Vodafone receives about audio hands terminal.
S304: preserve different distance and audio frequency parameter corresponding to different angles.
These positions and the audio frequency parameter corresponding to distance values can be obtained by experimental data, use audio test system, according to the many groups positional distance numerical value set in advance, change mobile terminal to HATS (head shoulder simulator, replace the face of user) locus, measure and obtain same voice signal different distance and angle by the sensitivity corresponding to the voice signal that exports after low-pass filter and frequency response then according to the sensitivity of tested many groups and frequency response adjustment low-pass filter, ensure that the frequency of the voice signal that low-pass filter exports can be positioned at critical field, critical field is 300 ~ 3000Hz, thus obtain different distance and audio frequency parameter corresponding to different angles numerical value.
S103: the audio frequency parameter of the speech processing module of described mobile terminal is adjusted to the audio frequency parameter that described face is corresponding to the locus of described mobile terminal;
In the present embodiment, in step S101 ~ S102, if mobile terminal cannot get the locus of face to described mobile terminal, such as the distance of face distance mobile terminal is beyond the distance maximal value preset, or not before camera overleaf, then audio frequency parameter is now chosen as default value, such as, according to 0 degree, gauged distance 20cm is (namely just to mobile terminal, face is 20cm to the distance of mobile terminal) corresponding audio frequency parameter, this audio frequency parameter is worth by default.
S104: the process of the voice signal received from network side by the described speech processing module after adjustment audio frequency parameter is exported by loudspeaker.
Wherein, preferably, speech processing module is low-pass filter.
As shown in Figure 5, present embodiments provide a kind of system improving speech quality, comprising:
Locus identification module, for obtaining the locus of user's face to mobile terminal;
Audio frequency parameter determination module, for determining the audio frequency parameter that described face is corresponding to the locus of described mobile terminal;
Speech processing module, for its audio frequency parameter being adjusted to the described face audio frequency parameter corresponding to the locus of described mobile terminal, then exports the voice signal received from network side.
Wherein, described face comprises to the locus of described mobile terminal: described face is to the Distance geometry angle of described mobile terminal, and the angle ranging from angle just to the left or to the right to mobile terminal, described angle is less than or equal to 90 degree.
Wherein, this system also comprises:
Configuration module, for described acquisition user face to mobile terminal locus before, pre-set described face to the locus of described mobile terminal and the corresponding relation of human face data that collects.
Wherein, described configuration module, for pre-setting the locus of described face to described mobile terminal and the corresponding relation of human face data, comprising:
Described face is set to the maximal value of the distance of described mobile terminal and minimum value, arrange angle maximal value to the left be just to mobile terminal left avertence 90 degree and angle maximal value to the right for just to mobile terminal right avertence 90 degree;
According to the distance interval of presetting and angle intervals, from distance minimum value and left avertence 90 degree, gather the different distance of face to mobile terminal and the human face data of different angles successively to the right;
Preserve described different distance and human face data corresponding to different angles.
Wherein, described locus identification module, for obtaining the locus of user's face to mobile terminal, comprising:
Gather the human face data of active user, the human face data of more described active user and the human face data of preservation, when the difference of the human face data of described active user and the human face data of described preservation is less than predetermined threshold value, then using locus corresponding for the human face data of described preservation as active user's face to the locus of mobile terminal.
Wherein, described configuration module, also for before determining described face to the audio frequency parameter that the locus of described mobile terminal is corresponding, the also locus of pre-configured face to mobile terminal and the corresponding relation of audio frequency parameter;
Described audio frequency parameter determination module, for determining the audio frequency parameter that described face is corresponding to the locus of described mobile terminal, comprising:
Determine to the locus of mobile terminal and the corresponding relation of audio frequency parameter the audio frequency parameter that described face is corresponding to the locus of described mobile terminal according to pre-configured described face.
Wherein, described configuration module, also for the locus of pre-configured face to mobile terminal and the corresponding relation of audio frequency parameter, comprising:
Described face is set to the maximal value of the distance of described mobile terminal and minimum value, arrange angle maximal value to the left be just to mobile terminal left avertence 90 degree and angle maximal value to the right for just to mobile terminal right avertence 90 degree;
According to the distance interval of presetting and angle intervals, from distance minimum value and left avertence 90 degree, measure sensitivity and the frequency response of the voice signal that same voice signal is exported to different distance and the different angles of described mobile terminal by described speech processing module process at described face to the right successively;
Calculate different distance and different angles are exported the sensitivity of voice signal and the audio frequency parameter of frequency response in critical field;
Preserve different distance and audio frequency parameter corresponding to different angles.
Wherein, preferably, speech processing module is low-pass filter.
The present embodiment additionally provides a kind of mobile terminal, comprising: the system improving speech quality as above.
Wherein, the application scenarios of the present embodiment is after user carries out video calling unlatching hands-free voice pattern, when the voice signal received from network side being detected, illustrate that user carries out video calling with the other side, now can obtain the locus of face to mobile terminal of user, and then improve the speech quality after opening hands-free voice pattern, improve user's impression.
As can be seen from above-described embodiment, relative to prior art, the method of the raising speech quality provided in above-described embodiment, system and mobile terminal, after video calling opens hands-free voice pattern, by adjusting the audio frequency parameter of speech processing module, make the frequency of the voice signal exported from speech processing module can ensure to be positioned at critical field, therefore, the frequency response that user hears and sensitivity can carry out dynamic conditioning along with user's face to the change of the locus (angle and distance) of mobile terminal, hands-free frequency response and sensitivity can be kept substantially to remain on a preferable states substantially constant, compensate for because mobile terminal to the distance of face or angle change and make the situation that frequency response curve low frequency or medium-high frequency lack, thus the speech quality improved in video calling, and improve the hearing effect of user.
The all or part of step that one of ordinary skill in the art will appreciate that in said method is carried out instruction related hardware by program and is completed, and described program can be stored in computer-readable recording medium, as ROM (read-only memory), disk or CD etc.Alternatively, all or part of step of above-described embodiment also can use one or more integrated circuit to realize.Correspondingly, each module/unit in above-described embodiment can adopt the form of hardware to realize, and the form of software function module also can be adopted to realize.The present invention is not restricted to the combination of the hardware and software of any particular form.
The foregoing is only the preferred embodiments of the present invention, be not intended to limit protection scope of the present invention.According to summary of the invention of the present invention; also can there be other various embodiments; when not deviating from the present invention's spirit and essence thereof; those of ordinary skill in the art are when making various corresponding change and distortion according to the present invention; within the spirit and principles in the present invention all; any amendment of doing, equivalent replacement, improvement etc., all should be included within protection scope of the present invention.

Claims (15)

1. improve a method for speech quality, comprising:
The face of acquisition for mobile terminal user is to the locus of mobile terminal;
Determine the audio frequency parameter that described face is corresponding to the locus of described mobile terminal;
The audio frequency parameter of the speech processing module of described mobile terminal is adjusted to the audio frequency parameter that described face is corresponding to the locus of described mobile terminal;
The speech processing module of the voice signal received from network side after adjustment audio frequency parameter is exported.
2. the method for claim 1, is characterized in that:
Described face comprises to the locus of described mobile terminal: described face is to the Distance geometry angle of described mobile terminal, and the angle ranging from angle just to the left or to the right to mobile terminal, described angle is less than or equal to 90 degree.
3. method as claimed in claim 2, is characterized in that: described acquisition user face to mobile terminal locus before, described method also comprises:
Pre-set the locus of described face to described mobile terminal and the corresponding relation of the human face data collected.
4. method as claimed in claim 3, is characterized in that:
Describedly pre-set the locus of described face to described mobile terminal and the corresponding relation of human face data, comprising:
Described face is set to the maximal value of the distance of described mobile terminal and minimum value, arrange angle maximal value to the left be just to mobile terminal left avertence 90 degree and angle maximal value to the right for just to mobile terminal right avertence 90 degree;
According to the distance interval of presetting and angle intervals, from distance minimum value and left avertence 90 degree, gather the different distance of face to mobile terminal and the human face data of different angles successively to the right;
Preserve described different distance and human face data corresponding to different angles.
5. method as claimed in claim 4, is characterized in that:
Described acquisition user face, to the locus of mobile terminal, comprising:
Gather the human face data of active user, the human face data of more described active user and the human face data of preservation, when the difference of the human face data of described active user and the human face data of described preservation is less than predetermined threshold value, then using locus corresponding for the human face data of described preservation as active user's face to the locus of mobile terminal.
6. method as claimed in claim 2, is characterized in that:
Described determine described face to the audio frequency parameter that the locus of described mobile terminal is corresponding before, also comprise: the locus of pre-configured face to mobile terminal and the corresponding relation of audio frequency parameter;
Describedly determine the audio frequency parameter that described face is corresponding to the locus of described mobile terminal, comprising:
Determine to the locus of mobile terminal and the corresponding relation of audio frequency parameter the audio frequency parameter that described face is corresponding to the locus of described mobile terminal according to pre-configured described face.
7. method as claimed in claim 6, is characterized in that:
The described locus of pre-configured face to mobile terminal and the corresponding relation of audio frequency parameter, comprising:
Described face is set to the maximal value of the distance of described mobile terminal and minimum value, arrange angle maximal value to the left be just to mobile terminal left avertence 90 degree and angle maximal value to the right for just to mobile terminal right avertence 90 degree;
According to the distance interval of presetting and angle intervals, from distance minimum value and left avertence 90 degree, measure sensitivity and the frequency response of the voice signal that same voice signal is exported to different distance and the different angles of described mobile terminal by described speech processing module process at described face to the right successively;
Calculate different distance and different angles are exported the sensitivity of voice signal and the audio frequency parameter of frequency response in critical field;
Preserve different distance and audio frequency parameter corresponding to different angles.
8. improve a system for speech quality, comprising:
Locus identification module, for the locus of the face to mobile terminal that obtain user;
Audio frequency parameter determination module, for determining the audio frequency parameter that described face is corresponding to the locus of described mobile terminal;
Speech processing module, for its audio frequency parameter being adjusted to the described face audio frequency parameter corresponding to the locus of described mobile terminal, then exports the voice signal received from network side.
9. system as claimed in claim 8, is characterized in that:
Described face comprises to the locus of described mobile terminal: described face is to the Distance geometry angle of described mobile terminal, and the angle ranging from angle just to the left or to the right to mobile terminal, described angle is less than or equal to 90 degree.
10. system as claimed in claim 9, is characterized in that: also comprise:
Configuration module, for described acquisition user face to mobile terminal locus before, pre-set described face to the locus of described mobile terminal and the corresponding relation of human face data that collects.
11. systems as claimed in claim 10, is characterized in that:
Described configuration module, for pre-setting the locus of described face to described mobile terminal and the corresponding relation of human face data, comprising:
Described face is set to the maximal value of the distance of described mobile terminal and minimum value, arrange angle maximal value to the left be just to mobile terminal left avertence 90 degree and angle maximal value to the right for just to mobile terminal right avertence 90 degree;
According to the distance interval of presetting and angle intervals, from distance minimum value and left avertence 90 degree, gather the different distance of face to mobile terminal and the human face data of different angles successively to the right;
Preserve described different distance and human face data corresponding to different angles.
12. systems as claimed in claim 11, is characterized in that:
Described locus identification module, for obtaining the locus of user's face to mobile terminal, comprising:
Gather the human face data of active user, the human face data of more described active user and the human face data of preservation, when the difference of the human face data of described active user and the human face data of described preservation is less than predetermined threshold value, then using locus corresponding for the human face data of described preservation as active user's face to the locus of mobile terminal.
13. systems as claimed in claim 9, is characterized in that:
Described configuration module, also for before determining described face to the audio frequency parameter that the locus of described mobile terminal is corresponding, the also locus of pre-configured face to mobile terminal and the corresponding relation of audio frequency parameter;
Described audio frequency parameter determination module, for determining the audio frequency parameter that described face is corresponding to the locus of described mobile terminal, comprising:
Determine to the locus of mobile terminal and the corresponding relation of audio frequency parameter the audio frequency parameter that described face is corresponding to the locus of described mobile terminal according to pre-configured described face.
14. systems as claimed in claim 13, is characterized in that:
Described configuration module, also for the locus of pre-configured face to mobile terminal and the corresponding relation of audio frequency parameter, comprising:
Described face is set to the maximal value of the distance of described mobile terminal and minimum value, arrange angle maximal value to the left be just to mobile terminal left avertence 90 degree and angle maximal value to the right for just to mobile terminal right avertence 90 degree;
According to the distance interval of presetting and angle intervals, from distance minimum value and left avertence 90 degree, measure sensitivity and the frequency response of the voice signal that same voice signal is exported to different distance and the different angles of described mobile terminal by described speech processing module process at described face to the right successively;
Calculate different distance and different angles are exported the sensitivity of voice signal and the audio frequency parameter of frequency response in critical field;
Preserve different distance and audio frequency parameter corresponding to different angles.
15. 1 kinds of mobile terminals, comprising: the system of the raising speech quality as described in claim 8 ~ 14.
CN201410428590.8A 2014-08-26 2014-08-26 Speech quality improving method and system and mobile terminal Pending CN105469802A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201410428590.8A CN105469802A (en) 2014-08-26 2014-08-26 Speech quality improving method and system and mobile terminal
PCT/CN2014/087707 WO2015117343A1 (en) 2014-08-26 2014-09-28 Method and system for improving tone quality of voice, and mobile terminal

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410428590.8A CN105469802A (en) 2014-08-26 2014-08-26 Speech quality improving method and system and mobile terminal

Publications (1)

Publication Number Publication Date
CN105469802A true CN105469802A (en) 2016-04-06

Family

ID=53777170

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410428590.8A Pending CN105469802A (en) 2014-08-26 2014-08-26 Speech quality improving method and system and mobile terminal

Country Status (2)

Country Link
CN (1) CN105469802A (en)
WO (1) WO2015117343A1 (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106060268A (en) * 2016-06-30 2016-10-26 维沃移动通信有限公司 Voice output method for mobile terminal and mobile terminal
CN106231046A (en) * 2016-08-03 2016-12-14 厦门傅里叶电子有限公司 Method according to grip Automatic Optimal receiver performance
CN106231498A (en) * 2016-09-27 2016-12-14 广东小天才科技有限公司 Method and device for adjusting microphone audio acquisition effect
WO2020062900A1 (en) * 2018-09-29 2020-04-02 华为技术有限公司 Sound processing method, apparatus and device
CN111741402A (en) * 2019-03-25 2020-10-02 比亚迪股份有限公司 Microphone noise reduction control method and device
CN113284504A (en) * 2020-02-20 2021-08-20 北京三星通信技术研究有限公司 Attitude detection method and apparatus, electronic device, and computer-readable storage medium
CN114727120A (en) * 2021-01-04 2022-07-08 腾讯科技(深圳)有限公司 Method and device for acquiring live broadcast audio stream, electronic equipment and storage medium
CN115834757A (en) * 2021-09-17 2023-03-21 北京小米移动软件有限公司 Data transmission method, electronic device, communication system and readable storage medium

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106126168B (en) * 2016-06-16 2018-12-11 广东欧珀移动通信有限公司 A kind of sound effect treatment method and device

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103270741A (en) * 2010-12-16 2013-08-28 摩托罗拉移动有限责任公司 System and method for adapting presentation attribute for a mobile communication device
US20130332156A1 (en) * 2012-06-11 2013-12-12 Apple Inc. Sensor Fusion to Improve Speech/Audio Processing in a Mobile Device
US20140011547A1 (en) * 2012-07-03 2014-01-09 Sony Mobile Communications Japan Inc. Terminal device, information processing method, program, and storage medium
CN103852066A (en) * 2012-11-28 2014-06-11 联想(北京)有限公司 Equipment positioning method, control method, electronic equipment and system
WO2014126991A1 (en) * 2013-02-13 2014-08-21 Vid Scale, Inc. User adaptive audio processing and applications

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9293151B2 (en) * 2011-10-17 2016-03-22 Nuance Communications, Inc. Speech signal enhancement using visual information
CN102611965A (en) * 2012-03-01 2012-07-25 广东步步高电子工业有限公司 Method for eliminating influence of distance between dual-microphone de-noising mobilephone and mouth on sending loudness of dual-microphone de-noising mobilephone
JP2013201525A (en) * 2012-03-23 2013-10-03 Mitsubishi Electric Corp Beam forming processing unit
CN103716437A (en) * 2012-09-28 2014-04-09 华为终端有限公司 Sound quality and volume control method and apparatus

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103270741A (en) * 2010-12-16 2013-08-28 摩托罗拉移动有限责任公司 System and method for adapting presentation attribute for a mobile communication device
US20130332156A1 (en) * 2012-06-11 2013-12-12 Apple Inc. Sensor Fusion to Improve Speech/Audio Processing in a Mobile Device
US20140011547A1 (en) * 2012-07-03 2014-01-09 Sony Mobile Communications Japan Inc. Terminal device, information processing method, program, and storage medium
CN103852066A (en) * 2012-11-28 2014-06-11 联想(北京)有限公司 Equipment positioning method, control method, electronic equipment and system
WO2014126991A1 (en) * 2013-02-13 2014-08-21 Vid Scale, Inc. User adaptive audio processing and applications

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106060268A (en) * 2016-06-30 2016-10-26 维沃移动通信有限公司 Voice output method for mobile terminal and mobile terminal
CN106231046A (en) * 2016-08-03 2016-12-14 厦门傅里叶电子有限公司 Method according to grip Automatic Optimal receiver performance
CN106231046B (en) * 2016-08-03 2019-04-09 厦门傅里叶电子有限公司 According to the method for grip Automatic Optimal earpiece performance
CN106231498A (en) * 2016-09-27 2016-12-14 广东小天才科技有限公司 Method and device for adjusting microphone audio acquisition effect
WO2020062900A1 (en) * 2018-09-29 2020-04-02 华为技术有限公司 Sound processing method, apparatus and device
CN111741402A (en) * 2019-03-25 2020-10-02 比亚迪股份有限公司 Microphone noise reduction control method and device
CN111741402B (en) * 2019-03-25 2021-12-07 比亚迪股份有限公司 Microphone noise reduction control method and device
CN113284504A (en) * 2020-02-20 2021-08-20 北京三星通信技术研究有限公司 Attitude detection method and apparatus, electronic device, and computer-readable storage medium
CN114727120A (en) * 2021-01-04 2022-07-08 腾讯科技(深圳)有限公司 Method and device for acquiring live broadcast audio stream, electronic equipment and storage medium
CN114727120B (en) * 2021-01-04 2023-06-09 腾讯科技(深圳)有限公司 Live audio stream acquisition method and device, electronic equipment and storage medium
CN115834757A (en) * 2021-09-17 2023-03-21 北京小米移动软件有限公司 Data transmission method, electronic device, communication system and readable storage medium

Also Published As

Publication number Publication date
WO2015117343A1 (en) 2015-08-13

Similar Documents

Publication Publication Date Title
CN105469802A (en) Speech quality improving method and system and mobile terminal
CN110972014B (en) Parameter adjustment method and device for active noise reduction earphone and wireless earphone
US20130058503A1 (en) Audio processing apparatus, audio processing method, and audio output apparatus
US20160330547A1 (en) Loud-speaking, loud-speaker and interactive device
CN111798852A (en) Voice wake-up recognition performance test method, device and system and terminal equipment
US20230034046A1 (en) Primary-secondary ear switching method and apparatus for tws earphones in communication scenario, and medium
CN110099322B (en) Method and device for detecting wearing state of earphone
CN114071308B (en) Headset self-adaptive tuning method and device, headset and readable storage medium
CN106941637A (en) A kind of method, system and the earphone of self adaptation active noise reduction
CN109246517A (en) A kind of noise reduction microphone bearing calibration, wireless headset and the charging box of wireless headset
CN107734428A (en) A kind of 3D audio-frequence player devices
CN113746983B (en) Hearing aid method and device, storage medium and intelligent terminal
CN108900951A (en) Volume adjusting method, earphone and computer readable storage medium
CN110035372A (en) Output control method and device of sound amplification system, sound amplification system and computer equipment
CN112492445A (en) Method and processor for realizing signal equalization by using ear-covering type earphone
CN107404587B (en) Audio playing control method, audio playing control device and mobile terminal
CN111541966B (en) Uplink noise reduction method and device of wireless earphone and wireless earphone
CN113596657A (en) Earphone in-ear detection method, terminal and computer readable storage medium
KR101659410B1 (en) Sound optimization device and method about combination of personal smart device and earphones
CN113794965A (en) Earphone frequency response calibration method and device, earphone equipment and storage medium
CN111787479B (en) Method and system for correcting listening sensation of TWS earphone
CN113096677A (en) Intelligent noise reduction method and related equipment
CN112153527A (en) Voice playing control method and related device
CN115696172B (en) Sound image calibration method and device
CN111107214A (en) Volume adjusting method and device

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20160406

WD01 Invention patent application deemed withdrawn after publication