CN107995624A

CN107995624A - The system and method for carrying out voice data output is transmitted based on multi-path data

Info

Publication number: CN107995624A
Application number: CN201711282364.3A
Authority: CN
Inventors: 王梅
Original assignee: Individual
Current assignee: Beijing Momo Information Technology Co Ltd
Priority date: 2017-12-07
Filing date: 2017-12-07
Publication date: 2018-05-04
Anticipated expiration: 2037-12-07
Also published as: CN107995624B; CN112804683A

Abstract

The invention discloses a kind of method and system that progress voice data output is transmitted based on multi-path data, wherein method includes：Sound agent equipment receives the voice data output request from mobile terminal and determines target sound output unit；The sound agent equipment judges the Network Transmission Delays received from mobile terminal, if it is determined that the Network Transmission Delays are less than time delay threshold value, are sent to mobile terminal and be used to indicate that the mobile terminal transmits the instruction message for carrying out voice data output by multi-path data；In response to wireless communication establishment of connection, the mobile terminal enters multi-path data transmission mode.Each first output order transmitted based on sound agent equipment in other voice output units and the voice data Jing Guo noise reduction process carry out sound output, and the second output order for being transmitted according to mobile terminal of the target sound output unit and original sound data carry out sound output.

Description

The system and method for carrying out voice data output is transmitted based on multi-path data

Technical field

The present invention relates to such as network communication field, the Internet of Things communications field, data processing field, speech processes field Deng, and more particularly, to the system and method that progress voice data output is transmitted based on multi-path data.

Background technology

At present, in videoconference or on-the-spot meeting, it is desirable to which the user for carrying out voice output usually requires to utilize such as wheat The voice-input device of gram wind etc carries out phonetic entry and then passes through the voice-output device of such as loudspeaker carrying out language Sound exports.Nevertheless, it is often the case that the lazy weight of voice-input device.The lazy weight meeting of this voice-input device Certain customers are caused to wait other users to pass over voice-input device when being desired with voice output.In addition, work as Two users for it is identical the problem of carry out frequently alternately voice output when, it may be necessary between the two users frequently Ground exchanges voice-input device.

In this case, it on the one hand can cause delay during user's progress voice output, such as need waiting voice defeated Enter equipment, on the other hand will also result in user and carry out the constant of voice output, such as need to carry out the switching of voice-input device.

In addition, when also carrying out the output of multi-medium data while user wishes and carrying out voice output, the prior art In scheme can not realize this requirement.

The content of the invention

According to an aspect of the present invention, there is provided a kind of method that data transmission in network telephony is carried out based on more transmission paths, institute The method of stating includes：

User initiates sound output request via mobile terminal to sound agent equipment, defeated from sound by sonic transmissions Go out unit and receive sonic transmissions matching code and after sound agent equipment receives the response message for permitting carrying out sound output, meter Calculate network delay and the network delay is sent to sound agent equipment；

According to the instruction into feedback inhibition pattern of sound agent equipment, multipath is entered based on multi-path transmission protocol Transmission mode, is connected using first network and gives data transmission in network telephony input by user to sound agent equipment, while using Second network connection by data transmission in network telephony input by user give multiple voice output units in apart from mobile terminal locations most Near voice output unit,

Using sound agent equipment sample sound, mobile terminal position are extracted from the sound output request that mobile terminal receives Put with sonic transmissions matching code, determine whether that the user carries out sound output according to sample sound, and without considering It is described to determine whether that the user carries out to indicate multiple voice output unit middle-ranges in the case of the result of sound output Sonic transmissions matching code is sent to mobile terminal by the voice output unit nearest from mobile terminal locations by sonic transmissions；

If determining to allow the user to carry out sound output according to sample sound, response message is sent to mobile terminal And network delay will be received from mobile terminal, if the network delay is more than feedback threshold, into feedback inhibition pattern Carry out sound output；

In the feedback inhibition pattern, sound agent equipment instruction mobile terminal is entered more based on multi-path transmission protocol Path transmission pattern,

Use the sound in multiple voice output units in addition to the voice output unit nearest apart from mobile terminal locations Output unit carries out sound output based on the output order transmitted by sound agent equipment and voice data, and the distance is mobile eventually Output order and voice data of the nearest voice output unit of end position according to transmitted by mobile terminal carry out sound output.

According to an aspect of the present invention, there is provided a kind of that the side for carrying out voice data output is transmitted based on multi-path data Method, the described method includes：

User sends voice data output request, institute by the first wireless network using mobile terminal to sound agent equipment Stating voice data output request includes sample sound, the current location of mobile terminal and initial matching code；

The sound agent equipment receives the voice data output request from mobile terminal, defeated according to the voice data The sample sound gone out in request determines whether that the mobile terminal carries out voice data output, if it is determined that allows the shifting Dynamic terminal carries out voice data output, then is sent to mobile terminal and permit output message；

Current location of the sound agent equipment based on the mobile terminal in voice data output request exists The target sound output unit closest with the current location of the mobile terminal is detected in multiple voice output units, and And determine that the first sound wave passes based on the initial matching code in voice data output request and the matching random number generated at random Defeated matching code, the target sound output unit is sent to by the first sonic transmissions matching code；

Mobile terminal receives the allowance output message and determines net based on the timestamp in the allowance output message The Network Transmission Delays are sent to the sound agent equipment by network transmission delay；

The sound agent equipment judges the Network Transmission Delays received from mobile terminal, if it is determined that the net Network transmission delay is less than time delay threshold value, is sent to mobile terminal and is used to indicate that the mobile terminal is passed by multi-path data It is input into the instruction message of row voice data output；

In response to receiving instruction message, the mobile terminal is according in the initial defeated matching code and the instruction message Matching random number determine the second sonic transmissions matching code and be broadcast to the second sound wave transmission code based on acoustic communication The multiple voice output unit；

When the target sound output unit in the multiple voice output unit determines the second received sonic transmissions With code it is identical with the first sonic transmissions matching code when, wireless communication is established by second wireless network with the mobile terminal Connection；

In response to the wireless communication establishment of connection, the mobile terminal enters multi-path data transmission mode：Pass through The sound acquisition device of mobile terminal obtains original sound data input by user, and the original sound data is carried out noise reduction Processing is to generate the voice data by noise reduction process, using the first wireless network by the voice data by noise reduction process Be transferred to sound agent equipment, the sound agent equipment by cable network by the voice data by noise reduction process and First output order is transferred in multiple voice output units other sound output in addition to the target sound output unit Unit, while, original sound data and the second output order are transferred to by the mobile terminal using second wireless network The target sound output unit；And

Each first output order transmitted based on sound agent equipment in other voice output units and Voice data by noise reduction process carries out sound output, and the target sound output unit is transmitted according to mobile terminal The second output order and original sound data carry out sound output.

Voice data output request is sent to sound agent equipment by the first wireless network using mobile terminal in user Further include before：Sample sound input by user is obtained by the sound acquisition device of mobile terminal.The sample sound is to use One section of voice that family inputs in the site environment for carrying out voice output.The sample sound is for expressing the general of User Perspective The one section of voice stated.

The sample sound is for introducing the one of user identity section of voice.Pass through first using mobile terminal in user Wireless network further includes before sending voice data output request to sound agent equipment：Obtained using the positioning devices of mobile terminal Take the current location of mobile terminal.

The current location that the positioning devices using mobile terminal obtain mobile terminal includes：The positioning devices according to Indoor auxiliary positioning, outdoor auxiliary positioning and/or access point auxiliary positioning calibrate satellite location data, to obtain movement The current location of terminal.

Voice data output request is sent to sound agent equipment by the first wireless network using mobile terminal in user Further include before：MAC address generation initial matching code based on mobile terminal, or based on the hard of mobile terminal Part address generates initial matching code.

Sample sound in the output request according to the voice data determines whether that the mobile terminal carries out Voice data output includes：Speech recognition is carried out to the sample sound to generate text information, when the text information meets During the communicative habits of corresponding language, determine to allow the mobile terminal to carry out voice data output.

Wherein determine whether the text information meets the communicative habits of corresponding language based on semantics recognition.

The text information is divided into by least one sentence unit according to the punctuate symbol of text information, to each sentence Unit carries out independent semantic analysis to determine semantic fraction, and the weighted sum of the semantic fraction of each sentence unit is more than expression Determine that the text information meets the communicative habits of corresponding language during threshold value.

Sound agent equipment obtains and stores each voice output unit in the multiple voice output unit in advance Position.

Current location based on the mobile terminal and each voice output unit in the multiple voice output unit Air line distance between position determines the target sound output unit closest with the current location of the mobile terminal.

Current location based on the mobile terminal and each voice output unit in the multiple voice output unit Sonic transmissions distance between position determines the target sound output unit closest with the current location of the mobile terminal.

The matching random number for exporting the initial matching code in request based on the voice data and generating at random determines First sonic transmissions matching code includes：Initial matching code is carried out character string with the matching random number generated at random to connect to generate First sonic transmissions matching code.

First is determined based on the initial matching code in voice data output request and the matching random number generated at random Sonic transmissions matching code includes：Initial matching code and the matching random number generated at random are summed up and passed with generating the first sound wave Defeated matching code.

The matching random number for exporting the initial matching code in request based on the voice data and generating at random determines First sonic transmissions matching code includes：Based on the matching random number generated at random to the initial matching code carry out cyclic shift, Digitwise operation or step-by-step are spliced to generate the first sonic transmissions matching code.

The sample sound is used to indicate at least one in herein below：The speech intelligibility of user, the voice of user Involved language form and background sound noise intensity.

When user speech intelligibility be more than minimum requirements clarity threshold, user voice involved by language form energy In the case of being enough less than maximum allowable noise intensity by speech recognition server automatic translation and background sound noise intensity, determine The mobile terminal is allowed to carry out voice data output.

The timestamp based in the allowance output message determines that Network Transmission Delays include：Mobile terminal is based on using Network delay is determined in the timestamp of instruction sending time and the current time of mobile terminal.

If it is determined that the Network Transmission Delays are greater than or equal to time delay threshold value, are sent to mobile terminal and be used to indicate The mobile terminal can not transmit the instruction message for carrying out voice data output by multi-path data.

In response to receiving without the instruction message for transmitting progress voice data output by multi-path data to method, the movement Terminal carries out voice data output by single path data transfer.

Matching random number of the mobile terminal in the initial defeated matching code and the instruction message determines second Sonic transmissions matching code includes：Initial matching code is carried out character string with matching random number to connect to generate the second sonic transmissions With code.

Matching random number of the mobile terminal in the initial defeated matching code and the instruction message determines second Sonic transmissions matching code includes：Initial matching code and matching random number are summed up to generate the second sonic transmissions matching code.

Matching random number of the mobile terminal in the initial defeated matching code and the instruction message determines second Sonic transmissions matching code includes：Cyclic shift, digitwise operation or step-by-step are carried out to the initial matching code based on matching random number Splice to generate the second sonic transmissions matching code.

The target sound output unit is stored received at least one first sonic transmissions matching code.

In the multiple voice output unit it is each by received at least one first sonic transmissions matching code into Row storage.

It is each when receiving the second sound wave transmission code in the multiple voice output unit, second sound wave is passed Defeated code is compared with least one first sonic transmissions matching code stored.

In the multi-path data transmission mode, the mobile terminal passes through two different transmission paths or at least two A different transmission path carries out the transmission of voice data.

Wherein described original sound data is the original sound that user is inputted by the sound acquisition device of mobile terminal Data flow, and the voice data by noise reduction process is the audio data stream by noise reduction process.

It is described to give the data transmission in network telephony by noise reduction process to sound agent equipment bag using the first wireless network Include：The audio data stream by noise reduction process is transferred to sound agent equipment in real time using the first wireless network.

Original sound data and the second output order are transferred to the mesh by the mobile terminal using second wireless network Mark voice output unit includes：Original sound data stream is transferred to institute by the mobile terminal in real time using second wireless network Target sound output unit is stated, and second output order is transferred to the target sound output unit.It is wherein described First output order includes being used for the volume value for indicating output volume.Wherein described second output order includes：It is defeated for indicating Go out the volume value of volume.

The voice data by noise reduction process and first are exported by cable network in the sound agent equipment Instruction is transferred to before other voice output units in multiple voice output units in addition to the target sound output unit Further include：

The original frequency and Multisound of the voice data by noise reduction process are determined, based on pre-set benchmark The difference of frequency values and the frequency determines the frequency level of the voice data by noise reduction process, based on described frequency etc. The definite tone color weighted factor of level, determines the spectrum curve of the Multisound and according to the spectrum curve and default tone color The similarity of normal line determines the initial tone color fraction of the voice data by noise reduction process, based on the tone color weighting because Sub and initial tone color fraction determines the tone color grade of the voice data by noise reduction process；Determine described to pass through noise reduction process Voice data original sound volume, volume weighted factor is determined based on the tone color grade, and based on volume weighting because Son and original sound volume determine output volume.The pre-set reference frequency value is 100Hz, 120Hz or 150Hz.

Determine that the original frequency (such as 60-300Hz) subtracts the difference of reference frequency value, calculate the difference divided by The integer, is determined as frequency of the voice data by noise reduction process etc. by the integer in the result of value (such as 10) Level, is determined as tone color weighted factor by the absolute value of the frequency level.

The initial tone color fraction includes：25 points, 26 points, 27 points, 28 points and 29 points.

It is described that the voice data by noise reduction process is determined based on the tone color weighted factor and initial tone color fraction Tone color grade include：Initial tone color fraction is subtracted into the obtained difference of tone color weighted factor and passes through noise reduction as described The tone color grade of the voice data of processing, and by the tone color grade.Volume weighted factor is determined based on the tone color grade Including：Percentages corresponding to the tone color grade divided by 100 obtained numerical value are determined as tone color weighted factor.

It is described to determine that output volume includes based on the volume weighted factor and original sound volume：Output volume=original sound volume × (volume weighted factor+1).First output order includes being used for the volume value for indicating the output volume.Described second Output order includes being used for the volume value for indicating original sound volume.

Each first output order transmitted based on sound agent equipment in other voice output units and Voice data by noise reduction process, which carries out sound output, to be included：Each in other voice output units is based on sound Volume value in the first output order that agent equipment is transmitted and the voice data Jing Guo noise reduction process carry out sound output.

The second output order and original sound data that the target sound output unit is transmitted according to mobile terminal into The output of row sound includes：Volume value in the second output order that the target sound output unit is transmitted according to mobile terminal Sound output is carried out with original sound data.

The sound agent equipment is referred to the voice data by noise reduction process and the first output by cable network Order is transferred to before other voice output units in multiple voice output units in addition to the target sound output unit also Including：

It is described to determine that output volume includes based on the volume weighted factor and original sound volume：Output volume=original sound volume × (1- volumes weighted factor).First output order includes being used for the volume value for indicating the output volume.Described second Output order includes being used for the volume value for indicating original sound volume.

Determine that the original frequency (such as 60-300Hz) subtracts the difference of reference frequency value, calculate the difference divided by The integer, is determined as frequency of the voice data by noise reduction process etc. by the integer in the result of value (such as 10) Level, is determined as tone color weighted factor by the absolute value of the frequency level.The initial tone color fraction includes：25 points, 26 points, 27 Divide, 28 points and 29 points.It is described determined based on the tone color weighted factor and initial tone color fraction it is described by noise reduction process The tone color grade of voice data includes：Initial tone color fraction is subtracted into the obtained difference of tone color weighted factor as described in The tone color grade of voice data by noise reduction process, and by the tone color grade.Volume is determined based on the tone color grade Weighted factor includes：Percentages corresponding to the tone color grade divided by 100 obtained numerical value are determined as tone color weighting The factor.

It is described to determine that output volume includes based on the volume weighted factor and original sound volume：Output volume=original sound volume × (1+ volumes weighted factor).First output order includes：For the volume value for indicating the output volume and the shifting The current location of dynamic terminal.Second output order includes being used for the volume value for indicating original sound volume.

Each first output order transmitted based on sound agent equipment in other voice output units and Voice data by noise reduction process, which carries out sound output, to be included：Each in other voice output units is based on sound The current location of mobile terminal in the first output order that agent equipment is transmitted determine with the straight line of the mobile terminal away from From the percentages corresponding to numerical value for obtaining the air line distance divided by 1000 are determined as the distance weighted factor, are based on The volume value of the distance weighted factor and the first output order calculates actual volume value, and according to actual volume value and process The voice data of noise reduction process carries out sound output；Wherein actual volume value=volume value × (the distance weighted factors of 1+).

Using second wireless network original sound data and the second output order are transferred in the mobile terminal described Further included before target sound output unit：Second output order include be used for indicate output volume volume value and The Network Transmission Delays.

The second output order and original sound data that the target sound output unit is transmitted according to mobile terminal into The output of row sound includes：Volume value in the second output order that the target sound output unit is transmitted according to mobile terminal And the sound postponed with the time of Network Transmission Delays to the original sound data exports, so that the target sound Sound output unit and each in other voice output units when carrying out sound output can the retention time it is consistent.

When current location of the sound agent equipment based on the mobile terminal in voice data output request Determine that at least two sound output closest with the current location of the mobile terminal is single in multiple voice output units When first, a voice output unit is randomly choosed from least two voice output units as target sound output unit.

When current location of the sound agent equipment based on the mobile terminal in voice data output request At least two sound output closest with the current location of the mobile terminal is detected in multiple voice output units During unit, the description information of at least two voice output unit is sent to the mobile terminal and in response to user's Message is selected to determine target sound output unit from least two voice output units.

When current location of the sound agent equipment based on the mobile terminal in voice data output request is more At least two voice output units closest with the current location of the mobile terminal are detected in a voice output unit When, at least two voice output unit with sound agent equipment target sound will be determined apart from farthest voice output unit Sound output unit.

When current location of the sound agent equipment based on the mobile terminal in voice data output request is more At least two voice output units closest with the current location of the mobile terminal are detected in a voice output unit When, voice output unit closest with sound agent equipment at least two voice output unit is determined into target sound Sound output unit.

According to an aspect of the present invention, there is provided a kind of system that data transmission in network telephony is carried out based on more transmission paths, institute The system of stating includes：

Mobile terminal, sound output request is initiated to sound agent equipment, by sonic transmissions from voice output unit Receive sonic transmissions matching code and after sound agent equipment receives the response message for permitting carrying out sound output, calculating network Postpone and the network delay is sent to sound agent equipment；

In response to the instruction into feedback inhibition pattern of sound agent equipment, multichannel is entered based on multi-path transmission protocol Footpath transmission mode, is connected using first network and gives data transmission in network telephony input by user to sound agent equipment, while making With the second network connection by data transmission in network telephony input by user give multiple voice output units in apart from mobile terminal locations Nearest voice output unit,

Sound agent equipment, sample sound, mobile terminal locations are extracted from the sound output request that mobile terminal receives With sonic transmissions matching code, determine whether that the user carries out sound output according to sample sound, and without considering institute State and determine whether that the user carries out to indicate distance in multiple voice output units in the case of the result of sound output Sonic transmissions matching code is sent to mobile terminal by the nearest voice output unit of mobile terminal locations by sonic transmissions；

If determining to allow the user to carry out sound output according to sample sound, response message is sent to mobile terminal And network delay will be received from mobile terminal, if network delay is more than feedback threshold, will be carried out into feedback inhibition pattern Sound exports；

In feedback inhibition pattern, sound agent equipment instruction mobile terminal enters multipath based on multi-path transmission protocol Transmission mode,

Multiple voice output units, except the sound nearest apart from mobile terminal locations is defeated in the multiple voice output unit The voice output unit gone out outside unit is defeated based on the output order transmitted by sound agent equipment and voice data progress sound Go out, output order and sound number of the voice output unit nearest apart from mobile terminal locations according to transmitted by mobile terminal According to progress sound output.

According to an aspect of the present invention, there is provided a kind of to be based on what multi-path data transmitted progress voice data output System, the system comprises：

Mobile terminal, being sent by the first wireless network to sound agent equipment please by voice data input by user output Ask, the voice data output request includes sample sound, the current location of mobile terminal and initial matching code；In response to receiving To instruction message, matching random number of the mobile terminal in the initial defeated matching code and the instruction message determines the Two sonic transmissions matching codes and the second sound wave transmission code is broadcast to by multiple voice output units based on acoustic communication；

Sound agent equipment, receives the voice data output request from mobile terminal, is exported according to the voice data Sample sound in request determines whether that the mobile terminal carries out voice data output, if it is determined that allows the movement Terminal carries out voice data output, then is sent to mobile terminal and permit output message；Based in voice data output request The mobile terminal current location detected in multiple voice output units with the current location of the mobile terminal away from From nearest target sound output unit, and based on the initial matching code in voice data output request and random generation Matching random number determine the first sonic transmissions matching code, the first sonic transmissions matching code is sent to the target sound Output unit；

Wherein, the mobile terminal received the allowance output message and based on the time in the allowance output message Stamp determines Network Transmission Delays, and the Network Transmission Delays are sent to the sound agent equipment；Sound agent equipment to from The Network Transmission Delays that mobile terminal receives are judged, if it is determined that the Network Transmission Delays are less than time delay threshold value, Sent to mobile terminal and be used to indicate that the mobile terminal is disappeared by the instruction of multi-path data transmission progress voice data output Breath；

Multiple voice output units, when the target sound output unit in the multiple voice output unit determines to be received The second sonic transmissions matching code it is identical with the first sonic transmissions matching code when, it is wireless by second with the mobile terminal Network establishes wireless communication connection；

Wherein enter multi-path data transmission mode in response to the wireless communication establishment of connection, the mobile terminal： Original sound data input by user is obtained by the sound acquisition device of mobile terminal, the original sound data is carried out Noise reduction process is to generate the voice data by noise reduction process, using the first wireless network by the sound by noise reduction process Data are transferred to sound agent equipment, and the sound agent equipment is by cable network by the sound number by noise reduction process According to the other sound being transferred to the first output order in multiple voice output units in addition to the target sound output unit Output unit, while, the mobile terminal is passed original sound data and the second output order using second wireless network It is defeated by the target sound output unit；And

Further include：Sample sound input by user is obtained by the sound acquisition device of mobile terminal.The sample sound It is one section of voice that user inputs in the site environment for carrying out voice output.The sample sound is to be used to express User Perspective General introduction one section of voice.The sample sound is for introducing the one of user identity section of voice.Further include：Using movement eventually The positioning devices at end obtain the current location of mobile terminal.

Further include：MAC address generation initial matching code based on mobile terminal, or based on mobile terminal Hardware address generation initial matching code.The sound agent equipment carries out speech recognition to generate word to the sample sound Information, when the text information meets the communicative habits of corresponding language, determines to allow the mobile terminal to carry out voice data Output.

The sound agent equipment determines whether the text information meets the expression of corresponding language based on semantics recognition Custom.

The text information is divided at least one sentence by the sound agent equipment according to the punctuate symbol of text information Subelement, carries out each sentence unit independent semantic analysis to determine semantic fraction, by the semanteme of each sentence unit point Several weighted sums determines that the text information meets the communicative habits of corresponding language when being more than expression threshold value.The sound agency sets It is standby to obtain in advance and store the position of each voice output unit in the multiple voice output unit.The sound agency sets The position of the standby current location based on the mobile terminal and each voice output unit in the multiple voice output unit it Between air line distance determine the target sound output unit closest with the current location of the mobile terminal.The sound generation Manage current location of the equipment based on the mobile terminal and the position of each voice output unit in the multiple voice output unit Sonic transmissions distance between putting determines the target sound output unit closest with the current location of the mobile terminal.Institute Sound agent equipment is stated to connect initial matching code to generate the first sound wave with the matching random number progress character string generated at random Transmit matching code.The sound agent equipment sums up initial matching code and the matching random number generated at random to generate One sonic transmissions matching code.The sound agent equipment carries out the initial matching code based on the matching random number generated at random Cyclic shift, digitwise operation or step-by-step are spliced to generate the first sonic transmissions matching code.The sample sound is used to indicate following It is at least one in content：Language form and background sound noise involved by the speech intelligibility of user, the voice of user is strong Degree.

When user speech intelligibility be more than minimum requirements clarity threshold, user voice involved by language form energy It is described in the case of being enough less than maximum allowable noise intensity by speech recognition server automatic translation and background sound noise intensity Sound agent equipment determines to allow the mobile terminal to carry out voice data output.

The mobile terminal is based on for indicating that the timestamp of sending time and the current time of mobile terminal determine network Delay.

If it is determined that the Network Transmission Delays are greater than or equal to time delay threshold value, the sound agent equipment is to movement Terminal, which is sent, to be used to indicate that the mobile terminal can not transmit the instruction message for carrying out voice data output by multi-path data.

The instruction message for carrying out voice data output, the movement can not be transmitted by multi-path data in response to receiving Terminal carries out voice data output by single path data transfer.

Initial matching code is carried out character string with matching random number and connected to generate the second sonic transmissions by the mobile terminal Matching code.The mobile terminal sums up initial matching code and matching random number to generate the second sonic transmissions matching code. The mobile terminal is based on matching random number and cyclic shift, digitwise operation or step-by-step splicing is carried out to the initial matching code with life Into the second sonic transmissions matching code.The target sound output unit matches received at least one first sonic transmissions Code is stored.It is each by received at least one first sonic transmissions matching code in the multiple voice output unit Stored.It is each when receiving the second sound wave transmission code in the multiple voice output unit, by second sound wave Transmission code is compared with least one first sonic transmissions matching code stored.In the multi-path data transmission mode In, the mobile terminal carries out the biography of voice data by the different transmission path of two different transmission paths or at least two It is defeated.Wherein described original sound data is the original sound data that user is inputted by the sound acquisition device of mobile terminal Stream, and the voice data by noise reduction process is the audio data stream by noise reduction process.The mobile terminal uses The audio data stream by noise reduction process is transferred to sound agent equipment by the first wireless network in real time.

Original sound data stream is transferred to the target sound by the mobile terminal in real time using second wireless network Output unit, and second output order is transferred to the target sound output unit.Wherein described first output refers to Order includes being used for the volume value for indicating output volume.Wherein described second output order includes：For indicating the sound of output volume Value.

Further include the sound agent equipment and determine the original frequency of the voice data by noise reduction process and original Tone color, the difference based on pre-set reference frequency value and the frequency determine the voice data by noise reduction process Frequency level, tone color weighted factor is determined based on the frequency level, determine the Multisound spectrum curve and according to The similarity of the spectrum curve and default tone color normal line determines the initial sound of the voice data by noise reduction process Color fraction, the tone color of the voice data by noise reduction process is determined based on the tone color weighted factor and initial tone color fraction Grade；Determine the original sound volume of the voice data by noise reduction process, based on the tone color grade determine volume weighting because Son, and output volume is determined based on the volume weighted factor and original sound volume.The pre-set reference frequency value is 100Hz, 120Hz or 150Hz.

The sound agent equipment determines that the original frequency (such as 60-300Hz) subtracts the difference of reference frequency value, meter The integer in the difference divided by the result of spacing value (such as 10) is calculated, the integer is determined as described by noise reduction process The frequency level of voice data, is determined as tone color weighted factor by the absolute value of the frequency level.

It is described that the voice data by noise reduction process is determined based on the tone color weighted factor and initial tone color fraction Tone color grade include：Initial tone color fraction is subtracted into the obtained difference of tone color weighted factor and passes through noise reduction as described The tone color grade of the voice data of processing, and by the tone color grade.

Determine that volume weighted factor includes based on the tone color grade：By the tone color grade divided by 100 obtained numerical value Corresponding percentages are determined as tone color weighted factor.

Further include the sound agent equipment and determine the original frequency of the voice data by noise reduction process and original Tone color, the difference based on pre-set reference frequency value and the frequency determine the voice data by noise reduction process Frequency level, tone color weighted factor is determined based on the frequency level, determine the Multisound spectrum curve and according to The similarity of the spectrum curve and default tone color normal line determines the initial sound of the voice data by noise reduction process Color fraction, the tone color of the voice data by noise reduction process is determined based on the tone color weighted factor and initial tone color fraction Grade；Determine the original sound volume of the voice data by noise reduction process, based on the tone color grade determine volume weighting because Son, and output volume is determined based on the volume weighted factor and original sound volume.

The pre-set reference frequency value is 100Hz, 120Hz or 150Hz.

Each first output order transmitted based on sound agent equipment in other voice output units and Voice data by noise reduction process, which carries out sound output, to be included：Each in other voice output units is based on sound The current location of mobile terminal in the first output order that agent equipment is transmitted determine with the straight line of the mobile terminal away from From the percentages corresponding to numerical value for obtaining the air line distance divided by 1000 are determined as the distance weighted factor, are based on The volume value of the distance weighted factor and the first output order calculates actual volume value, and according to actual volume value and process The voice data of noise reduction process carries out sound output；

Wherein actual volume value=volume value × (the distance weighted factors of 1+).

The pre-set reference frequency value is 100Hz, 120Hz or 150Hz.

Further include, the mobile terminal makes to include being used for the volume value for indicating output volume in second output order With the Network Transmission Delays.

When current location of the sound agent equipment based on the mobile terminal in voice data output request At least two sound output closest with the current location of the mobile terminal is detected in multiple voice output units During unit, voice output unit closest with sound agent equipment at least two voice output unit is determined into mesh Mark voice output unit.

Brief description of the drawings

By reference to the following drawings, the illustrative embodiments of the present invention can be more fully understood by：

Fig. 1 a, 1b and 1c are the system that data transmission in network telephony is carried out based on more transmission paths according to embodiment of the present invention Structure diagram；

Fig. 2 is the flow according to the method that data transmission in network telephony is carried out based on more transmission paths of embodiment of the present invention Figure；

Fig. 3 is the flow chart according to the method for the definite output volume of embodiment of the present invention；

Fig. 4 is the flow chart according to the method for the definite output volume of another embodiment of the present invention；

Fig. 5 is the flow chart according to the method that delay output is carried out to voice data of embodiment of the present invention；And

Fig. 6 be set the goal really according to embodiment of the present invention voice output unit method flow chart.

Embodiment

The illustrative embodiments of the present invention are introduced referring now to attached drawing, however, the present invention can use many different shapes Formula is implemented, and is not limited to the embodiment described herein, there is provided these embodiments are to disclose at large and fully The present invention, and fully pass on the scope of the present invention to person of ordinary skill in the field.Show for what is be illustrated in the accompanying drawings Term in example property embodiment is not limitation of the invention.In the accompanying drawings, identical cells/elements use identical attached Icon is remembered.

Unless otherwise indicated, term (including scientific and technical terminology) used herein has person of ordinary skill in the field It is common to understand implication.Further it will be understood that the term limited with usually used dictionary, be appreciated that and its The linguistic context of association area has consistent implication, and is not construed as Utopian or overly formal meaning.

Fig. 1 a, 1b and 1c are the system that data transmission in network telephony is carried out based on more transmission paths according to embodiment of the present invention 100 structure diagram.System 100 includes：Sound agent equipment 101, voice output unit 102-1,102-2,102-3, 102-4,102-5,102-6,102-7,102-8,102-9,102-10,102-11,102-12, and mobile terminal 103.Such as figure Described in 1a, multiple voice output unit 102-1,102-2,102-3,102-4,102-5,102-6,102-7,102-8,102-9, 102-10,102-11 and 102-12 are disposed in meeting-place, and each voice output unit is used for according to the sound received Sound data (for example, audio data stream) carry out sound output.In general, voice output unit can be determined according to the area in meeting-place Quantity and the position of each voice output unit can be determined according to the general layout in meeting-place.Preferably, each sound output Unit can be communicated by cable network with sound agent equipment 101, and can pass through various types of wireless networks Communicate with user equipment (for example, mobile terminal 103).Each voice output unit has can be to voice data or sound Data flow is handled and the ability of sound output can be carried out according to the output order received.

As shown in Figure 1 b, (that is, the user of mobile terminal 103 can for the audience area that mobile terminal 103 can be in meeting-place Position to stand or be seated) in any suitable position at.It will be appreciated that can have multiple mobile terminals in meeting-place. For the sake of clarity, the application is illustrated by taking mobile terminal 103 as an example.As shown in Figure 1 b, mobile terminal 103 and output unit 102-6's is closest, and with the distances of other output units to varying degrees farther out.

As illustrated in figure 1 c, mobile terminal 103 and output unit 102-2,102-3 and 102-6's is closest and equal, And with the distances of other output units to varying degrees farther out.It will be appreciated that the application is with 103 distance three of mobile terminal Illustrated exemplified by the distance of a output unit is equal.In fact, the quantity with 103 equidistant output unit of mobile terminal It can be any reasonable value.

Mobile terminal 103 initiates sound output request to sound agent equipment 101.Exported by sonic transmissions from sound Unit 102-2,102-3 and 102-6, which receive sonic transmissions matching code and received from sound agent equipment 101, to be permitted carrying out sound After the response message of output, 103 calculating network of mobile terminal postpones and the network delay is sent to sound agent equipment 101.The instruction into feedback inhibition pattern sent in response to sound agent equipment 101, mobile terminal 103 are passed based on multipath Defeated agreement enters multi-path transmission pattern.Wherein multi-path transmission pattern refers to mobile terminal by least two paths to transmit Voice data or audio data stream etc. are in a manner of carrying out sound output.Mobile terminal 103 is connected user using first network The data transmission in network telephony of input is to sound agent equipment 101, and mobile terminal 103 uses the second network connection by user at the same time The data transmission in network telephony of input give multiple voice output units in the voice output unit (example nearest apart from mobile terminal locations Such as one in output unit 102-2,102-3 and 102-6 in output unit 102-6 or Fig. 1 c in Fig. 1 b).

Sound agent equipment 101 extracts sample sound, mobile terminal from the sound output request that mobile terminal 103 receives Position and sonic transmissions matching code, and determine whether that the user carries out sound output according to sample sound, and Without considering it is described determine whether using mobile terminal 103 user carry out sound output result in the case of will instruction away from Sonic transmissions matching code is sent to mobile terminal by the voice output unit nearest from mobile terminal locations by sonic transmissions 103。

If sound agent equipment 101 determines to allow the user to carry out sound output according to sample sound, to movement Terminal 103 sends response message and receives network delay from mobile terminal 103, if the network delay is more than feedback threshold It is worth, then sound agent equipment 101 enters the progress sound output of feedback inhibition pattern.In the feedback inhibition pattern, sound generation Reason equipment 101 indicates that mobile terminal 103 enters multi-path transmission pattern based on multi-path transmission protocol.

Voice output unit 102-1,102-2,102-3,102-4,102-5,102-6,102-7,102-8,102-9, Sound in 102-10,102-11 and 102-12 in addition to the voice output unit nearest apart from mobile terminal locations is defeated Go out unit and sound output is carried out based on the output order transmitted by sound agent equipment 101 and voice data.Apart from mobile terminal Output order and voice data of the nearest voice output unit in position according to transmitted by mobile terminal 103 carry out sound output.

According to the embodiment of the present invention, system 100 is based on multi-path data transmission progress voice data output and wraps Include：Sound agent equipment 101, voice output unit 102-1,102-2,102-3,102-4,102-5,102-6,102-7,102- 8th, 102-9,102-10,102-11,102-12, and mobile terminal 103.Mobile terminal 103 passes through the first wireless network (example Such as, wide area wireless communication network, 3G, 4G or 5G etc.) send to sound agent equipment 101 and exported by voice data input by user Request.Voice data output request includes sample sound, the current location of mobile terminal 103 and initial matching code.Wherein user The sample sound of user's input or typing can be obtained by the sound acquisition device (for example, microphone) of mobile terminal 103.Sound One section of voice that sound sample can be user to be inputted in the site environment for carrying out voice output, for expressing the general of User Perspective The one section of voice stated and one section of voice for introducing user identity etc..Sample sound be for determining whether permit user into The important evidence of row voice data output.

The application obtains the current location of mobile terminal 103 using the positioning devices of mobile terminal 103.Specifically, it is described Positioning devices carry out school according to indoor auxiliary positioning, outdoor auxiliary positioning and/or access point auxiliary positioning to satellite location data Standard, to obtain the current location of mobile terminal 103.In general, can by the GPS chip or Big Dipper chip of mobile terminal 103 come Obtain the positional information of user.Then, the application can be according to the outdoor auxiliary positioning of communication network (for example, meeting-place is outdoor Meeting-place), indoor auxiliary positioning (for example, meeting-place is indoor meeting-place) and/or access point auxiliary positioning be (for example, there are nothing in meeting-place The access point apparatus of gauze network) above-mentioned positional information is calibrated.

MAC address generation initial matching code of the application based on mobile terminal 103, or based on mobile whole The hardware address generation initial matching code at end 103.For example, the application can by the media access control MAC of mobile terminal 103 All or part of content (character string) of location is determined as initial matching code, or by the whole of the hardware address of mobile terminal 103 or Partly (character string) content determines initial matching code.

Sound agent equipment 101 receives the voice data output request from mobile terminal 103, and according to the sound Sample sound in data output request determines whether that the mobile terminal 103 carries out voice data output.Sound is acted on behalf of Equipment 101 carries out the sample sound speech recognition to generate text information, when the text information meets corresponding language During communicative habits (for example, expressed implication meets the requirement of language substance), determine to allow the mobile terminal 103 to carry out Voice data exports.The sound agent equipment 101 determines whether the text information meets corresponding language based on semantics recognition The communicative habits of speech.

The text information is divided at least one by the sound agent equipment 101 according to the punctuate symbol of text information Sentence unit, carries out each sentence unit independent semantic analysis to determine semantic fraction, by the semanteme of each sentence unit The weighted sum of fraction determines that the text information meets the communicative habits of corresponding language when being more than expression threshold value.By in sentence unit The quantity of word is determined as the weight of sentence unit.For example, text information includes sentence unit A and B.Sentence unit A includes 5 Chinese character and sentence unit B includes 10 Chinese characters.The semantic fraction of sentence unit A is 9 points (full marks 10 divide, minimum 0 point), and The semantic fraction of sentence unit B is 8 points.The weight of sentence unit A is 5/ (5+10)=1/3, and the weight of sentence unit B is 10/ (5+10)=2/3.The semantic fraction of text information be sentence unit A and B semantic fraction weighted sum, i.e. 9* (1/3)+ 8* (2/3)=8.33.It is any reasonable such as can be 7,7.5,8 or 8.5 to express threshold value (being more than 0 and less than or equal to 10) Numerical value.

The sample sound is used to indicate at least one in herein below：The speech intelligibility of user, the voice of user Involved language form and background sound noise intensity.When user speech intelligibility be more than minimum requirements clarity threshold, Language form involved by the voice of user can be less than by speech recognition server automatic translation and background sound noise intensity In the case of maximum allowable noise intensity, the sound agent equipment 101 determines to allow the mobile terminal 103 to carry out sound number According to output

If it is determined that allowing the mobile terminal 103 to carry out voice data output, then sound agent equipment 101 passes through first Wireless network is sent to mobile terminal 103 permits output message.In addition, sound agent equipment 101 is defeated based on the voice data The current location of the mobile terminal 103 gone out in request detects and the mobile terminal in multiple voice output units The closest target sound output unit in 103 current location (such as exported in output unit 102-6 or Fig. 1 c in Fig. 1 b One in unit 102-2,102-3 and 102-6).Sound agent equipment 101 obtains in advance and to store the multiple sound defeated Go out the position of each voice output unit in unit.Present bit of the sound agent equipment 101 based on the mobile terminal 103 The air line distance put between the position of each voice output unit in the multiple voice output unit determines and the movement The closest target sound output unit in the current location of terminal 103.Alternatively, the sound agent equipment 101 is based on In the current location of the mobile terminal 103 and the multiple voice output unit between the position of each voice output unit Sonic transmissions distance determines the target sound output unit closest with the current location of the mobile terminal 103.This In the case of, when between specific sound output unit and mobile terminal 103 there are when barrier (for example, pillar), with sonic transmissions Beeline as the distance between specific sound output unit and mobile terminal 103.

When the sound agent equipment 101 working as based on the mobile terminal 103 in voice data output request Front position determines at least two sound closest with the current location of the mobile terminal 103 in multiple voice output units During sound output unit, a voice output unit is randomly choosed from least two voice output units as target sound output Unit.When present bit of the sound agent equipment 101 based on the mobile terminal 103 in voice data output request Put and at least two sound closest with the current location of the mobile terminal 103 are detected in multiple voice output units During output unit, the description information of at least two voice output unit is sent to the mobile terminal 103 and is responded Target sound output unit is determined from least two voice output units in the selection message of user.When the sound, agency sets Standby 101 current locations based on the mobile terminal 103 in voice data output request are in multiple voice output units In when detecting at least two voice output unit closest with the current location of the mobile terminal 103, will described in extremely Determine that target sound output is single apart from farthest voice output unit with sound agent equipment 101 in few two voice output units Member.When current location of the sound agent equipment 101 based on the mobile terminal 103 in voice data output request Detect that at least two sound closest with the current location of the mobile terminal 103 are defeated in multiple voice output units When going out unit, by voice output unit closest with sound agent equipment 101 at least two voice output unit Determine target sound output unit.

Sound agent equipment 101 is based on the initial matching code in voice data output request and the matching generated at random Random number determines the first sonic transmissions matching code, and the first sonic transmissions matching code is sent to the target sound exports list Member.The target sound output unit is stored received at least one first sonic transmissions matching code.Due to meeting There are multiple mobile terminals in, there may be multiple first sonic transmissions matching codes and the multiple sound output list for this Each in member may receive at least one first sonic transmissions matching code.For this reason, in multiple voice output units Each received at least one first sonic transmissions matching code is stored.

Initial matching code is carried out character string with the matching random number generated at random and connected to generate by sound agent equipment 101 First sonic transmissions matching code.For example, initial matching code is 406188963D56 and matches random number 25, then the first sound wave Transmission matching code is 406188963D5625.Alternatively, sound agent equipment 101 by initial matching code and the matching generated at random with Machine number is summed up to generate the first sonic transmissions matching code.For example, initial matching code be 406188963D56 and match with Machine number 25, then the first sonic transmissions matching code is 406188963D81.Alternatively, sound agent equipment 101 is based on random The matching random number of generation carries out the initial matching code cyclic shift, digitwise operation or step-by-step splicing to generate the first sound wave Transmit matching code.For example, initial matching code is 1,001 1,010 1101 and matches random number 2, then the first sonic transmissions Can be initial matching code ring shift right 2 two with code, i.e., 0,110 0,110 1011.For example, initial matching code is 1,001 1010 1101 and random number is matched as 1,001 1,001 1001, then the first sonic transmissions matching code can be step-by-step or computing, i.e., 1001 1011 1101.For example, initial matching code is 1101 and matches random number as 0010, then the first sonic transmissions match Code can be step-by-step carry out alternative splicing, i.e., 10100110, wherein the 1st, 3,5 and 7 of the first sonic transmissions matching code comes from Initial matching code, and Self Matching random number is carried out in the 2nd, 4,6 and 8.

Wherein, the mobile terminal 103 receives the allowance output message and based in the allowance output message Timestamp determines Network Transmission Delays.The mobile terminal 103 is based on the timestamp and mobile terminal for indicating sending time 103 current time determines network delay.The Network Transmission Delays are sent to the sound agent equipment by mobile terminal 103 101.Sound agent equipment 101 judges the Network Transmission Delays received from mobile terminal 103, if it is determined that the network Transmission delay is less than time delay threshold value, is sent to mobile terminal 103 and is used to indicate that the mobile terminal 103 passes through multipath number The instruction message of voice data output is carried out according to transmission.If it is determined that the Network Transmission Delays are greater than or equal to time delay threshold Value, the sound agent equipment 101 is sent to mobile terminal 103 to be used to indicate that the mobile terminal 103 can not pass through multipath Data transfer carries out the instruction message of voice data output.Sound is carried out in response to receiving not transmitting by multi-path data The instruction message of data output, the mobile terminal 103 carry out voice data output by single path data transfer.It is i.e. mobile whole End 103 is only communicated by the first wireless network with sound agent equipment 101, for example, by voice data or audio data stream Send sound agent equipment 101 to.

In response to receiving instruction message, mobile terminal 103 is according in the initial defeated matching code and the instruction message Matching random number determine the second sonic transmissions matching code and be broadcast to the second sound wave transmission code based on acoustic communication Multiple voice output units.Initial matching code is carried out character string with matching random number and connected to generate by the mobile terminal 103 Second sonic transmissions matching code.Alternatively, mobile terminal 103 sums up initial matching code and matching random number to generate second Sonic transmissions matching code.Alternatively, the mobile terminal 103 is based on matching random number carries out circulation shifting to the initial matching code Position, digitwise operation or step-by-step are spliced to generate the second sonic transmissions matching code.The generating mode of second sonic transmissions matching code with The generating mode of first sonic transmissions matching code as described above is identical, therefore repeats no more.

Multiple voice output units, when the target sound output unit in the multiple voice output unit determines to be received The second sonic transmissions matching code it is identical with the first sonic transmissions matching code when, pass through second with the mobile terminal 103 Wireless network establishes wireless communication connection.In addition, in the multiple voice output unit it is each receiving the second sound wave pass During defeated code, by the second sound wave transmission code compared with least one first sonic transmissions matching code stored.

Wherein in response to the wireless communication establishment of connection, the mobile terminal 103 enters multi-path data transmission mould Formula.Wherein multi-path transmission pattern refers to mobile terminal by least two paths to transmit voice data or audio data stream etc. In a manner of carrying out sound output.I.e. in the multi-path data transmission mode, the mobile terminal 103 passes through two differences Transmission path or at least two different transmission paths carry out the transmission of voice datas.Obtained by the sound of mobile terminal 103 Device is taken to obtain original sound data input by user, the original sound data is subjected to noise reduction process to generate by drop Make an uproar the voice data of processing.Wherein described original sound data is that user is defeated by the sound acquisition device institute of mobile terminal 103 The original sound data stream entered, and the voice data by noise reduction process is the audio data stream by noise reduction process.

Using the first wireless network by the data transmission in network telephony by noise reduction process to sound agent equipment 101, institute Sound agent equipment 101 is stated to be transferred to the voice data by noise reduction process and the first output order by cable network Other voice output units in multiple voice output units in addition to the target sound output unit, while, it is described Original sound data and the second output order are transferred to the target sound using second wireless network and exported by mobile terminal 103 Unit.Wherein the mobile terminal 103 using the first wireless network by the audio data stream by noise reduction process in real time It is transferred to sound agent equipment 101.The mobile terminal 103 is passed original sound data stream using second wireless network in real time The target sound output unit is defeated by, and second output order is transferred to the target sound output unit.Its Described in the first output order include being used to indicate the volume value of output volume.Wherein described second output order includes：For Indicate the volume value of output volume.

Each first output order transmitted based on sound agent equipment 101 in other voice output units Sound output is carried out with the voice data Jing Guo noise reduction process, and the target sound output unit is according to mobile terminal 103 The second output order and original sound data transmitted carries out sound output.

Preferably, sound agent equipment 101 determines the original frequency of the voice data by noise reduction process and original Tone color, the difference based on pre-set reference frequency value and the frequency determine the voice data by noise reduction process Frequency level, tone color weighted factor is determined based on the frequency level, determine the Multisound spectrum curve and according to The similarity of the spectrum curve and default tone color normal line determines the initial sound of the voice data by noise reduction process Color fraction, the tone color of the voice data by noise reduction process is determined based on the tone color weighted factor and initial tone color fraction Grade；Determine the original sound volume of the voice data by noise reduction process, based on the tone color grade determine volume weighting because Son, and output sound is determined based on the volume weighted factor and original sound volume.Wherein pre-set reference frequency value is 100Hz, 120Hz or 150Hz.

Sound agent equipment 101 determines that the original frequency (such as 60-300Hz) subtracts the difference of reference frequency value, meter The integer in the difference divided by the result of spacing value (such as 10) is calculated, the integer is determined as described by noise reduction process The frequency level of voice data, is determined as tone color weighted factor by the absolute value of the frequency level.Wherein initial tone color fraction Including：25 points, 26 points, 27 points, 28 points and 29 points.The warp is determined based on the tone color weighted factor and initial tone color fraction Crossing the tone color grade of the voice data of noise reduction process includes：It is obtained that initial tone color fraction is subtracted into the tone color weighted factor Tone color grade of the difference as the voice data by noise reduction process, and by the tone color grade.Wherein based on described Tone color grade determines that volume weighted factor includes：By the percentage number corresponding to the tone color grade divided by 100 obtained numerical value Value is determined as tone color weighted factor and wherein determines that output volume includes based on the volume weighted factor and original sound volume：Export sound Amount=original sound volume × (volume weighted factor+1).

First output order includes being used for the volume value for indicating the output volume, and the second output order includes being used for Indicate the volume value of original sound volume.Each in other voice output units transmitted based on sound agent equipment 101 One output order and voice data Jing Guo noise reduction process, which carry out sound output, to be included：It is every in other voice output units Volume value in one the first output order transmitted based on sound agent equipment 101 and the voice data Jing Guo noise reduction process Carry out sound output.The second output order and original sound that the target sound output unit is transmitted according to mobile terminal 103 Sound data, which carry out sound output, to be included：The second output that the target sound output unit is transmitted according to mobile terminal 103 refers to Volume value and original sound data in order carry out sound output.

Sound agent equipment 101 determines the original frequency and Multisound of the voice data by noise reduction process, base Frequency of the voice data by noise reduction process etc. is determined in the difference of pre-set reference frequency value and the frequency Level, determines tone color weighted factor based on the frequency level, determines the spectrum curve of the Multisound and according to the frequency The similarity of spectral curve and default tone color normal line determines the initial tone color fraction of the voice data by noise reduction process, The tone color grade of the voice data by noise reduction process is determined based on the tone color weighted factor and initial tone color fraction；Really The original sound volume of the fixed voice data by noise reduction process, volume weighted factor is determined based on the tone color grade, and Output volume is determined based on the volume weighted factor and original sound volume.The pre-set reference frequency value for 100Hz, 120Hz or 150Hz.

Sound agent equipment 101 determines that the original frequency (such as 60-300Hz) subtracts the difference of reference frequency value, meter The integer in the difference divided by the result of spacing value (such as 10) is calculated, the integer is determined as described by noise reduction process The frequency level of voice data, is determined as tone color weighted factor by the absolute value of the frequency level.The initial tone color fraction Including：25 points, 26 points, 27 points, 28 points and 29 points.The warp is determined based on the tone color weighted factor and initial tone color fraction Crossing the tone color grade of the voice data of noise reduction process includes：It is obtained that initial tone color fraction is subtracted into the tone color weighted factor Tone color grade of the difference as the voice data by noise reduction process, and by the tone color grade.Based on the tone color Grade determines that volume weighted factor includes：Percentages corresponding to the tone color grade divided by 100 obtained numerical value are true It is set to tone color weighted factor.It is described to determine that output volume includes based on the volume weighted factor and original sound volume：Output volume =original sound volume × (1- volumes weighted factor).

Sound agent equipment 101 determines the original frequency and Multisound of the voice data by noise reduction process, base Frequency of the voice data by noise reduction process etc. is determined in the difference of pre-set reference frequency value and the frequency Level, determines tone color weighted factor based on the frequency level, determines the spectrum curve of the Multisound and according to the frequency The similarity of spectral curve and default tone color normal line determines the initial tone color fraction of the voice data by noise reduction process, The tone color grade of the voice data by noise reduction process is determined based on the tone color weighted factor and initial tone color fraction；Really The original sound volume of the fixed voice data by noise reduction process, volume weighted factor is determined based on the tone color grade, and Output volume is determined based on the volume weighted factor and original sound volume.Pre-set reference frequency value is 100Hz, 120Hz Or 150Hz.

Sound agent equipment 101 determines that the original frequency (such as 60-300Hz) subtracts the difference of reference frequency value, meter The integer in the difference divided by the result of spacing value (such as 10) is calculated, the integer is determined as described by noise reduction process The frequency level of voice data, is determined as tone color weighted factor by the absolute value of the frequency level.Initial tone color fraction includes： 25 points, 26 points, 27 points, 28 points and 29 points.Determined based on the tone color weighted factor and initial tone color fraction described by drop The tone color grade of voice data of processing of making an uproar includes：Initial tone color fraction is subtracted into the obtained difference of tone color weighted factor As the tone color grade of the voice data by noise reduction process, and by the tone color grade.Based on the tone color grade Determine that volume weighted factor includes：Percentages corresponding to the tone color grade divided by 100 obtained numerical value are determined as Tone color weighted factor.Determine that output volume includes based on the volume weighted factor and original sound volume：Output volume=original sound Amount × (1+ volumes weighted factor).

First output order includes：For indicate the output volume volume value and the mobile terminal 103 it is current Position, and the second output order includes being used for the volume value for indicating original sound volume.Each in other voice output units The first output order transmitted based on sound agent equipment 101 and the voice data Jing Guo noise reduction process carry out sound output bag Include：In each first output order transmitted based on sound agent equipment 101 in other voice output units The current location of mobile terminal 103 determines the air line distance with the mobile terminal 103, and the air line distance divided by 1000 are obtained To numerical value corresponding to percentages be determined as the distance weighted factor, based on the distance weighted factor and first output refer to The volume value of order calculates actual volume value, and the voice data progress sound according to actual volume value and Jing Guo noise reduction process is defeated Go out；Wherein actual volume value=volume value × (the distance weighted factors of 1+).The target sound output unit is according to mobile terminal 103 the second output orders transmitted and original sound data, which carry out sound output, to be included：The target sound output unit root Volume value and original sound data in the second output order transmitted according to mobile terminal 103 carry out sound output.

The sound agent equipment 101 determines that the original frequency (such as 60-300Hz) subtracts the difference of reference frequency value Value, calculates the integer in the difference divided by the result of spacing value (such as 10), and the integer is determined as described to pass through noise reduction The frequency level of the voice data of processing, is determined as tone color weighted factor by the absolute value of the frequency level.Initial tone color point Number includes：25 points, 26 points, 27 points, 28 points and 29 points.Determined based on the tone color weighted factor and initial tone color fraction described The tone color grade of voice data by noise reduction process includes：Obtained by initial tone color fraction subtracted the tone color weighted factor Tone color grade of the difference as the voice data by noise reduction process, and by the tone color grade.Based on the sound Colour gradation determines that volume weighted factor includes：By the percentages corresponding to the tone color grade divided by 100 obtained numerical value It is determined as tone color weighted factor.Determine that output volume includes based on the volume weighted factor and original sound volume：Output volume= Original sound volume × (1- volumes weighted factor).

First output order includes being used for the volume value for indicating the output volume, and second output order Including the volume value for indicating original sound volume.Each in other voice output units is based on sound agent equipment 101 the first output orders transmitted and voice data Jing Guo noise reduction process, which carry out sound output, to be included：Other sound The mobile terminal 103 in each first output order transmitted based on sound agent equipment 101 in output unit is worked as Front position determines the air line distance with the mobile terminal 103, corresponding to the numerical value that the air line distance divided by 1000 are obtained Percentages be determined as the distance weighted factor, based on the volume value of the distance weighted factor and the first output order calculate Actual volume value, and the voice data according to actual volume value and Jing Guo noise reduction process carries out sound output；Wherein actual sound Value=volume value × (the distance weighted factors of 1+).The target sound output unit transmitted according to mobile terminal 103 Two output orders and original sound data, which carry out sound output, to be included：The target sound output unit is according to mobile terminal 103 Volume value and original sound data in the second output order transmitted carry out sound output.

Preferably, the mobile terminal 103 makes to include being used for the sound for indicating output volume in second output order Value and the Network Transmission Delays.The second output order that target sound output unit is transmitted according to mobile terminal 103 and Original sound data, which carries out sound output, to be included：The target sound output unit transmitted according to mobile terminal 103 second Volume value in output order and defeated with the sound that the time of Network Transmission Delays postpones the original sound data Go out, so that each in the target sound output unit and other voice output units is when carrying out sound output Can the retention time it is consistent.

Preferably, the evaluation Main Basiss of tone color aesthetic feeling are spectrum curves, and are sentenced according to Italian optimal Western style of singing line It is disconnected, i.e., connect in alignment, the frequency spectrum of which tone color from the hot spot of fundamental tone to the 16th overtone point (on spectrum curve) Curve is more graceful, melodious, interesting to listen to closer to this line, which music.Overtone has an effect on the characteristic of tone color, some overtone and base Sound is complete harmonious sound relationship；Some overtone and fundamental tone are incomplete harmonious sound relationships；Some overtone and fundamental tone are dissonances Relation；The overtone of complete harmonic tone is plentiful, then tonequality stability is strong；Incomplete harmonic tone overtone is plentiful, then tone color is imbued with table Existing power；The overtone of dissonance is excessive, then tone color is strange unpleasant to hear (wolf sound).Some stringed musical instruments make in the human hand that will not be played With being just very easy to find wolf sound.

Fig. 2 is the stream according to the method 200 that data transmission in network telephony is carried out based on more transmission paths of embodiment of the present invention Cheng Tu.Method 200 is since step 201 place, in step 201, user using mobile terminal by the first wireless network (for example, Wide area wireless communication network, 3G, 4G or 5G etc.) asked to the transmission voice data output of sound agent equipment, the voice data Output request includes sample sound, the current location of mobile terminal and initial matching code.Wherein user can pass through mobile terminal Sound acquisition device (for example, microphone) obtain user input or typing sample sound.Sample sound can be that user exists The one section of voice inputted in the site environment of voice output, the one section of voice and use of the general introduction for expressing User Perspective In one section of voice for introducing user identity etc..Sample sound is for determining whether to permit the weight that user carries out voice data output Will foundation.

The application obtains the current location of mobile terminal using the positioning devices of mobile terminal.Specifically, the locator Part calibrates satellite location data according to indoor auxiliary positioning, outdoor auxiliary positioning and/or access point auxiliary positioning, to obtain Take the current location of mobile terminal.In general, the position of user can be obtained by the GPS chip or Big Dipper chip of mobile terminal Information.Then, outdoor auxiliary positioning (for example, meeting-place is outdoor meeting-place), the indoor auxiliary that the application can be according to communication network (for example, meeting-place is indoor meeting-place) and/or access point auxiliary positioning are positioned (for example, there are the access point of wireless network in meeting-place Equipment) above-mentioned positional information is calibrated.

MAC address generation initial matching code of the application based on mobile terminal, or based on mobile terminal Hardware address generation initial matching code.For example, the application can be by the whole of the MAC address of mobile terminal Or partial content (character string) is determined as initial matching code, or all or part (character string) by the hardware address of mobile terminal Content determines initial matching code.

In step 202, the sound agent equipment receives the voice data output request from mobile terminal, according to described Sample sound in voice data output request determines whether that the mobile terminal carries out voice data output.Sound is acted on behalf of Equipment carries out the sample sound speech recognition to generate text information, when the text information meets the expression of corresponding language When being accustomed to (for example, expressed implication meets the requirement of language substance), determine to allow the mobile terminal to carry out sound number According to output.The sound agent equipment determines whether the text information meets the expression habit of corresponding language based on semantics recognition It is used.

The text information is divided at least one sentence list by sound agent equipment according to the punctuate symbol of text information Member, carries out each sentence unit independent semantic analysis to determine semantic fraction, by the semantic fraction of each sentence unit Weighted sum determines that the text information meets the communicative habits of corresponding language when being more than expression threshold value.By word in sentence unit Quantity is determined as the weight of sentence unit.For example, text information includes sentence unit A and B.Sentence unit A includes 5 Chinese characters simultaneously And sentence unit B includes 10 Chinese characters.The semantic fraction of sentence unit A is 9 points (full marks 10 divide, minimum 0 point), and sentence list The semantic fraction of first B is 8 points.The weight of sentence unit A is 5/ (5+10)=1/3, and the weight of sentence unit B is 10/ (5+ 10)=2/3.The semantic fraction of text information be sentence unit A and B semantic fraction weighted sum, i.e. 9* (1/3)+8* (2/3) =8.33.Express threshold value (being more than 0 and less than or equal to 10) any reasonable value such as can be 7,7.5,8 or 8.5.

The sample sound is used to indicate at least one in herein below：The speech intelligibility of user, the voice of user Involved language form and background sound noise intensity.When user speech intelligibility be more than minimum requirements clarity threshold, Language form involved by the voice of user can be less than by speech recognition server automatic translation and background sound noise intensity In the case of maximum allowable noise intensity, the sound agent equipment determines to allow the mobile terminal progress voice data defeated Go out.If it is determined that allowing the mobile terminal to carry out voice data output, then sent by the first wireless network to mobile terminal Permit output message.

In step 203, the sound agent equipment is based on the mobile terminal in voice data output request Current location detects the target sound closest with the current location of the mobile terminal in multiple voice output units Output unit.In addition, current location of the sound agent equipment based on the mobile terminal in voice data output request The target sound output unit closest with the current location of the mobile terminal is detected in multiple voice output units (such as one in output unit 102-6 or Fig. 1 c in Fig. 1 b in output unit 102-2,102-3 and 102-6).Sound generation Reason equipment obtains and stores the position of each voice output unit in the multiple voice output unit in advance.The sound generation Manage current location of the equipment based on the mobile terminal and the position of each voice output unit in the multiple voice output unit Air line distance between putting determines the target sound output unit closest with the current location of the mobile terminal.Alternatively Ground, current location of the sound agent equipment based on the mobile terminal and each sound in the multiple voice output unit Sonic transmissions distance between the position of output unit determines the target sound closest with the current location of the mobile terminal Sound output unit.In this case, when there are barrier (for example, pillar) between specific sound output unit and mobile terminal When, the distance between specific sound output unit and mobile terminal are used as using the beeline of sonic transmissions.

First is determined based on the initial matching code in voice data output request and the matching random number generated at random Sonic transmissions matching code, the target sound output unit is sent to by the first sonic transmissions matching code.The target sound Sound output unit is stored received at least one first sonic transmissions matching code.Since there are multiple shiftings in meeting-place Dynamic terminal, for this can there may be each in multiple first sonic transmissions matching codes and the multiple voice output unit It can receive at least one first sonic transmissions matching code.It is for this reason, each by received by multiple voice output units At least one first sonic transmissions matching code stored.

Initial matching code and the matching random number that generates at random are carried out character string and connected to generate the by sound agent equipment One sonic transmissions matching code.For example, initial matching code is 406188963D56 and matches random number 25, then the first sound wave passes Defeated matching code is 406188963D5625.Alternatively, sound agent equipment is by initial matching code and the matching random number generated at random Sum up to generate the first sonic transmissions matching code.For example, initial matching code is 406188963D56 and matches random number 25, then the first sonic transmissions matching code is 406188963D81.Alternatively, sound agent equipment is based on generated at random Cyclic shift, digitwise operation or step-by-step splicing are carried out to the initial matching code with random number to generate the matching of the first sonic transmissions Code.For example, initial matching code is 1,001 1,010 1101 and matches random number as 2, then the first sonic transmissions matching code can To be initial matching code ring shift right 2, i.e., 0,110 0,110 1011.For example, initial matching code is 1,001 1,010 1101 simultaneously And matching random number is 1,001 1,001 1001, then the first sonic transmissions matching code can be step-by-step or computing, i.e., 1001 1011 1101.For example, initial matching code is 1101 and matches random number as 0010, then the first sonic transmissions matching code can Be step-by-step carry out alternative splicing, i.e., 10100110, wherein the 1st, 3,5 and 7 of the first sonic transmissions matching code from initially Matching code, and Self Matching random number is carried out in the 2nd, 4,6 and 8.

In step 204, mobile terminal receive it is described permit output message and based in the allowance output message when Between stab determine Network Transmission Delays, the Network Transmission Delays are sent to the sound agent equipment.The mobile terminal base In for indicating that the timestamp of sending time and the current time of mobile terminal determine network delay.Mobile terminal is by the network Transmission delay is sent to the sound agent equipment.

In step 205, the sound agent equipment judges the Network Transmission Delays received from mobile terminal, if Determine that the Network Transmission Delays are less than time delay threshold value, sent to mobile terminal more for indicating that the mobile terminal passes through Path data transmission carries out the instruction message of voice data output.Sound agent equipment is to the network transmission that is received from mobile terminal Delay is judged, if it is determined that the Network Transmission Delays are less than time delay threshold value, are sent to mobile terminal and are used to indicate The mobile terminal transmits the instruction message for carrying out voice data output by multi-path data.If it is determined that the network transmission Delay is greater than or equal to time delay threshold value, and the sound agent equipment is sent to mobile terminal to be used to indicate the mobile terminal The instruction message for carrying out voice data output can not be transmitted by multi-path data.Can not be by multipath number in response to receiving The instruction message of voice data output is carried out according to transmission, the mobile terminal is defeated by single path data transfer progress voice data Go out.I.e. mobile terminal is only communicated by the first wireless network with sound agent equipment, for example, by voice data or sound number Sound agent equipment is given according to spreading.

In step 206, in response to receiving instruction message, the mobile terminal is according to the initial defeated matching code and described Matching random number in instruction message is determined the second sonic transmissions matching code and is passed second sound wave based on acoustic communication Defeated code is broadcast to the multiple voice output unit.Initial matching code and matching random number are carried out character string by the mobile terminal Connect to generate the second sonic transmissions matching code.Alternatively, mobile terminal by initial matching code and matching random number sum up with Generate the second sonic transmissions matching code.The initial matching code is followed alternatively, the mobile terminal is based on matching random number Ring displacement, digitwise operation or step-by-step are spliced to generate the second sonic transmissions matching code.The generation side of second sonic transmissions matching code Formula is identical with the generating mode of the first sonic transmissions matching code as described above, therefore repeats no more.

In step 207, when the target sound output unit in the multiple voice output unit determines received second When sonic transmissions matching code is identical with the first sonic transmissions matching code, built with the mobile terminal by second wireless network Vertical wireless communication connection.In addition, it is each when receiving the second sound wave transmission code in the multiple voice output unit, will The second sound wave transmission code is compared with least one first sonic transmissions matching code stored.

In step 208, in response to the wireless communication establishment of connection, the mobile terminal is transmitted into multi-path data Pattern.Wherein multi-path transmission pattern refers to mobile terminal by least two paths to transmit voice data or audio data stream Deng in a manner of carrying out sound output.I.e. in the multi-path data transmission mode, the mobile terminal 103 by two not The different transmission path of same transmission path or at least two carries out the transmission of voice data.Obtained by the sound of mobile terminal Device obtains original sound data input by user, and the original sound data is carried out noise reduction process to generate by noise reduction The voice data of processing.Wherein described original sound data is the original that user is inputted by the sound acquisition device of mobile terminal Beginning audio data stream, and the voice data by noise reduction process is the audio data stream by noise reduction process.

The data transmission in network telephony by noise reduction process is given to sound agent equipment, the sound using the first wireless network The voice data by noise reduction process and the first output order are transferred to multiple sound by sound agent equipment by cable network Other voice output units in sound output unit in addition to the target sound output unit, while, it is described mobile whole Original sound data and the second output order are transferred to the target sound output unit by end using second wireless network.Wherein The audio data stream by noise reduction process is transferred to sound generation by the mobile terminal in real time using the first wireless network Manage equipment.It is defeated that original sound data stream using second wireless network is transferred to the target sound by the mobile terminal in real time Go out unit, and second output order is transferred to the target sound output unit.Wherein described first output order Including the volume value for indicating output volume.Wherein described second output order includes：For indicating the volume of output volume Value.

Each in step 209, other voice output units transmitted based on sound agent equipment first Output order and the voice data progress sound output Jing Guo noise reduction process, and the target sound output unit is according to movement The second output order and original sound data that terminal is transmitted carry out sound output.

Fig. 3 is the flow chart according to the method 300 of the definite output volume of embodiment of the present invention.Method 300 is from step Start at 301.

In step 301, the original frequency and Multisound of the voice data by noise reduction process are determined.In step 302, the difference based on pre-set reference frequency value Yu the frequency.Determine the voice data by noise reduction process Frequency level, tone color weighted factor is determined based on the frequency level.In step 303, determine that the frequency spectrum of the Multisound is bent Line and the sound number by noise reduction process is determined according to the similarity of the spectrum curve and default tone color normal line According to initial tone color fraction, the sound by noise reduction process is determined based on the tone color weighted factor and initial tone color fraction The tone color grade of data.In step 304, the original sound volume of the voice data by noise reduction process is determined, based on the sound Colour gradation determines volume weighted factor, and determines output volume based on the volume weighted factor and original sound volume.

Fig. 4 is the flow chart according to the method 400 of the definite output volume of another embodiment of the present invention.Method 400 from Step 401 place starts.Each in step 401, other voice output units transmitted based on sound agent equipment The current location of mobile terminal in one output order determines the air line distance with mobile terminal.In step 402, by the straight line Percentages corresponding to distance divided by 1000 obtained numerical value are determined as the distance weighted factor.In step 403, based on distance The volume value of weighted factor and the first output order calculates actual volume value.In step 404, according to actual volume value and pass through drop Make an uproar processing voice data carry out sound output.

Fig. 5 is the flow chart according to the method 500 that delay output is carried out to voice data of embodiment of the present invention.Method 500 since step 501 place.In step 501, include being used for the volume value for indicating output volume in second output order With the Network Transmission Delays.In step 502, mobile terminal is exported original sound data and second using second wireless network Instruction is transferred to the target sound output unit.In step 503, target sound output unit is transmitted according to mobile terminal Volume value in second output order and the sound postponed with the time of Network Transmission Delays to the original sound data Sound exports, so that each in the target sound output unit and other voice output units is defeated in progress sound When going out can the retention time it is consistent.

Fig. 6 be set the goal really according to embodiment of the present invention voice output unit method 600 flow chart.Method 600 since step 601 place.In step 601, the institute in the sound agent equipment is asked based on voice data output The current location for stating mobile terminal is detected with the current location of mobile terminal distance most in multiple voice output units During near at least two voice output unit, the description information of at least two voice output unit is sent to the movement Terminal.In step 602, determine that target sound output is single from least two voice output units in response to the selection message of user Member.

Method 300-600 is directed to, in summary：Sound agent equipment determines the voice data by noise reduction process Original frequency and Multisound, the difference based on pre-set reference frequency value and the frequency determines described to pass through noise reduction The frequency level of the voice data of processing, determines tone color weighted factor based on the frequency level, determines the Multisound Spectrum curve and determined according to the similarity of the spectrum curve and default tone color normal line described by noise reduction process The initial tone color fraction of voice data, determines described to pass through noise reduction process based on the tone color weighted factor and initial tone color fraction Voice data tone color grade；The original sound volume of the voice data by noise reduction process is determined, based on described tone color etc. The definite volume weighted factor of level, and output sound is determined based on the volume weighted factor and original sound volume.Wherein pre-set Reference frequency value be 100Hz, 120Hz or 150Hz.

Sound agent equipment determines that the original frequency (such as 60-300Hz) subtracts the difference of reference frequency value, calculates institute The integer in difference divided by the result of spacing value (such as 10) is stated, the integer is determined as the sound by noise reduction process The frequency level of data, is determined as tone color weighted factor by the absolute value of the frequency level.Wherein initial tone color fraction includes： 25 points, 26 points, 27 points, 28 points and 29 points.Determined based on the tone color weighted factor and initial tone color fraction described by drop The tone color grade of voice data of processing of making an uproar includes：Initial tone color fraction is subtracted into the obtained difference of tone color weighted factor As the tone color grade of the voice data by noise reduction process, and by the tone color grade.Wherein it is based on the tone color Grade determines that volume weighted factor includes：Percentages corresponding to the tone color grade divided by 100 obtained numerical value are true It is set to tone color weighted factor and wherein determines that output volume includes based on the volume weighted factor and original sound volume：Output volume= Original sound volume × (volume weighted factor+1).

First output order includes being used for the volume value for indicating the output volume, and the second output order includes being used for Indicate the volume value of original sound volume.Each in other voice output units is transmitted first defeated based on sound agent equipment Go out instruction and the voice data Jing Guo noise reduction process carries out sound output and includes：Each in other voice output units Volume value in the first output order transmitted based on sound agent equipment and the voice data carry out sound Jing Guo noise reduction process Sound exports.The second output order and original sound data that the target sound output unit is transmitted according to mobile terminal carry out Sound output includes：Volume value in the second output order that the target sound output unit is transmitted according to mobile terminal and Original sound data carries out sound output.

Sound agent equipment determines the original frequency and Multisound of the voice data by noise reduction process, based on pre- The difference of the reference frequency value first set and the frequency determines the frequency level of the voice data by noise reduction process, base Tone color weighted factor is determined in the frequency level, determines the spectrum curve of the Multisound and according to the spectrum curve The initial tone color fraction of the voice data by noise reduction process is determined with the similarity of default tone color normal line, based on institute State tone color weighted factor and initial tone color fraction determines the tone color grade of the voice data by noise reduction process；Determine described The original sound volume of voice data by noise reduction process, volume weighted factor is determined based on the tone color grade, and is based on institute State volume weighted factor and original sound volume determines output volume.The pre-set reference frequency value for 100Hz, 120Hz or 150Hz。

Sound agent equipment determines that the original frequency (such as 60-300Hz) subtracts the difference of reference frequency value, calculates institute The integer in difference divided by the result of spacing value (such as 10) is stated, the integer is determined as the sound by noise reduction process The frequency level of data, is determined as tone color weighted factor by the absolute value of the frequency level.The initial tone color fraction includes： 25 points, 26 points, 27 points, 28 points and 29 points.Determined based on the tone color weighted factor and initial tone color fraction described by drop The tone color grade of voice data of processing of making an uproar includes：Initial tone color fraction is subtracted into the obtained difference of tone color weighted factor As the tone color grade of the voice data by noise reduction process, and by the tone color grade.Based on the tone color grade Determine that volume weighted factor includes：Percentages corresponding to the tone color grade divided by 100 obtained numerical value are determined as Tone color weighted factor.It is described to determine that output volume includes based on the volume weighted factor and original sound volume：Output volume=original Beginning volume × (1- volumes weighted factor).

Sound agent equipment determines the original frequency and Multisound of the voice data by noise reduction process, based on pre- The difference of the reference frequency value first set and the frequency determines the frequency level of the voice data by noise reduction process, base Tone color weighted factor is determined in the frequency level, determines the spectrum curve of the Multisound and according to the spectrum curve The initial tone color fraction of the voice data by noise reduction process is determined with the similarity of default tone color normal line, based on institute State tone color weighted factor and initial tone color fraction determines the tone color grade of the voice data by noise reduction process；Determine described The original sound volume of voice data by noise reduction process, volume weighted factor is determined based on the tone color grade, and is based on institute State volume weighted factor and original sound volume determines output volume.Pre-set reference frequency value for 100Hz, 120Hz or 150Hz。

Sound agent equipment determines that the original frequency (such as 60-300Hz) subtracts the difference of reference frequency value, calculates institute The integer in difference divided by the result of spacing value (such as 10) is stated, the integer is determined as the sound by noise reduction process The frequency level of data, is determined as tone color weighted factor by the absolute value of the frequency level.Initial tone color fraction includes：25 Divide, 26 points, 27 points, 28 points and 29 points.Determine described to pass through noise reduction based on the tone color weighted factor and initial tone color fraction The tone color grade of the voice data of processing includes：Initial tone color fraction is subtracted the obtained difference of the tone color weighted factor to make For the tone color grade of the voice data by noise reduction process, and by the tone color grade.It is true based on the tone color grade Accordatura amount weighted factor includes：Percentages corresponding to the tone color grade divided by 100 obtained numerical value are determined as sound Color weighted factor.Determine that output volume includes based on the volume weighted factor and original sound volume：Output volume=original sound volume × (1+ volumes weighted factor).

First output order includes：For indicating the volume value of the output volume and the present bit of the mobile terminal Put, and the second output order includes being used for the volume value for indicating original sound volume.Each base in other voice output units The first output order transmitted in sound agent equipment and the voice data Jing Guo noise reduction process, which carry out sound output, to be included：Institute State the mobile terminal in each first output order transmitted based on sound agent equipment in other voice output units Current location determine air line distance with the mobile terminal, corresponding to the numerical value that the air line distance divided by 1000 are obtained Percentages be determined as the distance weighted factor, based on the volume value of the distance weighted factor and the first output order calculate Actual volume value, and the voice data according to actual volume value and Jing Guo noise reduction process carries out sound output；Wherein actual sound Value=volume value × (the distance weighted factors of 1+).The target sound output unit is transmitted second defeated according to mobile terminal Go out instruction and original sound data carries out sound output and includes：The target sound output unit is passed according to mobile terminal 103 Volume value and original sound data in the second defeated output order carry out sound output.

The sound agent equipment determines that the original frequency (such as 60-300Hz) subtracts the difference of reference frequency value, meter The integer in the difference divided by the result of spacing value (such as 10) is calculated, the integer is determined as described by noise reduction process The frequency level of voice data, is determined as tone color weighted factor by the absolute value of the frequency level.Initial tone color fraction includes： 25 points, 26 points, 27 points, 28 points and 29 points.Determined based on the tone color weighted factor and initial tone color fraction described by drop The tone color grade of voice data of processing of making an uproar includes：Initial tone color fraction is subtracted into the obtained difference of tone color weighted factor As the tone color grade of the voice data by noise reduction process, and by the tone color grade.Based on the tone color grade Determine that volume weighted factor includes：Percentages corresponding to the tone color grade divided by 100 obtained numerical value are determined as Tone color weighted factor.Determine that output volume includes based on the volume weighted factor and original sound volume：Output volume=original sound Amount × (1- volumes weighted factor).

First output order includes being used for the volume value for indicating the output volume, and second output order Including the volume value for indicating original sound volume.Each in other voice output units is based on sound agent equipment institute First output order of transmission and the voice data Jing Guo noise reduction process, which carry out sound output, to be included：Other sound outputs are single The current location of the mobile terminal in each first output order transmitted based on sound agent equipment in member determine with The air line distance of the mobile terminal, the percentages corresponding to numerical value that the air line distance divided by 1000 are obtained determine For the distance weighted factor, actual volume value is calculated based on the volume value of the distance weighted factor and the first output order, and Voice data according to actual volume value and Jing Guo noise reduction process carries out sound output；Wherein actual volume value=volume value × (1 + distance weighted the factor).The second output order and original sound that the target sound output unit is transmitted according to mobile terminal Data, which carry out sound output, to be included：In the second output order that the target sound output unit is transmitted according to mobile terminal Volume value and original sound data carry out sound output.

Preferably, the mobile terminal makes to include being used for the volume value for indicating output volume in second output order With the Network Transmission Delays.The second output order and original sound that target sound output unit is transmitted according to mobile terminal Data, which carry out sound output, to be included：In the second output order that the target sound output unit is transmitted according to mobile terminal Volume value and the sound output postponed with the time of Network Transmission Delays to the original sound data, so that described Each in target sound output unit and other voice output units being capable of the retention time when carrying out sound output Unanimously.

In addition, the application further includes a kind of method that data transmission in network telephony is carried out based on more transmission paths, the method bag Include：

If determining to allow the user to carry out sound output according to sample sound, response message is sent to mobile terminal And network delay will be received from mobile terminal, if the network delay is more than feedback threshold, into feedback inhibition pattern Carry out sound output；In the feedback inhibition pattern, sound agent equipment instruction mobile terminal is based on multi-path transmission protocol Into multi-path transmission pattern, using removing the voice output unit nearest apart from mobile terminal locations in multiple voice output units Outside voice output unit sound output is carried out based on the output order transmitted by sound agent equipment and voice data, it is described Output order and voice data of the voice output unit nearest apart from mobile terminal locations according to transmitted by mobile terminal carry out Sound exports.

Claims

1. a kind of transmit the method for carrying out voice data output based on multi-path data, the described method includes：

User sends voice data output request, the sound by the first wireless network using mobile terminal to sound agent equipment The output request of sound data includes sample sound, the current location of mobile terminal and initial matching code；

The sound agent equipment receives the voice data output request from mobile terminal, please according to voice data output Sample sound in asking determines whether that the mobile terminal carries out voice data output, if it is determined that allows described mobile whole End carries out voice data output, then is sent to mobile terminal and permit output message；

Current location of the sound agent equipment based on the mobile terminal in voice data output request is multiple The target sound output unit closest with the current location of the mobile terminal, and base are detected in voice output unit The first sonic transmissions are determined in the matching random number that the voice data exports the initial matching code in request and generates at random With code, the first sonic transmissions matching code is sent to the target sound output unit；

Mobile terminal receives the allowance output message and determines that network passes based on the timestamp in the allowance output message The Network Transmission Delays are sent to the sound agent equipment by defeated delay；

The sound agent equipment judges the Network Transmission Delays received from mobile terminal, if it is determined that the network passes Defeated delay is less than time delay threshold value, to mobile terminal send be used for indicate the mobile terminal by multi-path data transmit into The instruction message of row voice data output；

In response to receiving instruction message, of the mobile terminal in the initial defeated matching code and the instruction message The second sonic transmissions matching code is determined with random number and is broadcast to the second sound wave transmission code based on acoustic communication described Multiple voice output units；

When the target sound output unit in the multiple voice output unit determines the second received sonic transmissions matching code When identical with the first sonic transmissions matching code, wireless communication is established by second wireless network with the mobile terminal and is connected Connect；

In response to the wireless communication establishment of connection, the mobile terminal enters multi-path data transmission mode：Pass through movement The sound acquisition device of terminal obtains original sound data input by user, and the original sound data is carried out noise reduction process To generate voice data Jing Guo noise reduction process, using the first wireless network by the data transmission in network telephony by noise reduction process Give sound agent equipment, the sound agent equipment is by cable network by the voice data by noise reduction process and first Output order is transferred to other voice output units in addition to the target sound output unit in multiple voice output units, While original sound data and the second output order are transferred to the mesh by the mobile terminal using second wireless network Mark voice output unit；And

Each first output order transmitted based on sound agent equipment and process in other voice output units The voice data of noise reduction process carries out sound output, and the target sound output unit transmitted according to mobile terminal the Two output orders and original sound data carry out sound output.

2. according to the method described in claim 1, set in user using mobile terminal by the first wireless network to sound agency Preparation is sent and is further included before voice data output request：Sound input by user is obtained by the sound acquisition device of mobile terminal Sample.

3. method according to claim 1 or 2, is acted on behalf of using mobile terminal in user by the first wireless network to sound Equipment further includes before sending voice data output request：The present bit of mobile terminal is obtained using the positioning devices of mobile terminal Put.

4. according to the method described in any one in claim 1-3, pass through the first wireless network using mobile terminal in user Further included before sending voice data output request to sound agent equipment：Media access control MAC based on mobile terminal Location generates initial matching code, or the generation initial matching code of the hardware address based on mobile terminal.

5. according to the method described in any one in claim 1-4, the sound in the output request according to the voice data Sound sample determines whether that the mobile terminal carries out voice data output and includes：Speech recognition is carried out to the sample sound To generate text information, when the text information meets the communicative habits of corresponding language, determine to allow the mobile terminal into Row voice data exports.

6. according to the method described in any one in claim 1-5, the sound agent equipment obtains and stores institute in advance State the position of each voice output unit in multiple voice output units.

7. according to the method described in claim 6, current location and the output of the multiple sound based on the mobile terminal are single Air line distance in member between the position of each voice output unit determines closest with the current location of the mobile terminal Target sound output unit.

8. according to the method described in claim 1, it is described based on the voice data output request in initial matching code and with The matching random number of machine generation determines that the first sonic transmissions matching code includes：Initial matching code and the matching generated at random is random Number carries out character string connection to generate the first sonic transmissions matching code.

9. according to the method described in claim 1, the timestamp based in the allowance output message determines network transmission Delay includes：Mobile terminal is based on for indicating that the timestamp of sending time and the current time of mobile terminal determine that network prolongs Late.

10. according to the method described in claim 1, in the multi-path data transmission mode, the mobile terminal passes through two A different transmission path or at least two different transmission paths carry out the transmission of voice data.