CN108833590A - A kind of speech-recognition services proxy server and Proxy Method - Google Patents
A kind of speech-recognition services proxy server and Proxy Method Download PDFInfo
- Publication number
- CN108833590A CN108833590A CN201810758656.8A CN201810758656A CN108833590A CN 108833590 A CN108833590 A CN 108833590A CN 201810758656 A CN201810758656 A CN 201810758656A CN 108833590 A CN108833590 A CN 108833590A
- Authority
- CN
- China
- Prior art keywords
- server
- voice
- proxy
- request
- media device
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 85
- 238000012545 processing Methods 0.000 claims abstract description 71
- 230000008569 process Effects 0.000 claims abstract description 50
- 230000005540 biological transmission Effects 0.000 claims abstract description 8
- 230000003213 activating effect Effects 0.000 claims description 35
- 238000004590 computer program Methods 0.000 claims description 7
- 238000005111 flow chemistry technique Methods 0.000 claims 1
- 238000013519 translation Methods 0.000 abstract description 9
- 239000003795 chemical substances by application Substances 0.000 description 51
- 238000004891 communication Methods 0.000 description 8
- 238000010168 coupling process Methods 0.000 description 7
- 238000005859 coupling reaction Methods 0.000 description 7
- 230000006870 function Effects 0.000 description 6
- 230000008878 coupling Effects 0.000 description 5
- 230000007246 mechanism Effects 0.000 description 5
- 230000003287 optical effect Effects 0.000 description 5
- 230000003993 interaction Effects 0.000 description 4
- 230000005291 magnetic effect Effects 0.000 description 4
- 230000009471 action Effects 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 238000010586 diagram Methods 0.000 description 2
- 238000005192 partition Methods 0.000 description 2
- 230000002093 peripheral effect Effects 0.000 description 2
- 238000000926 separation method Methods 0.000 description 2
- 230000001133 acceleration Effects 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 235000013399 edible fruits Nutrition 0.000 description 1
- 230000005611 electricity Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000009434 installation Methods 0.000 description 1
- 238000003032 molecular docking Methods 0.000 description 1
- 239000013307 optical fiber Substances 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 238000003786 synthesis reaction Methods 0.000 description 1
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L65/00—Network arrangements, protocols or services for supporting real-time applications in data packet communication
- H04L65/40—Support for services or applications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/50—Network services
- H04L67/51—Discovery or management thereof, e.g. service location protocol [SLP] or web services
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/50—Network services
- H04L67/56—Provisioning of proxy services
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Multimedia (AREA)
- Telephonic Communication Services (AREA)
Abstract
The invention discloses a kind of speech-recognition services proxy server and Proxy Method, the server includes MRCP proxy module, for receiving the voice stream process request of media device transmission;Voice stream process request is sent to service agent module to handle;Processing result is received from the service agent module;The processing result is sent to the corresponding operation system application of institute's voice traffic request;Service agent module, for receiving the voice stream process request from the MRCP proxy module;It is interacted with voice traffic services device, realizes the processing to the voice flow, processing result is returned into MRCP proxy module.It is decoupled between media device and ASR service, is conducive to service extension;ASR service request is authenticated by service agent service, flow control, safety, anti-cheating etc. can be applied to public cloud very well;It is serviced down by service agent and mounts other business modules, ASR service is extended, such as searched for, translation, identification intention etc..
Description
【Technical field】
The present invention relates to Computer Applied Technology, in particular to a kind of speech-recognition services proxy server and agent
Method.
【Background technique】
As ASR (Automatic Speech Recognition, automatic speech recognition) technology reaches its maturity and mutually
In conjunction with the value-added service based on ASR is continued to bring out and grown rapidly.
In the prior art, media device can pass through MRCP (Media Resource Control Protocol, media
Resource Control protocols) it is directly connected to ASR service, there are two disadvantages for such application mode:
1, there is good support for private clound, it is bad for public cloud support, because public cloud needs authentication, stream
Control, safety, the mechanism such as anti-cheating.Primary method to support public cloud or abandon these mechanism or allow these mechanism with
ASR service coupling;
2, ASR is serviced, the text of identification can only be returned, there cannot be good extension, to extension, needs to allow ASR
Couple other business modules.
In addition, media device can also dock shared cloud platform by HTTP docks ASR service, such application mode again
Disadvantage is as follows:
Because media device does not have the ability of HTTP docking ASR usually, need to be transformed.Therefore it is inconvenient to dock,
There is improvement cost.Also, ASR is serviced, the text of identification can only be returned, there cannot be good extension.
【Summary of the invention】
The many aspects of the application provide speech-recognition services proxy server, method, equipment and storage medium, can
Media device and operation system are decoupled, provides public cloud authentication, flow control, charging, the service such as safety, and know for ASR service extension
Know.
The one side of the application, provides a kind of speech-recognition services proxy server, and the server includes:
MRCP proxy module, for receiving the voice stream process request of media device transmission;The voice stream process is asked
It asks and is sent to service agent module and is handled;Processing result is received from the service agent module;The processing result is sent out
Give institute's voice traffic request corresponding operation system application;
Service agent module, for receiving the voice stream process request from the MRCP proxy module;With voice service
Server interaction, realizes the processing to voice flow, processing result is returned to MRCP proxy module.
The aspect and any possible implementation manners as described above, it is further provided a kind of implementation, the voice industry
Server be engaged in as ASR server and/or server of activating business.
The aspect and any possible implementation manners as described above, it is further provided a kind of implementation, the business generation
Managing module includes:
ASR handles submodule, for sending ASR request to the ASR server, receives what the ASR server returned
Speech recognition result;And/or
It activates business and handles submodule, for activating business request to the server transmission of activating business, described in reception
The processing result of activating business that server of activating business returns.
The aspect and any possible implementation manners as described above, it is further provided a kind of implementation, the business generation
Reason module is also used to interact with control server, realizes the control to the media device.
The aspect and any possible implementation manners as described above, it is further provided a kind of implementation, the control clothes
Business device is authentication/flow control/charging/security server;
The service agent module further includes:
Submodule is authenticated, for interacting with the authentication server, realizes the authentication to the media device;
Flow control submodule realizes the flow control to the media device for interacting with the flow control server;
Charging submodule realizes the charging to the media device for interacting with the accounting server;
Safe submodule realizes the security service to the media device for interacting with the security server.
According to the another aspect of the application, a kind of speech recognition based on above-mentioned speech-recognition services proxy server is provided
Service broker method, the method includes:
Proxy server receives the voice stream process request that media device is sent;
It is interacted with voice traffic services device, realizes the processing to voice flow, obtain processing result;
The processing result is sent to the corresponding operation system application of institute's voice traffic request.
The aspect and any possible implementation manners as described above, it is further provided a kind of implementation, agency's clothes
Business device receives the voice stream process request packet that media device is sent and includes:
The MRCP proxy module of the proxy server receives the voice stream process request that media device is sent, by institute's predicate
The request of sound stream process is sent to service agent module and is handled.
The aspect and any possible implementation manners as described above, it is further provided a kind of implementation, described and voice
Service server interaction, realization include to the processing of voice flow:
The service agent module of the proxy server receives the voice stream process request, hands over voice traffic services device
Mutually, it realizes the processing to voice flow, processing result is returned into the MRCP proxy module.
The aspect and any possible implementation manners as described above, it is further provided a kind of implementation, by the processing
As a result be sent to the corresponding operation system of institute's voice traffic request apply including:
The processing result is sent to the corresponding operation system of institute's voice traffic request and answered by the MRCP proxy module
With.
The aspect and any possible implementation manners as described above, it is further provided a kind of implementation, the voice industry
Server be engaged in as ASR server and/or server of activating business.
The aspect and any possible implementation manners as described above, it is further provided a kind of implementation, the business generation
It manages module and receives the voice stream process request, interacted with voice traffic services device, realization includes to the processing of the voice flow:
ASR request is sent to the ASR server, receives the speech recognition result that the ASR server returns;And/or
Request of activating business, the extension that server of activating business described in reception returns are sent to the server of activating business
Service processing result.
The aspect and any possible implementation manners as described above, it is further provided a kind of implementation, the method is also
Including:
The service agent module is interacted with control server, realizes the control to the media device.
The aspect and any possible implementation manners as described above, it is further provided a kind of implementation, the control clothes
Business device is authentication/flow control/charging/security server;
The service agent module is interacted with control server, and realization further includes to the control of the media device:
It is interacted with the authentication server, realizes the authentication to the media device;
It is interacted with the flow control server, realizes the flow control to the media device;
It is interacted with the accounting server, realizes the charging to the media device;
It is interacted with the security server, realizes the security service to the media device.
Another aspect of the present invention, provides a kind of computer equipment, including memory, processor and is stored in the storage
On device and the computer program that can run on the processor, the processor are realized as previously discussed when executing described program
Method.
Another aspect of the present invention provides a kind of computer readable storage medium, is stored thereon with computer program, described
Method as described above is realized when program is executed by processor.
It can be seen that based on above-mentioned introduction using scheme of the present invention, make to increase between media device and ASR service
MRCP agency service and service agent service, one come media device and ASR service between decouple, be conducive to service extension;Two
ASR service request is authenticated by service agent service, flow control, safety, anti-cheating etc. can be applied to public cloud very well;
Three mount other business modules to service down by service agent, are extended and (if searched for, translate, identification is intended to ASR service
Deng).
【Detailed description of the invention】
Fig. 1 is the implementation diagram of speech-recognition services proxy server of the present invention;
Fig. 2 is the flow chart of speech-recognition services Proxy Method of the present invention;
Fig. 3 shows the frame for being suitable for the exemplary computer system/server 012 for being used to realize embodiment of the present invention
Figure.
【Specific embodiment】
To keep the purposes, technical schemes and advantages of the embodiment of the present application clearer, below in conjunction with the embodiment of the present application
In attached drawing, the technical scheme in the embodiment of the application is clearly and completely described, it is clear that described embodiment is
Some embodiments of the present application, instead of all the embodiments.Based on the embodiment in the application, those of ordinary skill in the art
Whole other embodiments obtained without creative efforts, shall fall in the protection scope of this application.
Fig. 1 is the implementation diagram of speech-recognition services proxy server of the invention in speech synthesis platform, such as Fig. 1
It is shown, show media device, speech recognition proxy server, operation system, ASR server, server of activating business, control
Server.
The media device is connected to speech recognition proxy server, the speech recognition proxy server and operation system
Using, ASR server, server of activating business, control server be separately connected.Specifically, the speech recognition agency service
Device includes MRCP proxy module and service agent module, and the media device is connected to the MRCP proxy module, the MRCP
Proxy module is connected to service agent module.The MRCP proxy module is connected with operation system.The service agent module
It is separately connected with ASR server, server of activating business, control server.
The speech recognition proxy server is used to receive the voice stream process request of media device transmission;With voice service
Server interaction, realizes the processing to the voice flow, obtains processing result;The processing result is sent to the voice industry
Corresponding operation system application is requested in business.
The speech recognition proxy server includes:
MRCP proxy module, for receiving the voice stream process request of media device transmission;The voice stream process is asked
It asks and is sent to service agent module and is handled;Processing result is received from the service agent module;The processing result is sent out
Give institute's voice traffic request corresponding operation system application;
Service agent module, for receiving the voice stream process request from the MRCP proxy module;With voice service
Server interaction, realizes the processing to the voice flow, processing result is returned to MRCP proxy module.
Preferably, the MRCP proxy module is connect with media device.Media device is sent to the MRCP proxy module
The request of voice stream process, such as ASR processing request.
Voice stream process request is sent to service agent module and handled by the MRCP proxy module.
The service agent module includes that ASR processing submodule is generated and corresponded to for being requested according to the voice stream process
Voice service request, be sent to corresponding voice traffic services device.It is requested for example, being handled according to the ASR, generates ASR and ask
It asks and is sent to ASR server and is handled, receive the speech recognition result that the ASR server returns.
In the present embodiment, since service agent module connection activates business server to realize extended voice industry
Business, such as translation service.The service agent module includes activating business to handle submodule, for being sent simultaneously according to media device
The voice stream process request forwarded by MRCP proxy module, wherein the voice stream process request is translation processing request, is generated
Translation request is sent to translating server and is handled, and receives the translation result that the translating server returns.
After the service agent module receives the processing result of ASR server and/or server of activating business, by the place
Reason result is sent to MRCP proxy server, so that the processing result is sent to institute's voice traffic by MRCP proxy server
Request corresponding operation system application.
Preferably, the processing result is sent to media device by the MRCP proxy server, by media device by institute
It states processing result and is sent to operation system application.
Due to the proxy server be to the media device it is transparent, user will not perceive above-mentioned treatment process with
The request of voice stream process is sent to ASR server by existing media device, receives the speech recognition result of ASR server, will
The operating process that institute's speech recognition result is sent to operation system application has and difference.
Preferably, in a kind of preferred implementation of the present embodiment, in order to solve as authentication required for public cloud, stream
Control, charging, safety, mechanism, the service agent module such as anti-cheating are provided and are authenticated, and flow control, charging, safety are anti-to practise fraud
Etc. control servers connection, interacted with control server, realize control to the media device.
It preferably, further include authentication request and the user for logging in the media device in the voice stream process request
Account and user password.
In the present embodiment, the service agent module further includes authentication submodule, for being interacted with the authentication server,
Realize the authentication to the media device;Flow control submodule, for interacting with the flow control server, realization sets the media
Standby flow control;Charging submodule realizes the charging to the media device for interacting with the accounting server;Safety
Module realizes the security service to the media device for interacting with the security server.
It authenticates submodule and sends application authentication request to authentication server, the application authentication request includes logging in the matchmaker
The account and user password of the user of body equipment, authentication server is authenticated according to account and user password, if legal
The authentication is passed signal is returned to the authentication submodule.After authenticating successfully, service agent module is asked according to the voice stream process
It asks, generates corresponding voice service request, be sent to corresponding ASR server and/or server of activating business;Receive ASR clothes
After the processing result of business device and/or server of activating business, the processing result is sent to MRCP proxy server, so as to
The processing result is sent to the corresponding operation system application of institute's voice traffic request by MRCP proxy server.
Preferably, when ASR server and/or server resource of activating business are ready, then to service agent module return company
Connect success message.After service agent module obtains successful connection message, establish and ASR server and/or server of activating business
Connection, meanwhile, charging submodule to accounting server send charging commencing signal;Flow control submodule is sent to flow control server
Flow control commencing signal.When ASR server and/or server of activating business complete the voice service request of user, then to
Service agent module returns to identification and completes message.Service agent module obtains after identification completes message, to ASR server and/or
Server of activating business sends the request for disconnecting resource link, meanwhile, charging submodule is sent to accounting server stops charging
Message;Flow control submodule sends the message for stopping flow control to flow control server.Preferably, accounting server is according to
The duration or flow of voice service request carry out charging.
Preferably, in the present embodiment, third-party service server is accessed by public network, it is easy to security risk is brought,
Including:The safety problem of conversation establishing, the protection of media session, indirect access to content, has stored the protection for controlling session
Media file protection.Therefore, the safe submodule of the service agent module is connected with security server, by the peace
Full server provides security service.
Preferably, the authentication, flow control, charging, security server may alternatively be integrated in the service agent module, directly
Authentication, flow control, charging, security service are provided.
The proxy server through this embodiment makes between media device and ASR service, increase MRCP agency service and
Service agent service, one come media device and ASR service between decouple, be conducive to service extension;Two are taken by service agent
Business authenticates ASR service request, flow control, safety, and anti-cheating etc. can be applied to public cloud very well;Three came through business generation
Other business modules are mounted under reason service, (such as search is translated, and identification is intended to etc.) is extended to ASR service.
In embodiment provided herein, it should be understood that disclosed method and apparatus can pass through others
Mode is realized.For example, the apparatus embodiments described above are merely exemplary, for example, the division of the unit, only
A kind of logical function partition, there may be another division manner in actual implementation, for example, multiple units or components can combine or
Person is desirably integrated into another system, or some features can be ignored or not executed.Another point, shown or discussed is mutual
Between coupling, direct-coupling or communication connection can be through some interfaces, the INDIRECT COUPLING or communication link of device or unit
It connects, can be electrical property, mechanical or other forms.
The unit as illustrated by the separation member may or may not be physically separated, aobvious as unit
The component shown may or may not be physical unit, it can and it is in one place, or may be distributed over multiple
In network unit.It can select some or all of unit therein according to the actual needs to realize the mesh of this embodiment scheme
's.
In addition, each functional unit in each embodiment of the application can integrate in a processor, it is also possible to
Each unit physically exists alone, and can also be integrated in one unit with two or more units.The integrated unit
Both it can take the form of hardware realization, can also have been realized in the form of hardware adds SFU software functional unit.
Fig. 2 is the process of the speech-recognition services Proxy Method based on speech-recognition services proxy server of the present invention
Figure, as shown in Fig. 2, the method includes:
Step S21, proxy server receives the voice stream process request that media device is sent;
Step S22, it is interacted with voice traffic services device, realizes the processing to voice flow, obtain processing result;
Step S23, the processing result is sent to the corresponding operation system application of institute's voice traffic request.
In a kind of preferred implementation of step S21,
Proxy server receives the voice stream process request that media device is sent;
Preferably, the MRCP proxy module of the proxy server receives the voice stream process request that media device is sent;
Preferably, MRCP proxy module is stated to connect with media device.Media device sends language to the MRCP proxy module
The request of sound stream process, such as ASR processing request.The MRCP module requests the voice stream process, such as ASR processing is asked
It asks, the service agent module for being sent to proxy server is handled.
In a kind of preferred implementation of step S22,
Proxy server is interacted with voice traffic services device, realizes the processing to voice flow, obtains processing result;
Preferably, the service agent module of the proxy server receives at the voice flow that the MRCP proxy module is sent
Reason request, interacts with voice traffic services device, realizes the processing to the voice flow
Preferably, the service agent module includes that processing submodule is generated for being requested according to the voice stream process
Corresponding voice service request is sent to corresponding voice traffic services device.For example, ASR handles submodule, according to the ASR
Processing request, generation ASR request are sent to ASR server and are handled, and receive the speech recognition knot that the ASR server returns
Fruit.
In the present embodiment, since service agent module connection activates business server to realize extended voice industry
Business, such as translation service.The service agent module includes activating business to handle submodule, for being sent according to media device,
And requested by the voice stream process that MRCP proxy module forwards, wherein the voice stream process request is translation processing request, raw
Translating server is sent at translation request to be handled, and the translation result that the translating server returns is received.
After the service agent module receives the processing result of ASR server and/or server of activating business, by the place
Reason result is sent to MRCP proxy server, so that the processing result is sent to institute's voice traffic by MRCP proxy server
Request corresponding operation system application.
Preferably, in a kind of preferred implementation of the present embodiment, in order to solve as authentication required for public cloud, stream
Control, charging, safety, mechanism, the service agent module such as anti-cheating are provided and are authenticated, and flow control, charging, safety are anti-to practise fraud
Etc. control servers connection, interacted with control server, realize control to the media device.
It preferably, further include authentication request and the user for logging in the media device in the voice stream process request
Account and user password.
In the present embodiment, the service agent module further includes authentication submodule, for being interacted with the authentication server,
Realize the authentication to the media device;Flow control submodule, for interacting with the flow control server, realization sets the media
Standby flow control;Charging submodule realizes the charging to the media device for interacting with the accounting server;Safety
Module realizes the security service to the media device for interacting with the security server.
It authenticates submodule and sends application authentication request to authentication server, the application authentication request includes logging in the matchmaker
The account and user password of the user of body equipment, authentication server is authenticated according to account and user password, if legal
The authentication is passed signal is returned to the authentication submodule.After authenticating successfully, service agent module is asked according to the voice stream process
It asks, generates corresponding voice service request, be sent to corresponding ASR server and/or server of activating business;Receive ASR clothes
After the processing result of business device and/or server of activating business, the processing result is sent to MRCP proxy server, so as to
The processing result is sent to the corresponding operation system application of institute's voice traffic request by MRCP proxy server.
Preferably, when ASR server and/or server resource of activating business are ready, then to service agent module return company
Connect success message.After service agent module obtains successful connection message, establish and ASR server and/or server of activating business
Connection, meanwhile, charging submodule to accounting server send charging commencing signal;Flow control submodule is sent to flow control server
Flow control commencing signal.When ASR server and/or server of activating business complete the voice service request of user, then to
Service agent module returns to identification and completes message.Service agent module obtains after identification completes message, to ASR server and/or
Server of activating business sends the request for disconnecting resource link, meanwhile, charging submodule is sent to accounting server stops charging
Message;Flow control submodule sends the message for stopping flow control to flow control server.Preferably, accounting server is according to
The duration or flow of voice service request carry out charging.
Preferably, in the present embodiment, third-party service server is accessed by public network, it is easy to security risk is brought,
Including:The safety problem of conversation establishing, the protection of media session, indirect access to content, has stored the protection for controlling session
Media file protection.Therefore, the safe submodule of the service agent module is connected with security server, by the peace
Full server provides security service.
Preferably, the authentication, flow control, charging, security server may alternatively be integrated in the service agent module, directly
Authentication, flow control, charging, security service are provided.
In a kind of preferred implementation of step S23,
The processing result is sent to the corresponding operation system application of institute's voice traffic request.
Preferably, the processing result received from business proxy module is sent to business by the MRCP proxy server
System application.
Preferably, the processing result received from business proxy module is sent to media and set by the MRCP proxy server
It is standby, the processing result is sent to operation system application by media device.
Due to the proxy server be to the media device it is transparent, user will not perceive above-mentioned treatment process with
The request of voice stream process is sent to ASR server by existing media device, receives the speech recognition result of ASR server, will
The operating process that institute's speech recognition result is sent to operation system application has and difference.
The method through this embodiment authenticates ASR service request by service agent service, flow control, safety,
Anti- cheating etc., can be applied to public cloud very well;It is serviced down by service agent and mounts other business modules, ASR is serviced and is carried out
Extension (if searched for, translate, identification is intended to etc.).
It should be noted that for the various method embodiments described above, for simple description, therefore, it is stated as a series of
Combination of actions, but those skilled in the art should understand that, the application is not limited by the described action sequence because
According to the application, some steps may be performed in other sequences or simultaneously.Secondly, those skilled in the art should also know
It knows, the embodiments described in the specification are all preferred embodiments, related actions and modules not necessarily the application
It is necessary.
The introduction about embodiment of the method above, below by way of Installation practice, to scheme of the present invention carry out into
One step explanation.
Fig. 3 shows the frame for being suitable for the exemplary computer system/server 012 for being used to realize embodiment of the present invention
Figure.The computer system/server 012 that Fig. 3 is shown is only an example, should not function and use to the embodiment of the present invention
Range band carrys out any restrictions.
As shown in figure 3, computer system/server 012 is showed in the form of universal computing device.Computer system/clothes
The component of business device 012 can include but is not limited to:One or more processor or processor 016, system storage 028,
Connect the bus 018 of different system components (including system storage 028 and processor 016).
Bus 018 indicates one of a few class bus structures or a variety of, including memory bus or Memory Controller,
Peripheral bus, graphics acceleration port, processor or the local bus using any bus structures in a variety of bus structures.It lifts
For example, these architectures include but is not limited to industry standard architecture (ISA) bus, microchannel architecture (MAC)
Bus, enhanced isa bus, Video Electronics Standards Association (VESA) local bus and peripheral component interconnection (PCI) bus.
Computer system/server 012 typically comprises a variety of computer system readable media.These media, which can be, appoints
The usable medium what can be accessed by computer system/server 012, including volatile and non-volatile media, movably
With immovable medium.
System storage 028 may include the computer system readable media of form of volatile memory, such as deposit at random
Access to memory (RAM) 030 and/or cache memory 032.Computer system/server 012 may further include other
Removable/nonremovable, volatile/non-volatile computer system storage medium.Only as an example, storage system 034 can
For reading and writing immovable, non-volatile magnetic media (Fig. 3 do not show, commonly referred to as " hard disk drive ").Although in Fig. 3
It is not shown, the disc driver for reading and writing to removable non-volatile magnetic disk (such as " floppy disk ") can be provided, and to can
The CD drive of mobile anonvolatile optical disk (such as CD-ROM, DVD-ROM or other optical mediums) read-write.In these situations
Under, each driver can be connected by one or more data media interfaces with bus 018.Memory 028 may include
At least one program product, the program product have one group of (for example, at least one) program module, these program modules are configured
To execute the function of various embodiments of the present invention.
Program/utility 040 with one group of (at least one) program module 042, can store in such as memory
In 028, such program module 042 includes --- but being not limited to --- operating system, one or more application program, other
It may include the realization of network environment in program module and program data, each of these examples or certain combination.Journey
Sequence module 042 usually executes function and/or method in embodiment described in the invention.
Computer system/server 012 can also with one or more external equipments 014 (such as keyboard, sensing equipment,
Display 024 etc.) communication, in the present invention, computer system/server 012 is communicated with outside radar equipment, can also be with
One or more equipment that sounder is interacted with the computer system/server 012 communication, and/or with make this
Any equipment that computer system/server 012 can be communicated with one or more of the other calculating equipment (adjust by such as network interface card
Modulator-demodulator etc.) communication.This communication can be carried out by input/output (I/O) interface 022.Also, computer system/
Server 012 can also pass through network adapter 020 and one or more network (such as local area network (LAN), wide area network
(WAN) and/or public network, for example, internet) communication.As shown in figure 3, network adapter 020 passes through bus 018 and computer
Other modules of systems/servers 012 communicate.It should be understood that computer system/service can be combined although being not shown in Fig. 3
Device 012 uses other hardware and/or software module, including but not limited to:Microcode, device driver, redundant processor, outside
Disk drive array, RAID system, tape drive and data backup storage system etc..
The program that processor 016 is stored in system storage 028 by operation, thereby executing reality described in the invention
Apply the function and/or method in example.
Above-mentioned computer program can be set in computer storage medium, i.e., the computer storage medium is encoded with
Computer program, the program by one or more computers when being executed, so that one or more computers execute in the present invention
State method flow shown in embodiment and/or device operation.
With time, the development of technology, medium meaning is more and more extensive, and the route of transmission of computer program is no longer limited by
Tangible medium, can also be directly from network downloading etc..It can be using any combination of one or more computer-readable media.
Computer-readable medium can be computer-readable signal media or computer readable storage medium.Computer-readable storage medium
Matter for example may be-but not limited to-system, device or the device of electricity, magnetic, optical, electromagnetic, infrared ray or semiconductor, or
Any above combination of person.The more specific example (non exhaustive list) of computer readable storage medium includes:With one
Or the electrical connections of multiple conducting wires, portable computer diskette, hard disk, random access memory (RAM), read-only memory (ROM),
Erasable programmable read only memory (EPROM or flash memory), optical fiber, portable compact disc read-only memory (CD-ROM), light
Memory device, magnetic memory device or above-mentioned any appropriate combination.In this document, computer readable storage medium can
With to be any include or the tangible medium of storage program, the program can be commanded execution system, device or device use or
Person is in connection.
Computer-readable signal media may include in a base band or as carrier wave a part propagate data-signal,
Wherein carry computer-readable program code.The data-signal of this propagation can take various forms, including --- but
It is not limited to --- electromagnetic signal, optical signal or above-mentioned any appropriate combination.Computer-readable signal media can also be
Any computer-readable medium other than computer readable storage medium, which can send, propagate or
Transmission is for by the use of instruction execution system, device or device or program in connection.
The program code for including on computer-readable medium can transmit with any suitable medium, including --- but it is unlimited
In --- wireless, electric wire, optical cable, RF etc. or above-mentioned any appropriate combination.
The computer for executing operation of the present invention can be write with one or more programming languages or combinations thereof
Program code, described program design language include object oriented program language-such as Java, Smalltalk, C++,
It further include conventional procedural programming language-such as " C " language or similar programming language.Program code can be with
It is fully executed on sounder computer, partly executes on sounder computer, held as an independent software package
Row, partially on sounder computer part on the remote computer execute or completely on a remote computer or server
It executes.In situations involving remote computers, remote computer can pass through the network of any kind --- including local area network
(LAN) or wide area network (WAN) is connected to sounder computer, or, it may be connected to outer computer (such as utilize internet
Service provider is connected by internet).
It is apparent to those skilled in the art that for convenience and simplicity of description, the system of the description,
The specific work process of device and unit, can refer to corresponding processes in the foregoing method embodiment, and details are not described herein.
In several embodiments provided herein, it should be understood that disclosed method and apparatus can pass through it
Its mode is realized.For example, the apparatus embodiments described above are merely exemplary, for example, the division of the unit, only
Only a kind of logical function partition, there may be another division manner in actual implementation, such as multiple units or components can be tied
Another system is closed or is desirably integrated into, or some features can be ignored or not executed.Another point, it is shown or discussed
Mutual coupling, direct-coupling or communication connection can be through some interfaces, the INDIRECT COUPLING or logical of device or unit
Letter connection can be electrical property, mechanical or other forms.
The unit as illustrated by the separation member may or may not be physically separated, aobvious as unit
The component shown may or may not be physical unit, it can and it is in one place, or may be distributed over multiple
In network unit.It can select some or all of unit therein according to the actual needs to realize the mesh of this embodiment scheme
's.
In addition, each functional unit in each embodiment of the application can integrate in a processor, it is also possible to
Each unit physically exists alone, and can also be integrated in one unit with two or more units.The integrated unit
Both it can take the form of hardware realization, can also have been realized in the form of hardware adds SFU software functional unit.
Finally it should be noted that:Above embodiments are only to illustrate the technical solution of the application, rather than its limitations;Although
The application is described in detail with reference to the foregoing embodiments, those skilled in the art should understand that:It still may be used
To modify the technical solutions described in the foregoing embodiments or equivalent replacement of some of the technical features;
And these are modified or replaceed, each embodiment technical solution of the application that it does not separate the essence of the corresponding technical solution spirit and
Range.
Claims (15)
1. a kind of speech-recognition services proxy server, which is characterized in that the server includes:
MRCP proxy module, for receiving the voice stream process request of media device transmission;The voice stream process is requested to send out
Service agent module is given to be handled;Processing result is received from the service agent module;The processing result is sent to
The corresponding operation system application of institute's voice traffic request;
Service agent module, for receiving the voice stream process request from the MRCP proxy module;With voice traffic services
Device interaction, realizes the processing to voice flow, processing result is returned to MRCP proxy module.
2. proxy server according to claim 1, which is characterized in that the voice traffic services device is ASR server
And/or server of activating business.
3. proxy server according to claim 2, which is characterized in that the service agent module includes:
ASR handles submodule, for sending ASR request to the ASR server, receives the voice that the ASR server returns
Recognition result;And/or
It activates business and handles submodule, for sending request of activating business to the server of activating business, receive the extension
The processing result of activating business that service server returns.
4. proxy server according to claim 1, which is characterized in that the service agent module is also used to take with control
Business device interaction, realizes the control to the media device.
5. proxy server according to claim 4, which is characterized in that
The control server is authentication/flow control/charging/security server;
The service agent module further includes:
Submodule is authenticated, for interacting with the authentication server, realizes the authentication to the media device;
Flow control submodule realizes the flow control to the media device for interacting with the flow control server;
Charging submodule realizes the charging to the media device for interacting with the accounting server;
Safe submodule realizes the security service to the media device for interacting with the security server.
6. a kind of speech-recognition services Proxy Method based on speech-recognition services proxy server described in claim 1-5,
It is characterized in that, the method includes:
Proxy server receives the voice stream process request that media device is sent;
It is interacted with voice traffic services device, realizes the processing to voice flow, obtain processing result;
The processing result is sent to the corresponding operation system application of institute's voice traffic request.
7. according to the method described in claim 6, it is characterized in that, the proxy server receives the voice that media device is sent
Stream process is requested:
The MRCP proxy module of the proxy server receives the voice stream process request that media device is sent, by the voice flow
Processing request is sent to service agent module and is handled.
8. realizing the method according to the description of claim 7 is characterized in that described interact with voice traffic services device to voice
The processing of stream includes:
The service agent module of the proxy server receives the voice stream process request, interacts with voice traffic services device,
It realizes the processing to voice flow, processing result is returned into the MRCP proxy module.
9. according to the method described in claim 8, it is characterized in that, the processing result is sent to institute's voice traffic request
Corresponding operation system apply including:
The processing result is sent to the corresponding operation system application of institute's voice traffic request by the MRCP proxy module.
10. according to the method described in claim 9, it is characterized in that, the voice traffic services device be ASR server and/or
It activates business server.
11. according to the method described in claim 10, it is characterized in that, the service agent module receives the voice stream process
Request, interacts with voice traffic services device, and realization includes to the processing of the voice flow:
ASR request is sent to the ASR server, receives the speech recognition result that the ASR server returns;And/or
Request of activating business is sent to the server of activating business, what server of activating business described in reception returned activates business
Processing result.
12. according to the method described in claim 9, it is characterized in that, the method also includes:
The service agent module is interacted with control server, realizes the control to the media device.
13. according to the method for claim 12, which is characterized in that
The control server is authentication/flow control/charging/security server;
The service agent module is interacted with control server, and realization further includes to the control of the media device:
It is interacted with the authentication server, realizes the authentication to the media device;
It is interacted with the flow control server, realizes the flow control to the media device;
It is interacted with the accounting server, realizes the charging to the media device;
It is interacted with the security server, realizes the security service to the media device.
14. a kind of computer equipment, including memory, processor and it is stored on the memory and can be on the processor
The computer program of operation, which is characterized in that the processor is realized when executing described program as any in claim 6~13
Method described in.
15. a kind of computer readable storage medium, is stored thereon with computer program, which is characterized in that described program is processed
The method as described in any one of claim 6~13 is realized when device executes.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810758656.8A CN108833590B (en) | 2018-07-11 | 2018-07-11 | Voice recognition service proxy server and proxy method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810758656.8A CN108833590B (en) | 2018-07-11 | 2018-07-11 | Voice recognition service proxy server and proxy method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108833590A true CN108833590A (en) | 2018-11-16 |
CN108833590B CN108833590B (en) | 2021-10-26 |
Family
ID=64136036
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810758656.8A Active CN108833590B (en) | 2018-07-11 | 2018-07-11 | Voice recognition service proxy server and proxy method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108833590B (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111050002A (en) * | 2019-12-17 | 2020-04-21 | 北京鸿博信通科技有限公司 | Intelligent telephone exchange and working method and system thereof |
CN111128198A (en) * | 2019-12-25 | 2020-05-08 | 厦门快商通科技股份有限公司 | Voiceprint recognition method, voiceprint recognition device, storage medium, server and voiceprint recognition system |
CN112786022A (en) * | 2019-11-11 | 2021-05-11 | 青岛海信移动通信技术股份有限公司 | Terminal, first voice server, second voice server and voice recognition method |
CN114500128A (en) * | 2022-02-07 | 2022-05-13 | 北京百度网讯科技有限公司 | Flow control charging method, device, system, electronic equipment, medium and product |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101079885A (en) * | 2007-06-26 | 2007-11-28 | 中兴通讯股份有限公司 | A system and method for providing automatic voice identification integrated development platform |
CN101677329A (en) * | 2008-09-18 | 2010-03-24 | 中兴通讯股份有限公司 | Comprehensive voice resource platform proxy server and its data processing method |
CN102427465A (en) * | 2011-08-18 | 2012-04-25 | 青岛海信电器股份有限公司 | Voice service proxy method and device and system for integrating voice application through proxy |
-
2018
- 2018-07-11 CN CN201810758656.8A patent/CN108833590B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101079885A (en) * | 2007-06-26 | 2007-11-28 | 中兴通讯股份有限公司 | A system and method for providing automatic voice identification integrated development platform |
CN101677329A (en) * | 2008-09-18 | 2010-03-24 | 中兴通讯股份有限公司 | Comprehensive voice resource platform proxy server and its data processing method |
CN102427465A (en) * | 2011-08-18 | 2012-04-25 | 青岛海信电器股份有限公司 | Voice service proxy method and device and system for integrating voice application through proxy |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112786022A (en) * | 2019-11-11 | 2021-05-11 | 青岛海信移动通信技术股份有限公司 | Terminal, first voice server, second voice server and voice recognition method |
CN112786022B (en) * | 2019-11-11 | 2023-04-07 | 青岛海信移动通信技术股份有限公司 | Terminal, first voice server, second voice server and voice recognition method |
CN111050002A (en) * | 2019-12-17 | 2020-04-21 | 北京鸿博信通科技有限公司 | Intelligent telephone exchange and working method and system thereof |
CN111128198A (en) * | 2019-12-25 | 2020-05-08 | 厦门快商通科技股份有限公司 | Voiceprint recognition method, voiceprint recognition device, storage medium, server and voiceprint recognition system |
CN114500128A (en) * | 2022-02-07 | 2022-05-13 | 北京百度网讯科技有限公司 | Flow control charging method, device, system, electronic equipment, medium and product |
Also Published As
Publication number | Publication date |
---|---|
CN108833590B (en) | 2021-10-26 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108833590A (en) | A kind of speech-recognition services proxy server and Proxy Method | |
CN110166432A (en) | The access method of internal net destination service provides the method for Intranet destination service | |
CN110083465A (en) | A kind of data transferring method between applying of lodging | |
CN108257590A (en) | Voice interactive method, device, electronic equipment, storage medium | |
US11233897B1 (en) | Secure call center communications | |
CN106302211B (en) | The request amount control method and device of a kind of Internet resources | |
CN109933442A (en) | The means of communication, equipment and computer storage medium between small routine platform | |
CN109785829A (en) | A kind of customer service householder method and system based on voice control | |
CN109951488A (en) | Service implementing method, device, equipment and the storage medium of content distributing network | |
CN108135026A (en) | Wi-Fi connection method, computer equipment and storage medium | |
CN109951295A (en) | Key handling and application method, device, equipment and medium | |
CN107969003A (en) | A kind of wireless access authentication method | |
CN109976922A (en) | Discovery method, equipment and computer storage medium between small routine platform | |
CN108540552A (en) | Device interconnection method, apparatus, system, device and storage medium | |
CN109886798A (en) | The long-range processing method and processing device of financial business based on data normalization | |
WO2019136685A1 (en) | Method and apparatus for network selection by terminal, and computer device and storage medium | |
CN109256217B (en) | Internet-based remote inquiry system and method | |
US20220269517A1 (en) | Adaptable warnings and feedback | |
CN109669790A (en) | Data sharing method, device, shared platform and storage medium based on cloud platform | |
WO2020177731A1 (en) | Real-time communication method between hosted applications | |
WO2019056901A1 (en) | Method of forwarding voice information during instant messaging process, and device and storage medium | |
CN108829646A (en) | Data storage and analytic method, device, system and storage medium | |
CN108876299A (en) | A kind of method and device handling electronic contract | |
CN110572438A (en) | network connection establishing method, device, network equipment and storage medium | |
CN114301789B (en) | Data transmission method and device, storage medium and electronic equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |