CN108833590A - A kind of speech-recognition services proxy server and Proxy Method - Google Patents

A kind of speech-recognition services proxy server and Proxy Method Download PDF

Info

Publication number
CN108833590A
CN108833590A CN201810758656.8A CN201810758656A CN108833590A CN 108833590 A CN108833590 A CN 108833590A CN 201810758656 A CN201810758656 A CN 201810758656A CN 108833590 A CN108833590 A CN 108833590A
Authority
CN
China
Prior art keywords
server
voice
proxy
request
media device
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810758656.8A
Other languages
Chinese (zh)
Other versions
CN108833590B (en
Inventor
戴俊
常月
黄国瑞
张伟冬
先永春
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN201810758656.8A priority Critical patent/CN108833590B/en
Publication of CN108833590A publication Critical patent/CN108833590A/en
Application granted granted Critical
Publication of CN108833590B publication Critical patent/CN108833590B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/40Support for services or applications
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/51Discovery or management thereof, e.g. service location protocol [SLP] or web services
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/56Provisioning of proxy services
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Multimedia (AREA)
  • Telephonic Communication Services (AREA)

Abstract

The invention discloses a kind of speech-recognition services proxy server and Proxy Method, the server includes MRCP proxy module, for receiving the voice stream process request of media device transmission;Voice stream process request is sent to service agent module to handle;Processing result is received from the service agent module;The processing result is sent to the corresponding operation system application of institute's voice traffic request;Service agent module, for receiving the voice stream process request from the MRCP proxy module;It is interacted with voice traffic services device, realizes the processing to the voice flow, processing result is returned into MRCP proxy module.It is decoupled between media device and ASR service, is conducive to service extension;ASR service request is authenticated by service agent service, flow control, safety, anti-cheating etc. can be applied to public cloud very well;It is serviced down by service agent and mounts other business modules, ASR service is extended, such as searched for, translation, identification intention etc..

Description

A kind of speech-recognition services proxy server and Proxy Method
【Technical field】
The present invention relates to Computer Applied Technology, in particular to a kind of speech-recognition services proxy server and agent Method.
【Background technique】
As ASR (Automatic Speech Recognition, automatic speech recognition) technology reaches its maturity and mutually In conjunction with the value-added service based on ASR is continued to bring out and grown rapidly.
In the prior art, media device can pass through MRCP (Media Resource Control Protocol, media Resource Control protocols) it is directly connected to ASR service, there are two disadvantages for such application mode:
1, there is good support for private clound, it is bad for public cloud support, because public cloud needs authentication, stream Control, safety, the mechanism such as anti-cheating.Primary method to support public cloud or abandon these mechanism or allow these mechanism with ASR service coupling;
2, ASR is serviced, the text of identification can only be returned, there cannot be good extension, to extension, needs to allow ASR Couple other business modules.
In addition, media device can also dock shared cloud platform by HTTP docks ASR service, such application mode again Disadvantage is as follows:
Because media device does not have the ability of HTTP docking ASR usually, need to be transformed.Therefore it is inconvenient to dock, There is improvement cost.Also, ASR is serviced, the text of identification can only be returned, there cannot be good extension.
【Summary of the invention】
The many aspects of the application provide speech-recognition services proxy server, method, equipment and storage medium, can Media device and operation system are decoupled, provides public cloud authentication, flow control, charging, the service such as safety, and know for ASR service extension Know.
The one side of the application, provides a kind of speech-recognition services proxy server, and the server includes:
MRCP proxy module, for receiving the voice stream process request of media device transmission;The voice stream process is asked It asks and is sent to service agent module and is handled;Processing result is received from the service agent module;The processing result is sent out Give institute's voice traffic request corresponding operation system application;
Service agent module, for receiving the voice stream process request from the MRCP proxy module;With voice service Server interaction, realizes the processing to voice flow, processing result is returned to MRCP proxy module.
The aspect and any possible implementation manners as described above, it is further provided a kind of implementation, the voice industry Server be engaged in as ASR server and/or server of activating business.
The aspect and any possible implementation manners as described above, it is further provided a kind of implementation, the business generation Managing module includes:
ASR handles submodule, for sending ASR request to the ASR server, receives what the ASR server returned Speech recognition result;And/or
It activates business and handles submodule, for activating business request to the server transmission of activating business, described in reception The processing result of activating business that server of activating business returns.
The aspect and any possible implementation manners as described above, it is further provided a kind of implementation, the business generation Reason module is also used to interact with control server, realizes the control to the media device.
The aspect and any possible implementation manners as described above, it is further provided a kind of implementation, the control clothes Business device is authentication/flow control/charging/security server;
The service agent module further includes:
Submodule is authenticated, for interacting with the authentication server, realizes the authentication to the media device;
Flow control submodule realizes the flow control to the media device for interacting with the flow control server;
Charging submodule realizes the charging to the media device for interacting with the accounting server;
Safe submodule realizes the security service to the media device for interacting with the security server.
According to the another aspect of the application, a kind of speech recognition based on above-mentioned speech-recognition services proxy server is provided Service broker method, the method includes:
Proxy server receives the voice stream process request that media device is sent;
It is interacted with voice traffic services device, realizes the processing to voice flow, obtain processing result;
The processing result is sent to the corresponding operation system application of institute's voice traffic request.
The aspect and any possible implementation manners as described above, it is further provided a kind of implementation, agency's clothes Business device receives the voice stream process request packet that media device is sent and includes:
The MRCP proxy module of the proxy server receives the voice stream process request that media device is sent, by institute's predicate The request of sound stream process is sent to service agent module and is handled.
The aspect and any possible implementation manners as described above, it is further provided a kind of implementation, described and voice Service server interaction, realization include to the processing of voice flow:
The service agent module of the proxy server receives the voice stream process request, hands over voice traffic services device Mutually, it realizes the processing to voice flow, processing result is returned into the MRCP proxy module.
The aspect and any possible implementation manners as described above, it is further provided a kind of implementation, by the processing As a result be sent to the corresponding operation system of institute's voice traffic request apply including:
The processing result is sent to the corresponding operation system of institute's voice traffic request and answered by the MRCP proxy module With.
The aspect and any possible implementation manners as described above, it is further provided a kind of implementation, the voice industry Server be engaged in as ASR server and/or server of activating business.
The aspect and any possible implementation manners as described above, it is further provided a kind of implementation, the business generation It manages module and receives the voice stream process request, interacted with voice traffic services device, realization includes to the processing of the voice flow:
ASR request is sent to the ASR server, receives the speech recognition result that the ASR server returns;And/or
Request of activating business, the extension that server of activating business described in reception returns are sent to the server of activating business Service processing result.
The aspect and any possible implementation manners as described above, it is further provided a kind of implementation, the method is also Including:
The service agent module is interacted with control server, realizes the control to the media device.
The aspect and any possible implementation manners as described above, it is further provided a kind of implementation, the control clothes Business device is authentication/flow control/charging/security server;
The service agent module is interacted with control server, and realization further includes to the control of the media device:
It is interacted with the authentication server, realizes the authentication to the media device;
It is interacted with the flow control server, realizes the flow control to the media device;
It is interacted with the accounting server, realizes the charging to the media device;
It is interacted with the security server, realizes the security service to the media device.
Another aspect of the present invention, provides a kind of computer equipment, including memory, processor and is stored in the storage On device and the computer program that can run on the processor, the processor are realized as previously discussed when executing described program Method.
Another aspect of the present invention provides a kind of computer readable storage medium, is stored thereon with computer program, described Method as described above is realized when program is executed by processor.
It can be seen that based on above-mentioned introduction using scheme of the present invention, make to increase between media device and ASR service MRCP agency service and service agent service, one come media device and ASR service between decouple, be conducive to service extension;Two ASR service request is authenticated by service agent service, flow control, safety, anti-cheating etc. can be applied to public cloud very well; Three mount other business modules to service down by service agent, are extended and (if searched for, translate, identification is intended to ASR service Deng).
【Detailed description of the invention】
Fig. 1 is the implementation diagram of speech-recognition services proxy server of the present invention;
Fig. 2 is the flow chart of speech-recognition services Proxy Method of the present invention;
Fig. 3 shows the frame for being suitable for the exemplary computer system/server 012 for being used to realize embodiment of the present invention Figure.
【Specific embodiment】
To keep the purposes, technical schemes and advantages of the embodiment of the present application clearer, below in conjunction with the embodiment of the present application In attached drawing, the technical scheme in the embodiment of the application is clearly and completely described, it is clear that described embodiment is Some embodiments of the present application, instead of all the embodiments.Based on the embodiment in the application, those of ordinary skill in the art Whole other embodiments obtained without creative efforts, shall fall in the protection scope of this application.
Fig. 1 is the implementation diagram of speech-recognition services proxy server of the invention in speech synthesis platform, such as Fig. 1 It is shown, show media device, speech recognition proxy server, operation system, ASR server, server of activating business, control Server.
The media device is connected to speech recognition proxy server, the speech recognition proxy server and operation system Using, ASR server, server of activating business, control server be separately connected.Specifically, the speech recognition agency service Device includes MRCP proxy module and service agent module, and the media device is connected to the MRCP proxy module, the MRCP Proxy module is connected to service agent module.The MRCP proxy module is connected with operation system.The service agent module It is separately connected with ASR server, server of activating business, control server.
The speech recognition proxy server is used to receive the voice stream process request of media device transmission;With voice service Server interaction, realizes the processing to the voice flow, obtains processing result;The processing result is sent to the voice industry Corresponding operation system application is requested in business.
The speech recognition proxy server includes:
MRCP proxy module, for receiving the voice stream process request of media device transmission;The voice stream process is asked It asks and is sent to service agent module and is handled;Processing result is received from the service agent module;The processing result is sent out Give institute's voice traffic request corresponding operation system application;
Service agent module, for receiving the voice stream process request from the MRCP proxy module;With voice service Server interaction, realizes the processing to the voice flow, processing result is returned to MRCP proxy module.
Preferably, the MRCP proxy module is connect with media device.Media device is sent to the MRCP proxy module The request of voice stream process, such as ASR processing request.
Voice stream process request is sent to service agent module and handled by the MRCP proxy module.
The service agent module includes that ASR processing submodule is generated and corresponded to for being requested according to the voice stream process Voice service request, be sent to corresponding voice traffic services device.It is requested for example, being handled according to the ASR, generates ASR and ask It asks and is sent to ASR server and is handled, receive the speech recognition result that the ASR server returns.
In the present embodiment, since service agent module connection activates business server to realize extended voice industry Business, such as translation service.The service agent module includes activating business to handle submodule, for being sent simultaneously according to media device The voice stream process request forwarded by MRCP proxy module, wherein the voice stream process request is translation processing request, is generated Translation request is sent to translating server and is handled, and receives the translation result that the translating server returns.
After the service agent module receives the processing result of ASR server and/or server of activating business, by the place Reason result is sent to MRCP proxy server, so that the processing result is sent to institute's voice traffic by MRCP proxy server Request corresponding operation system application.
Preferably, the processing result is sent to media device by the MRCP proxy server, by media device by institute It states processing result and is sent to operation system application.
Due to the proxy server be to the media device it is transparent, user will not perceive above-mentioned treatment process with The request of voice stream process is sent to ASR server by existing media device, receives the speech recognition result of ASR server, will The operating process that institute's speech recognition result is sent to operation system application has and difference.
Preferably, in a kind of preferred implementation of the present embodiment, in order to solve as authentication required for public cloud, stream Control, charging, safety, mechanism, the service agent module such as anti-cheating are provided and are authenticated, and flow control, charging, safety are anti-to practise fraud Etc. control servers connection, interacted with control server, realize control to the media device.
It preferably, further include authentication request and the user for logging in the media device in the voice stream process request Account and user password.
In the present embodiment, the service agent module further includes authentication submodule, for being interacted with the authentication server, Realize the authentication to the media device;Flow control submodule, for interacting with the flow control server, realization sets the media Standby flow control;Charging submodule realizes the charging to the media device for interacting with the accounting server;Safety Module realizes the security service to the media device for interacting with the security server.
It authenticates submodule and sends application authentication request to authentication server, the application authentication request includes logging in the matchmaker The account and user password of the user of body equipment, authentication server is authenticated according to account and user password, if legal The authentication is passed signal is returned to the authentication submodule.After authenticating successfully, service agent module is asked according to the voice stream process It asks, generates corresponding voice service request, be sent to corresponding ASR server and/or server of activating business;Receive ASR clothes After the processing result of business device and/or server of activating business, the processing result is sent to MRCP proxy server, so as to The processing result is sent to the corresponding operation system application of institute's voice traffic request by MRCP proxy server.
Preferably, when ASR server and/or server resource of activating business are ready, then to service agent module return company Connect success message.After service agent module obtains successful connection message, establish and ASR server and/or server of activating business Connection, meanwhile, charging submodule to accounting server send charging commencing signal;Flow control submodule is sent to flow control server Flow control commencing signal.When ASR server and/or server of activating business complete the voice service request of user, then to Service agent module returns to identification and completes message.Service agent module obtains after identification completes message, to ASR server and/or Server of activating business sends the request for disconnecting resource link, meanwhile, charging submodule is sent to accounting server stops charging Message;Flow control submodule sends the message for stopping flow control to flow control server.Preferably, accounting server is according to The duration or flow of voice service request carry out charging.
Preferably, in the present embodiment, third-party service server is accessed by public network, it is easy to security risk is brought, Including:The safety problem of conversation establishing, the protection of media session, indirect access to content, has stored the protection for controlling session Media file protection.Therefore, the safe submodule of the service agent module is connected with security server, by the peace Full server provides security service.
Preferably, the authentication, flow control, charging, security server may alternatively be integrated in the service agent module, directly Authentication, flow control, charging, security service are provided.
The proxy server through this embodiment makes between media device and ASR service, increase MRCP agency service and Service agent service, one come media device and ASR service between decouple, be conducive to service extension;Two are taken by service agent Business authenticates ASR service request, flow control, safety, and anti-cheating etc. can be applied to public cloud very well;Three came through business generation Other business modules are mounted under reason service, (such as search is translated, and identification is intended to etc.) is extended to ASR service.
In embodiment provided herein, it should be understood that disclosed method and apparatus can pass through others Mode is realized.For example, the apparatus embodiments described above are merely exemplary, for example, the division of the unit, only A kind of logical function partition, there may be another division manner in actual implementation, for example, multiple units or components can combine or Person is desirably integrated into another system, or some features can be ignored or not executed.Another point, shown or discussed is mutual Between coupling, direct-coupling or communication connection can be through some interfaces, the INDIRECT COUPLING or communication link of device or unit It connects, can be electrical property, mechanical or other forms.
The unit as illustrated by the separation member may or may not be physically separated, aobvious as unit The component shown may or may not be physical unit, it can and it is in one place, or may be distributed over multiple In network unit.It can select some or all of unit therein according to the actual needs to realize the mesh of this embodiment scheme 's.
In addition, each functional unit in each embodiment of the application can integrate in a processor, it is also possible to Each unit physically exists alone, and can also be integrated in one unit with two or more units.The integrated unit Both it can take the form of hardware realization, can also have been realized in the form of hardware adds SFU software functional unit.
Fig. 2 is the process of the speech-recognition services Proxy Method based on speech-recognition services proxy server of the present invention Figure, as shown in Fig. 2, the method includes:
Step S21, proxy server receives the voice stream process request that media device is sent;
Step S22, it is interacted with voice traffic services device, realizes the processing to voice flow, obtain processing result;
Step S23, the processing result is sent to the corresponding operation system application of institute's voice traffic request.
In a kind of preferred implementation of step S21,
Proxy server receives the voice stream process request that media device is sent;
Preferably, the MRCP proxy module of the proxy server receives the voice stream process request that media device is sent;
Preferably, MRCP proxy module is stated to connect with media device.Media device sends language to the MRCP proxy module The request of sound stream process, such as ASR processing request.The MRCP module requests the voice stream process, such as ASR processing is asked It asks, the service agent module for being sent to proxy server is handled.
In a kind of preferred implementation of step S22,
Proxy server is interacted with voice traffic services device, realizes the processing to voice flow, obtains processing result;
Preferably, the service agent module of the proxy server receives at the voice flow that the MRCP proxy module is sent Reason request, interacts with voice traffic services device, realizes the processing to the voice flow
Preferably, the service agent module includes that processing submodule is generated for being requested according to the voice stream process Corresponding voice service request is sent to corresponding voice traffic services device.For example, ASR handles submodule, according to the ASR Processing request, generation ASR request are sent to ASR server and are handled, and receive the speech recognition knot that the ASR server returns Fruit.
In the present embodiment, since service agent module connection activates business server to realize extended voice industry Business, such as translation service.The service agent module includes activating business to handle submodule, for being sent according to media device, And requested by the voice stream process that MRCP proxy module forwards, wherein the voice stream process request is translation processing request, raw Translating server is sent at translation request to be handled, and the translation result that the translating server returns is received.
After the service agent module receives the processing result of ASR server and/or server of activating business, by the place Reason result is sent to MRCP proxy server, so that the processing result is sent to institute's voice traffic by MRCP proxy server Request corresponding operation system application.
Preferably, in a kind of preferred implementation of the present embodiment, in order to solve as authentication required for public cloud, stream Control, charging, safety, mechanism, the service agent module such as anti-cheating are provided and are authenticated, and flow control, charging, safety are anti-to practise fraud Etc. control servers connection, interacted with control server, realize control to the media device.
It preferably, further include authentication request and the user for logging in the media device in the voice stream process request Account and user password.
In the present embodiment, the service agent module further includes authentication submodule, for being interacted with the authentication server, Realize the authentication to the media device;Flow control submodule, for interacting with the flow control server, realization sets the media Standby flow control;Charging submodule realizes the charging to the media device for interacting with the accounting server;Safety Module realizes the security service to the media device for interacting with the security server.
It authenticates submodule and sends application authentication request to authentication server, the application authentication request includes logging in the matchmaker The account and user password of the user of body equipment, authentication server is authenticated according to account and user password, if legal The authentication is passed signal is returned to the authentication submodule.After authenticating successfully, service agent module is asked according to the voice stream process It asks, generates corresponding voice service request, be sent to corresponding ASR server and/or server of activating business;Receive ASR clothes After the processing result of business device and/or server of activating business, the processing result is sent to MRCP proxy server, so as to The processing result is sent to the corresponding operation system application of institute's voice traffic request by MRCP proxy server.
Preferably, when ASR server and/or server resource of activating business are ready, then to service agent module return company Connect success message.After service agent module obtains successful connection message, establish and ASR server and/or server of activating business Connection, meanwhile, charging submodule to accounting server send charging commencing signal;Flow control submodule is sent to flow control server Flow control commencing signal.When ASR server and/or server of activating business complete the voice service request of user, then to Service agent module returns to identification and completes message.Service agent module obtains after identification completes message, to ASR server and/or Server of activating business sends the request for disconnecting resource link, meanwhile, charging submodule is sent to accounting server stops charging Message;Flow control submodule sends the message for stopping flow control to flow control server.Preferably, accounting server is according to The duration or flow of voice service request carry out charging.
Preferably, in the present embodiment, third-party service server is accessed by public network, it is easy to security risk is brought, Including:The safety problem of conversation establishing, the protection of media session, indirect access to content, has stored the protection for controlling session Media file protection.Therefore, the safe submodule of the service agent module is connected with security server, by the peace Full server provides security service.
Preferably, the authentication, flow control, charging, security server may alternatively be integrated in the service agent module, directly Authentication, flow control, charging, security service are provided.
In a kind of preferred implementation of step S23,
The processing result is sent to the corresponding operation system application of institute's voice traffic request.
Preferably, the processing result received from business proxy module is sent to business by the MRCP proxy server System application.
Preferably, the processing result received from business proxy module is sent to media and set by the MRCP proxy server It is standby, the processing result is sent to operation system application by media device.
Due to the proxy server be to the media device it is transparent, user will not perceive above-mentioned treatment process with The request of voice stream process is sent to ASR server by existing media device, receives the speech recognition result of ASR server, will The operating process that institute's speech recognition result is sent to operation system application has and difference.
The method through this embodiment authenticates ASR service request by service agent service, flow control, safety, Anti- cheating etc., can be applied to public cloud very well;It is serviced down by service agent and mounts other business modules, ASR is serviced and is carried out Extension (if searched for, translate, identification is intended to etc.).
It should be noted that for the various method embodiments described above, for simple description, therefore, it is stated as a series of Combination of actions, but those skilled in the art should understand that, the application is not limited by the described action sequence because According to the application, some steps may be performed in other sequences or simultaneously.Secondly, those skilled in the art should also know It knows, the embodiments described in the specification are all preferred embodiments, related actions and modules not necessarily the application It is necessary.
The introduction about embodiment of the method above, below by way of Installation practice, to scheme of the present invention carry out into One step explanation.
Fig. 3 shows the frame for being suitable for the exemplary computer system/server 012 for being used to realize embodiment of the present invention Figure.The computer system/server 012 that Fig. 3 is shown is only an example, should not function and use to the embodiment of the present invention Range band carrys out any restrictions.
As shown in figure 3, computer system/server 012 is showed in the form of universal computing device.Computer system/clothes The component of business device 012 can include but is not limited to:One or more processor or processor 016, system storage 028, Connect the bus 018 of different system components (including system storage 028 and processor 016).
Bus 018 indicates one of a few class bus structures or a variety of, including memory bus or Memory Controller, Peripheral bus, graphics acceleration port, processor or the local bus using any bus structures in a variety of bus structures.It lifts For example, these architectures include but is not limited to industry standard architecture (ISA) bus, microchannel architecture (MAC) Bus, enhanced isa bus, Video Electronics Standards Association (VESA) local bus and peripheral component interconnection (PCI) bus.
Computer system/server 012 typically comprises a variety of computer system readable media.These media, which can be, appoints The usable medium what can be accessed by computer system/server 012, including volatile and non-volatile media, movably With immovable medium.
System storage 028 may include the computer system readable media of form of volatile memory, such as deposit at random Access to memory (RAM) 030 and/or cache memory 032.Computer system/server 012 may further include other Removable/nonremovable, volatile/non-volatile computer system storage medium.Only as an example, storage system 034 can For reading and writing immovable, non-volatile magnetic media (Fig. 3 do not show, commonly referred to as " hard disk drive ").Although in Fig. 3 It is not shown, the disc driver for reading and writing to removable non-volatile magnetic disk (such as " floppy disk ") can be provided, and to can The CD drive of mobile anonvolatile optical disk (such as CD-ROM, DVD-ROM or other optical mediums) read-write.In these situations Under, each driver can be connected by one or more data media interfaces with bus 018.Memory 028 may include At least one program product, the program product have one group of (for example, at least one) program module, these program modules are configured To execute the function of various embodiments of the present invention.
Program/utility 040 with one group of (at least one) program module 042, can store in such as memory In 028, such program module 042 includes --- but being not limited to --- operating system, one or more application program, other It may include the realization of network environment in program module and program data, each of these examples or certain combination.Journey Sequence module 042 usually executes function and/or method in embodiment described in the invention.
Computer system/server 012 can also with one or more external equipments 014 (such as keyboard, sensing equipment, Display 024 etc.) communication, in the present invention, computer system/server 012 is communicated with outside radar equipment, can also be with One or more equipment that sounder is interacted with the computer system/server 012 communication, and/or with make this Any equipment that computer system/server 012 can be communicated with one or more of the other calculating equipment (adjust by such as network interface card Modulator-demodulator etc.) communication.This communication can be carried out by input/output (I/O) interface 022.Also, computer system/ Server 012 can also pass through network adapter 020 and one or more network (such as local area network (LAN), wide area network (WAN) and/or public network, for example, internet) communication.As shown in figure 3, network adapter 020 passes through bus 018 and computer Other modules of systems/servers 012 communicate.It should be understood that computer system/service can be combined although being not shown in Fig. 3 Device 012 uses other hardware and/or software module, including but not limited to:Microcode, device driver, redundant processor, outside Disk drive array, RAID system, tape drive and data backup storage system etc..
The program that processor 016 is stored in system storage 028 by operation, thereby executing reality described in the invention Apply the function and/or method in example.
Above-mentioned computer program can be set in computer storage medium, i.e., the computer storage medium is encoded with Computer program, the program by one or more computers when being executed, so that one or more computers execute in the present invention State method flow shown in embodiment and/or device operation.
With time, the development of technology, medium meaning is more and more extensive, and the route of transmission of computer program is no longer limited by Tangible medium, can also be directly from network downloading etc..It can be using any combination of one or more computer-readable media. Computer-readable medium can be computer-readable signal media or computer readable storage medium.Computer-readable storage medium Matter for example may be-but not limited to-system, device or the device of electricity, magnetic, optical, electromagnetic, infrared ray or semiconductor, or Any above combination of person.The more specific example (non exhaustive list) of computer readable storage medium includes:With one Or the electrical connections of multiple conducting wires, portable computer diskette, hard disk, random access memory (RAM), read-only memory (ROM), Erasable programmable read only memory (EPROM or flash memory), optical fiber, portable compact disc read-only memory (CD-ROM), light Memory device, magnetic memory device or above-mentioned any appropriate combination.In this document, computer readable storage medium can With to be any include or the tangible medium of storage program, the program can be commanded execution system, device or device use or Person is in connection.
Computer-readable signal media may include in a base band or as carrier wave a part propagate data-signal, Wherein carry computer-readable program code.The data-signal of this propagation can take various forms, including --- but It is not limited to --- electromagnetic signal, optical signal or above-mentioned any appropriate combination.Computer-readable signal media can also be Any computer-readable medium other than computer readable storage medium, which can send, propagate or Transmission is for by the use of instruction execution system, device or device or program in connection.
The program code for including on computer-readable medium can transmit with any suitable medium, including --- but it is unlimited In --- wireless, electric wire, optical cable, RF etc. or above-mentioned any appropriate combination.
The computer for executing operation of the present invention can be write with one or more programming languages or combinations thereof Program code, described program design language include object oriented program language-such as Java, Smalltalk, C++, It further include conventional procedural programming language-such as " C " language or similar programming language.Program code can be with It is fully executed on sounder computer, partly executes on sounder computer, held as an independent software package Row, partially on sounder computer part on the remote computer execute or completely on a remote computer or server It executes.In situations involving remote computers, remote computer can pass through the network of any kind --- including local area network (LAN) or wide area network (WAN) is connected to sounder computer, or, it may be connected to outer computer (such as utilize internet Service provider is connected by internet).
It is apparent to those skilled in the art that for convenience and simplicity of description, the system of the description, The specific work process of device and unit, can refer to corresponding processes in the foregoing method embodiment, and details are not described herein.
In several embodiments provided herein, it should be understood that disclosed method and apparatus can pass through it Its mode is realized.For example, the apparatus embodiments described above are merely exemplary, for example, the division of the unit, only Only a kind of logical function partition, there may be another division manner in actual implementation, such as multiple units or components can be tied Another system is closed or is desirably integrated into, or some features can be ignored or not executed.Another point, it is shown or discussed Mutual coupling, direct-coupling or communication connection can be through some interfaces, the INDIRECT COUPLING or logical of device or unit Letter connection can be electrical property, mechanical or other forms.
The unit as illustrated by the separation member may or may not be physically separated, aobvious as unit The component shown may or may not be physical unit, it can and it is in one place, or may be distributed over multiple In network unit.It can select some or all of unit therein according to the actual needs to realize the mesh of this embodiment scheme 's.
In addition, each functional unit in each embodiment of the application can integrate in a processor, it is also possible to Each unit physically exists alone, and can also be integrated in one unit with two or more units.The integrated unit Both it can take the form of hardware realization, can also have been realized in the form of hardware adds SFU software functional unit.
Finally it should be noted that:Above embodiments are only to illustrate the technical solution of the application, rather than its limitations;Although The application is described in detail with reference to the foregoing embodiments, those skilled in the art should understand that:It still may be used To modify the technical solutions described in the foregoing embodiments or equivalent replacement of some of the technical features; And these are modified or replaceed, each embodiment technical solution of the application that it does not separate the essence of the corresponding technical solution spirit and Range.

Claims (15)

1. a kind of speech-recognition services proxy server, which is characterized in that the server includes:
MRCP proxy module, for receiving the voice stream process request of media device transmission;The voice stream process is requested to send out Service agent module is given to be handled;Processing result is received from the service agent module;The processing result is sent to The corresponding operation system application of institute's voice traffic request;
Service agent module, for receiving the voice stream process request from the MRCP proxy module;With voice traffic services Device interaction, realizes the processing to voice flow, processing result is returned to MRCP proxy module.
2. proxy server according to claim 1, which is characterized in that the voice traffic services device is ASR server And/or server of activating business.
3. proxy server according to claim 2, which is characterized in that the service agent module includes:
ASR handles submodule, for sending ASR request to the ASR server, receives the voice that the ASR server returns Recognition result;And/or
It activates business and handles submodule, for sending request of activating business to the server of activating business, receive the extension The processing result of activating business that service server returns.
4. proxy server according to claim 1, which is characterized in that the service agent module is also used to take with control Business device interaction, realizes the control to the media device.
5. proxy server according to claim 4, which is characterized in that
The control server is authentication/flow control/charging/security server;
The service agent module further includes:
Submodule is authenticated, for interacting with the authentication server, realizes the authentication to the media device;
Flow control submodule realizes the flow control to the media device for interacting with the flow control server;
Charging submodule realizes the charging to the media device for interacting with the accounting server;
Safe submodule realizes the security service to the media device for interacting with the security server.
6. a kind of speech-recognition services Proxy Method based on speech-recognition services proxy server described in claim 1-5, It is characterized in that, the method includes:
Proxy server receives the voice stream process request that media device is sent;
It is interacted with voice traffic services device, realizes the processing to voice flow, obtain processing result;
The processing result is sent to the corresponding operation system application of institute's voice traffic request.
7. according to the method described in claim 6, it is characterized in that, the proxy server receives the voice that media device is sent Stream process is requested:
The MRCP proxy module of the proxy server receives the voice stream process request that media device is sent, by the voice flow Processing request is sent to service agent module and is handled.
8. realizing the method according to the description of claim 7 is characterized in that described interact with voice traffic services device to voice The processing of stream includes:
The service agent module of the proxy server receives the voice stream process request, interacts with voice traffic services device, It realizes the processing to voice flow, processing result is returned into the MRCP proxy module.
9. according to the method described in claim 8, it is characterized in that, the processing result is sent to institute's voice traffic request Corresponding operation system apply including:
The processing result is sent to the corresponding operation system application of institute's voice traffic request by the MRCP proxy module.
10. according to the method described in claim 9, it is characterized in that, the voice traffic services device be ASR server and/or It activates business server.
11. according to the method described in claim 10, it is characterized in that, the service agent module receives the voice stream process Request, interacts with voice traffic services device, and realization includes to the processing of the voice flow:
ASR request is sent to the ASR server, receives the speech recognition result that the ASR server returns;And/or
Request of activating business is sent to the server of activating business, what server of activating business described in reception returned activates business Processing result.
12. according to the method described in claim 9, it is characterized in that, the method also includes:
The service agent module is interacted with control server, realizes the control to the media device.
13. according to the method for claim 12, which is characterized in that
The control server is authentication/flow control/charging/security server;
The service agent module is interacted with control server, and realization further includes to the control of the media device:
It is interacted with the authentication server, realizes the authentication to the media device;
It is interacted with the flow control server, realizes the flow control to the media device;
It is interacted with the accounting server, realizes the charging to the media device;
It is interacted with the security server, realizes the security service to the media device.
14. a kind of computer equipment, including memory, processor and it is stored on the memory and can be on the processor The computer program of operation, which is characterized in that the processor is realized when executing described program as any in claim 6~13 Method described in.
15. a kind of computer readable storage medium, is stored thereon with computer program, which is characterized in that described program is processed The method as described in any one of claim 6~13 is realized when device executes.
CN201810758656.8A 2018-07-11 2018-07-11 Voice recognition service proxy server and proxy method Active CN108833590B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810758656.8A CN108833590B (en) 2018-07-11 2018-07-11 Voice recognition service proxy server and proxy method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810758656.8A CN108833590B (en) 2018-07-11 2018-07-11 Voice recognition service proxy server and proxy method

Publications (2)

Publication Number Publication Date
CN108833590A true CN108833590A (en) 2018-11-16
CN108833590B CN108833590B (en) 2021-10-26

Family

ID=64136036

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810758656.8A Active CN108833590B (en) 2018-07-11 2018-07-11 Voice recognition service proxy server and proxy method

Country Status (1)

Country Link
CN (1) CN108833590B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111050002A (en) * 2019-12-17 2020-04-21 北京鸿博信通科技有限公司 Intelligent telephone exchange and working method and system thereof
CN111128198A (en) * 2019-12-25 2020-05-08 厦门快商通科技股份有限公司 Voiceprint recognition method, voiceprint recognition device, storage medium, server and voiceprint recognition system
CN112786022A (en) * 2019-11-11 2021-05-11 青岛海信移动通信技术股份有限公司 Terminal, first voice server, second voice server and voice recognition method
CN114500128A (en) * 2022-02-07 2022-05-13 北京百度网讯科技有限公司 Flow control charging method, device, system, electronic equipment, medium and product

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101079885A (en) * 2007-06-26 2007-11-28 中兴通讯股份有限公司 A system and method for providing automatic voice identification integrated development platform
CN101677329A (en) * 2008-09-18 2010-03-24 中兴通讯股份有限公司 Comprehensive voice resource platform proxy server and its data processing method
CN102427465A (en) * 2011-08-18 2012-04-25 青岛海信电器股份有限公司 Voice service proxy method and device and system for integrating voice application through proxy

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101079885A (en) * 2007-06-26 2007-11-28 中兴通讯股份有限公司 A system and method for providing automatic voice identification integrated development platform
CN101677329A (en) * 2008-09-18 2010-03-24 中兴通讯股份有限公司 Comprehensive voice resource platform proxy server and its data processing method
CN102427465A (en) * 2011-08-18 2012-04-25 青岛海信电器股份有限公司 Voice service proxy method and device and system for integrating voice application through proxy

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112786022A (en) * 2019-11-11 2021-05-11 青岛海信移动通信技术股份有限公司 Terminal, first voice server, second voice server and voice recognition method
CN112786022B (en) * 2019-11-11 2023-04-07 青岛海信移动通信技术股份有限公司 Terminal, first voice server, second voice server and voice recognition method
CN111050002A (en) * 2019-12-17 2020-04-21 北京鸿博信通科技有限公司 Intelligent telephone exchange and working method and system thereof
CN111128198A (en) * 2019-12-25 2020-05-08 厦门快商通科技股份有限公司 Voiceprint recognition method, voiceprint recognition device, storage medium, server and voiceprint recognition system
CN114500128A (en) * 2022-02-07 2022-05-13 北京百度网讯科技有限公司 Flow control charging method, device, system, electronic equipment, medium and product

Also Published As

Publication number Publication date
CN108833590B (en) 2021-10-26

Similar Documents

Publication Publication Date Title
CN108833590A (en) A kind of speech-recognition services proxy server and Proxy Method
CN110166432A (en) The access method of internal net destination service provides the method for Intranet destination service
CN110083465A (en) A kind of data transferring method between applying of lodging
CN108257590A (en) Voice interactive method, device, electronic equipment, storage medium
US11233897B1 (en) Secure call center communications
CN106302211B (en) The request amount control method and device of a kind of Internet resources
CN109933442A (en) The means of communication, equipment and computer storage medium between small routine platform
CN109785829A (en) A kind of customer service householder method and system based on voice control
CN109951488A (en) Service implementing method, device, equipment and the storage medium of content distributing network
CN108135026A (en) Wi-Fi connection method, computer equipment and storage medium
CN109951295A (en) Key handling and application method, device, equipment and medium
CN107969003A (en) A kind of wireless access authentication method
CN109976922A (en) Discovery method, equipment and computer storage medium between small routine platform
CN108540552A (en) Device interconnection method, apparatus, system, device and storage medium
CN109886798A (en) The long-range processing method and processing device of financial business based on data normalization
WO2019136685A1 (en) Method and apparatus for network selection by terminal, and computer device and storage medium
CN109256217B (en) Internet-based remote inquiry system and method
US20220269517A1 (en) Adaptable warnings and feedback
CN109669790A (en) Data sharing method, device, shared platform and storage medium based on cloud platform
WO2020177731A1 (en) Real-time communication method between hosted applications
WO2019056901A1 (en) Method of forwarding voice information during instant messaging process, and device and storage medium
CN108829646A (en) Data storage and analytic method, device, system and storage medium
CN108876299A (en) A kind of method and device handling electronic contract
CN110572438A (en) network connection establishing method, device, network equipment and storage medium
CN114301789B (en) Data transmission method and device, storage medium and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant