CN113408736B - Processing method and device of voice semantic model - Google Patents

Processing method and device of voice semantic model Download PDF

Info

Publication number
CN113408736B
CN113408736B CN202110475912.4A CN202110475912A CN113408736B CN 113408736 B CN113408736 B CN 113408736B CN 202110475912 A CN202110475912 A CN 202110475912A CN 113408736 B CN113408736 B CN 113408736B
Authority
CN
China
Prior art keywords
menu
file
semantic model
standard question
voice
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110475912.4A
Other languages
Chinese (zh)
Other versions
CN113408736A (en
Inventor
张兰英
江黎枫
钟亮
李培
郭玉春
王永彬
许璐
文禄
张海宁
李蔷
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Postal Savings Bank of China Ltd
Original Assignee
Postal Savings Bank of China Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Postal Savings Bank of China Ltd filed Critical Postal Savings Bank of China Ltd
Priority to CN202110475912.4A priority Critical patent/CN113408736B/en
Publication of CN113408736A publication Critical patent/CN113408736A/en
Application granted granted Critical
Publication of CN113408736B publication Critical patent/CN113408736B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/063Training
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Data Mining & Analysis (AREA)
  • Computing Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Mathematical Physics (AREA)
  • Acoustics & Sound (AREA)
  • General Health & Medical Sciences (AREA)
  • User Interface Of Digital Computer (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a processing method and a processing device of a voice semantic model. Wherein the method comprises the following steps: deploying a voice semantic model; and synchronizing the menu standard question file to the application program of the target tenant so that the application program of the target tenant matches the local menu file with the menu standard question file, generating a menu association file, and loading the menu association file to enable the voice semantic model. The method and the device solve the technical problem that APP tenants in the prior art trigger a new voice semantic model to take effect by means of manual operation.

Description

Processing method and device of voice semantic model
Technical Field
The invention relates to the technical field of voice processing, in particular to a processing method and device of a voice semantic model.
Background
With the rising of deep learning in recent years, speech recognition and natural language processing rapidly develop, which brings fundamental changes to the interaction modes of financial products, and more financial systems and business products are dialogues and interactions with users through voices or words through CUI (Conversational User Interface) except for necessary GUIs. At present, each bank has a plurality of customer-oriented financial APP, business such as customer-oriented mobile phone banking, living application, network loan and the like, mobile business application, financial business handling, comprehensive service application and the like for operators, the complexity of application GUI is embodied on navigation and hierarchy level, is determined by GUI structure, has multiple functions, multiple menu hierarchy, complex interactive control, deeper user searching and using operation hierarchy and very high learning cost. The voice is one of the most natural human interaction modes, has the advantages of being direct, clear and rapid, and the voice instruction aims at utilizing voice recognition and semantic analysis to directly reach the menu, eliminate hierarchical relations, reduce user operation paths and improve user operation convenience. In the intelligent financial era, more and more foreground APPs are required to provide voice instruction services for clients. For the voice semantic service center, all the APP tenants in the front stage can be uniformly regarded as APP tenants of the voice semantic service center.
In the process that the voice semantic middle station provides voice instruction service for each APP tenant, the encountered problems are mainly: in the internet financial era, APP tenants are constantly adjusted and optimized in layout and menus are constantly adjusted accordingly in order to improve user experience, so that voice semantic middle station services need to adapt to frequent changes of the menus of each APP tenant. The technical scheme adopted by the current voice semantic middle station adaptation APP is as follows: the APP tenant knowledge team is responsible for sorting newly added menus, changing menus, deleting menus and corresponding expansion questions, and submitting the newly added menus, the changing menus, the deleting menus and the corresponding expansion questions to the voice semantic center service team. The voice semantic middle model development team is responsible for training the model according to the corpus provided by the tenant, putting the model on line after training, and simultaneously providing the menu and the corresponding standard question ID for the APP tenant application development team. The APP tenant application development team is responsible for updating the corresponding relation between the standard question ID and the menu ID, manually triggering and loading the corresponding file, starting the new configuration file, and enabling the new model to be effective. The problem in doing so is that the manual operation of the APP tenant is closely related to the training of the speech semantic middle model, and the APP tenant must perform upgrading configuration and reload files after taking the standard query ID of the speech semantic new model to enable the new model to be effective.
In addition, the existing implementation has the defect that the manual operation of the APP tenant is effectively and highly coupled with the voice semantic middle platform model. The APP tenant provides the menu and the corresponding extension questions of the menu to the voice semantic middle station, the voice semantic middle station performs model training according to the corpus, the menu and the corresponding standard question ID are provided to the APP tenant, and after the APP tenant takes the menu provided by the voice semantic service and the corresponding standard question ID, the APP tenant performs upgrading configuration and reloads the file, so that the new model can be validated.
In view of the above problems, no effective solution has been proposed at present.
Disclosure of Invention
The embodiment of the invention provides a processing method and a processing device of a voice semantic model, which at least solve the technical problem that APP tenants in the prior art trigger a new voice semantic model to take effect by means of manual operation.
According to an aspect of the embodiment of the present invention, there is provided a method for processing a speech semantic model, including: deploying a voice semantic model; synchronizing a menu standard question file to an application program of a target tenant, so that the application program of the target tenant matches a local menu file with the menu standard question file, generates a menu association file, and loads the menu association file to enable the voice semantic model; wherein the local menu file at least comprises: menu name and menu identification.
Optionally, synchronizing the menu standard questioning file to the application of the target tenant includes: generating a menu standard question file according to application programs of different types of target tenants, wherein the menu standard question file at least comprises: menu name, standard question mark; and synchronizing the menu standard question file to the corresponding application program of the target tenant according to a preset pushing mode.
Optionally, after enabling the speech semantic model, further comprising: acquiring voice information received by an application program of the target tenant; identifying the voice information according to the voice semantic model to obtain a real-time message, wherein the real-time message at least comprises: the menu name and the confidence level corresponding to the standard question mark; and sending the real-time message to the application program of the target tenant, so that the application program of the target tenant matches a corresponding menu identifier from the local menu file according to the standard question identifier in the real-time message, and performs menu display or interface skip according to the menu identifier.
Optionally, identifying the voice information according to the voice semantic model to obtain a real-time message, including: inputting the voice information into the voice semantic model, and recognizing a standard question mark corresponding to the voice information by the voice semantic model; determining menu names and confidence levels corresponding to the standard question marks; and generating the real-time message according to the standard question mark, the menu name corresponding to the standard question mark and the confidence level.
Optionally, before deploying the speech semantic model, the method further comprises: obtaining an updated corpus, wherein the updated corpus at least comprises: expanding a question table, a standard question table and a menu name table; training the voice semantic model by using the linguistic data in the updated linguistic data table, wherein the voice semantic model is trained by using the linguistic data corresponding to the extended question table, the standard question table and the menu name table respectively, and cross-verifying the voice semantic model obtained by training based on different linguistic data tables until the voice semantic model finally used for deployment is obtained.
Optionally, before obtaining the updated corpus, the method includes: acquiring a corpus file of an application program of the target tenant; extracting corpus information in the corpus file, wherein the corpus information at least comprises: menu name, standard question and corresponding extension question of said standard question; and updating a corpus table according to the corpus information.
Optionally, before obtaining the updated corpus, the method further includes: setting a local menu file of an application program of the target tenant, wherein the local menu file at least comprises: menu name and menu identification.
According to another aspect of the embodiment of the present invention, there is also provided a processing apparatus for a speech semantic model, including: the deployment module is used for deploying the voice semantic model; the synchronous module is used for synchronizing the menu standard question file to the application program of the target tenant so that the application program of the target tenant matches the local menu file with the menu standard question file, a menu associated file is generated, and the menu associated file is loaded to enable the voice semantic model; wherein the local menu file at least comprises: menu name and menu identification.
According to another aspect of the embodiment of the present invention, there is further provided a computer readable storage medium, where the computer readable storage medium includes a stored program, and when the program runs, the device in which the computer readable storage medium is controlled to execute the method for processing the speech semantic model according to any one of the above.
According to another aspect of the embodiment of the present invention, there is further provided a processor, where the processor is configured to run a program, and when the program runs, perform a method for processing a speech semantic model according to any one of the above.
In the embodiment of the invention, the deployment voice semantic model is adopted, the menu standard question file is synchronized to the application program of the target tenant, so that the application program of the target tenant is matched with the local menu file and the menu standard file, the menu associated file is generated, the voice semantic model is started, the application program of the target tenant can start the voice semantic model by synchronizing the menu standard question file to the application program of the target tenant, the aim of preventing the APP tenant from triggering the new voice semantic model to take effect by manual operation is fulfilled, the quasi-real-time effect after model deployment is realized, the technical effect of providing more efficient, unified and automatic voice instruction service for each APP tenant is further realized, and the technical problem that the APP tenant triggers the new voice semantic model to take effect by means of manual operation in the prior art is solved.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this application, illustrate embodiments of the invention and together with the description serve to explain the invention and do not constitute a limitation on the invention. In the drawings:
FIG. 1 is a flow chart of a method of processing a speech semantic model according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of a speech semantic model correlation table according to an alternative embodiment of the present invention;
FIG. 3 is a flow chart of model training in synchronization with a standard questioning document in accordance with an alternative embodiment of the invention;
FIG. 4 is a flow chart of a speech semantic instruction service according to an alternative embodiment of the present invention;
fig. 5 is a schematic diagram of a processing apparatus of a speech semantic model according to an embodiment of the present invention.
Detailed Description
In order that those skilled in the art will better understand the present invention, a technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in which it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present invention without making any inventive effort, shall fall within the scope of the present invention.
It should be noted that the terms "first," "second," and the like in the description and the claims of the present invention and the above figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the invention described herein may be implemented in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
Example 1
According to an embodiment of the present invention, there is provided an embodiment of a processing method of a speech semantic model, it should be noted that the steps illustrated in the flowchart of the drawings may be performed in a computer system such as a set of computer executable instructions, and that although a logical order is illustrated in the flowchart, in some cases the steps illustrated or described may be performed in an order different from that herein.
FIG. 1 is a flowchart of a processing method of a speech semantic model according to an embodiment of the present invention, as shown in FIG. 1, the processing method of the speech semantic model includes the steps of:
step S102, deploying a voice semantic model;
step S104, synchronizing the menu standard question file to the application program of the target tenant, so that the application program of the target tenant matches the local menu file and the menu standard question file, generates a menu associated file, and loads the menu associated file to enable the voice semantic model;
wherein the local menu file comprises at least: menu name and menu identification.
The menu association file corresponds to the menu identifier, the standard question identifier and the menu name. Optionally, the application program of the target tenant is an APP tenant.
It should be noted that the above embodiment may be applied to a speech semantic middle stage, that is, the implementation step may be implemented in the speech semantic middle stage.
Through the steps, the menu standard questioning file can be synchronized to the application program of the target tenant, so that the application program of the target tenant can enable the voice semantic model, the aim of preventing the APP tenant from triggering a new voice semantic model to take effect by manual operation is fulfilled, the technical effect of providing more efficient, unified and automatic voice instruction service for each APP tenant after model deployment is realized, and the technical problem that the APP tenant triggers the new voice semantic model to take effect by manual operation in the prior art is solved.
Optionally, synchronizing the menu standard questioning file to the application of the target tenant includes: generating a menu standard question file according to application programs of different types of target tenants, wherein the menu standard question file at least comprises: menu name, standard question mark; and synchronizing the menu standard questioning file to the application program of the corresponding target tenant according to a preset pushing mode.
In an alternative embodiment, the menu name table may be extracted, the standard question identifier may be added/modified, and then the menu standard question file may be generated according to the application program of the target tenant of different types, where the menu standard question file includes the menu name, the standard question identifier, etc., and it should be noted that, in the implementation process, the menu standard question file includes, but is not limited to, the menu name and the standard question identifier.
The preset pushing mode includes, but is not limited to, daily terminal form pushing, quasi-real-time file form pushing, and the like. For example, the application program for synchronizing the menu standard question file to the corresponding target tenant may be pushed in a day-end format, and the application program for synchronizing the menu standard question file to the corresponding target tenant may be pushed in a near real-time file format. Through the embodiment, the menu standard question file can be flexibly synchronized to the application program of the corresponding target tenant.
Optionally, after the speech semantic model is enabled, the method further comprises: acquiring voice information received by an application program of a target tenant; identifying voice information according to the voice semantic model to obtain a real-time message; and sending the real-time message to the application program of the target tenant, so that the application program of the target tenant matches the corresponding menu identification from the local menu file according to the standard question identification in the real-time message, and performs menu display or interface jump according to the menu identification.
The real-time messages include, but are not limited to, standard question identification, menu names, corresponding confidence levels, and the like.
In an alternative embodiment, firstly, the voice information received by the application program of the target tenant needs to be acquired, secondly, the voice information is identified by utilizing a voice semantic model so as to obtain a real-time message, and then the real-time message is sent to the application program of the target tenant, wherein the application program of the target tenant can match a corresponding menu identifier from a local menu file according to a standard query identifier in the real-time message, and menu display or interface skip is performed according to the menu identifier.
Through the embodiment, the voice information can be identified by using the voice semantic model after being started to obtain the real-time message, and the real-time message is sent to the application program of the target tenant, so that menu display or interface skip of the application program of the target tenant can be realized.
Optionally, identifying the voice information according to the voice semantic model to obtain the real-time message includes: inputting the voice information into a voice semantic model, and recognizing a standard question mark corresponding to the voice information by the voice semantic model; determining menu names and confidence levels corresponding to the standard question marks; and generating the real-time message according to the standard question mark, the menu name corresponding to the standard question mark and the confidence level.
The above-described speech semantic model includes, but is not limited to, an acoustic model, a language model, a semantic model, and the like.
In an alternative embodiment, after the voice information received by the application program of the target tenant is acquired, the voice information may be converted into text through an acoustic model and a language model, and then converted into intention through a semantic model, that is, a standard question identifier, and the standard question identifier, the menu name and the corresponding confidence level of the top 3 confidence level are returned to the application program of the target tenant in a real-time message manner.
Through the implementation mode, the voice information can be converted into the real-time information in real time, and the real-time information can be provided for the application program of the target tenant in time.
Optionally, before deploying the speech semantic model, the method further comprises: obtaining an updated corpus, wherein the updated corpus at least comprises: expanding a question table, a standard question table and a menu name table; training the voice semantic model by using the corpus in the updated corpus, wherein the voice semantic model is respectively trained by using the corpus corresponding to the extended question table, the standard question table and the menu name table, and cross-verifying the voice semantic model obtained by training based on different corpus until the voice semantic model finally used for deployment is obtained.
The corpus includes, but is not limited to, an extended question, a standard question, a menu name list, etc., and the corpus can be updated in real time, and the updated corpus also includes, but is not limited to, an extended question, a standard question, a menu name list, etc. It should be noted that, the content of the extended question table includes, but is not limited to, standard question identification, extended question, etc.; the content of the standard question table includes, but is not limited to, standard question identification, tenant identification, standard question and the like; the contents of the menu name table include, but are not limited to, menu names, tenant identifications, question mark, etc.
In an alternative embodiment, training the speech semantic model using the corpus in the updated corpus comprises: training a voice semantic model by using a corpus corresponding to the expansion question table to obtain a first training model; training a voice semantic model by using the corpus corresponding to the standard question table to obtain a second training model; training a voice semantic model by using the corpus corresponding to the menu name table to obtain a third training model; and respectively carrying out cross verification on the first training model, the second training model and the third training model to finally obtain the voice semantic model which can be used for deployment.
By the aid of the method and the device, recognition accuracy and accuracy of the trained voice semantic model can be improved.
Optionally, before obtaining the updated corpus, the method includes: acquiring a corpus file of an application program of a target tenant; extracting corpus information in a corpus file, wherein the corpus information at least comprises: menu name, standard question and corresponding extension question of standard question; and updating the corpus table according to the corpus information.
The corpus file of the application program of the target tenant contains corpus information, wherein the corpus information comprises but is not limited to menu names, standard questions, corresponding expansion questions and the like. In the implementation process, the corpus information such as menu names, standard questions and corresponding expansion questions in the corpus file can be extracted, and the corpus table is updated according to the corpus information. It should be noted that the corpus includes, but is not limited to, an extended question list, a standard question list, a menu name list, and the like.
Through the implementation manner, the corpus can be updated in time by utilizing the corpus information extracted from the corpus file of the application program of the target tenant, so that a model obtained by training by using the corpus can adapt to complex and changeable application scenes.
Optionally, before obtaining the updated corpus, the method further includes: setting a local menu file of an application program of a target tenant, wherein the local menu file at least comprises: menu name and menu identification.
In an alternative embodiment, before the updated corpus is obtained, a local menu file of the application program of the target tenant may be preset, where in a specific implementation process, the local menu file includes, but is not limited to, a menu name, a menu identifier, and the like.
An alternative embodiment of the present invention will be described in detail below.
The optional implementation mode of the invention is divided into three parts of voice semantic model training, data file synchronization and voice semantic instruction real-time service. Specifically, after the voice semantic middle station service model is deployed, the menu name and the corresponding standard question ID are synchronized to an APP tenant in a quasi-real-time mode, the APP tenant automatically matches a local menu file with a received menu standard file, and a menu association file is automatically generated, so that a new model is started. In addition, the voice semantic middle station can provide semantic instruction service for multiple APP tenants at the same time, and all APP tenants have no influence and no perception. The platform in the voice semantic instruction can generate different menu standard question files for multiple APP tenants, synchronize to the different APP tenants, and enable the new model simultaneously for the multiple APP tenants.
1. The training of the voice semantic model comprises the following two steps:
(1) The APP tenant knowledge team provides corpora. The corpus is subdivided into menu names, standard questions and corresponding extension questions, for example, a first-level menu of "account loss", wherein the standard questions are "loss-reporting bank cards", and the extension questions comprise the following several "I want to loss one card", "what the bank card does not have, what the loss-reporting bank card", "what the loss-reporting bank card cannot find", "whether temporary loss-reporting can be achieved on a mobile phone bank", and the like. The APP tenant knowledge team is responsible for organizing menu names, standard questions and corresponding extension questions into files and submitting the files to the voice semantic middle knowledge team. And configuring a menu file locally by the APP tenant development team according to the sorted corpus, wherein the menu file comprises two columns of menu names and menu IDs.
(2) The speech semantic middle station further processes the corpus submitted by the front-end APP to update three tables, as shown in fig. 2, specifically: a. standard questionnaire: a standard question ID, a standard question, and an APP tenant ID correspondence table; b. expanding the questionnaire: standard question ID, extended question correspondence table; c. menu name table: menu name, APP tenant ID, challenge ID correspondence table. The speech semantic middle station trains the model with the processed corpus, and issues the model online after cross-verifying the model as shown in fig. 3.
2. The data file synchronization, as shown in fig. 3, can be divided into the following two steps:
(1) The database extracts the menu name from the menu name table, adds/modifies the standard question ID, and respectively generates different menu files according to different APP tenants, wherein the menu files comprise two columns of the menu name and the standard question ID. Pushing the data to corresponding APP tenants in a day-end form or a near-real-time file form through an enterprise data bus.
(2) And the APP tenant starts the model after receiving the menu standard question file sent by the voice semantic center platform service according to the preset local menu file. And matching the local menu file with the menu standard question file through menu name association, and corresponding the menu ID, the standard question ID and the menu name to generate a menu association file.
3. The on-line application, as shown in fig. 4, can be divided into the following three steps:
(1) After receiving the voice of the user, the APP tenant forwards the voice to a voice semantic middle station.
(2) The voice semantic middle station converts the user voice into words through an acoustic model and a language model, converts the words into intentions through a semantic model, namely, a standard question ID, and returns the standard question ID and menu name with the confidence coefficient ranking 3 and the corresponding confidence coefficient to the APP tenant in a real-time message mode.
(3) After receiving the real-time message, the APP tenant finds out the corresponding menu ID according to the received standard question ID and the menu file, and realizes menu display or interface skip.
In the above embodiment, the menu name may be used as an association tie between the voice semantic center service and the APP tenant, instead of the voice semantic center service and the APP tenant offline transfer standard query ID, which may not only decouple the APP tenant from the voice semantic center service, but also isolate each APP tenant, so that they have no influence and no perception.
In addition, the implementation mode can realize the quasi-real-time effect after the new model is deployed in the voice semantics, can provide efficient, unified and automatic voice semantic instruction service for each APP tenant, decouple the APP tenant and the voice semantic instruction service, achieve the automation of model effect, can also provide isolatable and personalized service for each tenant, and really realizes that the tenant does not feel and the service is not interrupted.
In another alternative embodiment, the menu ID of each APP tenant may be used as a standard query ID of the speech semantic middle station, instead of the menu name being used as a tie associated between the speech semantic middle station and the APP tenant. Through the implementation mode, once the APP tenant submits the corpus, no configuration work is needed, once the voice semantic model is deployed, the APP tenant gives the client input, the voice semantic middle station directly returns the menu ID, and the APP tenant does not need to do any conversion work.
It should be noted that, once the menu ID is changed, the table key of the voice semantic center is changed relative to the table key of the tenant, and a full-scale regression test is required, which may increase the workload of the regression test of the voice semantic center.
Example 2
According to another aspect of the embodiment of the present invention, there is further provided a processing apparatus for a speech semantic model, and fig. 5 is a schematic diagram of the processing apparatus for a speech semantic model according to an embodiment of the present invention, as shown in fig. 5, where the processing apparatus for a speech semantic model includes: a deployment module 52 and a synchronization module 54. The processing means of the speech semantic model will be described in detail below.
A deployment module 52 for deploying the speech semantic model; the synchronization module 54 is connected to the deployment module 52, and is configured to synchronize the menu standard query file to an application program of the target tenant, so that the application program of the target tenant matches the local menu file and the menu standard query file, generate a menu association file, and load the menu association file to enable the speech semantic model; wherein the local menu file comprises at least: menu name and menu identification.
It should be noted that each of the above modules may be implemented by software or hardware, for example, in the latter case, it may be implemented by: the above modules may be located in the same processor; and/or the above modules are located in different processors in an arbitrary processing manner.
In the above embodiment, the processing device of the voice semantic model can synchronize the menu standard question file to the application program of the target tenant, so that the application program of the target tenant can enable the voice semantic model, thereby achieving the purpose of avoiding the APP tenant from triggering the new voice semantic model to take effect by manual operation, realizing the technical effect of providing more efficient, unified and automatic voice instruction service for each APP tenant after the model is deployed, and further solving the technical problem that the APP tenant triggers the new voice semantic model to take effect by means of manual operation in the prior art.
It should be noted that, the deployment module 52 and the synchronization module 54 correspond to steps S102 to S104 in embodiment 1, and the modules are the same as examples and application scenarios implemented by the corresponding steps, but are not limited to those disclosed in embodiment 1.
Optionally, the synchronization module 54 includes: the first generation unit is used for generating menu standard question files according to application programs of different types of target tenants, wherein the menu standard question files at least comprise: menu name, standard question mark; and the synchronizing unit is used for synchronizing the menu standard questioning file to the application program of the corresponding target tenant according to the preset pushing mode.
Optionally, the apparatus further includes: the first acquisition module is used for acquiring voice information received by an application program of a target tenant after the voice semantic model is started; the recognition module is used for recognizing the voice information according to the voice semantic model to obtain a real-time message, wherein the real-time message at least comprises: the standard question mark, the menu name corresponding to the standard question mark and the confidence level; and the sending module is used for sending the real-time message to the application program of the target tenant, so that the application program of the target tenant can be matched with the corresponding menu identification from the local menu file according to the standard question identification in the real-time message, and menu display or interface skip can be carried out according to the menu identification.
Optionally, the identification module includes: the recognition unit is used for inputting the voice information into the voice semantic model, and recognizing the standard question mark corresponding to the voice information by the voice semantic model; the determining unit is used for determining menu names and confidence degrees corresponding to the standard question marks; and the second generation unit is used for generating the real-time message according to the standard question mark, the menu name corresponding to the standard question mark and the confidence level.
Optionally, the apparatus further includes: the second obtaining module is configured to obtain an updated corpus before deploying the speech semantic model, where the updated corpus at least includes: expanding a question table, a standard question table and a menu name table; the training module is used for training the voice semantic model by using the linguistic data in the updated linguistic data table, wherein the voice semantic model is respectively trained by using the linguistic data corresponding to the extended question table, the standard question table and the menu name table, and the voice semantic model obtained by training based on different linguistic data tables is subjected to cross verification until the voice semantic model which is finally used for deployment is obtained.
Optionally, the apparatus further includes: the third acquisition module is used for acquiring the corpus file of the application program of the target tenant before acquiring the updated corpus list; the extracting module is used for extracting the corpus information in the corpus file, wherein the corpus information at least comprises: menu name, standard question and corresponding extension question of standard question; and the updating module is used for updating the corpus table according to the corpus information.
Optionally, the apparatus further includes: the setting module is used for setting a local menu file of an application program of the target tenant before the updated corpus is acquired, wherein the local menu file at least comprises: menu name and menu identification.
Example 3
According to another aspect of the embodiment of the present invention, there is further provided a computer readable storage medium, where the computer readable storage medium includes a stored program, and when the program runs, the device in which the computer readable storage medium is controlled to execute the method for processing the speech semantic model according to any one of the above.
Alternatively, in this embodiment, the above-mentioned computer-readable storage medium may be located in any one of the computer terminals in the computer terminal group in the computer network and/or in any one of the mobile terminals in the mobile terminal group, and the above-mentioned computer-readable storage medium includes a stored program.
Optionally, the computer readable storage medium is controlled to perform the following functions when the program is run: deploying a voice semantic model; and synchronizing the menu standard question file to the application program of the target tenant so that the application program of the target tenant matches the local menu file with the menu standard question file, generating a menu association file, and loading the menu association file to enable the voice semantic model.
Example 4
According to another aspect of the embodiment of the present invention, there is further provided a processor, where the processor is configured to run a program, and when the program runs, perform a method for processing a speech semantic model according to any one of the above.
The embodiment of the invention provides equipment, which comprises a processor, a memory and a program stored in the memory and capable of running on the processor, wherein the processor realizes the following steps when executing the program: deploying a voice semantic model; and synchronizing the menu standard question file to the application program of the target tenant so that the application program of the target tenant matches the local menu file with the menu standard question file, generating a menu association file, and loading the menu association file to enable the voice semantic model.
The invention also provides a computer program product adapted to perform, when executed on a data processing device, a program initialized with the method steps of: deploying a voice semantic model; and synchronizing the menu standard question file to the application program of the target tenant so that the application program of the target tenant matches the local menu file with the menu standard question file, generating a menu association file, and loading the menu association file to enable the voice semantic model.
The foregoing embodiment numbers of the present invention are merely for the purpose of description, and do not represent the advantages or disadvantages of the embodiments.
In the foregoing embodiments of the present invention, the descriptions of the embodiments are emphasized, and for a portion of this disclosure that is not described in detail in this embodiment, reference is made to the related descriptions of other embodiments.
In the several embodiments provided in the present application, it should be understood that the disclosed technology content may be implemented in other manners. The above-described embodiments of the apparatus are merely exemplary, and the division of the units, for example, may be a logic function division, and may be implemented in another manner, for example, a plurality of units or components may be combined or may be integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be through some interfaces, units or modules, or may be in electrical or other forms.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in the embodiments of the present invention may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.
The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied essentially or in part or all of the technical solution or in part in the form of a software product stored in a storage medium, including instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a removable hard disk, a magnetic disk, or an optical disk, or other various media capable of storing program codes.
The foregoing is merely a preferred embodiment of the present invention and it should be noted that modifications and adaptations to those skilled in the art may be made without departing from the principles of the present invention, which are intended to be comprehended within the scope of the present invention.

Claims (7)

1. A method for processing a speech semantic model, comprising:
deploying a voice semantic model;
synchronizing a menu standard question file to an application program of a target tenant, so that the application program of the target tenant matches a local menu file with the menu standard question file, generates a menu association file, and loads the menu association file to enable the voice semantic model;
wherein the local menu file at least comprises: menu name and menu identification;
synchronizing the menu standard questioning file to the application program of the target tenant comprises:
generating a menu standard question file according to application programs of different types of target tenants, wherein the menu standard question file at least comprises: menu name, standard question mark;
synchronizing the menu standard question file to the corresponding application program of the target tenant according to a preset pushing mode;
after enabling the speech semantic model, further comprising:
acquiring voice information received by an application program of the target tenant;
identifying the voice information according to the voice semantic model to obtain a real-time message, wherein the real-time message at least comprises: the menu name and the confidence level corresponding to the standard question mark;
and sending the real-time message to the application program of the target tenant, so that the application program of the target tenant matches a corresponding menu identifier from the local menu file according to the standard question identifier in the real-time message, and performs menu display or interface skip according to the menu identifier.
2. The method of claim 1, wherein identifying the speech information from the speech semantic model to obtain a real-time message comprises:
inputting the voice information into the voice semantic model, and recognizing a standard question mark corresponding to the voice information by the voice semantic model;
determining menu names and confidence levels corresponding to the standard question marks;
and generating the real-time message according to the standard question mark, the menu name corresponding to the standard question mark and the confidence level.
3. The method of claim 1, further comprising, prior to deploying the speech semantic model:
obtaining an updated corpus, wherein the updated corpus at least comprises: expanding a question table, a standard question table and a menu name table;
training the voice semantic model by using the linguistic data in the updated linguistic data table, wherein the voice semantic model is trained by using the linguistic data corresponding to the extended question table, the standard question table and the menu name table respectively, and cross-verifying the voice semantic model obtained by training based on different linguistic data tables until the voice semantic model finally used for deployment is obtained.
4. A method according to claim 3, comprising, prior to obtaining the updated corpus:
acquiring a corpus file of an application program of the target tenant;
extracting corpus information in the corpus file, wherein the corpus information at least comprises: menu name, standard question and corresponding extension question of said standard question;
and updating a corpus table according to the corpus information.
5. The method of claim 3, further comprising, prior to obtaining the updated corpus:
setting a local menu file of an application program of the target tenant, wherein the local menu file at least comprises: menu name and menu identification.
6. A processing apparatus for a speech semantic model, comprising:
the deployment module is used for deploying the voice semantic model;
the synchronous module is used for synchronizing the menu standard question file to the application program of the target tenant so that the application program of the target tenant matches the local menu file with the menu standard question file, a menu associated file is generated, and the menu associated file is loaded to enable the voice semantic model;
wherein the local menu file at least comprises: menu name and menu identification;
the synchronization module includes: the first generation unit is used for generating menu standard question files according to application programs of different types of target tenants, wherein the menu standard question files at least comprise: menu name, standard question mark; the synchronization unit is used for synchronizing the menu standard questioning file to the application program of the corresponding target tenant according to a preset pushing mode;
the apparatus further comprises: the first acquisition module is used for acquiring voice information received by an application program of a target tenant after the voice semantic model is started; the recognition module is used for recognizing the voice information according to the voice semantic model to obtain a real-time message, wherein the real-time message at least comprises: the standard question mark, the menu name corresponding to the standard question mark and the confidence level; and the sending module is used for sending the real-time message to the application program of the target tenant, so that the application program of the target tenant can be matched with the corresponding menu identification from the local menu file according to the standard question identification in the real-time message, and menu display or interface skip can be carried out according to the menu identification.
7. A computer-readable storage medium, characterized in that the computer-readable storage medium comprises a stored program, wherein the program, when run, controls a device in which the computer-readable storage medium is located to perform the method of processing a speech semantic model according to any one of claims 1 to 5.
CN202110475912.4A 2021-04-29 2021-04-29 Processing method and device of voice semantic model Active CN113408736B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110475912.4A CN113408736B (en) 2021-04-29 2021-04-29 Processing method and device of voice semantic model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110475912.4A CN113408736B (en) 2021-04-29 2021-04-29 Processing method and device of voice semantic model

Publications (2)

Publication Number Publication Date
CN113408736A CN113408736A (en) 2021-09-17
CN113408736B true CN113408736B (en) 2024-04-12

Family

ID=77677716

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110475912.4A Active CN113408736B (en) 2021-04-29 2021-04-29 Processing method and device of voice semantic model

Country Status (1)

Country Link
CN (1) CN113408736B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116933062B (en) * 2023-09-18 2023-12-15 中孚安全技术有限公司 Intelligent file judgment system and method

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107257373A (en) * 2017-06-15 2017-10-17 国电南瑞科技股份有限公司 A kind of grid model data maintenance management method based on CIM/E files
CN108597522A (en) * 2018-05-10 2018-09-28 北京奇艺世纪科技有限公司 A kind of method of speech processing and device
CN109119067A (en) * 2018-11-19 2019-01-01 苏州思必驰信息科技有限公司 Phoneme synthesizing method and device
CN111933118A (en) * 2020-08-17 2020-11-13 苏州思必驰信息科技有限公司 Method and device for optimizing voice recognition and intelligent voice dialogue system applying same
CN112580359A (en) * 2019-09-11 2021-03-30 甲骨文国际公司 Computer implemented method, training system and computer program product

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107257373A (en) * 2017-06-15 2017-10-17 国电南瑞科技股份有限公司 A kind of grid model data maintenance management method based on CIM/E files
CN108597522A (en) * 2018-05-10 2018-09-28 北京奇艺世纪科技有限公司 A kind of method of speech processing and device
CN109119067A (en) * 2018-11-19 2019-01-01 苏州思必驰信息科技有限公司 Phoneme synthesizing method and device
CN112580359A (en) * 2019-09-11 2021-03-30 甲骨文国际公司 Computer implemented method, training system and computer program product
CN111933118A (en) * 2020-08-17 2020-11-13 苏州思必驰信息科技有限公司 Method and device for optimizing voice recognition and intelligent voice dialogue system applying same

Also Published As

Publication number Publication date
CN113408736A (en) 2021-09-17

Similar Documents

Publication Publication Date Title
CN102880649B (en) A kind of customized information disposal route and system
US20170124069A1 (en) Translation Review Workflow Systems and Methods
CN107153965A (en) A kind of intelligent customer service solution of multiple terminals
CN110489198A (en) A kind of method and system of worksheet
CN106326452A (en) Method for human-machine dialogue based on contexts
WO2015141700A1 (en) Dialogue system construction support apparatus and method
CN108304424B (en) Text keyword extraction method and text keyword extraction device
CN101193069A (en) Information inquiry system, instant communication robot server and information inquiry method
CN111553138B (en) Auxiliary writing method and device for standardizing content structure document
CN108306813B (en) Session message processing method, server and client
CN112035630A (en) Dialogue interaction method, device, equipment and storage medium combining RPA and AI
CN113408736B (en) Processing method and device of voice semantic model
CN117112065B (en) Large model plug-in calling method, device, equipment and medium
JP6797382B1 (en) Question answer display server, question answer display method and question answer display program
US20220129628A1 (en) Artificial intelligence system for business processes
CN117251538A (en) Document processing method, computer terminal and computer readable storage medium
CN111723559A (en) Real-time information extraction method and device
CN114528851B (en) Reply sentence determination method, reply sentence determination device, electronic equipment and storage medium
CN113573029B (en) Multi-party audio and video interaction method and system based on IOT
CN111683174B (en) Incoming call processing method, device and system
JP2021015599A (en) Question/answer display server, question/answer display method and question/answer display program
CN111859148A (en) Theme extraction method, device and equipment and computer readable storage medium
CN109981490B (en) Intelligent network switch system with action value-added service
CN117349425B (en) Knowledge item generation method, device, equipment and storage medium
CN112148861B (en) Intelligent voice broadcasting method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant