CN111402906B - Speech decoding method, device, engine and storage medium - Google Patents

Speech decoding method, device, engine and storage medium Download PDF

Info

Publication number
CN111402906B
CN111402906B CN202010155132.7A CN202010155132A CN111402906B CN 111402906 B CN111402906 B CN 111402906B CN 202010155132 A CN202010155132 A CN 202010155132A CN 111402906 B CN111402906 B CN 111402906B
Authority
CN
China
Prior art keywords
decoding
thread
level
channel
voice
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010155132.7A
Other languages
Chinese (zh)
Other versions
CN111402906A (en
Inventor
赵伟伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
WeBank Co Ltd
Original Assignee
WeBank Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by WeBank Co Ltd filed Critical WeBank Co Ltd
Priority to CN202010155132.7A priority Critical patent/CN111402906B/en
Publication of CN111402906A publication Critical patent/CN111402906A/en
Application granted granted Critical
Publication of CN111402906B publication Critical patent/CN111402906B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/008Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Telephonic Communication Services (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

The invention discloses a voice decoding method, a device, an engine and a storage medium, wherein the method is applied to the voice decoding engine, when a plurality of voice decoding requests are received, a plurality of thread-level decoding channels are applied, and the plurality of voice decoding requests are in one-to-one correspondence with the plurality of thread-level decoding channels; and respectively calling a general model by using the thread-level decoding channels, performing parallel decoding processing on voice stream data in the voice decoding requests to obtain decoding results, and responding to the voice decoding requests based on the decoding results. Therefore, a plurality of voice decoding requests are processed in parallel through a plurality of thread-level decoding channels, and the plurality of thread-level decoding channels are requested to share a common model, so that the parallel processing of the thread level of voice decoding is realized, the hardware cost is reduced, and the concurrency capacity and decoding efficiency of voice decoding are improved.

Description

Speech decoding method, device, engine and storage medium
Technical Field
The present invention relates to the field of speech recognition technologies, and in particular, to a speech decoding method, device, engine and storage medium.
Background
With the development of computer technology, more and more technologies (big data, distributed, blockchain Blockchain, artificial intelligence, etc.) are applied in the financial field, and the traditional financial industry is gradually changing to financial technology (Fintech), but due to the requirements of security and real-time performance of the financial industry, higher requirements are also put forward on the technologies.
Speech decoding is an important component of speech recognition. Currently, the text corresponding to the voice stream data is generally obtained by decoding the voice stream data based on a general model. If the parallel processing is needed to be realized, the speech decoding efficiency can only be realized by deploying more general models at the process level, but the general models are large in size, so that the hardware cost is greatly improved.
Disclosure of Invention
The invention provides a voice decoding method, a device, an engine and a storage medium, which aim to create a common model for a plurality of decoding channels, realize parallel processing at a thread level, reduce hardware cost and improve concurrency capacity and decoding efficiency of voice decoding.
To achieve the above object, the present invention provides a speech decoding method, the method comprising:
When a plurality of voice decoding requests are received, a plurality of thread-level decoding channels are applied, and the plurality of voice decoding requests are in one-to-one correspondence with the plurality of thread-level decoding channels;
and respectively calling a general model by using the thread-level decoding channels, performing parallel decoding processing on voice stream data in the voice decoding requests to obtain decoding results, and responding to the voice decoding requests based on the decoding results.
Preferably, the thread-level decoding channel comprises a channel decoding unit, a data cache area and a callback interface unit;
The step of using the plurality of thread-level decoding channels to respectively call a general model, performing parallel decoding processing on the voice stream data in the plurality of voice decoding requests to obtain decoding results, and responding to the plurality of voice decoding requests based on the decoding results comprises the steps of:
respectively caching voice stream data in the voice decoding requests by utilizing data cache channels of the thread-level decoding channels;
respectively calling a general model by utilizing channel decoding units of the thread-level decoding channels, and performing parallel decoding processing on voice stream data in the voice decoding requests to obtain decoding results;
and respectively responding to the voice decoding requests based on the decoding results by utilizing callback interface units of the thread-level decoding channels.
Preferably, the buffering of the voice stream data in the plurality of voice decoding requests by the data buffering channels of the plurality of thread-level decoding channels includes:
for any specific thread level decoding channel in the plurality of thread level decoding channels, checking the data state of a data buffer area of the specific thread level decoding channel;
If the data state of the data buffer area of the specific thread level decoding channel is waiting data, directly temporarily storing the voice stream data corresponding to the specific thread level decoding channel in the data buffer area of the specific thread level decoding channel;
And if the data state of the data buffer area of the specific thread level decoding channel is data, temporarily storing the voice stream data corresponding to the specific thread level decoding channel at the tail end of the data buffer area of the specific thread level decoding channel.
Preferably, the channel decoding unit that uses the multiple thread-level decoding channels respectively invokes a common model, and performs parallel decoding processing on the voice stream data in the multiple voice decoding requests, and the step of obtaining a decoding result includes:
Respectively calling a general model by using channel decoding units of the plurality of thread-level decoding channels;
and based on the general model, converting the voice stream data into a feature vector set in each channel decoding unit in parallel, and converting the feature vector set into a decoding result.
Preferably, the thread-level decoding channel further comprises a state control unit,
The method further comprises the steps of:
updating the running state of the thread-level decoding channel, the data state of the data buffer area and the registration state of the callback interface unit in real time through a state control unit of the thread-level decoding channel so as to execute corresponding steps according to the running state, the data state and the registration state; and/or
And receiving an external control signal by a state control unit of the thread-level decoding channel, and adjusting the running state of the thread-level decoding channel according to the external control signal.
Preferably, the thread-level decoding channel comprises a reclamation unit;
The step of using the decoding channels of the multiple thread levels to respectively call a general model, performing parallel decoding processing on the voice stream data in the multiple voice decoding requests, and responding to the multiple voice decoding requests based on decoding results further comprises:
clearing the voice stream data in the data buffer area of the thread level decoding channel through the recovery unit of the thread level decoding channel so as to store the voice stream data again by utilizing the data buffer area; and/or
And clearing the state information recorded by the state control unit of the thread-level decoding channel through the recovery unit of the thread-level decoding channel so that the state control unit can record the state of the thread-level decoding channel again.
Preferably, the registration state of the callback interface unit is registered or unregistered;
Before the step of respectively responding to the plurality of voice decoding requests by using the callback interface unit of the plurality of thread-level decoding channels based on the decoding results, the callback interface unit further comprises:
Checking the registration state of a callback interface of the thread-level decoding channel recorded by a state control unit of each thread-level decoding channel;
If the registration state of the callback interface unit of the thread-level decoding channel is registered, executing the steps: respectively responding to the voice decoding requests based on the decoding results by utilizing callback interface units of the thread-level decoding channels;
and if the registration state of the callback interface unit of the thread-level decoding channel is unregistered, clearing the voice stream data of the data cache region through the recovery unit of the thread-level decoding channel, and clearing the state information updated by the state control unit.
In addition, in order to achieve the above object, the present invention also provides a speech decoding apparatus including:
The application module is used for applying a plurality of thread-level decoding channels when a plurality of voice decoding requests are received, wherein the plurality of voice decoding requests are in one-to-one correspondence with the plurality of thread-level decoding channels;
and the decoding module is used for respectively calling a general model by utilizing the plurality of thread-level decoding channels, carrying out parallel decoding processing on voice stream data in the plurality of voice decoding requests, obtaining a decoding result, and responding to the plurality of voice decoding requests based on the decoding result.
In addition, in order to achieve the above object, the present invention also provides a speech decoding engine including a processor, a memory, and a speech decoding program stored in the memory, which when executed by the processor, implements the steps of the speech decoding method as described above.
In addition, in order to achieve the above object, the present invention also provides a computer storage medium having stored thereon a speech decoding program which, when executed by a processor, implements the steps of the speech decoding method as described above.
Compared with the prior art, the invention provides a voice decoding method, a device, an engine and a storage medium, wherein the method is applied to the voice decoding engine, when a plurality of voice decoding requests are received, a plurality of thread-level decoding channels are applied, and the plurality of voice decoding requests are in one-to-one correspondence with the plurality of thread-level decoding channels; and respectively calling a general model by using the thread-level decoding channels, performing parallel decoding processing on voice stream data in the voice decoding requests to obtain decoding results, and responding to the voice decoding requests based on the decoding results. Therefore, a plurality of voice decoding requests are processed in parallel through a plurality of thread-level decoding channels, and the plurality of thread-level decoding channels are requested to share a common model, so that the parallel processing of the thread level of voice decoding is realized, the hardware cost is reduced, and the concurrency capacity and decoding efficiency of voice decoding are improved.
Drawings
FIG. 1 is a schematic diagram of the hardware architecture of a speech decoding engine according to various embodiments of the present invention;
FIG. 2 is a schematic diagram of the components of a speech decoding engine of the present invention;
FIG. 3 is a schematic diagram of the components of the speech decoding channel of the present invention
FIG. 4 is a flow chart of a first embodiment of the speech decoding method of the present invention;
FIG. 5 is a schematic diagram of a speech stream data processing flow according to an embodiment of the speech decoding method of the present invention;
fig. 6 is a schematic functional block diagram of a first embodiment of the speech decoding apparatus according to the present invention.
The achievement of the objects, functional features and advantages of the present invention will be further described with reference to the accompanying drawings, in conjunction with the embodiments.
Detailed Description
It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
The voice decoding engine mainly related to the embodiment of the invention refers to a network connection engine capable of realizing network connection, and the voice decoding engine can be a server, a cloud platform and the like.
Referring to fig. 1, fig. 1 is a schematic diagram of a hardware configuration of a speech decoding engine according to various embodiments of the present invention. In an embodiment of the present invention, the speech decoding engine may include a processor 1001 (e.g., central processing unit Central Processing Unit, CPU), a communication bus 1002, an input port 1003, an output port 1004, and a memory 1005. Wherein the communication bus 1002 is used to enable connected communications between these components; the input port 1003 is used for data input; the output port 1004 is used for data output, and the memory 1005 may be a high-speed RAM memory or a stable memory (non-volatile memory), such as a disk memory, and the memory 1005 may be an optional storage device independent of the processor 1001. Those skilled in the art will appreciate that the hardware configuration shown in fig. 1 is not limiting of the invention and may include more or fewer components than shown, or may combine certain components, or a different arrangement of components.
With continued reference to fig. 1, the memory 1005 of fig. 1, which is a readable storage medium, may include an operating system, a network communication module, an application program module, and a speech decoding program. In fig. 1, the network communication module is mainly used for connecting with a server and performing data communication with the server; and the processor 1001 may call a voice decoding program stored in the memory 1005 and execute the voice decoding method provided by the embodiment of the present invention.
Further, referring to fig. 2, fig. 2 is a schematic diagram of the components of the speech decoding engine of the present invention. The decoding engine is an important component of the speech recognition technology, interfaces with the network communication protocol, controls the speech stream data and decoding algorithms, returns decoding results, and responds to various control signals. From the application device, it can be classified into a terminal decoding engine (e.g. handheld device, embedded device), a server decoding engine (e.g. cloud service), etc. According to different scenes, the decoder is required to be designed in a targeted manner, for example, a terminal decoding engine only needs to decode in a single path, power consumption is reduced, and cloud services need to be parallel in multiple paths as much as possible, so that deployment cost is reduced.
The speech decoding engine includes a generic model and a plurality of thread-level decoding channels. Wherein the generic model is a speech recognition model, including an acoustic model, a language model, and the like. The general model is a speech decoding model obtained by training huge amount of data, can be used as a model training base for vertical field recognition, and is combined with field data to perform migration learning training, so that a high-precision general model is obtained. It will be appreciated that the generic model includes a decoding algorithm, which may be a bundle search algorithm, a viterbi algorithm, or the like. Converting the audio stream data into a voice feature vector set by a mel frequency cepstrum coefficient (MelFrequencyCepstrumCoefficient, MFCC), a linear prediction cepstrum coefficient (LinearPredictiveCepstralCoefficient, LPCC), a linear prediction analysis (LinearPredictionCoefficients, LPC) and other methods, wherein the acoustic model provides a probability that a voice feature vector is converted into a certain semantic; the language model provides probabilities of transitions between semantics and finally obtains decoding results. In this embodiment, the training process of the generic model is not different from the training process of the existing speech decoding model, and will not be described here again.
The generic model provides support for the plurality of decoding channels to decode a speech stream. With continued reference to FIG. 2, the decoding engine includes a plurality of thread-level decoding channels: thread level decode channel 1, thread level decode channel 2 … …, thread level decode channel n. The individual thread-level decoding channels may process the speech streams in parallel. Each thread-level decoding channel corresponds to one speech decoding request, thereby providing a timely and quick response for each speech decoding request. And a plurality of thread-level decoding channels are arranged on the basis of the general model, so that the volume of the general model is not increased, and compared with parallel processing of a process level, the hardware cost can be greatly reduced. One path uses one thread-level decoding channel in parallel, the thread-level decoding channel and the thread-level decoding channel are isolated from each other, but the same general model is used, so that one process can comprise n thread-level decoding channels, namely n paths of parallel decoding capability. Because the thread-level decoding channel operates at the thread level, a process can comprise any multi-path parallel decoding channel, and compared with a one-path decoding channel method of a process, the processing efficiency can be improved by times, and the use efficiency of a memory and a video memory can be greatly improved.
Further, referring to fig. 3, fig. 3 is a schematic diagram of the composition of the speech decoding channel of the present invention. The thread-level decoding channel comprises a data buffer area, a channel decoding unit, a callback interface unit and a state control unit.
The data buffer area is used for storing voice stream data for the channel decoding unit to read; the buffer capacity of the data buffer area can be specifically set according to actual needs.
And the channel decoding unit extracts and reads the voice stream data from the data cache region, and calls the universal model to decode the voice stream data after obtaining the voice stream data, so as to obtain a decoding result. Typically, the decoding result is an optimal text corresponding to the voice stream data;
the callback interface unit is used for returning the decoding result to the client;
The state control unit is used for updating the states of the data cache area, the channel decoding unit and the callback interface unit in real time; the state control unit is also used for influencing external control signals and adjusting the running state of the channel according to the external control signals.
Further, the thread-level decoding channel further comprises a reclamation unit, wherein the reclamation unit is used for emptying the data buffer area, clearing state information updated by the state control unit and emptying data format information of the voice stream data recorded by the state control unit. Thus, the thread-level decoding channel can be reused after completing a task.
The embodiment of the invention provides a voice decoding method.
Referring to fig. 4, fig. 4 is a flowchart illustrating a first embodiment of the speech decoding method according to the present invention.
Step S101: when a plurality of voice decoding requests are received, a plurality of thread-level decoding channels are applied, and the plurality of voice decoding requests are in one-to-one correspondence with the plurality of thread-level decoding channels;
After the voice decoding engine establishes network connection with the client, the voice decoding request sent by the client can be received. The speech decoding engine comprises a plurality of thread-level decoding channels for decoding, so that network connection can be established with a plurality of clients simultaneously, and a corresponding plurality of speech decoding requests are received.
And the voice decoding engine establishes network connection with a plurality of clients, and applies for a corresponding thread-level decoding channel for each client after the connection is successful. The plurality of speech decoding requests are in one-to-one correspondence with the plurality of thread-level decoding channels.
Respectively calling a general model by utilizing the thread-level decoding channels, performing parallel decoding processing on voice stream data in the voice decoding requests to obtain decoding results, and responding to the voice decoding requests based on the decoding results
Step S102: and respectively calling a general model by using the thread-level decoding channels, performing parallel decoding processing on voice stream data in the voice decoding requests to obtain decoding results, and responding to the voice decoding requests based on the decoding results.
The thread-level decoding channel comprises a channel decoding unit, a data buffer area and a callback interface unit, wherein the channel decoding unit, the data buffer area and the callback interface unit cooperatively operate to store voice stream data, decode the voice stream data to obtain a decoding result, and respond the decoding result to a voice decoding request.
Specifically, the step S102: the step of respectively calling a general model by utilizing the plurality of thread-level decoding channels, performing parallel decoding processing on voice stream data in the plurality of voice decoding requests to obtain decoding results, and responding to the plurality of voice decoding requests based on the decoding results comprises the following steps:
step S102a: respectively caching voice stream data in the voice decoding requests by utilizing data cache channels of the thread-level decoding channels;
The data buffer area is used for buffering the voice stream data received based on the voice decoding requests, and the plurality of data buffer areas of the plurality of thread-level decoding channels are respectively used for storing the voice stream data in different voice decoding requests. The voice stream data corresponding to the voice decoding requests are respectively stored in different data buffer areas temporarily, and the different voice stream data are mutually isolated, so that the integrity of the voice stream data is ensured, and the voice stream data are orderly stored and are not easy to lose.
Specifically, the step S102a: the step of buffering the voice stream data in the plurality of voice decoding requests by using the data buffers of the plurality of thread-level decoding channels respectively includes:
Step S102a1: for any specific thread level decoding channel in the plurality of thread level decoding channels, checking the data state of a data buffer area of the specific thread level decoding channel;
each voice decoding request corresponds to a particular thread-level decoding channel, and the particular thread-level decoding channel corresponding to the voice decoding request is marked as the particular thread-level decoding channel.
And storing the voice stream data into the data buffer area of the specific thread level decoding channel, determining the data state of the data buffer area of the specific thread level decoding channel, and selecting a corresponding voice stream data storage mode according to the data state.
Step S102a2: if the data state of the data buffer area of the specific thread level decoding channel is waiting data, directly temporarily storing the voice stream data corresponding to the specific thread level decoding channel in the data buffer area of the specific thread level decoding channel;
In this embodiment, if the data state of the data buffer area of the specific thread level decoding channel is waiting data, the voice stream data corresponding to the specific thread level decoding channel is directly buffered in the data buffer area of the specific thread level decoding channel.
Step S102a3: and if the data state of the data buffer area of the specific thread level decoding channel is data, temporarily storing the voice stream data corresponding to the specific thread level decoding channel at the tail end of the data buffer area of the specific thread level decoding channel.
Therefore, the channel decoding unit can sequentially extract the voice stream data in the data buffer area, so that the loss of the voice stream data is avoided, and the un-decoded voice stream data is not missed in the decoding process.
Step S102b: respectively calling a general model by utilizing channel decoding units of the thread-level decoding channels, and performing parallel decoding processing on voice stream data in the voice decoding requests to obtain decoding results;
In this embodiment, the generic model provides support for the multiple thread-level decoding channels, and the multiple thread-level decoding channels decode the respective voice stream data by using the generic model to obtain a decoding result.
Specifically, the step S102b includes:
Step S102b1: respectively calling a general model by using channel decoding units of the plurality of thread-level decoding channels;
and after the channel decoding unit extracts the voice stream data from the data cache region, a general model is called, and the general model decodes the voice stream data.
Step S102b2: and based on the general model, converting the voice stream data into a feature vector set in each channel decoding unit in parallel, and converting the feature vector set into a decoding result.
The generic model is a speech recognition model, including an acoustic model, a language model, and the like. The general model is a speech decoding model obtained by training huge amount of data, can be used as a model training base for vertical field recognition, and is combined with field data to perform migration learning training, so that a high-precision general model is obtained. The generic model includes a decoding algorithm, which may be a bundle search algorithm, a viterbi algorithm, or the like. In this embodiment, after the general model converts the audio stream data into a voice feature vector set by means of mel frequency cepstrum coefficient (MelFrequencyCepstrumCoefficient, MFCC), linear prediction cepstrum coefficient (LinearPredictiveCepstralCoefficient, LPCC), linear prediction analysis (LinearPredictionCoefficients, LPC) and the like, the acoustic model is used to provide a probability that a voice feature vector is converted into a certain semantic; and providing the probability of transition between semantics by using the language model, and finally obtaining a decoding result. The decoding result is text data.
In this embodiment, decoding may be performed based on the viterbi algorithm. The Viterbi algorithm is a general decoding algorithm and is a method for solving the shortest path of a sequence based on dynamic programming. The viterbi algorithm is a dynamic programming algorithm for finding the-viterbi path-hidden state sequence most likely to produce the sequence of observed events, especially in the markov information source context and hidden markov models. In this embodiment, based on the voice stream data, the most probable hidden state sequence is obtained by using a viterbi algorithm, and the most probable hidden state sequence is marked as a decoding result. Typically, the decoding result is text corresponding to the voice stream data. In addition, the voice stream data may be decoded based on a beam search algorithm. The cluster search is a heuristic graph search algorithm, and is generally used under the condition that the solution space of a graph is relatively large, in order to reduce the space and time occupied by searching, when each step of depth expansion is performed, some nodes with relatively poor quality are cut off, and some nodes with relatively high quality are reserved. This reduces space consumption and improves time efficiency.
Step S102c: and respectively responding to the voice decoding requests based on the decoding results by utilizing callback interface units of the thread-level decoding channels.
And after the connection is successfully established with the client corresponding to the voice decoding request and the corresponding decoding channel is obtained, registering a callback interface unit so that the callback interface unit returns the decoding result to the corresponding client.
After the corresponding channel decoding unit and callback interface unit are obtained, initializing the corresponding data buffer area, the corresponding channel decoding unit and the corresponding callback interface unit for standby use in receiving a voice stream corresponding to the voice decoding request, decoding the voice stream by the channel decoding unit, and responding the decoding result to the voice decoding requests by the callback interface unit.
In this embodiment, network connection may be established with multiple clients simultaneously or sequentially. After establishing network connection, receiving voice stream data uploaded by the client in real time through a communication protocol determined by the client; the communication protocol may be TCP (Transmission Control Protocol transmission control protocol), HTTP (HyperText Transfer Protocol, hypertext transmission protocol), webSocket (duplex communication protocol based on TCP), MRCP (Media Resource Control Protocol ), or the like. The client comprises a webpage, a microphone, a mobile terminal and the like.
Each client corresponds to one decoding channel, the decoding channels are mutually isolated, the voice decoding flow is respectively carried out, and the decoding flows among the decoding channels are mutually not interfered. For example, three speech decoding requests come from three clients, respectively: client a, client B and client C, apply for three decoding channels: decoding channel A, decoding channel B, decoding channel C. It can be understood that if the number of the voice decoding requests, that is, the number of the corresponding clients, exceeds the maximum value of the number of the preset decoding channels, adding the exceeded voice decoding requests into a queuing sequence according to a queuing rule, marking the voice decoding requests in the queuing sequence as queuing voice decoding requests, and accessing one or more corresponding numbers of the queuing voice decoding requests after one or more decoding channels complete the current voice decoding task. After the decoding channel is determined, initializing a channel decoding unit in the decoding channel, and registering a callback interface unit so as to enable a corresponding decoding result to be returned through the callback interface unit.
Further, the thread-level decoding channel further comprises a state control unit, wherein the state control unit is used for updating the states of the data cache area, the channel decoding unit and the callback interface unit in real time; the state control unit is also used for influencing external control signals and adjusting the running state of the channel according to the external control signals.
The method further comprises the steps of:
step S200: recording the running state of the thread-level decoding channel, the data state of the data buffer area and the registration state of the callback interface unit in real time through a state control unit of the thread-level decoding channel, so as to execute corresponding steps according to the running state, the data state and the registration state;
It will be appreciated that as the speech decoding engine processes the speech stream data, the operating state of the decoding channels in the speech decoding engine, the data state of the data buffers, and the registration state of the callback interface unit change. In order to better monitor the speech decoding flow, the present embodiment updates the running state, the data state, and the registration state in real time by the state control unit, so as to execute corresponding steps according to the running state, the data state, and the registration state. Wherein the operation state comprises operation middle and stop operation; the data state comprises time, waiting data and receiving end transmission data; the registration status includes registered and unregistered.
Step S300: and receiving an external control signal by a state control unit of the thread-level decoding channel, and adjusting the running state of the thread-level decoding channel according to the external control signal.
Further, the state control unit of the thread-level decoding channel may further receive an external control signal, the external control signal being sent by the client that sends the speech decoding request, the external control signal including a normal data stream signal, a network connection error signal, wherein the normal data stream signal includes start, pause, end, and the like. And if the external control signal is started, updating the running state of the thread-level decoding channel to be running, and if the external control signal is started, updating the running state of the thread-level decoding channel to be stopped.
Further, the thread-level decode channel includes a reclamation unit; the reclaiming unit is used for clearing the voice stream data in the data buffer area of the thread-level decoding channel so as to store the voice stream data again by utilizing the data buffer area. The reclaiming unit is also used for clearing the state information recorded by the state control unit of the thread-level decoding channel so that the state control unit can update the state of the thread-level decoding channel again.
Further, after the steps of respectively calling the universal models by using the decoding channels of the multiple thread levels, performing parallel decoding processing on the voice stream data in the multiple voice decoding requests, and responding to the multiple voice decoding requests based on the decoding results, the method further comprises:
Step 2011a, emptying the voice stream data in the data buffer area of the thread level decoding channel through the recovery unit of the thread level decoding channel so as to store the voice stream data again by utilizing the data buffer area;
In the process of executing the decoding process on the voice stream data, if the network connection is interrupted, an external control instruction for stopping is received, or the like, part or all of the voice stream data stored in the data buffer may not be extracted yet, and thus may still be stored in the data buffer. At this time, the recovery unit needs to empty the voice stream data in the data buffer of the thread-level decoding channel, so that when the thread-level decoding channel needs to execute the voice decoding process again, the voice stream data can be stored again by using the data buffer.
And step 2011b, clearing the state information recorded by the state control unit of the thread-level decoding channel through the recovery unit of the thread-level decoding channel so that the state control unit can record the state of the thread-level decoding channel again.
And in the process of performing voice decoding on the thread-level decoding channel, the recovery unit of the thread-level decoding channel is required to empty the state control unit of the thread-level decoding channel to record the state information of the thread-level decoding channel, the data buffer area and the callback interface unit. It will be appreciated that when a speech decoding process is completed, the state information recorded by the state control unit of the thread-level decoding channel needs to be cleared, so that the state control unit records the state of the thread-level decoding channel again.
Further, when the thread-level decoding channel receives the voice stream data, the state control unit records and updates data format information of the voice stream data, wherein the data format information comprises a data stream format, a data stream sampling rate, data stream coding and the like. And when the corresponding voice decoding process is finished, the recovery unit empties the data format information of the voice stream data more in the state control unit. Thus, disk space of the speech decoding engine can be saved.
Further, the registration state of the callback interface unit is registered or unregistered; if the client corresponding to the voice decoding request registers a callback interface, the state control unit updates the state of the corresponding callback interface to registered; otherwise, the registration state is unregistered.
The step S102c, before the step of respectively responding to the plurality of voice decoding requests based on the decoding results by using the callback interface units of the plurality of thread-level decoding channels, further includes:
step S102c0: checking the registration state of a callback interface of the thread-level decoding channel recorded by a state control unit of each thread-level decoding channel;
If the registration state of the callback interface unit of the thread-level decoding channel is registered, executing the steps: respectively responding to the voice decoding requests based on the decoding results by utilizing callback interface units of the thread-level decoding channels;
and if the registration state of the callback interface unit of the thread-level decoding channel is unregistered, clearing the voice stream data of the data cache region through the recovery unit of the thread-level decoding channel, and clearing the state information updated by the state control unit.
It may be appreciated that, if the registration status is registered, the decoding result may be directly responded to a plurality of clients corresponding to the plurality of voice decoding requests through a plurality of registered callback interface units. If the registration state is unregistered, it indicates that there is no legal callback interface unit, and the recovery unit of the thread level decoding channel needs to empty the voice stream data in the data buffer area, and empty the state information updated by the state control unit.
Referring to fig. 5, fig. 5 is a schematic diagram of a speech stream data processing flow according to an embodiment of the present invention, where the speech stream processing flow is exemplified by a specific one-thread-level decoding channel, and the overall flow of the speech decoding method according to the present invention is schematically illustrated.
As shown in fig. 5, firstly, receiving a voice decoding request, applying for a thread-level decoding channel, initializing a decoding unit, and registering a callback interface unit; and updating the running state of the thread-level decoding channel into running state, updating the state of the callback interface unit into registered state and updating the state of the data cache area into waiting data through a state control unit. In addition, the data format information of the voice stream data may be updated by the state control unit, where the data format information includes a data stream format, a data stream sampling rate, a data stream coding, and the like.
Then, the voice stream data is sent to a thread-level decoding channel through a communication protocol, and the thread-level decoding channel stores the voice stream data in a data buffer area temporarily; if the data buffer area has the un-decoded data, the data buffer area is added at the end of the buffer area, and the state of the data buffer area is updated to be 'data exists' through a state control unit.
Further, extracting voice stream data stored in the data buffer area through the channel decoding unit, and emptying the data buffer area; updating the data state to "waiting data" by the state control unit; and decoding the voice stream data to generate a decoding result.
After obtaining the decoding result, checking the registration state of the callback interface unit: and if the registration state is unregistered, clearing the state information recorded by the channel state control unit of the thread-level decoding channel through a recovery unit.
Otherwise, if the registration state is registered, returning a decoding result through a callback interface;
Further, judging whether an end data stream signal exists, wherein the end data stream transmission signal can be an external control signal, and executing the steps if the end data stream signal does not exist: sending the voice stream data to a thread-level decoding channel through a communication protocol, wherein the thread-level decoding channel stores the voice stream data in a data buffer area temporarily; if the data buffer area has the un-decoded data, the data buffer area is added at the end of the buffer area, and the state of the data buffer area is updated to be 'data exists' through a state control unit.
If the data stream ending signal exists, updating the data state into 'ending transmitting data' through the control unit, and reading all data in the data cache area by the decoding unit to obtain a decoding result;
Then checking the registration state of the callback interface, if the registration state is registered, returning a decoding result through the callback interface unit, and updating the running state of the thread-level decoding channel through the state control unit to stop running; and if the registration state is unregistered, clearing the state information recorded by the channel state control unit of the thread-level decoding channel through a recovery unit.
According to the scheme, when a plurality of voice decoding requests are received, a plurality of thread-level decoding channels are applied, and the plurality of voice decoding requests are in one-to-one correspondence with the plurality of thread-level decoding channels; and respectively calling a general model by using the thread-level decoding channels, performing parallel decoding processing on voice stream data in the voice decoding requests to obtain decoding results, and responding to the voice decoding requests based on the decoding results. Therefore, a plurality of voice decoding requests are processed in parallel through a plurality of thread-level decoding channels, and the plurality of thread-level decoding channels are requested to share a common model, so that the parallel processing of the thread level of voice decoding is realized, the hardware cost is reduced, and the concurrency capacity and decoding efficiency of voice decoding are improved.
In addition, the embodiment also provides a voice decoding device. Referring to fig. 6, fig. 6 is a schematic functional block diagram of a speech decoding apparatus according to a first embodiment of the present invention.
In this embodiment, the speech decoding apparatus is a virtual apparatus, and is stored in the memory 1005 of the speech decoding engine shown in fig. 1, so as to implement all the functions of the speech decoding program: when receiving a plurality of voice decoding requests, applying for a plurality of thread-level decoding channels, wherein the plurality of voice decoding requests are in one-to-one correspondence with the plurality of thread-level decoding channels; and the method is used for respectively calling a general model by utilizing the thread-level decoding channels, carrying out parallel decoding processing on the voice stream data in the voice decoding requests to obtain decoding results, and responding to the voice decoding requests based on the decoding results.
Specifically, the speech decoding apparatus includes:
The application module 10 is configured to apply for a plurality of thread-level decoding channels when receiving a plurality of voice decoding requests, where the plurality of voice decoding requests are in one-to-one correspondence with the plurality of thread-level decoding channels;
And the decoding module 20 is configured to invoke a common model by using the multiple thread-level decoding channels, perform parallel decoding processing on the voice stream data in the multiple voice decoding requests, obtain a decoding result, and respond to the multiple voice decoding requests based on the decoding result.
Further, the decoding module includes:
A buffer unit, configured to buffer voice stream data in the plurality of voice decoding requests by using data buffer units of the plurality of thread-level decoding channels, respectively;
The calling unit is used for calling the universal model respectively by utilizing the channel decoding units of the plurality of thread-level decoding channels and carrying out parallel decoding processing on the voice stream data in the plurality of voice decoding requests to obtain decoding results;
And the response unit is used for respectively responding to the voice decoding requests based on the decoding results by utilizing callback interface units of the thread-level decoding channels.
Further, the response unit includes:
A checking subunit, configured to check, for any specific thread level decoding channel of the plurality of thread level decoding channels, a data state of a data buffer area of the specific thread level decoding channel;
The first temporary storage subunit is used for directly temporarily storing the voice stream data corresponding to the specific thread level decoding channel in the data buffer area of the specific thread level decoding channel if the data state of the data buffer area of the specific thread level decoding channel is waiting data;
And the second temporary storage subunit is used for temporarily storing the voice stream data corresponding to the specific thread level decoding channel at the tail of the data buffer zone of the specific thread level decoding channel if the data state of the data buffer zone of the specific thread level decoding channel is data.
Further, the calling unit includes:
A calling subunit, configured to respectively call the generic models by using the channel decoding units of the multiple thread-level decoding channels;
and the decoding subunit is used for parallelly converting the voice stream data into a feature vector set in each channel decoding unit based on the universal model, and converting the feature vector set into a decoding result.
Further, the speech decoding apparatus further includes:
The updating module is used for updating the running state of the thread-level decoding channel, the data state of the data cache area and the registration state of the callback interface unit in real time through the state control unit of the thread-level decoding channel so as to execute corresponding steps according to the running state, the data state and the registration state; and/or
The control module is used for receiving an external control signal through the state control unit of the thread-level decoding channel and adjusting the running state of the thread-level decoding channel according to the external control signal.
Further, the decoding module further includes:
The first emptying unit is used for emptying the voice stream data in the data cache area of the thread-level decoding channel through the recovery unit of the thread-level decoding channel so as to store the voice stream data again by utilizing the data cache area; and/or
And the second clearing unit is used for clearing the state information recorded by the state control unit of the thread-level decoding channel through the recovery unit of the thread-level decoding channel so that the state control unit can record the state of the thread-level decoding channel again.
Further, the response unit further includes:
a checking subunit, configured to check the registration states of callback interfaces of the thread-level decoding channels recorded by the state control unit of each thread-level decoding channel;
And the execution subunit is configured to execute the steps if the registration state of the callback interface unit of the thread-level decoding channel is registered: respectively responding to the voice decoding requests based on the decoding results by utilizing callback interface units of the thread-level decoding channels;
and the clearing subunit is used for clearing the voice stream data of the data cache area through the recovery unit of the thread-level decoding channel and clearing the state information updated by the state control unit if the registration state of the callback interface unit of the thread-level decoding channel is unregistered.
In addition, the embodiment of the present invention further provides a computer storage medium, where a speech decoding program is stored, where the speech decoding program is executed by a processor to implement the steps of the speech decoding method described above, which is not described herein again.
Compared with the prior art, the method, the device, the engine and the storage medium for voice decoding are provided, wherein the method is applied to the voice decoding engine, when a plurality of voice decoding requests are received, a plurality of thread-level decoding channels are applied, and the plurality of voice decoding requests are in one-to-one correspondence with the plurality of thread-level decoding channels; and respectively calling a general model by using the thread-level decoding channels, performing parallel decoding processing on voice stream data in the voice decoding requests to obtain decoding results, and responding to the voice decoding requests based on the decoding results. Therefore, a plurality of voice decoding requests are processed in parallel through a plurality of thread-level decoding channels, and the plurality of thread-level decoding channels are requested to share a common model, so that the parallel processing of the thread level of voice decoding is realized, the hardware cost is reduced, and the concurrency capacity and decoding efficiency of voice decoding are improved.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or system that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or system. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or system that comprises the element.
The foregoing embodiment numbers of the present invention are merely for the purpose of description, and do not represent the advantages or disadvantages of the embodiments.
From the above description of the embodiments, it will be clear to those skilled in the art that the above-described embodiment method may be implemented by means of software plus a necessary general hardware platform, but of course may also be implemented by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a storage medium (e.g. ROM/RAM, magnetic disk, optical disk) as described above, comprising several instructions for causing a terminal device to perform the method according to the embodiments of the present invention.
The foregoing description is only of the preferred embodiments of the present invention and is not intended to limit the scope of the invention, and all equivalent structures or modifications in the structures or processes described in the specification and drawings, or the direct or indirect application of the present invention to other related technical fields, are included in the scope of the present invention.

Claims (9)

1. A method of speech decoding, the method being applied to a speech decoding engine, comprising:
When a plurality of voice decoding requests are received, a plurality of thread-level decoding channels are applied, wherein the voice decoding requests are in one-to-one correspondence with the thread-level decoding channels, and the thread-level decoding channels comprise a channel decoding unit, a data cache area and a callback interface unit;
respectively caching voice stream data in the voice decoding requests by utilizing data cache channels of the thread-level decoding channels;
the channel decoding units of the thread-level decoding channels are utilized to respectively call the same general model, and the voice stream data in the voice decoding requests are subjected to parallel decoding processing to obtain decoding results, wherein each general model is used for parallel use of the thread-level decoding channels;
and respectively responding to the voice decoding requests based on the decoding results by utilizing callback interface units of the thread-level decoding channels.
2. The method of claim 1, wherein buffering the voice stream data in the plurality of voice decoding requests with the data buffers of the plurality of thread-level decoding channels, respectively, comprises:
for any specific thread level decoding channel in the plurality of thread level decoding channels, checking the data state of a data buffer area of the specific thread level decoding channel;
If the data state of the data buffer area of the specific thread level decoding channel is waiting data, directly temporarily storing the voice stream data corresponding to the specific thread level decoding channel in the data buffer area of the specific thread level decoding channel;
And if the data state of the data buffer area of the specific thread level decoding channel is data, temporarily storing the voice stream data corresponding to the specific thread level decoding channel at the tail end of the data buffer area of the specific thread level decoding channel.
3. The method according to claim 1, wherein the step of using the channel decoding units of the plurality of thread-level decoding channels to respectively call a common model to perform parallel decoding processing on the voice stream data in the plurality of voice decoding requests, and obtaining a decoding result includes:
Respectively calling a general model by using channel decoding units of the plurality of thread-level decoding channels;
and based on the general model, converting the voice stream data into a feature vector set in each channel decoding unit in parallel, and converting the feature vector set into a decoding result.
4. The method of claim 1, wherein the thread-level decode channel further comprises a state control unit,
The method further comprises the steps of:
updating the running state of the thread-level decoding channel, the data state of the data buffer area and the registration state of the callback interface unit in real time through a state control unit of the thread-level decoding channel so as to execute corresponding steps according to the running state, the data state and the registration state; and/or
And receiving an external control signal by a state control unit of the thread-level decoding channel, and adjusting the running state of the thread-level decoding channel according to the external control signal.
5. The method of claim 1, wherein the thread-level decode channel comprises a reclamation unit;
The step of using the decoding channels of the multiple thread levels to respectively call a general model, performing parallel decoding processing on the voice stream data in the multiple voice decoding requests, and responding to the multiple voice decoding requests based on decoding results further comprises:
clearing the voice stream data in the data buffer area of the thread level decoding channel through the recovery unit of the thread level decoding channel so as to store the voice stream data again by utilizing the data buffer area; and/or
And clearing the state information updated by the state control unit of the thread-level decoding channel through the recovery unit of the thread-level decoding channel so as to enable the state control unit to update the state of the thread-level decoding channel again.
6. The method of claim 1, wherein the registration status of the callback interface element is registered or unregistered;
Before the step of respectively responding to the plurality of voice decoding requests by using the callback interface unit of the plurality of thread-level decoding channels based on the decoding results, the callback interface unit further comprises:
Checking the registration state of a callback interface of the thread-level decoding channel recorded by a state control unit of each thread-level decoding channel;
If the registration state of the callback interface unit of the thread-level decoding channel is registered, executing the steps: respectively responding to the voice decoding requests based on the decoding results by utilizing callback interface units of the thread-level decoding channels;
and if the registration state of the callback interface unit of the thread-level decoding channel is unregistered, clearing the voice stream data of the data cache region through the recovery unit of the thread-level decoding channel, and clearing the state information updated by the state control unit.
7. A speech decoding apparatus, characterized in that the speech decoding apparatus comprises:
The application module is used for applying a plurality of thread-level decoding channels when a plurality of voice decoding requests are received, wherein the voice decoding requests are in one-to-one correspondence with the thread-level decoding channels, and the thread-level decoding channels comprise a channel decoding unit, a data cache area and a callback interface unit;
The decoding module is used for respectively caching the voice stream data in the voice decoding requests by utilizing the data cache memory of the thread-level decoding channels; the channel decoding units of the thread-level decoding channels are utilized to respectively call the same general model, and the voice stream data in the voice decoding requests are subjected to parallel decoding processing to obtain decoding results, wherein each general model is used for parallel use of the thread-level decoding channels; and respectively responding to the voice decoding requests based on the decoding results by utilizing callback interface units of the thread-level decoding channels.
8. A speech decoding engine, characterized in that it comprises a processor, a memory and a speech decoding program stored in said memory, which, when being executed by said processor, implements the steps of the speech decoding method according to any of claims 1-7.
9. A computer storage medium, characterized in that the computer storage medium has stored thereon a speech decoding program which, when executed by a processor, implements the steps of the speech decoding method according to any of claims 1-7.
CN202010155132.7A 2020-03-06 2020-03-06 Speech decoding method, device, engine and storage medium Active CN111402906B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010155132.7A CN111402906B (en) 2020-03-06 2020-03-06 Speech decoding method, device, engine and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010155132.7A CN111402906B (en) 2020-03-06 2020-03-06 Speech decoding method, device, engine and storage medium

Publications (2)

Publication Number Publication Date
CN111402906A CN111402906A (en) 2020-07-10
CN111402906B true CN111402906B (en) 2024-05-14

Family

ID=71430610

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010155132.7A Active CN111402906B (en) 2020-03-06 2020-03-06 Speech decoding method, device, engine and storage medium

Country Status (1)

Country Link
CN (1) CN111402906B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112822183B (en) * 2020-12-30 2023-08-22 北京捷通华声科技股份有限公司 Speech processing method, device, computer readable storage medium and processor

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2000026902A1 (en) * 1998-11-04 2000-05-11 Syvox Corporation Apparatus and method for improved memory and resource management in a single-user or multi-user speech recognition system
CN1486044A (en) * 2002-09-28 2004-03-31 ��Ϊ�������޹�˾ Method for scheduling multi-channel coding-decoding task in VOIP network
CN101281749A (en) * 2008-05-22 2008-10-08 上海交通大学 Apparatus for encoding and decoding hierarchical voice and musical sound together
CN102177542A (en) * 2008-10-10 2011-09-07 艾利森电话股份有限公司 Energy conservative multi-channel audio coding
CN104683860A (en) * 2015-02-02 2015-06-03 北京神州天脉网络计算机有限公司 Multipath audio and video concurrent decoding acceleration card and decoding acceleration method for same
CN107710323A (en) * 2016-01-22 2018-02-16 弗劳恩霍夫应用研究促进协会 Resampled using spectrum domain to encode or decode the device and method of audio multichannel signal
CN110570838A (en) * 2019-08-02 2019-12-13 北京葡萄智学科技有限公司 Voice stream processing method and device
CN110689876A (en) * 2019-10-14 2020-01-14 腾讯科技(深圳)有限公司 Voice recognition method and device, electronic equipment and storage medium

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8515052B2 (en) * 2007-12-17 2013-08-20 Wai Wu Parallel signal processing system and method
CN101739242B (en) * 2009-11-27 2013-07-31 深圳中微电科技有限公司 Stream data processing method and stream processor
US11443227B2 (en) * 2018-03-30 2022-09-13 International Business Machines Corporation System and method for cognitive multilingual speech training and recognition

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2000026902A1 (en) * 1998-11-04 2000-05-11 Syvox Corporation Apparatus and method for improved memory and resource management in a single-user or multi-user speech recognition system
CN1486044A (en) * 2002-09-28 2004-03-31 ��Ϊ�������޹�˾ Method for scheduling multi-channel coding-decoding task in VOIP network
CN101281749A (en) * 2008-05-22 2008-10-08 上海交通大学 Apparatus for encoding and decoding hierarchical voice and musical sound together
CN102177542A (en) * 2008-10-10 2011-09-07 艾利森电话股份有限公司 Energy conservative multi-channel audio coding
CN104683860A (en) * 2015-02-02 2015-06-03 北京神州天脉网络计算机有限公司 Multipath audio and video concurrent decoding acceleration card and decoding acceleration method for same
CN107710323A (en) * 2016-01-22 2018-02-16 弗劳恩霍夫应用研究促进协会 Resampled using spectrum domain to encode or decode the device and method of audio multichannel signal
CN110570838A (en) * 2019-08-02 2019-12-13 北京葡萄智学科技有限公司 Voice stream processing method and device
CN110689876A (en) * 2019-10-14 2020-01-14 腾讯科技(深圳)有限公司 Voice recognition method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN111402906A (en) 2020-07-10

Similar Documents

Publication Publication Date Title
EP2932501B1 (en) Speech model retrieval in distributed speech recognition systems
US20200312329A1 (en) Performing speech recognition using a local language context including a set of words with descriptions in terms of components smaller than the words
US8868425B2 (en) System and method for providing network coordinated conversational services
CA2345660C (en) System and method for providing network coordinated conversational services
US8099284B2 (en) System and method for speech recognition system
KR101649771B1 (en) Markup language-based selection and utilization of recognizers for utterance processing
US11443169B2 (en) Adaptation of model for recognition processing
CN110310657B (en) Audio data processing method and device
US11562735B1 (en) Multi-modal spoken language understanding systems
CN103514882A (en) Voice identification method and system
CN110995943B (en) Multi-user streaming voice recognition method, system, device and medium
CN110956955A (en) Voice interaction method and device
JP2023162265A (en) Text echo cancellation
CN111402906B (en) Speech decoding method, device, engine and storage medium
CN110942764B (en) Stream type voice recognition method
US20170140751A1 (en) Method and device of speech recognition
CN116075888A (en) System and method for reducing latency in cloud services
Lojka et al. Multi-thread parallel speech recognition for mobile applications
CN108986792B (en) Training and scheduling method and system for voice recognition model of voice conversation platform
CN111414748A (en) Traffic data processing method and device
US10923122B1 (en) Pausing automatic speech recognition
CN112466283B (en) Cooperative software voice recognition system
JP4224305B2 (en) Dialog information processing system
KR102365611B1 (en) Meeting management system using automatic speech recognition(ASR)
KR102364935B1 (en) A method and apparatus for data transmission for improving 5G-based speech recognition response speed

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant