WO2024124913A1 - 实体信息确定方法、装置和设备 - Google Patents

实体信息确定方法、装置和设备 Download PDF

Info

Publication number
WO2024124913A1
WO2024124913A1 PCT/CN2023/109425 CN2023109425W WO2024124913A1 WO 2024124913 A1 WO2024124913 A1 WO 2024124913A1 CN 2023109425 W CN2023109425 W CN 2023109425W WO 2024124913 A1 WO2024124913 A1 WO 2024124913A1
Authority
WO
WIPO (PCT)
Prior art keywords
character
feature
score
features
speech text
Prior art date
Application number
PCT/CN2023/109425
Other languages
English (en)
French (fr)
Inventor
李渊
Original Assignee
浙江极氪智能科技有限公司
浙江吉利控股集团有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 浙江极氪智能科技有限公司, 浙江吉利控股集团有限公司 filed Critical 浙江极氪智能科技有限公司
Publication of WO2024124913A1 publication Critical patent/WO2024124913A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis

Definitions

  • the present application relates to the field of intelligent vehicles, and in particular to a method, device and equipment for determining entity information.
  • the natural language understanding of the in-vehicle voice system includes the entity recognition stage included in the voice text, and its recognition effect determines whether the in-vehicle function can be successfully executed. Therefore, it is very important to improve the accuracy of entity recognition of the in-vehicle voice system.
  • entity information is usually identified by extracting semantic feature information of text and the association of feature information.
  • the present application provides a method, apparatus and device for determining entity information, so as to solve the problem of low recognition accuracy of non-continuous named entities.
  • the present application provides a method for determining entity information, the method comprising:
  • the entity information of the speech text is determined according to the character score of each character.
  • determining the position feature of each character according to the semantic feature of each character includes:
  • the preset first weight matrix feature mapping is performed on the semantic features of each character to determine the position of each character.
  • the first weight matrix is used to represent the correlation between the semantic feature of each character and the character position feature.
  • determining the word score of each word according to the semantic features and position features of each word includes:
  • the word score of each word is determined.
  • determining the final feature of each character according to the semantic feature and position feature of each character includes:
  • the final features of each character are determined according to the feature ratio, position feature and semantic feature of each character.
  • determining the character score of each character according to the final feature of each character includes:
  • the final features of each word are subjected to matrix transformation processing to determine the word score of each word in each entity classification.
  • determining the entity information of the speech text according to the character score of each character includes:
  • the word score is greater than a preset threshold, determining that the entity information of the speech text includes the entity classification corresponding to the word score;
  • the text score is less than or equal to the preset threshold, it is determined that the entity information of the speech text does not include the entity classification corresponding to the text score.
  • the present application provides an entity information determination device, the device comprising:
  • a first processing unit is used to obtain a voice text and extract semantic features of each character in the voice text, wherein the voice text includes N characters, and N is a positive integer greater than 1;
  • a first determining unit is used to determine the position feature of each character according to the semantic feature of each character, wherein the position feature represents the relative position feature of the corresponding character relative to other characters in the speech text;
  • a second determination unit is used to determine a character score corresponding to each character according to the semantic features and position features of each character, wherein the character score represents an entity recognition category score of the corresponding character;
  • the second processing unit is used to determine the entity information of the speech text according to the character score of each character.
  • the present application provides an electronic device, the electronic device comprising a memory and a processor;
  • the memory is used to store computer programs
  • the processor is configured to read the computer program stored in the memory and execute the entity information determination method as described in the first aspect according to the computer program in the memory.
  • the present application provides a computer-readable storage medium, wherein the computer-readable storage medium stores computer-executable instructions.
  • the entity information determination method as described in the first aspect is implemented.
  • the present application provides a computer program product, including a computer program, which, when executed by a processor, implements the entity information determination method as described in the first aspect.
  • the entity information determination method, device and equipment provided by the present application are carried out through the following steps: obtaining a speech text and extracting the semantic features of each character in the speech text, wherein the speech text includes N characters, and N is a positive integer greater than 1; determining the position features of each character according to the semantic features of each character, wherein the position features represent the relative position features of the corresponding character relative to other characters in the speech text; determining the character score corresponding to each character according to the semantic features and position features of each character; and determining the entity information of the speech text according to the character scores of each character.
  • high-level position feature information of each character in the speech text is extracted, thereby improving the accuracy of entity information recognition.
  • FIG1 is a flow chart of a method for determining entity information provided in an embodiment of the present application.
  • FIG2 is a flow chart of another entity information determination method provided in an embodiment of the present application.
  • FIG3 is a schematic diagram of the structure of an entity information determination device provided in an embodiment of the present application.
  • FIG4 is a schematic diagram of the structure of an electronic device provided in an embodiment of the present application.
  • FIG5 is a block diagram of an electronic device provided in an embodiment of the present application.
  • the natural language understanding of the in-vehicle voice system includes the entity recognition stage included in the voice text, and its recognition effect determines whether the in-vehicle function can be successfully executed. Therefore, it is very important to improve the accuracy of entity recognition of the in-vehicle voice system.
  • the semantic feature information of the characters in the spoken text is extracted through the pre-trained language representation model (Bidirectional Encoder Representation from Transformers, referred to as BERT), and then the feature information between the characters is associated through the conditional random field: natural language processing (Conditional Random Field, referred to as CRF), thereby realizing the recognition and determination of entity information in the spoken text.
  • BERT pre-trained language representation model
  • CRF conditional Random Field
  • the entity information determination method provided in this application is intended to solve the above technical problems in the prior art.
  • FIG. 1 is a flow chart of a method for determining entity information provided by an embodiment of the present application. As shown in FIG. 1 , the method includes:
  • a speech text is obtained, such as an in-vehicle speech text, and semantic feature information is extracted from the in-vehicle speech text through a pre-trained model BERT, and the semantic features of each character in the speech text are output.
  • the feature of the i-th text character in the speech text is output as a feature vector h i
  • the speech features of the entire speech text can be represented as a feature vector sequence h, where the speech text includes N characters, and N is a positive integer greater than 1.
  • the feature vector of each character is spatially encoded, the correlation between the low-level features and high-level features of each character, that is, the correlation between semantic features and position features, is mapped, and the position features of each character are determined, wherein the position features represent the relative position features of the corresponding character relative to other characters in the speech text.
  • the character score represents the entity recognition category score of the corresponding character.
  • a complete voice text may include keyword words that the user needs to express the intention, as well as modal particles, adjectives and other words that are used to assist in expressing the user's intention.
  • the position feature importance of different words can be added according to their importance, and then entity information can be recognized based on the features of the words with added position importance, and the word score corresponding to each word can be calculated and determined.
  • each character has a corresponding character score for different categories of entities, and the character scores of each character can be integrated, calculated and analyzed. For example, it can be determined which character scores in each character belong to the corresponding entity information that can be ignored and which cannot be ignored, and then the entity information contained in the speech text can be summarized and determined.
  • the entity information determination method includes the following steps: obtaining a speech text and extracting the semantic features of each character in the speech text, wherein the speech text includes N characters, and N is a positive integer greater than 1; determining the position features of each character according to the semantic features of each character, wherein the position features represent the relative position features of the corresponding character relative to other characters in the speech text; determining the character score corresponding to each character according to the semantic features and position features of each character; and determining the entity information of the speech text according to the character scores of each character.
  • high-level position feature information of each character in the speech text is extracted, thereby improving the accuracy of entity information recognition.
  • FIG. 2 is a flow chart of another entity information determination method provided in an embodiment of the present application. As shown in FIG. 2 , the method includes:
  • this step refers to step 101 and will not be described in detail.
  • the spatial relationship encoding is performed on the feature vector of each character, and a trainable position weight matrix, that is, a preset first weight matrix, is used to map the correlation between the low-level features and the high-level features of each character, that is, the correlation between the semantic features and the position features, to determine the position features of each character, wherein the position features represent the relative position features of the corresponding characters relative to other characters in the speech text, and the trainable position weight matrix is used to map the correlation between the low-level features and the high-level features of each character, that is, the correlation between the semantic features and the position features, to determine the position features of each character.
  • the weight matrix represents the position information between each character.
  • mapping between low-level features and high-level features can be expressed as follows: S j
  • i W ij h i
  • Wij represents a trainable position weight matrix
  • hi represents the semantic feature vector of character i
  • i represents the position feature information of character i.
  • the feature ratios of the semantic features and position features of the characters are determined, and then integrated to determine the final features of each character.
  • step 203 includes the following steps:
  • the final features of each character are determined based on the feature proportion, position features and semantic features of each character.
  • the feature ratio of the position feature of each character to the final feature of the corresponding character is determined, wherein the second weight matrix is used to characterize the position importance of the corresponding character in the speech text, including the connection weight between each character relative to other characters in the speech text; the final feature of each character is determined based on the feature ratio, position feature and semantic feature of each character.
  • a preset second weight matrix i.e., a connection weight matrix
  • the preset second weight matrix can be trained and adjusted, and each capsule vector output U i can be calculated through dynamic routing in the capsule network for training and adjustment.
  • the calculation formula is as follows:
  • n represents the number of characters
  • Ci represents the connection weight between character i and other characters in the speech text
  • Sj i represents the position feature information representing character i.
  • U i obtains the capsule output V i through the squash nonlinear change function, and dynamically adjusts the weight C i according to the capsule output V i .
  • Li is the final feature vector of word i.
  • step 204 includes: performing matrix transformation on the final features of each character to determine the The text score of the text in each entity category.
  • matrix transformation processing is performed on the final features of each word, for example, Multi-head matrix calculation is performed on the final features of each word to determine the word score of each word, such as the score of the first category and the score of the second non-category.
  • the entity recognition category score of text i in the tth entity class is obtained through Multi-head matrix calculation.
  • Lij tanh( WL [ Li , Lj ])+ bh
  • Wt is the weight parameter corresponding to entity category t
  • W L represents the weight vector of the concatenation vector of feature vector i and feature vector j
  • b h is the offset for calculating L ij
  • L ij represents the weight vector of L ij
  • b t is the offset for calculating the category score.
  • step 205 includes the following steps:
  • the word score is greater than a preset threshold, determining that the entity information of the speech text includes the entity classification corresponding to the word score;
  • the text score is less than or equal to a preset threshold, it is determined that the entity information of the speech text does not include an entity classification corresponding to the text score.
  • the character scores of each character can be integrated and calculated for analysis. For example, if the character score is greater than a preset threshold, it is determined that the entity information of the speech text includes the entity classification corresponding to the character score; if the character score is less than or equal to the preset threshold, it is determined that the entity information of the speech text does not include the entity classification corresponding to the character score.
  • the extraction and determination of non-continuous entity information in the speech text is performed according to the entity score of each character in the speech text.
  • the entity information determination method extracts the semantic features and relative position features of each character, adds the position importance, and then calculates the corresponding character score through the Multi-head matrix. Then, based on the character score of each character, the entity information of the speech text is determined, thereby improving the accuracy of entity information recognition.
  • FIG3 is a schematic diagram of the structure of an entity information determination device provided in an embodiment of the present application. As shown in FIG3 , the device includes:
  • the first processing unit 31 is used to obtain a voice text and extract semantic features of each character in the voice text, wherein the voice text includes N characters, and N is a positive integer greater than 1.
  • the first determination unit 32 is used to determine the position feature of each character according to the semantic feature of each character, wherein the position feature represents the relative position feature of the corresponding character with respect to other characters in the speech text.
  • the second determining unit 33 is used to determine a character score corresponding to each character according to the semantic features and position features of each character, wherein the character score represents an entity recognition category score of the corresponding character.
  • the second processing unit 34 is used to determine entity information of the speech text according to the character score of each character.
  • the first determining unit 32 is specifically configured to:
  • a preset first weight matrix feature mapping is performed on the semantic features of each character to determine the position feature information of each character, wherein the first weight matrix is used to characterize the correlation between the semantic features of each character and the character position features.
  • the second determining unit 33 includes:
  • the first determination subunit is used to determine the final feature of each character according to the semantic feature and position feature of each character.
  • the second determination subunit is used to determine the character score of each character according to the final feature of each character.
  • the second determining subunit includes:
  • the first processing module is used to determine the feature ratio of the position feature of each character to the final feature of the corresponding character according to a preset second weight matrix, wherein the second weight matrix is used to characterize the position importance of the corresponding character in the speech text.
  • the second processing module is used to determine the final features of each character according to the feature ratio, position feature and semantic feature of each character.
  • the second determining subunit is specifically used for:
  • the final features of each word are transformed into a matrix to determine the word score of each word in each entity classification.
  • the second processing unit 34 includes:
  • the first processing subunit is used to determine that the entity information of the speech text includes an entity classification corresponding to the word score if the word score is greater than a preset threshold.
  • the second processing subunit is configured to determine that the entity information of the speech text does not include an entity classification corresponding to the word score if the word score is less than or equal to a preset threshold.
  • FIG4 is a schematic diagram of the structure of an electronic device provided in an embodiment of the present application. As shown in FIG4 , the electronic device includes: a memory 41 and a processor 42 .
  • Memory used to store computer programs.
  • the processor is used to read the computer program stored in the memory and execute the method of any of the above embodiments according to the computer program in the memory.
  • FIG. 5 is a block diagram of an electronic device provided in an embodiment of the present application.
  • the device may be a mobile phone, a computer, a digital broadcast terminal, a message transceiver, a game console, a tablet device, a medical device, a fitness device, a personal digital assistant, etc.
  • the device 800 may include one or more of the following components: a processing component 802 , a memory 804 , a power component 806 , a multimedia component 808 , an audio component 810 , an input/output (I/O) interface 812 , a sensor component 814 , and a communication component 816 .
  • the processing component 802 generally controls the overall operation of the device 800, such as operations associated with display, phone calls, data communications, camera operations, and recording operations.
  • the processing component 802 may include one or more processors 820 to execute instructions to complete all or part of the steps of the above-mentioned method.
  • the processing component 802 may include one or more modules to facilitate the interaction between the processing component 802 and other components.
  • the processing component 802 may include a multimedia module to facilitate the interaction between the multimedia component 808 and the processing component 802.
  • the memory 804 is configured to store various types of data to support operations on the device 800. Examples of such data include instructions for any application or method operating on the device 800, contact data, phone book data, messages, pictures, videos, etc.
  • the memory 804 can be implemented by any type of volatile or non-volatile storage device or a combination thereof, such as static random access memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic disk or optical disk.
  • SRAM static random access memory
  • EEPROM electrically erasable programmable read-only memory
  • EPROM erasable programmable read-only memory
  • PROM programmable read-only memory
  • ROM read-only memory
  • magnetic memory flash memory
  • flash memory magnetic disk or optical disk.
  • the power supply component 806 provides power to the various components of the device 800.
  • the power supply component 806 may include a power management system, one or more power supplies, and other components associated with generating, managing, and distributing power for the device 800.
  • the multimedia component 808 includes a screen that provides an output interface between the device 800 and the user.
  • the screen may include a liquid crystal display (LCD) and a touch panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive input signals from the user.
  • the touch panel includes one or more touch sensors to sense touch, slide, and gestures on the touch panel. The touch sensor may not only sense the boundaries of the touch or slide action, but also detect the duration and pressure associated with the touch or slide operation.
  • the multimedia component 808 includes a front camera and/or a rear camera. When the device 800 is in an operating mode, such as a shooting mode or a video mode, the front camera and/or the rear camera may receive external multimedia data. Each front camera and rear camera may be a fixed optical lens system or have a focal length and optical zoom capability.
  • the audio component 810 is configured to output and/or input audio signals.
  • the audio component 810 includes a microphone (MIC), and when the device 800 is in an operation mode, such as a call mode, a recording mode, and a speech recognition mode, the microphone is configured to receive an external audio signal.
  • the received audio signal can be further stored in the memory 804 or transmitted via the mic.
  • the communication component 816 transmits.
  • the audio component 810 also includes a speaker for outputting audio signals.
  • I/O interface 812 provides an interface between processing component 802 and peripheral interface modules, such as keyboards, click wheels, buttons, etc. These buttons may include but are not limited to: home button, volume button, start button, and lock button.
  • the sensor assembly 814 includes one or more sensors for providing various aspects of status assessment for the device 800.
  • the sensor assembly 814 can detect the open/closed state of the device 800, the relative positioning of components, such as the display and keypad of the device 800, the sensor assembly 814 can also detect the position change of the device 800 or a component of the device 800, the presence or absence of user contact with the device 800, the orientation or acceleration/deceleration of the device 800 and the temperature change of the device 800.
  • the sensor assembly 814 can include a proximity sensor configured to detect the presence of nearby objects without any physical contact.
  • the sensor assembly 814 can also include an optical sensor, such as a CMOS or CCD image sensor, for use in imaging applications.
  • the sensor assembly 814 can also include an accelerometer, a gyroscope sensor, a magnetic sensor, a pressure sensor or a temperature sensor.
  • the communication component 816 is configured to facilitate wired or wireless communication between the device 800 and other devices.
  • the device 800 can access a wireless network based on a communication standard, such as WiFi, 2G or 3G, or a combination thereof.
  • the communication component 816 receives a broadcast signal or broadcast-related information from an external broadcast management system via a broadcast channel.
  • the communication component 816 also includes a near field communication (NFC) module to facilitate short-range communication.
  • the NFC module can be implemented based on radio frequency identification (RFID) technology, infrared data association (IrDA) technology, ultra-wideband (UWB) technology, Bluetooth (BT) technology and other technologies.
  • RFID radio frequency identification
  • IrDA infrared data association
  • UWB ultra-wideband
  • Bluetooth Bluetooth
  • the apparatus 800 may be implemented by one or more application specific integrated circuits (ASICs), digital signal processors (DSPs), digital signal processing devices (DSPDs), programmable logic devices (PLDs), field programmable gate arrays (FPGAs), controllers, microcontrollers, microprocessors or other electronic components to perform the above method.
  • ASICs application specific integrated circuits
  • DSPs digital signal processors
  • DSPDs digital signal processing devices
  • PLDs programmable logic devices
  • FPGAs field programmable gate arrays
  • controllers microcontrollers, microprocessors or other electronic components to perform the above method.
  • a non-transitory computer-readable storage medium including instructions is also provided, such as a memory 804 including instructions, and the instructions can be executed by the processor 820 of the device 800 to perform the above method.
  • the non-transitory computer-readable storage medium can be a ROM, a random access memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, etc.
  • An embodiment of the present application also provides a non-temporary computer-readable storage medium.
  • the instructions in the storage medium are executed by a processor of an electronic device, the electronic device can execute the method provided by the above embodiment.
  • the present application also provides a computer program product, the computer program product comprising: a computer program, a computer program
  • the computer program is stored in a readable storage medium.
  • At least one processor of the electronic device can read the computer program from the readable storage medium.
  • At least one processor executes the computer program so that the electronic device executes the solution provided by any of the above embodiments.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Machine Translation (AREA)

Abstract

本申请提供一种实体信息确定方法、装置和设备,该方法包括:获取语音文本,并提取语音文本中每一文字的语义特征,其中,语音文本中包括N个文字,N为大于1的正整数;根据每一文字的语义特征,确定各个文字的位置特征,其中,位置特征表征对应文字相对于语音文本中其他文字的相对位置特征;根据每一文字的语义特征和位置特征,确定每一文字对应的文字分数;根据各个文字的文字分数,确定语音文本的实体信息。这个过程中提取了语音文本中的每一文字的高层的位置特征信息,提高了实体信息识别的准确性。

Description

实体信息确定方法、装置和设备
本申请要求于2022年12月16日提交中国专利局、申请号为202211622292.3、申请名称为“实体信息确定方法、装置和设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及智能车辆领域,尤其涉及一种实体信息确定方法、装置和设备。
背景技术
车载语音***的自然语言理解包含语音文本所包括的实体识别阶段,其识别效果决定了车载功能是否可以成功执行。因此提高车载语音***的实体识别的准确率是非常重要的。
现有技术中,通常通过提取文字的语义特征信息以及特征信息的关联来进行实体信息的识别。
然而现有技术中,对于复杂场景下,非连续命名实体的识别准确性较低。
发明内容
本申请提供一种实体信息确定方法、装置和设备,用以解决非连续命名实体的识别准确性较低的问题。
第一方面本申请提供一种实体信息确定方法,所述方法包括:
获取语音文本,并提取所述语音文本中每一文字的语义特征,其中,所述语音文本中包括N个文字,N为大于1的正整数;
根据每一文字的语义特征,确定各个文字的位置特征,其中,所述位置特征表征对应文字相对于所述语音文本中其他文字的相对位置特征;
根据每一文字的语义特征和位置特征,确定每一文字对应的文字分数,其中,所述文字分数表征对应文字的实体识别类别分数;
根据各个文字的文字分数,确定所述语音文本的实体信息。
在可选的一种实施方式中,根据每一文字的语义特征,确定各个文字的位置特征,包括:
根据预设的第一权重矩阵,对每一文字的语义特征进行特征映射,确定各个文字的位 置特征信息,其中,所述第一权重矩阵用于表征每一文字的语义特征与所述文字位置特征两者之间的相关性。
在可选的一种实施方式中,根据每一文字的语义特征和位置特征,确定每一文字的文字分数,包括:
根据每一文字的语义特征和位置特征,确定每一所述文字的最终特征;
根据每一文字的最终特征,确定每一文字的文字分数。
在可选的一种实施方式中,根据每一文字的语义特征和位置特征,确定每一所述文字的最终特征,包括:
根据预设的第二权重矩阵,确定每一文字的位置特征所占对应文字的最终特征的特征比例,其中,所述第二权重矩阵用于表征对应文字在所述语音文本中的位置重要性;
根据每一文字的所述特征比例、位置特征以及语义特征,确定每一所述文字的最终特征。
在可选的一种实施方式中,根据每一文字的最终特征,确定每一文字的文字分数,包括:
对每一文字的最终特征进行矩阵转化处理,确定每一所述文字在各个实体分类的文字分数。
在可选的一种实施方式中,根据各个文字的文字分数,确定所述语音文本的实体信息,包括:
若文字分数大于预设阈值,则确定所述语音文本的实体信息包括所述文字分数对应的实体分类;
若文字分数小于或等于所述预设阈值,则确定所述语音文本的实体信息不包括所述文字分数对应的实体分类。
第二方面,本申请提供一种实体信息确定装置,所述装置包括:
第一处理单元,用于获取语音文本,并提取所述语音文本中每一文字的语义特征,其中,所述语音文本中包括N个文字,N为大于1的正整数;
第一确定单元,用于根据每一文字的语义特征,确定各个文字的位置特征,其中,所述位置特征表征对应文字相对于所述语音文本中其他文字的相对位置特征;
第二确定单元,用于根据每一文字的语义特征和位置特征,确定每一文字对应的文字分数,其中,所述文字分数表征对应文字的实体识别类别分数;
第二处理单元,用于根据各个文字的文字分数,确定所述语音文本的实体信息。
第三方面,本申请提供一种电子设备,所述电子设备包括存储器和处理器;
所述存储器,用于存储计算机程序;
所述处理器,用于读取所述存储器存储的计算机程序,并根据所述存储器中的计算机程序执行如第一方面所述的实体信息确定方法。
第四方面,本申请提供一种计算机可读存储介质,所述计算机可读存储介质中存储有计算机执行指令,当处理器执行所述计算机执行指令时,实现如第一方面所述的实体信息确定方法。
第五方面,本申请提供一种计算机程序产品,包括计算机程序,所述计算机程序被处理器执行时实现如第一方面所述的实体信息确定方法。
本申请提供的实体信息确定方法、装置和设备,通过以下步骤:获取语音文本,并提取语音文本中每一文字的语义特征,其中,语音文本中包括N个文字,N为大于1的正整数;根据每一文字的语义特征,确定各个文字的位置特征,其中,位置特征表征对应文字相对于语音文本中其他文字的相对位置特征;根据每一文字的语义特征和位置特征,确定每一文字对应的文字分数;根据各个文字的文字分数,确定语音文本的实体信息。这个过程中提取了语音文本中的每一文字的高层的位置特征信息,提高了实体信息识别的准确性。
附图说明
此处的附图被并入说明书中并构成本说明书的一部分,示出了符合本申请的实施例,并与说明书一起用于解释本申请的原理。
图1为本申请实施例提供的一种实体信息确定方法的流程图;
图2为本申请实施例提供的另一种实体信息确定方法的流程图;
图3为本申请实施例提供的一种实体信息确定装置的结构示意图;
图4为本申请实施例提供的一种电子设备的结构示意图;
图5为本申请实施例提供的一种电子设备的框图。
通过上述附图,已示出本申请明确的实施例,后文中将有更详细的描述。这些附图和文字描述并不是为了通过任何方式限制本申请构思的范围,而是通过参考特定实施例为本领域技术人员说明本申请的概念。
具体实施方式
这里将详细地对示例性实施例进行说明,其示例表示在附图中。下面的描述涉及附图时,除非另有表示,不同附图中的相同数字表示相同或相似的要素。以下示例性实施例中 所描述的实施方式并不代表与本申请相一致的所有实施方式。相反,它们仅是与如所附权利要求书中所详述的、本申请的一些方面相一致的装置和方法的例子。
车载语音***的自然语言理解包含语音文本所包括的实体识别阶段,其识别效果决定了车载功能是否可以成功执行。因此提高车载语音***的实体识别的准确率是非常重要的。
一个示例中,通过预训练语言表征模型(Bidirectional Encoder Representation from Transformers,简称BERT)提取语音文本中文字的语义特征信息,再通过条件随机域:自然语言处理(Conditional Random Field,简称CRF)进行文字间特征信息的关联,进而实现语音文本中实体信息的识别确定。
然而,现有的实体识别确定方法对于复杂场景下,语音文本中的所包含的命名实体分散不连续的情况,识别准确性较低。
本申请提供的实体信息确定方法,旨在解决现有技术的如上技术问题。
下面以具体地实施例对本申请的技术方案以及本申请的技术方案如何解决上述技术问题进行详细说明。下面这几个具体的实施例可以相互结合,对于相同或相似的概念或过程可能在某些实施例中不再赘述。下面将结合附图,对本申请的实施例进行描述。
图1为本申请实施例提供的一种实体信息确定方法的流程图,如图1所示,该方法包括:
101、获取语音文本,并提取语音文本中每一文字的语义特征,其中,语音文本中包括N个文字,N为大于1的正整数。
示例性地,获取语音文本,例如车载语音文本,将车载语音文本通过预训练模型BERT进行语义特征信息提取,输出语音文本中每一文字的语义特征,例如,语音文本中的第i个文本文字的特征输出为特征向量hi,进而整个语音文本的语音特征可以表示为一个特征向量序列h,其中,语音文本中包括N个文字,N为大于1的正整数。
102、根据每一文字的语义特征,确定各个文字的位置特征,其中,位置特征表征对应文字相对于语音文本中其他文字的相对位置特征。
示例性地,基于特征提取网络,例如胶囊网络中的胶囊层,对每一文字的特征向量进行空间关系编码,映射每一文字的低层特征和高层特征的相关性,即语义特征和位置特征的相关性,确定各个文字的位置特征,其中,位置特征表征对应文字相对于语音文本中其他文字的相对位置特征。
103、根据每一文字的语义特征和位置特征,确定每一文字对应的文字分数,其 中,文字分数表征对应文字的实体识别类别分数。
示例性地,不同文字在语音文本中的重要程度是不一样的,例如一个完整的语音文本中就可以包括用户所需表达的意图关键字文字以及用以辅助表达用户意图的语气词、形容词等文字,可以根据不同文字的重要程度,对不同文字进行位置特征重要性加成,进而可以根据添加位置重要性的文字的特征进行实体信息识别,计算确定每一文字对应的文字分数。
104、根据各个文字的文字分数,确定语音文本的实体信息。
示例性地,每一文字针对不同类别的实体,有对应的文字分数,可以整合计算分析各个文字的文字分数,例如确定各个文字中哪些文字分数属于对应的实体信息是可以忽略的,哪些又是不可忽略的等,进而汇总确定语音文本的所包含的实体信息。
综上,本实施例提供的实体信息确定方法,通过以下步骤:获取语音文本,并提取语音文本中每一文字的语义特征,其中,语音文本中包括N个文字,N为大于1的正整数;根据每一文字的语义特征,确定各个文字的位置特征,其中,位置特征表征对应文字相对于语音文本中其他文字的相对位置特征;根据每一文字的语义特征和位置特征,确定每一文字对应的文字分数;根据各个文字的文字分数,确定语音文本的实体信息。这个过程中提取了语音文本中的每一文字的高层的位置特征信息,提高了实体信息识别的准确性。
图2为本申请实施例提供的另一种实体信息确定方法的流程图,如图2所示,该方法包括:
201、获取语音文本,并提取语音文本中每一文字的语义特征,其中,语音文本中包括N个文字,N为大于1的正整数。
示例性地,本步骤参见步骤101,不再赘述。
202、根据预设的第一权重矩阵,对每一文字的语义特征进行特征映射,确定各个文字的位置特征信息,其中,第一权重矩阵用于表征每一文字的语义特征与文字位置特征两者之间的相关性,其中,位置特征表征对应文字相对于语音文本中其他文字的相对位置特征。
示例性地,基于胶囊网络中的胶囊层,对每一文字的特征向量进行空间关系编码,通过一个可训练的位置权重矩阵,即预设的第一权重矩阵,来映射每一文字的低层特征和高层特征的相关性,即语义特征和位置特征的相关性,确定各个文字的位置特征,其中,位置特征表征对应文字相对于语音文本中其他文字的相对位置特征,可训练的位置权 重矩阵表征了各个文字之间的位置信息。
一个示例中,低层特征与高层特征的映射可由下式表示:
Sj|i=Wijhi
其中Wij表示一个可训练的位置权重矩阵,hi表示文字i的语义特征向量,Sj|i表征文字i的位置特征信息。
203、根据每一文字的语义特征和位置特征,确定每一文字的最终特征。
示例性地,确定文字的语义特征和位置特征的特征占比,进而整合确定每一文字的最终特征。
一个示例中,步骤203包括以下步骤:
根据预设的第二权重矩阵,确定每一文字的位置特征所占对应文字的最终特征的特征比例,其中,第二权重矩阵用于表征对应文字在语音文本中的位置重要性;
根据每一文字的特征比例、位置特征以及语义特征,确定每一文字的最终特征。
示例性地,根据预设的第二权重矩阵,即连接权重矩阵,确定每一文字的位置特征所占对应文字的最终特征的特征比例,其中,第二权重矩阵用于表征对应文字在语音文本中的位置重要性,其中包括每一文字相对于语音文本中其他文字之间的连接权重;根据每一文字的特征比例、位置特征以及语义特征,确定每一文字的最终特征。
一个示例中,预设的第二权重矩阵是可以通过训练并调整的,可以通过胶囊网络中的动态路由计算每个胶囊向量输出Ui,进行训练调整,计算公式如下:
其中,n表示文字个数,Ci表示文字i相对于语音文本中其他文字之间的连接权重,Sj|i表示表征文字i的位置特征信息。
Ui通过squash非线性变化函数得到胶囊输出Vi,根据胶囊输出Vi动态调整权重Ci,当文字对文本重要性越大,权重Ci越大。
一个示例中,得到每个文字的位置重要性权重Ci后,可以进行权重相乘,每个文字加入相对位置信息及重要性:
Li=CiSi
其中,Li为文字i的最终特征向量。
204、根据每一文字的最终特征,确定每一文字的文字分数,其中,文字分数表征对应文字的实体识别类别分数。
一个示例中,步骤204包括:对每一文字的最终特征进行矩阵转化处理,确定每一 文字在各个实体分类的文字分数。
示例性地,对每一文字的最终特征进行矩阵转化处理,例如对每一文字的最终特征进行Multi-head矩阵计算,确定每一文字的文字分数,例如第1个品类类别的得分,第2个非品类类别的得分。
一个示例中,通过Multi-head矩阵计算,得到文字i在第t个实体类的实体识别类别得分,针对实体类别t,进行文字i特征向量与语音文本中其他文字j的特征向量之间遍历计算,即二维矩阵计算,得到文字i在第t个实体类的实体识别类别得分,如下式:
f(Li,Lj,t)=WtLij+bt
其中,Lij=tanh(WL[Li,Lj])+bh,Wt为实体类别t对应的权重参数;
WL表示特征向量i和特征向量j的拼接向量的权重向量,bh是计算Lij的偏移量;Lij表示Lij的权重向量,bt是计算类别得分的偏移量。
205、根据各个文字的文字分数,确定语音文本的实体信息。
一个示例中,步骤205包括以下步骤:
若文字分数大于预设阈值,则确定语音文本的实体信息包括文字分数对应的实体分类;
若文字分数小于或等于预设阈值,则确定语音文本的实体信息不包括文字分数对应的实体分类。
示例性地,可以整合计算分析各个文字的文字分数,例如若文字分数大于预设阈值,则确定语音文本的实体信息包括文字分数对应的实体分类;若文字分数小于或等于预设阈值,则确定语音文本的实体信息不包括文字分数对应的实体分类。从而根据语音文本中每一文字的实体得分进行语音文本中非连续实体信息的提取确定。
综上,本实施例提供的实体信息确定方法,提取了每一文字的语义特征、相对位置特征,并且添加位置重要性,再通过Multi-head矩阵计算得到对应的文字分数,进而根据各个文字的文字分数,确定语音文本的实体信息,提高了实体信息识别的准确性。
图3为本申请实施例提供的一种实体信息确定装置的结构示意图,如图3所示,该装置包括:
第一处理单元31,用于获取语音文本,并提取语音文本中每一文字的语义特征,其中,语音文本中包括N个文字,N为大于1的正整数。
第一确定单元32,用于根据每一文字的语义特征,确定各个文字的位置特征,其中,位置特征表征对应文字相对于语音文本中其他文字的相对位置特征。
第二确定单元33,用于根据每一文字的语义特征和位置特征,确定每一文字对应的文字分数,其中,文字分数表征对应文字的实体识别类别分数。
第二处理单元34,用于根据各个文字的文字分数,确定语音文本的实体信息。
一个示例中,第一确定单元32具体用于:
根据预设的第一权重矩阵,对每一文字的语义特征进行特征映射,确定各个文字的位置特征信息,其中,第一权重矩阵用于表征每一文字的语义特征与文字位置特征两者之间的相关性。
一个示例中,第二确定单元33包括:
第一确定子单元,用于根据每一文字的语义特征和位置特征,确定每一文字的最终特征。
第二确定子单元,用于根据每一文字的最终特征,确定每一文字的文字分数。
一个示例中,第二确定子单元包括:
第一处理模块,用于根据预设的第二权重矩阵,确定每一文字的位置特征所占对应文字的最终特征的特征比例,其中,第二权重矩阵用于表征对应文字在语音文本中的位置重要性。
第二处理模块,用于根据每一文字的特征比例、位置特征以及语义特征,确定每一文字的最终特征。
一个示例中,第二确定子单元具体用于:
对每一文字的最终特征进行矩阵转化处理,确定每一文字在各个实体分类的文字分数。
一个示例中,第二处理单元34包括:
第一处理子单元,用于若文字分数大于预设阈值,则确定语音文本的实体信息包括文字分数对应的实体分类。
第二处理子单元,用于若文字分数小于或等于预设阈值,则确定语音文本的实体信息不包括文字分数对应的实体分类。
图4为本申请实施例提供的一种电子设备的结构示意图,如图4所示,电子设备包括:存储器41,处理器42。
存储器,用于存储计算机程序。
处理器,用于读取存储器存储的计算机程序,并根据存储器中的计算机程序执行上述任一实施例的方法。
图5为本申请实施例提供的一种电子设备的框图,该设备可以是移动电话,计算机,数字广播终端,消息收发设备,游戏控制台,平板设备,医疗设备,健身设备,个人数字助理等。
装置800可以包括以下一个或多个组件:处理组件802,存储器804,电源组件806,多媒体组件808,音频组件810,输入/输出(I/O)接口812,传感器组件814,以及通信组件816。
处理组件802通常控制装置800的整体操作,诸如与显示,电话呼叫,数据通信,相机操作和记录操作相关联的操作。处理组件802可以包括一个或多个处理器820来执行指令,以完成上述的方法的全部或部分步骤。此外,处理组件802可以包括一个或多个模块,便于处理组件802和其他组件之间的交互。例如,处理组件802可以包括多媒体模块,以方便多媒体组件808和处理组件802之间的交互。
存储器804被配置为存储各种类型的数据以支持在装置800的操作。这些数据的示例包括用于在装置800上操作的任何应用程序或方法的指令,联系人数据,电话簿数据,消息,图片,视频等。存储器804可以由任何类型的易失性或非易失性存储设备或者它们的组合实现,如静态随机存取存储器(SRAM),电可擦除可编程只读存储器(EEPROM),可擦除可编程只读存储器(EPROM),可编程只读存储器(PROM),只读存储器(ROM),磁存储器,快闪存储器,磁盘或光盘。
电源组件806为装置800的各种组件提供电力。电源组件806可以包括电源管理***,一个或多个电源,及其他与为装置800生成、管理和分配电力相关联的组件。
多媒体组件808包括在装置800和用户之间的提供一个输出接口的屏幕。在一些实施例中,屏幕可以包括液晶显示器(LCD)和触摸面板(TP)。如果屏幕包括触摸面板,屏幕可以被实现为触摸屏,以接收来自用户的输入信号。触摸面板包括一个或多个触摸传感器以感测触摸、滑动和触摸面板上的手势。触摸传感器可以不仅感测触摸或滑动动作的边界,而且还检测与触摸或滑动操作相关的持续时间和压力。在一些实施例中,多媒体组件808包括一个前置摄像头和/或后置摄像头。当装置800处于操作模式,如拍摄模式或视频模式时,前置摄像头和/或后置摄像头可以接收外部的多媒体数据。每个前置摄像头和后置摄像头可以是一个固定的光学透镜***或具有焦距和光学变焦能力。
音频组件810被配置为输出和/或输入音频信号。例如,音频组件810包括一个麦克风(MIC),当装置800处于操作模式,如呼叫模式、记录模式和语音识别模式时,麦克风被配置为接收外部音频信号。所接收的音频信号可以被进一步存储在存储器804或经由 通信组件816发送。在一些实施例中,音频组件810还包括一个扬声器,用于输出音频信号。
I/O接口812为处理组件802和***接口模块之间提供接口,上述***接口模块可以是键盘,点击轮,按钮等。这些按钮可包括但不限于:主页按钮、音量按钮、启动按钮和锁定按钮。
传感器组件814包括一个或多个传感器,用于为装置800提供各个方面的状态评估。例如,传感器组件814可以检测到装置800的打开/关闭状态,组件的相对定位,例如组件为装置800的显示器和小键盘,传感器组件814还可以检测装置800或装置800一个组件的位置改变,用户与装置800接触的存在或不存在,装置800方位或加速/减速和装置800的温度变化。传感器组件814可以包括接近传感器,被配置用来在没有任何的物理接触时检测附近物体的存在。传感器组件814还可以包括光传感器,如CMOS或CCD图像传感器,用于在成像应用中使用。在一些实施例中,该传感器组件814还可以包括加速度传感器,陀螺仪传感器,磁传感器,压力传感器或温度传感器。
通信组件816被配置为便于装置800和其他设备之间有线或无线方式的通信。装置800可以接入基于通信标准的无线网络,如WiFi,2G或3G,或它们的组合。在一个示例性实施例中,通信组件816经由广播信道接收来自外部广播管理***的广播信号或广播相关信息。在一个示例性实施例中,通信组件816还包括近场通信(NFC)模块,以促进短程通信。例如,在NFC模块可基于射频识别(RFID)技术,红外数据协会(IrDA)技术,超宽带(UWB)技术,蓝牙(BT)技术和其他技术来实现。
在示例性实施例中,装置800可以被一个或多个应用专用集成电路(ASIC)、数字信号处理器(DSP)、数字信号处理设备(DSPD)、可编程逻辑器件(PLD)、现场可编程门阵列(FPGA)、控制器、微控制器、微处理器或其他电子元件实现,用于执行上述方法。
在示例性实施例中,还提供了一种包括指令的非临时性计算机可读存储介质,例如包括指令的存储器804,上述指令可由装置800的处理器820执行以完成上述方法。例如,非临时性计算机可读存储介质可以是ROM、随机存取存储器(RAM)、CD-ROM、磁带、软盘和光数据存储设备等。
本申请实施例还提供了一种非临时性计算机可读存储介质,当该存储介质中的指令由电子设备的处理器执行时,使得电子设备能够执行上述实施例提供的方法。
本申请实施例还提供了一种计算机程序产品,计算机程序产品包括:计算机程序,计 算机程序存储在可读存储介质中,电子设备的至少一个处理器可以从可读存储介质读取计算机程序,至少一个处理器执行计算机程序使得电子设备执行上述任一实施例提供的方案。
本领域技术人员在考虑说明书及实践这里公开的发明后,将容易想到本申请的其它实施方案。本申请旨在涵盖本申请的任何变型、用途或者适应性变化,这些变型、用途或者适应性变化遵循本申请的一般性原理并包括本申请未公开的本技术领域中的公知常识或惯用技术手段。说明书和实施例仅被视为示例性的,本申请的真正范围和精神由下面的权利要求书指出。
应当理解的是,本申请并不局限于上面已经描述并在附图中示出的精确结构,并且可以在不脱离其范围进行各种修改和改变。本申请的范围仅由所附的权利要求书来限制。

Claims (10)

  1. 一种实体信息确定方法,其特征在于,所述方法包括:
    获取语音文本,并提取所述语音文本中每一文字的语义特征,其中,所述语音文本中包括N个文字,N为大于1的正整数;
    根据每一文字的语义特征,确定各个文字的位置特征,其中,所述位置特征表征对应文字相对于所述语音文本中其他文字的相对位置特征;
    根据每一文字的语义特征和位置特征,确定每一文字对应的文字分数,其中,所述文字分数表征对应文字的实体识别类别分数;
    根据各个文字的文字分数,确定所述语音文本的实体信息。
  2. 根据权利要求1所述的方法,其特征在于,根据每一文字的语义特征,确定各个文字的位置特征,包括:
    根据预设的第一权重矩阵,对每一文字的语义特征进行特征映射,确定各个文字的位置特征信息,其中,所述第一权重矩阵用于表征每一文字的语义特征与所述文字位置特征两者之间的相关性。
  3. 根据权利要求1所述的方法,其特征在于,根据每一文字的语义特征和位置特征,确定每一文字对应的文字分数,包括:
    根据每一文字的语义特征和位置特征,确定每一所述文字的最终特征;
    根据每一文字的最终特征,确定每一文字的文字分数。
  4. 根据权利要求3所述的方法,其特征在于,根据每一文字的语义特征和位置特征,确定每一所述文字的最终特征,包括:
    根据预设的第二权重矩阵,确定每一文字的位置特征所占对应文字的最终特征的特征比例,其中,所述第二权重矩阵用于表征对应文字在所述语音文本中的位置重要性;
    根据每一文字的所述特征比例、位置特征以及语义特征,确定每一所述文字的最终特征。
  5. 根据权利要求3所述的方法,其特征在于,根据每一文字的最终特征,确定每一文字的文字分数,包括:
    对每一文字的最终特征进行矩阵转化处理,确定每一所述文字在各个实体分类的文字分数。
  6. 根据权利要求5所述的方法,其特征在于,根据各个文字的文字分数,确定所述 语音文本的实体信息,包括:
    若文字分数大于预设阈值,则确定所述语音文本的实体信息包括所述文字分数对应的实体分类;
    若文字分数小于或等于所述预设阈值,则确定所述语音文本的实体信息不包括所述文字分数对应的实体分类。
  7. 一种实体信息确定装置,其特征在于,所述装置包括:
    第一处理单元,用于获取语音文本,并提取所述语音文本中每一文字的语义特征,其中,所述语音文本中包括N个文字,N为大于1的正整数;
    第一确定单元,用于根据每一文字的语义特征,确定各个文字的位置特征,其中,所述位置特征表征对应文字相对于所述语音文本中其他文字的相对位置特征;
    第二确定单元,用于根据每一文字的语义特征和位置特征,确定每一文字对应的文字分数,其中,所述文字分数表征对应文字的实体识别类别分数;
    第二处理单元,用于根据各个文字的文字分数,确定所述语音文本的实体信息。
  8. 一种电子设备,其特征在于,包括存储器和处理器;
    所述存储器,用于存储计算机程序;
    所述处理器,用于读取所述存储器存储的计算机程序,并根据所述存储器中的计算机程序执行上述权利要求1-6任一项所述的实体信息确定方法。
  9. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质中存储有计算机执行指令,当处理器执行所述计算机执行指令时,实现如权利要求1-6任一项所述的实体信息确定方法。
  10. 一种计算机程序产品,包括计算机程序,其特征在于,所述计算机程序被处理器执行时实现权利要求1-6任一项所述的实体信息确定方法。
PCT/CN2023/109425 2022-12-16 2023-07-26 实体信息确定方法、装置和设备 WO2024124913A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202211622292.3 2022-12-16
CN202211622292.3A CN115906853A (zh) 2022-12-16 2022-12-16 实体信息确定方法、装置和设备

Publications (1)

Publication Number Publication Date
WO2024124913A1 true WO2024124913A1 (zh) 2024-06-20

Family

ID=86480587

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/109425 WO2024124913A1 (zh) 2022-12-16 2023-07-26 实体信息确定方法、装置和设备

Country Status (2)

Country Link
CN (1) CN115906853A (zh)
WO (1) WO2024124913A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115906853A (zh) * 2022-12-16 2023-04-04 浙江极氪智能科技有限公司 实体信息确定方法、装置和设备

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190347298A1 (en) * 2018-05-11 2019-11-14 The Regents Of The University Of California Speech based structured querying
CN112287100A (zh) * 2019-07-12 2021-01-29 阿里巴巴集团控股有限公司 文本识别方法、拼写纠错方法及语音识别方法
CN112466288A (zh) * 2020-12-18 2021-03-09 北京百度网讯科技有限公司 语音识别方法、装置、电子设备及存储介质
CN113284499A (zh) * 2021-05-24 2021-08-20 湖北亿咖通科技有限公司 一种语音指令识别方法及电子设备
CN115099242A (zh) * 2022-08-29 2022-09-23 江西电信信息产业有限公司 意图识别方法、***、计算机及可读存储介质
CN115906853A (zh) * 2022-12-16 2023-04-04 浙江极氪智能科技有限公司 实体信息确定方法、装置和设备

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190347298A1 (en) * 2018-05-11 2019-11-14 The Regents Of The University Of California Speech based structured querying
CN112287100A (zh) * 2019-07-12 2021-01-29 阿里巴巴集团控股有限公司 文本识别方法、拼写纠错方法及语音识别方法
CN112466288A (zh) * 2020-12-18 2021-03-09 北京百度网讯科技有限公司 语音识别方法、装置、电子设备及存储介质
CN113284499A (zh) * 2021-05-24 2021-08-20 湖北亿咖通科技有限公司 一种语音指令识别方法及电子设备
CN115099242A (zh) * 2022-08-29 2022-09-23 江西电信信息产业有限公司 意图识别方法、***、计算机及可读存储介质
CN115906853A (zh) * 2022-12-16 2023-04-04 浙江极氪智能科技有限公司 实体信息确定方法、装置和设备

Also Published As

Publication number Publication date
CN115906853A (zh) 2023-04-04

Similar Documents

Publication Publication Date Title
US11430427B2 (en) Method and electronic device for separating mixed sound signal
WO2020134556A1 (zh) 图像风格迁移方法、装置、电子设备及存储介质
WO2020107813A1 (zh) 图像的描述语句定位方法及装置、电子设备和存储介质
WO2021244457A1 (zh) 一种视频生成方法及相关装置
CN109919829B (zh) 图像风格迁移方法、装置和计算机可读存储介质
US11244228B2 (en) Method and device for recommending video, and computer readable storage medium
CN107133354B (zh) 图像描述信息的获取方法及装置
CN109961791B (zh) 一种语音信息处理方法、装置及电子设备
WO2024124913A1 (zh) 实体信息确定方法、装置和设备
US11335348B2 (en) Input method, device, apparatus, and storage medium
CN106547850B (zh) 表情注释方法及装置
CN110674246A (zh) 问答模型训练方法、自动问答方法及装置
CN110931028A (zh) 一种语音处理方法、装置和电子设备
CN110111795B (zh) 一种语音处理方法及终端设备
CN109686359B (zh) 语音输出方法、终端及计算机可读存储介质
CN113656557A (zh) 消息回复方法、装置、存储介质及电子设备
CN111046780A (zh) 神经网络训练及图像识别方法、装置、设备和存储介质
CN111984765A (zh) 知识库问答过程关系检测方法及装置
CN112863499B (zh) 语音识别方法及装置、存储介质
CN114095817B (zh) 耳机的降噪方法、装置、耳机及存储介质
CN114090738A (zh) 场景数据信息确定的方法、装置、设备及存储介质
CN117642817A (zh) 识别音频数据类别的方法、装置及存储介质
CN115374256A (zh) 问答数据处理方法、装置、设备、存储介质及产品
CN112434714A (zh) 多媒体识别的方法、装置、存储介质及电子设备
CN115809323A (zh) 意图识别方法、装置和设备