WO2022198750A1 - 语义识别方法 - Google Patents

语义识别方法 Download PDF

Info

Publication number
WO2022198750A1
WO2022198750A1 PCT/CN2021/091024 CN2021091024W WO2022198750A1 WO 2022198750 A1 WO2022198750 A1 WO 2022198750A1 CN 2021091024 W CN2021091024 W CN 2021091024W WO 2022198750 A1 WO2022198750 A1 WO 2022198750A1
Authority
WO
WIPO (PCT)
Prior art keywords
intent
model
text
semantic
training
Prior art date
Application number
PCT/CN2021/091024
Other languages
English (en)
French (fr)
Inventor
张晖
李吉媛
赵海涛
孙雁飞
朱洪波
Original Assignee
南京邮电大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 南京邮电大学 filed Critical 南京邮电大学
Priority to JP2022512826A priority Critical patent/JP7370033B2/ja
Publication of WO2022198750A1 publication Critical patent/WO2022198750A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/194Calculation of difference between files
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Definitions

  • the present application relates to the field of natural language processing, and in particular, to a natural language semantic analysis method in a human-machine dialogue system.
  • semantic analysis is divided into two basic subtasks: intent recognition and semantic slot filling. For these two subtasks, the traditional research method is to treat the two tasks as two independent problems to solve, and then connect the results of the two tasks.
  • Each exemplary embodiment of the present application provides a semantic recognition method, including:
  • S102 construct a multi-intent recognition model based on clustering pre-analysis, and recognize multiple intentions of the user according to the intention text vector;
  • the step of constructing the multi-intent recognition model based on the cluster pre-analysis, and the step of recognizing the multiple intentions of the user includes:
  • the first stage use the K-means clustering algorithm to divide the input intent text vector into a single-intent category intent text vector and a multi-intent category intent text vector;
  • the second stage classify the intent text vector of the single-intent category through a softmax classifier to identify the multiple intents; and classify the multiple-intent intent text vector through a sigmoid classifier to identify the multiple intents multiple intents.
  • the distance function in the K-means clustering algorithm is:
  • f Sim (x i , x j ) represents the distance between the schematic text vector x i and the intent text vector x j
  • f 1 (x i , x j ) represents the distance between the schematic text vector x i and the intent text vector x j
  • the cosine similarity between , f 2 ( xi , x j ) represents the Euclidean distance between the schematic text vector x i and the intent text vector x j .
  • performing optimization training on the joint model in step S104 includes:
  • the loss function Loss intent of the multi-intent recognition model satisfies the following formula:
  • Loss intent (Loss multi ) k (Loss single ) 1-k
  • k represents the category of the schematic text, k is 1 when the intent text contains multiple intents, and k is 0 when the intent text is a single intent; is the cross-entropy loss for multi-intent recognition, is the cross-entropy loss for single-intent recognition, y I is the predicted output of the intent, y intent is the real intent, and T is the number of training texts.
  • the loss function Loss slot of the semantic slot filling model satisfies the following formula:
  • This application fully considers the connection between intent recognition and semantic slot filling, constructs a joint recognition model, combines two semantic analysis subtasks into one task, and shares the underlying semantic features of BERT. Then, the Slot-Gated correlation gate is used to generate the intent-semantic slot joint feature vector, which is then used for the semantic slot filling task.
  • BiLSTM is used to capture the word order features of the text to obtain contextual semantic information
  • CRF is used as a decoder to consider the dependencies before and after the label, so that the semantic slot labeling is more reasonable.
  • an algorithm based on clustering pre-analysis is proposed for the uncertainty of user input intentions to determine the number of intentions.
  • the traditional measurement method of semantic similarity is improved, and a new measurement method is proposed.
  • the new measurement method can measure the similarity between intent texts more effectively and improve the accuracy of the judgment of the number of intents.
  • improve the robustness of the algorithm In order to improve the ability of semantic analysis and make full use of intent semantic information to guide the filling of semantic slots, based on the idea of iteration, a step-by-step iterative training method is proposed, which can make full use of the relationship between intent and semantic slots. Fill in the accuracy while improving the accuracy of the multi-intent recognition model, thereby improving the effect of semantic analysis.
  • FIG. 1 is a block diagram of the overall structure of a joint modeling method according to an embodiment of the application.
  • FIG. 3 is a structural diagram of a semantic slot recognition model according to an embodiment of the present application.
  • FIG. 4 is a schematic diagram of a step-by-step iterative training method of a joint identification model according to an embodiment of the present application.
  • the traditional research method is to treat the two tasks of intent recognition and semantic slot filling as two independent problems to solve, and then connect the results of the two tasks.
  • intent recognition is a judgment on the type of user needs, while semantic slot filling is to concretize user needs. Therefore, user intent and the slot to be identified are strongly correlated, and the purpose of intent identification is to better fill the semantic slot.
  • the traditional separate modeling method does not fully consider the connection between the two tasks, so that the semantic information cannot be effectively utilized.
  • the problem of multi-intent recognition is often faced in human-machine dialogue systems, that is, the intention text input by the user may not only contain one kind of intention, but may also have multiple intentions.
  • the research on the problem of intent recognition mainly focuses on the recognition of single-intent. Compared with single-intent recognition, multi-intent recognition is not only more complex to recognize, but also requires a higher degree of semantic understanding.
  • the inventor found that, for the semantic analysis problem in the human-machine dialogue system, how to propose a joint modeling method to effectively solve the multi-intent recognition and semantic slot filling based on the existing technology is an urgent need for those skilled in the art to solve one of the problems.
  • an embodiment of the present application discloses a method for joint identification of multi-intent and semantic slots based on clustering pre-analysis, including:
  • Step S101 obtaining the multi-intent text input by the current user in real time and preprocessing
  • Step S102 constructing a multi-intent recognition model based on the clustering pre-analysis, which is used to recognize the multiple intentions of the user;
  • Step S103 constructing a BiLSTM-CRF semantic slot filling model based on the Slot-Gated correlation gate mechanism, and making full use of the result of the intent recognition to guide the filling of the semantic slot;
  • Step S104 optimize the constructed joint model of multi-intent recognition and semantic slot filling.
  • the preprocessing of the multi-intent text input by the current user is to represent the multi-intent text in a vectorized manner so as to be input into the neural network model for semantic feature extraction.
  • the vectorized representation method in the embodiment of the present application is to first use unsupervised corpus of massive texts in the same field (such as Chinese, English, and texts in other languages) to train a BERT (Bidirectional Encoder Representations from Transformer) model. Then, the obtained BERT pretrained model is used to vectorize the multi-intent text.
  • the purpose of constructing the multi-intent recognition model based on clustering pre-analysis in the above step S102 is to fill the semantic slot.
  • the accuracy of multi-intent recognition will directly affect the filling of semantic slots.
  • the embodiment of the present application proposes a method based on clustering pre-analysis, that is, the text of the intention is analyzed before the intention is recognized, and it is determined that the intention belongs to Single intent or multiple intent.
  • the intent recognition of the cluster pre-analysis-based method includes the following steps.
  • the first stage uses the K-means clustering algorithm to determine the type of the input intent text.
  • the input intent texts are classified according to the number of judged intents.
  • a multi-intent classifier is used for classification. That is, a fully connected layer is added after the BERT pre-training model. Each node of the fully connected layer is connected to all nodes of the previous layer, which is used to fuse the previously extracted semantic features. Then the intent text vector output by the BERT model is input into the sigmoid classifier, and the classifier is used to perform binary classification on each label to output multiple intent labels.
  • the calculation formula of label prediction is as follows:
  • y I is the predicted probability
  • W I is the weight of intent recognition
  • C is the intent text vector
  • b I is the bias of intent recognition
  • the softmax classifier is used to directly input the first sentence vector C marked as ([CLS]) by BERT into the classifier for classification, and the predicted intent label can be obtained according to the following formula:
  • y I is the predicted probability
  • W I is the weight of intent recognition
  • C is the intent text vector
  • b I is the bias of intent recognition
  • a BiLSTM-CRF semantic slot filling model is constructed based on the Slot-Gated correlation gate mechanism, and the result of intention recognition is fully utilized to guide the filling of the semantic slot.
  • the Slot-Gated association gate mechanism is shown in Figure 3, which can link the task of intent recognition with the task of semantic slot filling. That is, the weighted sum of the intent vector for intent recognition and the intent text vector for semantic slot filling. Then through the activation function tanh, the intent-semantic slot joint feature vector g is obtained.
  • the calculation method of the intent-semantic slot joint feature vector g is as follows:
  • c I represents the schematic vector, Having the same dimensions as c I , v and W are trainable vectors and matrices, respectively.
  • the intent-semantic slot joint feature vector g is input into the BiLSTM (Bi-directional Long Short-Term Memory) neural network to extract the word order features of the text and capture the deep context. semantic information. Then a linear layer is added behind the BiLSTM network to map the dimensions of the neural network output vector for semantic slot decoding. Finally use CRF
  • the decoding layer is used as a decoding unit to output the slot label corresponding to each word in the sequence.
  • the calculation method is as follows:
  • step S104 the constructed joint model of multi-intent recognition and semantic slot filling is optimized. .
  • the performance of the joint recognition model is jointly determined by the two subtasks.
  • the joint probability of multi-intent recognition and semantic slot filling is as follows:
  • the joint conditional probability of , T is the length of the input text sequence
  • t is the t-th character in the text sequence
  • the training goal is to maximize the joint probability of outputting multi-intent recognition and semantic slot filling.
  • the joint recognition model is optimized by making full use of intent semantic information for filling semantic slots.
  • the traditional method of simply adding multiple task loss functions is changed.
  • a step-by-step iterative training method combining multi-intent recognition and semantic slot filling is proposed.
  • the training data is input into the joint recognition model.
  • a multi-intent recognition model is trained first.
  • the multi-intent recognition model parameters and the underlying BERT model parameters are updated through backpropagation.
  • the updated model is used to transfer the semantic features of the multi-intent recognition results to the Slot-Gated correlation gate.
  • the semantic features of the intent are fused with the semantic slot features generated by using the updated BERT model to generate an intent-semantic slot joint feature vector.
  • the generated intent-semantic slot joint feature vector is used to train the semantic slot filling model.
  • the semantic slot filling model parameters and the underlying BERT model parameters are updated through backpropagation. Repeat the training until the optimum is reached.
  • the two tasks of multi-intent recognition and semantic slot filling share the underlying parameters of the BERT model during training, that is, when training one model, the training results of another model are used to initialize the underlying model.
  • the upstream tasks are trained separately, while the results of intent recognition are transferred to the semantic slot filling task. Improve the accuracy of multi-intent recognition model while improving the accuracy of semantic slot filling.
  • the loss function is very important for model parameter update. If the loss function is chosen unreasonably, no matter how powerful the model is, the final result will not be good.
  • the multi-intent recognition loss function Loss intent in the joint recognition model is as follows:
  • Loss intent (Loss multi ) k (Loss single ) 1-k
  • k represents the category of the schematic text
  • k is 1 when the intent text contains multiple intents
  • k is 0 when the intent text is a single intent.
  • Loss multi is the cross-entropy of multi-intent recognition
  • Loss single is the cross-entropy of single-intent recognition. The specific calculation is as follows:
  • y I is the predicted output of the intent
  • y intent is the real intent
  • j is a certain text in the training text
  • T 1 represents the number of training texts for multi-intent recognition.
  • the loss function Loss slot of the semantic slot filling task in the joint recognition model is calculated as follows:
  • W11 and W12 represent the weights of multi-intent recognition
  • Ws1 and Ws2 represent the weights of semantic slot filling
  • steps in the flowcharts of FIGS. 1-4 are shown in sequence according to the arrows, these steps are not necessarily executed in the sequence shown by the arrows. Unless explicitly stated in the embodiments of the present application, the execution of these steps is not strictly limited in sequence, and these steps may be executed in other sequences. Moreover, at least a part of the steps in FIGS. 1-4 may include multiple sub-steps or multiple stages, and these sub-steps or stages are not necessarily executed and completed at the same time, but may be executed at different times. These sub-steps or stages The order of execution of the steps is not necessarily sequential, but may be performed alternately or alternately with other steps or at least a part of sub-steps or stages of other steps.
  • the present application also provides a multi-intent and semantic slot joint identification system for cluster pre-analysis, comprising: a memory and a processor; the memory stores a computer program, and when the computer program is executed by the processor, The above-mentioned multi-intent and semantic slot joint recognition method is realized.
  • the present application also provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the steps of the above-mentioned method for joint identification of multiple intents and semantic slots are implemented.
  • the computer-readable storage medium may include: a USB flash drive, a removable hard disk, a read-only memory (Read-Only Memory, ROM), a random access memory (Random Access Memory, RAM), a magnetic disk or an optical disk, etc. that can store program codes medium.
  • Nonvolatile memory may include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory.
  • Volatile memory may include random access memory (RAM) or external cache memory.
  • RAM is available in various forms such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous chain Road (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.
  • SRAM static RAM
  • DRAM dynamic RAM
  • SDRAM synchronous DRAM
  • DDRSDRAM double data rate SDRAM
  • ESDRAM enhanced SDRAM
  • SLDRAM synchronous chain Road (Synchlink) DRAM
  • SLDRAM synchronous chain Road (Synchlink) DRAM
  • Rambus direct RAM
  • DRAM direct memory bus dynamic RAM
  • RDRAM memory bus dynamic RAM

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biophysics (AREA)
  • Software Systems (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Machine Translation (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

一种基于聚类预分析的多意图识别与语义槽填充联合建模方法:实时获取当前用户输入的多意图文本并进行预处理(S101);基于聚类预分析构建多意图识别模型(S102),用来识别用户的多个意图;基于Slot-Gated关联门机制构建BiLSTM-CRF语义槽填充模型(S103),充分利用意图识别的结果指导语义槽的填充;对构建的多意图识别与语义槽填充的联合模型进行优化(S104)。

Description

语义识别方法
相关申请
本申请要求于2021年3月26日提交中国专利局、申请号为202110325369X、申请名称为“一种基于聚类预分析的多意图与语义槽联合识别方法”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及自然语言处理领域,尤其涉及人机对话***中的自然语言语义分析方法。
背景技术
随着人工智能的快速发展,人们对许多应用场景中的设备智能性的要求日益增加。为了满足智能性的要求,良好的人机交互是必不可少的。目前,人机交互的方式呈现多样化,其中,最方便的方式莫过于使用自然语言。因此,利用自言语言实现人机对话的呼声也越来越高。这使得人机对话***受到了学术界和工业界的广泛关注,具有非常广泛的应用场景。
要实现人机对话***离不开自然语言语义分析技术。语义分析的好坏将直接影响人机交互的效果。由于自然语言的复杂性、抽象性以及词语的多义性等,这都增加了自然语言语义分析的难度。目前,语义分析分为意图识别与语义槽填充两个基本子任务。对于这两个子任务,传统的研究方法是将这两个任务看做两个独立的问题去解决,之后再将两个任务的结果进行连接。
发明内容
本申请各示例性实施例提供了一种语义识别方法,包括:
S101,实时获取当前用户输入的意图文本,利用BERT模型将所述意图文本进行向量化表示以获得意图文本向量;
S102,基于聚类预分析构建多意图识别模型,根据所述意图文本向量识别所述用户的多个意图;
S103,基于Slot-Gated关联门机制构建BiLSTM-CRF语义槽填充模型,利用所识别的所述多个意图来填充所述语义槽填充模型的语义槽;以及
S104,对由所述BERT模型、所述多意图识别模型以及所述语义槽填充模型构成的联合模型进行优化训练,利用优化训练完成的联合模型对输入所述联合模型的文本进 行识别。
在一实施例中,所述基于所述聚类预分析构建所述多意图识别模型,识别所述用户的多个意图的步骤包括:
第一阶段:用K-means聚类算法,将输入的所述意图文本向量分为单意图类别的意图文本向量和多意图类别的意图文本向量;以及
第二阶段:对所述单意图类别的意图文本向量,通过softmax分类器进行分类以识别所述多个意图;以及对所述多意图的意图文本向量,通过sigmoid分类器进行分类以识别所述多个意图。
在一实施例中,所述K-means聚类算法中的距离函数为:
Figure PCTCN2021091024-appb-000001
其中,f Sim(x i,x j)表示意图文本向量x i和意图文本向量x j之间的距离,f 1(x i,x j)表示意图文本向量x i和意图文本向量x j之间的余弦相似度,f 2(x i,x j)表示意图文本向量x i和意图文本向量x j之间的欧氏距离。
在一实施例中,步骤S104中对所述联合模型进行优化训练包括:
①利用训练文本对所述BERT模型和所述多意图识别模型进行训练,并更新所述BERT模型和所述多意图识别模型的参数;
②将①中所述多意图识别模型的输出传送至Slot-Gated,利用与①中相同的训练文本对①中更新后的BERT模型和语义槽填充模型进行训练,并更新BERT模型和语义槽填充模型的参数;以及
③迭代执行①和②,直到达到训练目标。
在一实施例中,所述多意图识别模型的损失函数Loss intent满足以下公式:
Loss intent=(Loss multi) k(Loss single) 1-k
其中,k表示意图文本的类别,当所述意图文本包含多个意图时k为1,当意图文本为单意图时k为0;
Figure PCTCN2021091024-appb-000002
为多意图识别的交叉熵损失,
Figure PCTCN2021091024-appb-000003
为单意图识别的交叉熵损失,y I为意图的预测输出,y intent为真实意图,以及T是训练文本的数量。
在一实施例中,所述语义槽填充模型的损失函数Loss slot满足以下公式:
Figure PCTCN2021091024-appb-000004
Figure PCTCN2021091024-appb-000005
其中,
Figure PCTCN2021091024-appb-000006
表示训练文本序列中第i个字的语义槽预测输出,
Figure PCTCN2021091024-appb-000007
表示训练文本序列中第i个字的真实语义槽,T是训练文本数,以及M表示训练文本序列长度。
本申请采用以上技术方案与现有技术相比,具有以下技术效果:
本申请充分考虑了意图识别与语义槽填充之间的联系,构建了联合识别模型,将两个语义分析子任务合并为一个任务,共享BERT底层语义特征。然后,利用Slot-Gated关联门生成意图-语义槽联合特征向量,再将其用于语义槽填充任务。在语义槽填充的任务中,使用BiLSTM捕获文本的语序特征,获取上下文语义信息;以及使用CRF作为解码器,考虑标签前后的依赖关系,使得语义槽标注的更合理。此外,为了提升联合模型的整体性能,在多意图识别过程中,针对用户输入意图的不确定性,提出了一种基于聚类预分析的算法,用来判断意图的数量。在该算法中对传统的语义相似度的度量方法进行了改进,提出了新的度量方式,新的度量方式可以更加有效的衡量意图文本之间的相似度,提高意图个数判断的准确度,提高算法的鲁棒性。为了提高语义分析的能力,充分利用意图语义信息指导语义槽的填充,基于迭代思想,提出一种通过分步迭代的训练方式,可以充分利用意图与语义槽之间的相互关系,在提高语义槽填充准确性的同时提高多意图识别模型的准确性,从而提高语义分析的效果。
附图说明
为了使本申请的目的、技术方案和技术效果更加清楚,本申请提供如下附图进行说明:
图1为本申请一实施例的联合建模方法整体结构框图;
图2为本申请一实施例的基于聚类预分析的多意图识别流程图;
图3为本申请一实施例的语义槽识别模型结构图;以及
图4位本申请一实施例的联合识别模型的分步迭代训练方式示意图。
具体实施方式
为了使本申请的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本申请进行进一步详细说明。应当理解,此处描述的具体实施例仅仅用以解释本申请, 并不用于限定本申请。
如上所述,传统的研究方法是将意图识别与语义槽填充这两个任务看做两个独立的问题去解决,之后再将两个任务的结果进行连接。
但发明人发现,意图识别是对用户需求类型的判断,而语义槽填充是将用户需求具体化。因此,用户意图和待识别的槽位是强相关的,意图识别的目的是为了更好的进行语义槽的填充。而传统的单独建模的方法没有充分考虑两个任务之间的联系,使得语义信息没能得到有效的利用。此外,人机对话***中常面临多意图识别问题,即用户输入的意图文本可能不仅只包含一种意图,也可能会出现多种意图。目前对于意图识别问题的研究主要集中在单意图的识别上,多意图识别相对于单意图识别来说,识别起来不仅更为复杂而且对语义理解的程度要求更高。
因此,发明人发现,针对人机对话***中的语义分析问题,如何在现有技术的基础上提出有效解决多意图识别与语义槽填充的联合建模方法,是本领域内技术人员亟待解决的问题之一。
如图1所示,本申请一实施例揭示了一种基于聚类预分析的多意图与语义槽联合识别方法,包括:
步骤S101,实时获取当前用户输入的多意图文本并进行预处理;
步骤S102,基于聚类预分析构建多意图识别模型,用来识别用户的多个意图;
步骤S103,基于Slot-Gated关联门机制构建BiLSTM-CRF语义槽填充模型,充分利用意图识别的结果指导语义槽的填充;
步骤S104,对构建的多意图识别与语义槽填充的联合模型进行优化。
其中,对当前用户输入的多意图文本进行预处理,就是将多意图文本进行向量化表示,以便输入神经网络模型中进行语义特征提取。本申请实施例的向量化表示方法为,首先使用同领域海量文本(例如中文、英文及其他各种语言文本)无监督语料训练BERT(Bidirectional Encoder Representations from Transformer)模型。然后,利用得到的BERT预训练模型来对多意图文本进行向量化表示。
上述步骤S102中构建基于聚类预分析的多意图识别模型的目的是为了用于语义槽的填充。多意图识别的准确与否会直接影响语义槽的填充。
为了提高多意图识别的准确率,针对用户输入的意图的不确定性,本申请实施例提出了一种基于聚类预分析的方法,即在意图识别之前先对意图文本进行分析,判断意图属于单意图还是多意图。如图2所示,基于聚类预分析的方法的意图识别包括以下步骤。
整个意图识别分为两个阶段。
首先第一阶段用K-means聚类算法来判断输入的意图文本的类型。
一般情况下意图主要分为单意图和多意图两种,因此,K-means聚类算法的聚类中心K为两个。
第二阶段,根据判断的意图个数分别对输入的意图文本进行分类。
当判断意图文本包含多个意图时,使用多意图分类器进行分类。即在BERT预训练模型后面增加全连接层。全连接层的每一个结点都与上一层的所有结点相连,用来把前面提取的语义特征融合起来。然后将BERT模型输出的意图文本向量,输入sigmoid分类器中,用分类器在每个标签上进行二分类,输出多个意图标签。标签预测的计算公式如下所示:
y I=sigmoid(W IC+b I),
其中,y I为预测概率,W I为意图识别的权重,C为意图文本向量,以及b I为意图识别的偏置。
当判断意图文本为单意图时,采用softmax分类器,直接将BERT输出第一个标志为([CLS])的句向量C输入分类器中进行分类,根据下面的公式可得到预测的意图标签:
y I=softmax(W IC+b I),
其中,y I为预测概率,W I为意图识别的权重,C为意图文本向量,以及b I为意图识别的偏置。
在使用K-means聚类算法对多意图文本进行预分析的过程中,需要判断意图文本之间的语义相似度。语义相似度的衡量对于聚类结果的准确性至关重要。对于文本语义相似度的衡量,常用的方式是计算余弦相似度。余弦相似度可以体现空间中两个向量间的差异性。但是余弦相似度对绝对数值不敏感,无法衡量同一方向上的差异性。而欧式距离(Euclidean Metric)在计算相似度时,对绝对数值敏感,可以很好的衡量同一方向上的差异性。因此本申请在余弦相似度和欧式距离的基础上,综合二者的特点,提出了一种新的度量方法,如下所示:
Figure PCTCN2021091024-appb-000008
其中,f 1指余弦相似度,f 2指欧式距离,X i为第i个意图文本向量,X j为第j个意图文本向量,以及e为自然常数。当计算得到的f Sim值越大,说明数据对象之间的相似度越大,而当计算得到的f Sim值越小,说明数据对象之间的相似度越小。使用该方法可以 更好的衡量文本之间的相似度。
上述步骤S103中的填充过程,基于Slot-Gated关联门机制构建BiLSTM-CRF语义槽填充模型,充分利用意图识别的结果指导语义槽的填充。
如图3中示出了Slot-Gated关联门机制,它可以把意图识别任务与语义槽填充任务联系起来。即将意图识别的意图向量与用于语义槽填充的意图文本向量加权求和。然后通过激活函数tanh,得到意图-语义槽联合特征向量g。意图-语义槽联合特征向量g的计算方法如下:
Figure PCTCN2021091024-appb-000009
其中,
Figure PCTCN2021091024-appb-000010
表示语义槽向量,c I表示意图向量,
Figure PCTCN2021091024-appb-000011
和c I的维度相同,v和W分别是可训练的向量和矩阵。
计算得到意图-语义槽联合特征向量g之后,将意图-语义槽联合特征向量g,输入BiLSTM(Bi-directional Long Short-Term Memory)神经网络中,从而提取文本的语序特征并捕获深层次的上下文语义信息。然后在BiLSTM网络后面添加一层线性层(Linear Layer),对神经网络输出向量的维度进行映射,用于语义槽解码。最后使用CRF
(Conditional Random Field)解码层作为解码单元,输出序列中每个词对应的槽标签。计算方法如下:
Figure PCTCN2021091024-appb-000012
其中,
Figure PCTCN2021091024-appb-000013
表示输入文本序列中第i个字的语义槽预测输出,
Figure PCTCN2021091024-appb-000014
为权重矩阵,以及□ i为隐藏状态向量。图3中B-time为time槽标签的开始标记,I-time为time槽标签的后续标记。
上述步骤S104中,对构建的多意图识别与语义槽填充的联合模型进行优化。。
如图4所示,联合识别模型的性能由两个子任务共同决定。多意图识别和语义槽填充的联合概率如下所示:
Figure PCTCN2021091024-appb-000015
其中,
Figure PCTCN2021091024-appb-000016
表示在输入多意图文本序列x(包括x 1,x 2,…,x T)的前提下, 多意图识别y I和语义槽填充
Figure PCTCN2021091024-appb-000017
的联合条件概率,T为输入的文本序列长度,t为文本序列中的第t个字符,
Figure PCTCN2021091024-appb-000018
为输入序列中第t个字符的语义槽预测输出。
在图4所示的联合模型训练中,训练的目标是最大化输出多意图识别和语义槽填充的联合概率。为了提高语义分析的能力,充分利用意图语义信息用于语义槽的填充,对联合识别模型进行优化。在模型训练时,改变了传统的只是将多个任务损失函数简单相加的方式。基于迭代思想,提出了一种联合多意图识别与语义槽填充的分步迭代训练方式。如图4所示,首先,将训练数据输入联合识别模型中。训练时,先训练一轮多意图识别模型。通过反向传播更新多意图识别模型参数及底层BERT模型参数。然后利用更新过的模型将多意图识别结果的语义特征传送给Slot-Gated关联门。通过Slot-Gated关联门将意图的语义特征与采用更新过的BERT模型而生成的语义槽特征相融合,生成意图-语义槽联合特征向量。将生成的意图-语义槽联合特征向量用于语义槽填充模型的训练。训练时,通过反向传播更新语义槽填充模型参数及底层BERT模型参数。重复训练直到达到最优。
多意图识别与语义槽填充两个任务在训练时共享BERT模型底层参数,即训练一个模型时以另外一个模型的训练结果进行底层模型的初始化。而上游任务则分别训练,同时将意图识别的结果传送给语义槽填充任务。在提高语义槽填充的准确率的同时提高多意图识别模型的准确率。
损失函数对于模型参数更新非常重要。若损失函数选择的不合理,模型再强大最后的结果也不好。
联合识别模型中的多意图识别损失函数Loss intent,计算公式如下所示:
Loss intent=(Loss multi) k(Loss single) 1-k
其中,k表示意图文本的类别,当意图文本包含多个意图时k为1,当意图文本为单意图时k为0。Loss multi为多意图识别的交叉熵,Loss single为单意图识别的交叉熵,其具体计算如下所示:
Figure PCTCN2021091024-appb-000019
Figure PCTCN2021091024-appb-000020
式中,y I为意图的预测输出,y intent为真实意图,j为训练文本中的某一条文本,T 1表 示多意图识别的训练文本数。
联合识别模型中语义槽填充任务损失函数Loss slot,其计算如下所示:
Figure PCTCN2021091024-appb-000021
Figure PCTCN2021091024-appb-000022
其中,
Figure PCTCN2021091024-appb-000023
表示训练文本序列中第i个字的语义槽预测输出,
Figure PCTCN2021091024-appb-000024
表示训练文本序列中第i个字的真实语义槽,T 2表示语义槽填充训练文本数,M表示训练文本序列长度。
图4中,W11、W12表示多意图识别的权重,Ws1、Ws2表示语义槽填充的权重。
应该理解的是,虽然图1-4的流程图中的各个步骤按照箭头的指示依次显示,但是这些步骤并不是必然按照箭头指示的顺序依次执行。除非本申请实施例中有明确的说明,这些步骤的执行并没有严格的顺序限制,这些步骤可以以其它的顺序执行。而且,图1-4中的至少一部分步骤可以包括多个子步骤或者多个阶段,这些子步骤或者阶段并不必然是在同一时刻执行完成,而是可以在不同的时刻执行,这些子步骤或者阶段的执行顺序也不必然是依次进行,而是可以与其它步骤或者其它步骤的子步骤或者阶段的至少一部分轮流或者交替地执行。
本申请还提供一种于聚类预分析的多意图与语义槽联合识别***,包括:存储器和处理器;所述存储器上存储有计算机程序,当所述计算机程序被所述处理器执行时,实现上述的多意图与语义槽联合识别方法。
本申请还提供一种计算机可读存储介质,所述计算机可读存储介质上存储有计算机程序,所述计算机程序被处理器执行时实现上述的多意图与语义槽联合识别方法的步骤。该计算机可读存储介质可以包括:U盘、移动硬盘、只读存储器(Read-Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机程序来指令相关的硬件来完成,所述的计算机程序可存储于一非易失性计算机可读取存储介质中,该计算机程序在执行时,可包括如上述各方法的实施例的流程。其中,本申请所提供的各实施例中所使用的对存储器、存储、数据库或其它介质的任何引用,均可包括非易失性和/或易失性存储器。非易失性存储器可包括只读存储器(ROM)、 可编程ROM(PROM)、电可编程ROM(EPROM)、电可擦除可编程ROM(EEPROM)或闪存。易失性存储器可包括随机存取存储器(RAM)或者外部高速缓冲存储器。作为说明而非局限,RAM以多种形式可得,诸如静态RAM(SRAM)、动态RAM(DRAM)、同步DRAM(SDRAM)、双数据率SDRAM(DDRSDRAM)、增强型SDRAM(ESDRAM)、同步链路(Synchlink)DRAM(SLDRAM)、存储器总线(Rambus)直接RAM(RDRAM)、直接存储器总线动态RAM(DRDRAM)、以及存储器总线动态RAM(RDRAM)等。
以上实施例的各技术特征可以进行任意的组合,为使描述简洁,未对上述实施例中的各个技术特征所有可能的组合都进行描述,然而,只要这些技术特征的组合不存在矛盾,都应当认为是本说明书记载的范围。
以上所述实施例仅表达了本申请的几种实施方式,其描述较为具体和详细,但并不能因此而理解为对发明专利范围的限制。应当指出的是,对于本领域的普通技术人员来说,在不脱离本申请构思的前提下,还可以做出若干变形和改进,这些都属于本申请的保护范围。因此,本申请专利的保护范围应以所附权利要求为准。

Claims (8)

  1. 一种语义识别方法,包括:
    S101,实时获取当前用户输入的意图文本,利用BERT模型将所述意图文本进行向量化表示以获得意图文本向量;
    S102,基于聚类预分析构建多意图识别模型,根据所述意图文本向量识别所述用户的多个意图;
    S103,基于Slot-Gated关联门机制构建BiLSTM-CRF语义槽填充模型,利用所识别的所述多个意图来填充所述语义槽填充模型的语义槽;以及
    S104,对由所述BERT模型、所述多意图识别模型以及所述语义槽填充模型构成的联合模型进行优化训练,利用优化训练完成的联合模型对输入所述联合模型的文本进行识别。
  2. 根据权利要求1所述的方法,其中,所述基于所述聚类预分析构建所述多意图识别模型,识别所述用户的多个意图的步骤包括:
    第一阶段:用K-means聚类算法,将输入的所述意图文本向量分为单意图类别的意图文本向量和多意图类别的意图文本向量;以及
    第二阶段:对所述单意图类别的意图文本向量,通过softmax分类器进行分类以识别所述多个意图;以及对所述多意图的意图文本向量,通过sigmoid分类器进行分类以识别所述多个意图。
  3. 根据权利要求2所述的方法,其中,所述K-means聚类算法中的距离函数为:
    Figure PCTCN2021091024-appb-100001
    其中,f Sim(x i,x j)表示意图文本向量x i和意图文本向量x j之间的距离,f 1(x i,x j)表示意图文本向量x i和意图文本向量x j之间的余弦相似度,f 2(x i,x j)表示意图文本向量x i和意图文本向量x j之间的欧氏距离。
  4. 根据权利要求1所述的方法,其中,步骤S104中对所述联合模型进行优化训练包括:
    ①利用训练文本对所述BERT模型和所述多意图识别模型进行训练,并更新所述BERT模型和所述多意图识别模型的参数;
    ②将①中所述多意图识别模型的输出传送至Slot-Gated,利用与①中相同的训练文本对①中更新后的BERT模型和语义槽填充模型进行训练,并更新BERT模型和语义槽填充模型的参数;以及
    ③迭代执行①和②,直到达到训练目标。
  5. 根据权利要求4所述的方法,其中,所述多意图识别模型的损失函数Loss intent满足以下公式:
    Loss intent=(Loss multi) k(Loss single) 1-k
    其中,k表示意图文本的类别,当所述意图文本包含多个意图时k为1,当意图文本为单意图时k为0;
    Figure PCTCN2021091024-appb-100002
    为多意图识别的交叉熵损失,
    Figure PCTCN2021091024-appb-100003
    为单意图识别的交叉熵损失,y I为意图的预测输出,y intent为真实意图,以及T是训练文本的数量。
  6. 根据权利要求4所述的方法,其中,所述语义槽填充模型的损失函数Loss slot满足以下公式:
    Figure PCTCN2021091024-appb-100004
    Figure PCTCN2021091024-appb-100005
    其中,
    Figure PCTCN2021091024-appb-100006
    表示训练文本序列中第i个字的语义槽预测输出,
    Figure PCTCN2021091024-appb-100007
    表示训练文本序列中第i个字的真实语义槽,T是训练文本数,以及M表示训练文本序列长度。
  7. 一种计算机可读存储介质,其中,所述计算机可读存储介质上存储有计算机程序,所述计算机程序被处理器执行时实现如权利要求1至6中任一所述的方法。
  8. 一种语义识别***,包括:存储器和处理器;所述存储器上存储有计算机程序,当所述计算机程序被所述处理器执行时,实现如权利要求1至6中任一所述的方法。
PCT/CN2021/091024 2021-03-26 2021-04-29 语义识别方法 WO2022198750A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP2022512826A JP7370033B2 (ja) 2021-03-26 2021-04-29 セマンティック認識方法

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110325369.XA CN113204952B (zh) 2021-03-26 2021-03-26 一种基于聚类预分析的多意图与语义槽联合识别方法
CN202110325369.X 2021-03-26

Publications (1)

Publication Number Publication Date
WO2022198750A1 true WO2022198750A1 (zh) 2022-09-29

Family

ID=77025737

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/091024 WO2022198750A1 (zh) 2021-03-26 2021-04-29 语义识别方法

Country Status (3)

Country Link
JP (1) JP7370033B2 (zh)
CN (1) CN113204952B (zh)
WO (1) WO2022198750A1 (zh)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116795886A (zh) * 2023-07-13 2023-09-22 杭州逍邦网络科技有限公司 用于销售数据的数据分析引擎及方法
CN117435716A (zh) * 2023-12-20 2024-01-23 国网浙江省电力有限公司宁波供电公司 电网人机交互终端的数据处理方法及***
CN117765949A (zh) * 2024-02-22 2024-03-26 青岛海尔科技有限公司 一种基于语义依存分析的语句多意图识别方法及装置
CN117909508A (zh) * 2024-03-20 2024-04-19 成都赛力斯科技有限公司 意图识别方法、模型训练方法、装置、设备及存储介质

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115292463B (zh) * 2022-08-08 2023-05-12 云南大学 一种基于信息抽取的联合多意图检测和重叠槽填充的方法
CN115273849B (zh) * 2022-09-27 2022-12-27 北京宝兰德软件股份有限公司 一种关于音频数据的意图识别方法及装置
CN117435738B (zh) * 2023-12-19 2024-04-16 中国人民解放军国防科技大学 一种基于深度学习的文本多意图分析方法与***
CN118037362B (zh) * 2024-04-12 2024-07-05 中国传媒大学 基于用户多意图对比的序列推荐方法及***

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110008476A (zh) * 2019-04-10 2019-07-12 出门问问信息科技有限公司 语义解析方法、装置、设备及存储介质
CN110321418A (zh) * 2019-06-06 2019-10-11 华中师范大学 一种基于深度学习的领域、意图识别和槽填充方法
US20200257856A1 (en) * 2019-02-07 2020-08-13 Clinc, Inc. Systems and methods for machine learning based multi intent segmentation and classification
CN112035626A (zh) * 2020-07-06 2020-12-04 北海淇诚信息科技有限公司 一种大规模意图的快速识别方法、装置和电子设备
CN112183062A (zh) * 2020-09-28 2021-01-05 云知声智能科技股份有限公司 一种基于交替解码的口语理解方法、电子设备和存储介质

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111767408B (zh) * 2020-05-27 2023-06-09 青岛大学 一种基于多种神经网络集成的因果事理图谱构建方法

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200257856A1 (en) * 2019-02-07 2020-08-13 Clinc, Inc. Systems and methods for machine learning based multi intent segmentation and classification
CN110008476A (zh) * 2019-04-10 2019-07-12 出门问问信息科技有限公司 语义解析方法、装置、设备及存储介质
CN110321418A (zh) * 2019-06-06 2019-10-11 华中师范大学 一种基于深度学习的领域、意图识别和槽填充方法
CN112035626A (zh) * 2020-07-06 2020-12-04 北海淇诚信息科技有限公司 一种大规模意图的快速识别方法、装置和电子设备
CN112183062A (zh) * 2020-09-28 2021-01-05 云知声智能科技股份有限公司 一种基于交替解码的口语理解方法、电子设备和存储介质

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116795886A (zh) * 2023-07-13 2023-09-22 杭州逍邦网络科技有限公司 用于销售数据的数据分析引擎及方法
CN116795886B (zh) * 2023-07-13 2024-03-08 杭州逍邦网络科技有限公司 用于销售数据的数据分析引擎及方法
CN117435716A (zh) * 2023-12-20 2024-01-23 国网浙江省电力有限公司宁波供电公司 电网人机交互终端的数据处理方法及***
CN117435716B (zh) * 2023-12-20 2024-06-11 国网浙江省电力有限公司宁波供电公司 电网人机交互终端的数据处理方法及***
CN117765949A (zh) * 2024-02-22 2024-03-26 青岛海尔科技有限公司 一种基于语义依存分析的语句多意图识别方法及装置
CN117765949B (zh) * 2024-02-22 2024-05-24 青岛海尔科技有限公司 一种基于语义依存分析的语句多意图识别方法及装置
CN117909508A (zh) * 2024-03-20 2024-04-19 成都赛力斯科技有限公司 意图识别方法、模型训练方法、装置、设备及存储介质

Also Published As

Publication number Publication date
JP2023522502A (ja) 2023-05-31
CN113204952A (zh) 2021-08-03
CN113204952B (zh) 2023-09-15
JP7370033B2 (ja) 2023-10-27

Similar Documents

Publication Publication Date Title
WO2022198750A1 (zh) 语义识别方法
WO2022037256A1 (zh) 文本语句处理方法、装置、计算机设备和存储介质
US11941366B2 (en) Context-based multi-turn dialogue method and storage medium
Zhou et al. A C-LSTM neural network for text classification
CN111783462A (zh) 基于双神经网络融合的中文命名实体识别模型及方法
CN110263325B (zh) 中文分词***
CN113255320A (zh) 基于句法树和图注意力机制的实体关系抽取方法及装置
CN115081437B (zh) 基于语言学特征对比学习的机器生成文本检测方法及***
CN114398855A (zh) 基于融合预训练的文本抽取方法、***及介质
CN113190656A (zh) 一种基于多标注框架与融合特征的中文命名实体抽取方法
CN114398881A (zh) 基于图神经网络的交易信息识别方法、***及介质
CN115600597A (zh) 基于注意力机制和词内语义融合的命名实体识别方法、装置、***及存储介质
CN116304748A (zh) 一种文本相似度计算方法、***、设备及介质
CN112699685A (zh) 基于标签引导的字词融合的命名实体识别方法
CN112988970A (zh) 一种服务于智能问答***的文本匹配算法
CN113239694B (zh) 一种基于论元短语的论元角色识别的方法
CN111145914A (zh) 一种确定肺癌临床病种库文本实体的方法及装置
CN113178189A (zh) 一种信息分类方法及装置、信息分类模型训练方法及装置
CN115186670B (zh) 一种基于主动学习的领域命名实体识别方法及***
CN116432755A (zh) 一种基于动态实体原型的权重网络推理方法
CN116362242A (zh) 一种小样本槽值提取方法、装置、设备及存储介质
CN115964497A (zh) 一种融合注意力机制与卷积神经网络的事件抽取方法
Wu et al. A Text Emotion Analysis Method Using the Dual‐Channel Convolution Neural Network in Social Networks
CN113255342B (zh) 一种5g移动业务产品名称识别方法及***
VEENA et al. DETECTION OF SARCASTIC SENTIMENT ANALYSIS IN TWEETS USING LSTM WITH IMPROVED ATTENTION BASED FEATURE EXTRACTION (IATEN)

Legal Events

Date Code Title Description
ENP Entry into the national phase

Ref document number: 2022512826

Country of ref document: JP

Kind code of ref document: A

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21932367

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21932367

Country of ref document: EP

Kind code of ref document: A1