CN118193693A - Word slot filling method, device, equipment and medium - Google Patents

Word slot filling method, device, equipment and medium Download PDF

Info

Publication number
CN118193693A
CN118193693A CN202410295234.7A CN202410295234A CN118193693A CN 118193693 A CN118193693 A CN 118193693A CN 202410295234 A CN202410295234 A CN 202410295234A CN 118193693 A CN118193693 A CN 118193693A
Authority
CN
China
Prior art keywords
word slot
entity
word
type
slot
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202410295234.7A
Other languages
Chinese (zh)
Inventor
霍庆源
向宇波
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN202410295234.7A priority Critical patent/CN118193693A/en
Publication of CN118193693A publication Critical patent/CN118193693A/en
Pending legal-status Critical Current

Links

Landscapes

  • Machine Translation (AREA)

Abstract

The present disclosure provides a word slot filling method, device, apparatus and medium, and relates to the field of artificial intelligence, in particular to the fields of intelligent dialogue, speech technology, deep learning and large language models. The specific implementation scheme is as follows: extracting word slots of the information to be identified according to the constructed prompting words by using the large language model to obtain word slot initial values of target word slots; performing entity identification according to the initial value of the word slot to obtain an entity type and an entity identification value; acquiring the word slot type of the target word slot from the prompt word, and checking whether the word slot extraction is successful or not according to the entity relationship between the entity type and the word slot type; and if the word slot extraction is successful, filling the target word slot by taking the entity identification value as a target word slot value, wherein the target word slot value is a normalization result of the word slot initial value. The present disclosure may improve the accuracy of word slot filling.

Description

Word slot filling method, device, equipment and medium
Technical Field
The present disclosure relates to the field of artificial intelligence, and in particular, to the field of intelligent conversations, speech technology, deep learning, and large language models, and more particularly, to a word slot filling method, apparatus, device, and medium.
Background
In the field of intelligent conversations, word slot filling (entity collection) techniques are a very efficient means of collecting user information, which can be recognized and extracted by intelligent conversational robots to key information by user speaking, and then based on these key information make subsequent decisions.
However, in some dialog scenarios, current word slot filling techniques still suffer from word slot filling inaccuracy.
Disclosure of Invention
The present disclosure provides a word slot filling method, apparatus, device and medium.
According to an aspect of the present disclosure, there is provided a word slot filling method including:
Extracting word slots of the information to be identified according to the constructed prompt words by using the large language model to obtain word slot initial values of target word slots;
Performing entity identification according to the word slot initial value to obtain an entity type and an entity identification value;
Acquiring the word slot type of the target word slot from the prompt word, and checking whether the word slot extraction is successful or not according to the entity relationship between the entity type and the word slot type;
And if the word slot extraction is successful, filling the target word slot by taking the entity identification value as a target word slot value, wherein the target word slot value is a normalization result of the word slot initial value.
According to another aspect of the present disclosure, there is provided a word slot filling apparatus including:
the extraction module is used for extracting word slots of the information to be identified according to the constructed prompt words by utilizing the large language model to obtain word slot initial values of target word slots;
The entity identification module is used for carrying out entity identification according to the word slot initial value to obtain an entity type and an entity identification value;
The verification module is used for acquiring the word slot type of the target word slot from the prompt word and verifying whether the word slot extraction is successful or not according to the entity relationship between the entity type and the word slot type;
and the filling module is used for filling the target word slot by taking the entity identification value as a target word slot value if the word slot extraction is successful, wherein the target word slot value is a normalization result of the word slot initial value.
According to another aspect of the present disclosure, there is provided an electronic device including:
At least one processor; and
A memory communicatively coupled to the at least one processor; wherein,
The memory stores instructions executable by the at least one processor to enable the at least one processor to perform the word slot filling method of any embodiment of the present disclosure.
According to another aspect of the present disclosure, there is provided a non-transitory computer-readable storage medium storing computer instructions for causing a computer to perform the word slot filling method according to any embodiment of the present disclosure.
According to another aspect of the present disclosure, there is provided a computer program product comprising a computer program/instructions which, when executed by a processor, implement the word slot filling method of any embodiment of the present disclosure.
It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the disclosure, nor is it intended to be used to limit the scope of the disclosure. Other features of the present disclosure will become apparent from the following specification.
Drawings
The drawings are for a better understanding of the present solution and are not to be construed as limiting the present disclosure. Wherein:
FIG. 1 is a flow diagram of a word slot filling method according to an embodiment of the present disclosure;
FIG. 2 is a flow diagram of another word slot filling method according to an embodiment of the present disclosure;
FIG. 3 is a first example diagram of a word slot filling method according to an embodiment of the present disclosure;
FIG. 4 is a second example diagram of a word slot filling method according to an embodiment of the present disclosure;
FIG. 5 is a third example diagram of a word slot filling method according to an embodiment of the present disclosure;
FIG. 6 is a schematic structural view of a word slot filling apparatus according to an embodiment of the present disclosure;
fig. 7 is a block diagram of an electronic device for implementing a word slot filling method of an embodiment of the present disclosure.
Detailed Description
Exemplary embodiments of the present disclosure are described below in conjunction with the accompanying drawings, which include various details of the embodiments of the present disclosure to facilitate understanding, and should be considered as merely exemplary. Accordingly, one of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
Fig. 1 is a flow chart of a word slot filling method according to an embodiment of the present disclosure, which is applicable to a case of word slot filling based on a user dialogue in an intelligent dialogue process, and relates to the field of artificial intelligence, in particular to the fields of intelligent dialogue, speech technology and deep learning. The method may be performed by a word slot filling device implemented in software and/or hardware, preferably configured in an electronic device, such as a smart terminal, smart conversation robot or computer device. As shown in fig. 1, the method specifically includes the following steps:
S101, extracting word slots of information to be identified according to the constructed prompt words by using a large language model, and obtaining word slot initial values of target word slots.
In a smart dialog scenario, a smart dialog robot or other hardware device hosting a smart dialog system may identify and extract key information from a user session and then make subsequent dialog decisions based on the key information to complete a dialog objective. Thus, the user speaking is the information to be identified.
In order to achieve a preset goal, such as a ticket reservation, hotel booking, or commodity purchase, it is generally necessary to collect word slot information required to achieve the goal by performing multiple conversations with a user. For example, in a ticket reservation scene, the required word slot information may be configured with ticket kind, seat/bunk type, departure/arrival time, departure/destination, and the like. The required word slot information is the information of the target word slot. The number of the target word slots may be one or more, and may be specifically configured according to requirements of a dialogue scene, which is not limited in any way by the embodiments of the present disclosure.
The large language model refers to a deep learning model trained based on massive text data. In the embodiment of the disclosure, the large language model is specifically used for carrying out semantic recognition on a user speaking operation, and a value of a target word slot is extracted from the user speaking operation through semantic recognition and is used as a word slot initial value. Specifically, the method can realize the extraction of the target word slot value of the large language model by dynamically constructing prompt words promtt of the large language model, wherein in the process of constructing prompt, dialogue context of a system and a user, word slot types to be filled, word slot candidate values to be filled (if any), collected word slots, current conversation of the user and other information are constructed, and the large language model is driven to extract the target word slot value according to the information.
In one embodiment, in each round of dialogue, the intelligent dialogue system may construct a hint word based on the information to be identified and the target word slot to be currently filled in by the user. The prompt word is used for indicating the specific extraction task of the large language model. Taking hotel booking as an example, the target word slot to be filled is "check-in time" and the user speaking is "tomorrow bar", the constructed prompting words can be: the word slot to be filled is 'time to live', and the input of the user is 'tomorrow bar', please input and output the time to live according to the input of the user. Then, the large language model is extracted according to the prompt words, semantic recognition is carried out on the "tomorrow bar" of the user speaking operation, tomorrow "is extracted, and then the initial value of the word slot of the" residence time "is obtained as tomorrow.
It should be noted that, the construction of the prompt word is a dynamic process, that is, in the process of proceeding the dialogue, the dialogue flows to which link, and the prompt word is constructed based on the target word slot to be filled in the current link and the input of the user. In addition, the context and word slots that have been filled in can be combined to construct hint words.
S102, entity identification is carried out according to the initial value of the word slot, and the entity type and the entity identification value are obtained.
S103, acquiring the word slot type of the target word slot from the prompt word, and checking whether the word slot extraction is successful or not according to the entity relationship between the entity type and the word slot type.
And S104, if the word slot extraction is successful, filling the target word slot by taking the entity identification value as a target word slot value, wherein the target word slot value is a normalization result of the word slot initial value.
Taking hotel booking as an example, the check-in time belongs to a time entity, the check-in place belongs to a hotel entity, and the check-in name belongs to a name entity. There are different ways of dividing entities in different fields. When the large language model extracts the initial value of the word slot, in order to further ensure the accuracy of the word slot, in the embodiment of the disclosure, entity recognition is performed based on the initial value of the word slot, that is, the initial value of the word slot is used as a text to be recognized, and the entity type and the corresponding entity recognition value contained in the text to be recognized are recognized from the text to be recognized. When the text to be identified comprises the entity and other texts except the entity, the entity can be identified by the entity, the entity category to which the entity belongs is determined, and the entity identification value is the entity name determined from the text to be identified. The definition of the entity in this disclosure is the same as the prior art, and is not repeated herein, meanwhile, the content of the entity included in different fields is different, and the disclosure does not limit the content of the entity. For example, if the text to be identified is "from beijing", which includes the entity "beijing", the entity category may be identified as "place", and the entity identification value is "beijing"; if the text to be identified is "Beijing", the entity category may be identified as "place" and the entity identification value is "Beijing". Therefore, the word slot initial value is used as the text to be identified for entity identification, and a high-precision identification result can be obtained, so that an accurate data basis is provided for the next verification.
For example, the large language model obtains that the initial value of the word slot of the "hotel name" is "a hotel", then, entity identification is continuously performed on the "a hotel", if it can be identified that the "a hotel" belongs to the hotel name, that is, the entity type is "hotel name", it belongs to the same entity as the word slot type of the target word slot to be filled currently, that is, the "hotel name", which indicates that the initial value of the current word slot is matched with the word slot type of the target word slot. If the initial value of the word slot extracted by the large language model is "a '", but "a '" is not any hotel name, when the entity identification is performed on "a '", the entity type cannot be successfully obtained as the hotel name, which indicates that the current word slot initial value is not matched with the word slot type of the target word slot, that is, the word slot extraction of the large language model fails. For another example, if the target word slot to be extracted currently by the large language model is a luxury hotel, but the initial word slot value is "B hotel" which does not belong to the luxury hotel, for example, "B hotel" actually belongs to the economical hotel, that is, there is a situation of extraction error. At this time, the entity identification is continued on the "B hotel", and the corresponding entity type is identified as the "economic hotel", and the entity is not matched with the word slot type of the target word slot, so that the verification is not passed, and the extraction of the large language model fails. It should be further noted that, when entity identification is performed, the entity may be configured in advance according to the scene requirement, and fine granularity subdivision may be performed on the basis of a general entity, for example, the entity may be further divided into a luxury entity, an economic entity, a chain entity and any other existing known entity types, so as to improve the accuracy of entity identification in different scenes. The present disclosure is not limited to specific recognition algorithms, and may be implemented using algorithms known in the art.
Therefore, by carrying out entity recognition on the initial value of the word slot, whether the word slot of the large language model is extracted successfully or not can be checked according to the entity relation between the entity type and the word slot type, and therefore the accuracy of word slot filling is further ensured.
In addition, since the large language model may not be able to handle the normalization problem of large data volume when word slots are filled, inaccuracy of the extracted word slot initial value may occur. In the embodiment of the disclosure, since entity recognition is further performed based on the initial value of the word slot, an entity recognition value with more normalization accuracy can be obtained through entity recognition, so that the entity recognition value is used as a target word slot value to fill the target word slot, normalization of the word slot filling result can be well completed, and the word slot filling result with more certainty and accuracy can be obtained. For example, when the initial value of the word slot is "tomorrow", the final filled target word slot value may be "2024-2-29", resulting in a normalized accurate time expression; when the initial value of the word slot is "some cold agent", the final filling target word slot value can be "some cold granule", so as to obtain a more accurate entity name; when the initial value of the word slot is "sea lake area", the final filling target word slot value may be "Beijing city sea lake area", and a more standard administrative division name is obtained.
Therefore, according to the technical scheme of the embodiment of the disclosure, word slot extraction is performed on information to be identified according to prompt words based on a large language model, a word slot initial value of a target word slot is obtained, and then entity identification is performed on the word slot initial value, so that entity types and entity identification values are obtained. And checking whether the word slot extraction is successful or not according to the entity relationship between the entity type and the word slot type of the target word slot. The accuracy of word slot filling is improved through verification, and meanwhile, the entity identification value can be used as a final filling result to obtain a normalized and accurate result. And based on accurate word slot filling results, the experience of intelligent conversation can be finally improved, so that the intelligent robot is more approximate to real people customer service, and the conversation effect is improved.
In one embodiment, verifying whether the word slot extraction is successful based on the entity relationship between the entity type and the word slot type comprises:
If the entity type is the same as the word slot type, determining that the word slot extraction is successful; or alternatively
If the word slot type belongs to the entity type, the word slot extraction is determined to be successful.
For example, when the initial value of the word slot is "C hotel", the entity type is "hotel name" obtained by entity identification, the entity identification value is "C hotel", and the word slot type corresponding to "C hotel" is "hotel name", so that the entity type and the word slot type are the same, and it is determined that the word slot extraction is successful. For another example, when the target word slot is "time to live", the initial value of the extracted word slot is "tomorrow", the entity type is obtained by entity identification, and the "time to live" belongs to one of the "time" entity types, and the two entity types have the dependency relationship, so that the word slot extraction is determined to be successful. Thus, verification can be accomplished based on the association between entities without requiring a complex approach. Meanwhile, the entity relevance which can pass through verification can be configured according to the requirements of different scenes, and the flexibility is higher.
In some embodiments, multiple recognition results may be obtained through entity recognition, i.e. multiple groups of entity types and entity recognition values are obtained, and at this time, a verification operation may be performed according to each entity type, where if the entity type includes the following entity types, it is determined that word slot extraction is successful: the entity type is the same as the word slot type, or the word slot type is subordinate to the entity type. And then, filling the entity identification value corresponding to the entity type meeting the above conditions as a target word slot value. Therefore, even if the entity identification result is not unique, the verification can be ensured to be carried out smoothly, so that the scheme of the embodiment of the disclosure can be suitable for more complex scenes.
In some embodiments, if the entity type and the entity identification value corresponding to the word slot initial value are not identified, the synonym of the word slot initial value may be determined based on the configuration information, the entity type corresponding to the synonym is used as the entity type corresponding to the word slot initial value, and the synonym is used as the entity identification value corresponding to the word slot initial value. That is, in the entity identification process, if the entity type corresponding to the entity type is not identified initially, a synonym of the initial value of the word slot is further searched, if the synonym exists, the entity type of the synonym is used as the entity type of the initial value of the word slot, and meanwhile the synonym is used as the entity identification value corresponding to the initial value of the word slot. For example, the initial value of the word slot identified by the large language model is "some Ganmaoling", but the entity identification does not have a corresponding entity type, but according to the preset configuration information, the "some Ganmaoling" and the "some Ganmaoling granule" are synonymous, and the entity type of the "some Ganmaoling granule" is a medicine name, so the medicine name is the entity type of the "some Ganmaoling", and the "some Ganmaoling granule" is taken as a corresponding entity identification value.
It can be seen that, although the large language model has good semantic recognition capability, in some professional fields, the large language model still has the problem that the large language model cannot be well understood, and meanwhile, the problem that the extraction result cannot be normalized. In the example of "some common cold medicine" above, the correct medicine name should be "some common cold granule", but by adopting the technical scheme of the embodiment of the disclosure, although the initial extraction is not accurate, the final accurate filling result "some common cold granule" can be obtained through entity identification and verification, so that the accuracy of word slot filling is improved.
In one embodiment, entity recognition may be performed specifically using an entity recognition model, that is: and inputting the initial value of the word slot into an entity recognition model, and outputting the entity type and the entity recognition value corresponding to the initial value of the word slot by utilizing the entity recognition model. The entity recognition model may be, for example, a sequence annotation model (CRF) or a read understanding Model (MRC). Specifically, CRF techniques are statistical models for entity recognition that can accurately predict the labels of various parts (e.g., words or phrases) in text. If entities such as names and places are to be identified from a sentence, the CRF helps the model understand the word-to-word connection, and ensures that the identification of the entities in the whole sentence is accurate and consistent. Therefore, in the technical scheme of the embodiment of the disclosure, the input word slot initial value can be effectively processed and analyzed through the CRF model, an accurate label is provided for each element in the word slot initial value, and the type and the identification value corresponding to the entity in the label are determined through the label. MRC techniques may then, based on a given piece of text and a question posed for that text, give an answer to the question after reading the text. The process may include several parts, such as embedded coding, feature extraction, text interaction with questions, and answer prediction. In the technical solution of the embodiment of the present disclosure, the initial value of the word slot is the given text input, and the proposed problem may be to extract the entity in the text and determine the entity type, so that the entity type and the entity identification value may be identified based on the initial value of the word slot based on the MRC.
In this embodiment, the large language model and the entity recognition model are combined to solve the problem that the large language model cannot be well understood and the problem that the extraction result cannot be normalized when the large language model is used alone for word slot filling in certain professional fields, and also solve the problem that the entity recognition model cannot accurately extract the entity which is most really intended by the user because the intention understanding ability of the entity recognition model is not strong when the entity recognition model is used alone for word slot filling, for example, when the user expresses a plurality of entities.
According to the technical scheme, word slots are extracted based on strong semantic understanding capability of the large language model, so that word slot initial values are obtained, entity recognition is carried out on the word slot initial values by utilizing the entity recognition model, and therefore extraction results of the large language model are verified based on results of the entity recognition, and finally more accurate word slot filling values are obtained. By adopting the technical scheme of the embodiment of the disclosure, in a multi-round dialogue scene, the dialogue understanding capability is stronger, even if the dialogue operation of a user is complex, a large language model can be extracted, the extracted value entity identification model is easier to identify, and then the purpose of verification is realized. According to the technical scheme, complex dialogue scenes such as word slot value replacement, excessive information processing and word slot value indication can be supported, the accuracy of word slot filling can be greatly improved, the experience effect of intelligent dialogue is enabled to be more smooth, and the user satisfaction is improved.
FIG. 2 is a flow diagram of another word slot filling method according to an embodiment of the present disclosure. As shown in fig. 2, first, the intelligent dialogue system/device obtains the user utterance, and then uses the understanding capability of the large language model to perform preliminary entity extraction on the user utterance, and at this time, the user utterance can be understood and identified in combination with the current utterance and calendar historical narrative. If the initial value of the word slot is successfully extracted, carrying out normalization and validity check by using an entity identification model (namely a small model), wherein if the normalization and validity check are successful, the word slot filling is successful, otherwise, the word slot filling is failed. In addition, if the large language model fails to be extracted, the initial value of the word slot is not extracted, the extraction process is skipped, other business processes are executed, for example, the word slot collection is performed by reacquiring the user speaking with the user in a session continuing mode.
Next, a hotel booking scenario will be described as an example. Fig. 3 to 5 are a first example diagram, a second example diagram, and a third example diagram of a word slot filling method according to an embodiment of the present disclosure, respectively.
As shown in fig. 3, the type of slot currently to be filled in by the system (i.e., word slot type) is "hotel name", and the system asks the user (Actor) according to the prompt: "please ask you which hotel [ 1 ] a hotel [ 2 ] B hotel you want to order". Wherein in this example, candidate words may be preconfigured as an a-hotel and a B-hotel. Thereafter, the user answers: "do not want to B hotel". And extracting word slots according to the user answers by the large language model to obtain the initial value of the word slot 'hotel name' as 'A hotel'. Then, taking the hotel A as a text input small model (namely an entity recognition model) for entity recognition, wherein the small model obtains a recognition result of the hotel name: hotel "and" bath center: a bath, although two recognition types are included in the recognition result, namely a hotel name and a bath center, one of the recognition types is the same as the current bath position type, so that the verification is successful, and the hotel A is taken as a final word slot value. It should be noted that, in this example, if word slot filling is performed directly by using the small model, the entity "B hotel" is likely to be identified from the user speaking "don't B hotel", and then "B hotel" is used as the slot filling result, so that the slot filling accuracy is affected. However, by adopting the technical scheme of the embodiment of the disclosure, better semantic understanding capability of the large language model is utilized, and the fact that the true semantics of the user are to subscribe to the hotel A and not the hotel B can be identified. And then checking the groove based on the small model, and further improving the quality of filling the groove.
As shown in fig. 4, the type of slot currently to be filled by the system is "time to live" and the system asks the user: "please ask you what day to live? "user answer: "lower Zhou Di Bay". "the prompt word prompt constructed at this time is, for example: the current slot type to be collected is 'time to live', the input of the user is 'Tuesday', please input and output the time to live according to the input and output of the user, and if not, the 'irrelevant' is filled. And extracting the large language model to obtain the check-in time of 3 months No. 3. And then, inputting the '3 month 3' as a text into the small model for entity recognition, wherein the small model recognizes that the entity types are a 'date' entity and a 'time' entity, the 'date' entity is associated with the slot type 'time of entry', so that the verification is successful, and meanwhile, taking the result '2024-03-03' of the small model recognition as a final slot filling result. It should be noted that, in this example, the word slot initial value "3 months 3" extracted by the large language model is not a normalized result, and the small model can be more accurately expressed based on this, so by adopting the technical scheme of the embodiment of the disclosure, the problem that the result cannot be normalized when the large language model is used alone can be solved, and meanwhile, the method has the advantages of high development efficiency and low cost, and the method has wider application scenarios.
As shown in fig. 5, the type of slot that the system is currently waiting to fill is "departure time", and the type of slot that has been filled based on the previous rounds of conversation is "hotel name" and "check-in time", at which point the system asks the user: "please ask you what time to leave the store? "user answer: "etc. the hotel is changed to the B hotel bar first, then the hotel is checked in the open day, and the hotel leaves the store in the acquired day. It can be seen that the user's answer involves a question of multiple entities and word slot alterations. The constructed campt may be, for example: the current collected slot is named as 'A hotel', the slot 'check-in time' is 'Next Tuesday', and the current slot to be collected is 'departure time'; the user expresses that: 'and so on, the hotel is changed into the B hotel bar, then the hotel is checked in the open day, and the hotel leaves the store in the back day'. The large language model is extracted based on the prompt, so that a slot position 'hotel name' is 'B hotel', a slot position 'check-in time' is 'tomorrow', and a slot position 'departure time' is 'postnatal'. Therefore, in complex scenes such as multi-entity and word slot value replacement, the large language model can accurately extract the required word slot value based on good understanding capability. Then, to further check its accuracy, the small model may be invoked three times in a loop. For the first time, the "B hotel" is used as a text input small model, the small model identifies that the entity type is the "hotel name", the entity identification value is the "B hotel", the entity type accords with the currently collected slot type, the verification is successful, and the "B hotel" is used as a slot filling result. The second time, "tomorrow" is entered as text into the small model, the small model recognizes that the entity type is "date", the entity recognition value is "2024-02-27", the currently collected slot type is also "time of entry", and is associated with the date entity, so that the verification is successful, and "2024-02-27" is taken as the slot filling result. Third, the "acquired" is input as text to the small model, and similarly, the small model recognizes that the entity type is "date", the entity recognition value is "2024-02-28", so that the verification is successful, and the "2024-02-28" is used as a slot filling result.
Fig. 6 is a schematic structural diagram of a word slot filling device according to an embodiment of the present disclosure, which is applicable to a case of word slot filling based on a user dialogue in an intelligent dialogue process, and relates to the field of artificial intelligence, in particular to the fields of intelligent dialogue, speech technology and deep learning. The device can realize the word slot filling method according to any embodiment of the disclosure. As shown in fig. 6, the apparatus 600 specifically includes:
The extraction module 601 is configured to perform word slot extraction on information to be identified according to the constructed prompt word by using the large language model, so as to obtain a word slot initial value of the target word slot;
the entity recognition module 602 is configured to perform entity recognition according to the word slot initial value to obtain an entity type and an entity recognition value;
The verification module 603 is configured to obtain a word slot type of the target word slot from the prompt word, and verify whether the word slot extraction is successful according to an entity relationship between the entity type and the word slot type;
and a filling module 604, configured to fill the target word slot with the entity identification value as a target word slot value if the word slot extraction is successful, where the target word slot value is a normalization result of the word slot initial value.
Optionally, the verification module 603 includes:
The first verification unit is used for determining that the word slot extraction is successful if the entity type is the same as the word slot type;
and the second verification unit is used for determining that the word slot extraction is successful if the word slot type belongs to the entity type.
Optionally, the verification module 603 further includes:
a third checking unit, configured to determine that the word slot extraction is successful if the number of entity types is plural, and the entity types include: the entity type is the same as the word slot type, or the word slot type is subordinate to the entity type.
Optionally, the entity identification module 602 includes:
a synonym determining unit, configured to determine, if no entity type and entity identification value corresponding to the word slot initial value are identified, a synonym of the word slot initial value based on configuration information;
And the entity identification unit is used for taking the entity type corresponding to the synonym as the entity type corresponding to the word slot initial value and taking the synonym as the entity identification value corresponding to the word slot initial value.
Optionally, the entity identification module 602 is specifically configured to:
and inputting the initial value of the word slot into an entity recognition model, and outputting the entity type and the entity recognition value corresponding to the initial value of the word slot by utilizing the entity recognition model.
Optionally, the number of the target word slots is at least two;
the entity identification module 602 is specifically configured to:
Sequentially calling the entity recognition model, respectively inputting the word slot initial value of each target word slot into the entity recognition model, and outputting the entity type and the entity recognition value corresponding to each word slot initial value by utilizing the entity recognition model;
The verification module 603 is specifically configured to:
Respectively acquiring the word slot type of each target word slot from the prompt words, and respectively checking whether the word slot extraction is successful or not according to the entity type corresponding to the initial value of each word slot and the entity relationship between the word slot types;
The filling module 604 is specifically configured to:
And if the word slot extraction is successful for each target word slot, respectively filling each target word slot by taking the entity identification value corresponding to the initial value of each word slot as a target word slot value.
Optionally, the apparatus further includes:
The obtaining module is configured to obtain the input information to be identified and the target word slot corresponding to the information to be identified before the extracting module 601 performs word slot extraction;
and the construction module is used for constructing prompt words based on the information to be identified and the target word groove.
The product can execute the method provided by any embodiment of the disclosure, and has the corresponding functional modules and beneficial effects of executing the method.
In the technical scheme of the disclosure, the related personal information of the user is collected, stored, used, processed, transmitted, provided, disclosed and the like, all conform to the regulations of related laws and regulations and do not violate the popular public order.
According to embodiments of the present disclosure, the present disclosure also provides an electronic device, a readable storage medium and a computer program product.
Fig. 7 illustrates a schematic block diagram of an example electronic device 700 that may be used to implement embodiments of the present disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the disclosure described and/or claimed herein.
As shown in fig. 7, the apparatus 700 includes a computing unit 701 that can perform various appropriate actions and processes according to a computer program stored in a Read Only Memory (ROM) 702 or a computer program loaded from a storage unit 708 into a Random Access Memory (RAM) 703. In the RAM 703, various programs and data required for the operation of the device 700 may also be stored. The computing unit 701, the ROM 702, and the RAM 703 are connected to each other through a bus 704. An input/output (I/O) interface 705 is also connected to bus 704.
Various components in device 700 are connected to I/O interface 705, including: an input unit 706 such as a keyboard, a mouse, etc.; an output unit 707 such as various types of displays, speakers, and the like; a storage unit 708 such as a magnetic disk, an optical disk, or the like; and a communication unit 709 such as a network card, modem, wireless communication transceiver, etc. The communication unit 709 allows the device 700 to exchange information/data with other devices via a computer network, such as the internet, and/or various telecommunication networks.
The computing unit 701 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of computing unit 701 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, etc. The computing unit 701 performs the respective methods and processes described above, such as the word slot filling method. For example, in some embodiments, the word slot filling method may be implemented as a computer software program tangibly embodied on a machine-readable medium, such as storage unit 708. In some embodiments, part or all of the computer program may be loaded and/or installed onto device 700 via ROM 702 and/or communication unit 709. When a computer program is loaded into RAM 703 and executed by computing unit 701, one or more steps of the word slot filling method described above may be performed. Alternatively, in other embodiments, the computing unit 701 may be configured to perform the word slot filling method by any other suitable means (e.g., by means of firmware).
Various implementations of the systems and techniques described here above can be implemented in digital electronic circuitry, integrated circuit systems, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), systems On Chip (SOCs), complex Programmable Logic Devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.
Program code for carrying out methods of the present disclosure may be written in any combination of one or more programming languages. These program code may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus such that the program code, when executed by the processor or controller, causes the functions/operations specified in the flowchart and/or block diagram to be implemented. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. The machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and pointing device (e.g., a mouse or trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), blockchain networks, and the internet.
The computer system may include a client and a server. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server can be a cloud server, also called a cloud computing server or a cloud host, and is a host product in a cloud computing service system, so that the defects of high management difficulty and weak service expansibility in the traditional physical hosts and VPS service are overcome. The server may also be a server of a distributed system or a server that incorporates a blockchain.
Artificial intelligence is the discipline of studying the process of making a computer mimic certain mental processes and intelligent behaviors (e.g., learning, reasoning, thinking, planning, etc.) of a person, both hardware-level and software-level techniques. Artificial intelligence hardware technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing, and the like; the artificial intelligent software technology mainly comprises a computer vision technology, a voice recognition technology, a natural language processing technology, a machine learning/deep learning technology, a big data processing technology, a knowledge graph technology and the like.
Cloud computing (cloud computing) refers to a technical system that a shared physical or virtual resource pool which is elastically extensible is accessed through a network, resources can comprise servers, operating systems, networks, software, applications, storage devices and the like, and resources can be deployed and managed in an on-demand and self-service mode. Through cloud computing technology, high-efficiency and powerful data processing capability can be provided for technical application such as artificial intelligence and blockchain, and model training.
It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps recited in the present disclosure may be performed in parallel, sequentially, or in a different order, provided that the desired results of the technical solutions provided by the present disclosure are achieved, and are not limited herein.
The above detailed description should not be taken as limiting the scope of the present disclosure. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present disclosure are intended to be included within the scope of the present disclosure.

Claims (17)

1. A word slot filling method, comprising:
Extracting word slots of the information to be identified according to the constructed prompt words by using the large language model to obtain word slot initial values of target word slots;
Performing entity identification according to the word slot initial value to obtain an entity type and an entity identification value;
Acquiring the word slot type of the target word slot from the prompt word, and checking whether the word slot extraction is successful or not according to the entity relationship between the entity type and the word slot type;
And if the word slot extraction is successful, filling the target word slot by taking the entity identification value as a target word slot value, wherein the target word slot value is a normalization result of the word slot initial value.
2. The method of claim 1, wherein the verifying whether the word slot extraction was successful based on the entity relationship between the entity type and the word slot type comprises:
if the entity type is the same as the word slot type, determining that the word slot extraction is successful; or alternatively
And if the word slot type belongs to the entity type, determining that the word slot extraction is successful.
3. The method of claim 2, wherein the verifying whether the word slot extraction was successful based on the entity relationship between the entity type and the word slot type, further comprises:
if the number of entity types is a plurality, and the entity types are included, determining that the word slot extraction is successful: the entity type is the same as the word slot type, or the word slot type is subordinate to the entity type.
4. The method of claim 1, wherein the performing entity recognition according to the initial value of the word slot to obtain an entity type and an entity recognition value comprises:
If the entity type and the entity identification value corresponding to the word slot initial value are not identified, determining synonyms of the word slot initial value based on configuration information;
And taking the entity type corresponding to the synonym as the entity type corresponding to the initial value of the word slot, and taking the synonym as the entity identification value corresponding to the initial value of the word slot.
5. The method of claim 1, wherein the performing entity recognition according to the initial value of the word slot to obtain an entity type and an entity recognition value comprises:
and inputting the initial value of the word slot into an entity recognition model, and outputting the entity type and the entity recognition value corresponding to the initial value of the word slot by utilizing the entity recognition model.
6. The method of claim 5, wherein the target word slots are at least two;
performing entity identification according to the initial value of the word slot to obtain an entity type and an entity identification value, including:
Sequentially calling the entity recognition model, respectively inputting the word slot initial value of each target word slot into the entity recognition model, and outputting the entity type and the entity recognition value corresponding to each word slot initial value by utilizing the entity recognition model;
The step of obtaining the word slot type of the target word slot from the prompt word and checking whether the word slot extraction is successful according to the entity relationship between the entity type and the word slot type comprises the following steps:
Respectively acquiring the word slot type of each target word slot from the prompt words, and respectively checking whether the word slot extraction is successful or not according to the entity type corresponding to the initial value of each word slot and the entity relationship between the word slot types;
And if the word slot extraction is successful, filling the target word slot by taking the entity identification value as a target word slot value, wherein the method comprises the following steps:
And if the word slot extraction is successful for each target word slot, respectively filling each target word slot by taking the entity identification value corresponding to the initial value of each word slot as a target word slot value.
7. The method of claim 1, wherein before the word slot extraction is performed on the information to be identified according to the constructed hint word by using the large language model to obtain the word slot initial value of the target word slot, the method further comprises:
Acquiring the input information to be identified and the target word groove corresponding to the information to be identified;
And constructing a prompt word based on the information to be identified and the target word groove.
8. A word slot filling apparatus comprising:
the extraction module is used for extracting word slots of the information to be identified according to the constructed prompt words by utilizing the large language model to obtain word slot initial values of the target word slots;
The entity identification module is used for carrying out entity identification according to the word slot initial value to obtain an entity type and an entity identification value;
The verification module is used for acquiring the word slot type of the target word slot from the prompt word and verifying whether the word slot extraction is successful or not according to the entity relationship between the entity type and the word slot type;
and the filling module is used for filling the target word slot by taking the entity identification value as a target word slot value if the word slot extraction is successful, wherein the target word slot value is a normalization result of the word slot initial value.
9. The apparatus of claim 8, wherein the verification module comprises:
The first verification unit is used for determining that the word slot extraction is successful if the entity type is the same as the word slot type;
and the second verification unit is used for determining that the word slot extraction is successful if the word slot type belongs to the entity type.
10. The apparatus of claim 9, wherein the verification module further comprises:
a third checking unit, configured to determine that the word slot extraction is successful if the number of entity types is plural, and the entity types include: the entity type is the same as the word slot type, or the word slot type is subordinate to the entity type.
11. The apparatus of claim 8, wherein the entity identification module comprises:
a synonym determining unit, configured to determine, if no entity type and entity identification value corresponding to the word slot initial value are identified, a synonym of the word slot initial value based on configuration information;
And the entity identification unit is used for taking the entity type corresponding to the synonym as the entity type corresponding to the word slot initial value and taking the synonym as the entity identification value corresponding to the word slot initial value.
12. The apparatus of claim 8, wherein the entity identification module is specifically configured to:
and inputting the initial value of the word slot into an entity recognition model, and outputting the entity type and the entity recognition value corresponding to the initial value of the word slot by utilizing the entity recognition model.
13. The apparatus of claim 12, wherein the target word slots are at least two;
the entity identification module is specifically configured to:
Sequentially calling the entity recognition model, respectively inputting the word slot initial value of each target word slot into the entity recognition model, and outputting the entity type and the entity recognition value corresponding to each word slot initial value by utilizing the entity recognition model;
the verification module is specifically configured to:
Respectively acquiring the word slot type of each target word slot from the prompt words, and respectively checking whether the word slot extraction is successful or not according to the entity type corresponding to the initial value of each word slot and the entity relationship between the word slot types;
the filling module is specifically configured to:
And if the word slot extraction is successful for each target word slot, respectively filling each target word slot by taking the entity identification value corresponding to the initial value of each word slot as a target word slot value.
14. The apparatus of claim 8, further comprising:
the acquisition module is used for acquiring the input information to be identified and the target word groove corresponding to the information to be identified before the word groove is extracted by the extraction module;
and the construction module is used for constructing prompt words based on the information to be identified and the target word groove.
15. An electronic device, comprising:
At least one processor; and
A memory communicatively coupled to the at least one processor; wherein,
The memory stores instructions executable by the at least one processor to enable the at least one processor to perform the word slot filling method of any one of claims 1-7.
16. A non-transitory computer readable storage medium storing computer instructions for causing the computer to perform the word slot filling method according to any one of claims 1-7.
17. A computer program product comprising computer programs/instructions which when executed by a processor implement the word slot filling method according to any of claims 1-7.
CN202410295234.7A 2024-03-14 2024-03-14 Word slot filling method, device, equipment and medium Pending CN118193693A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410295234.7A CN118193693A (en) 2024-03-14 2024-03-14 Word slot filling method, device, equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410295234.7A CN118193693A (en) 2024-03-14 2024-03-14 Word slot filling method, device, equipment and medium

Publications (1)

Publication Number Publication Date
CN118193693A true CN118193693A (en) 2024-06-14

Family

ID=91406141

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410295234.7A Pending CN118193693A (en) 2024-03-14 2024-03-14 Word slot filling method, device, equipment and medium

Country Status (1)

Country Link
CN (1) CN118193693A (en)

Similar Documents

Publication Publication Date Title
CN111666380A (en) Intelligent calling method, device, equipment and medium
CN113553412B (en) Question-answering processing method, question-answering processing device, electronic equipment and storage medium
CN113392253B (en) Visual question-answering model training and visual question-answering method, device, equipment and medium
CN113836925B (en) Training method and device for pre-training language model, electronic equipment and storage medium
CN114416943B (en) Training method and device for dialogue model, electronic equipment and storage medium
US20220358292A1 (en) Method and apparatus for recognizing entity, electronic device and storage medium
CN110223134B (en) Product recommendation method based on voice recognition and related equipment
CN115062718A (en) Language model training method and device, electronic equipment and storage medium
CN115481229A (en) Method and device for pushing answer call, electronic equipment and storage medium
CN114490985A (en) Dialog generation method and device, electronic equipment and storage medium
CN114120166A (en) Video question and answer method and device, electronic equipment and storage medium
CN114970666B (en) Spoken language processing method and device, electronic equipment and storage medium
CN114758649B (en) Voice recognition method, device, equipment and medium
CN116010916A (en) User identity information identification method and device, electronic equipment and storage medium
CN114141236B (en) Language model updating method and device, electronic equipment and storage medium
CN114461665B (en) Method, apparatus and computer program product for generating a statement transformation model
CN113743127B (en) Task type dialogue method, device, electronic equipment and storage medium
CN116187301A (en) Model generation method, entity identification device, electronic equipment and storage medium
CN115905497A (en) Method, device, electronic equipment and storage medium for determining reply sentence
CN113033179B (en) Knowledge acquisition method, knowledge acquisition device, electronic equipment and readable storage medium
CN112541557B (en) Training method and device for generating countermeasure network and electronic equipment
CN118193693A (en) Word slot filling method, device, equipment and medium
CN113806541A (en) Emotion classification method and emotion classification model training method and device
CN113705206B (en) Emotion prediction model training method, device, equipment and storage medium
CN115131709B (en) Video category prediction method, training method and device for video category prediction model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination