CN118113823A - Query response generation using structured and unstructured data for conversational AI systems and applications - Google Patents

Query response generation using structured and unstructured data for conversational AI systems and applications Download PDF

Info

Publication number
CN118113823A
CN118113823A CN202311622222.2A CN202311622222A CN118113823A CN 118113823 A CN118113823 A CN 118113823A CN 202311622222 A CN202311622222 A CN 202311622222A CN 118113823 A CN118113823 A CN 118113823A
Authority
CN
China
Prior art keywords
data
text
response
information
neural networks
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311622222.2A
Other languages
Chinese (zh)
Inventor
S·达斯
S·K·巴塔查里亚
O·奥拉比伊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nvidia Corp
Original Assignee
Nvidia Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US18/172,571 external-priority patent/US20240176808A1/en
Application filed by Nvidia Corp filed Critical Nvidia Corp
Publication of CN118113823A publication Critical patent/CN118113823A/en
Pending legal-status Critical Current

Links

Landscapes

  • Machine Translation (AREA)

Abstract

Query response generation using structured and unstructured data for conversational AI systems and applications is disclosed. In various examples, the context data may be generated using structured and unstructured data for the session AI system and applications. Systems and methods for generating context data using structured data (converted to unstructured form) and unstructured data (such as from a knowledge database) are disclosed. For example, the context data may represent text (e.g., a narrative), where a first portion of the text is generated using structured data and a second portion of the text is generated using unstructured data. The systems and methods may then process the input data and the context data representing the request (e.g., query) using a neural network, such as a neural network associated with a dialog manager, to generate a response to the request. For example, if the request includes a query for information associated with the topic, the neural network may generate a response that includes the requested information.

Description

Query response generation using structured and unstructured data for conversational AI systems and applications
Cross Reference to Related Applications
The application claims the benefit of U.S. provisional application No. 63/428,843, filed on 11/30 of 2022, which is incorporated herein by reference in its entirety.
Background
Dialog systems are used in many different applications, such as applications for requesting information (e.g., information about objects, features, etc.), scheduling travel plans (e.g., reservation arrangements for transportation and accommodation, etc.), planning activities (e.g., making reservations, etc.), communicating with others (e.g., making telephone calls, starting video conferences, etc.), purchasing items (e.g., purchasing items from an online marketplace, ordering food from a local restaurant, etc.), and/or the like. Some dialog systems operate by receiving text (such as text including one or more letters, words, numbers, and/or symbols) that is generated using an input device and/or that is generated as a transcript of a spoken language. In some cases, the text may represent a request, such as a request to query a restaurant for food items provided by the restaurant and/or a request to order one or more food items provided by the restaurant, in a restaurant or ordering scenario. The dialog system then processes the text using a dialog manager trained to interpret the text. For example, based on the interpreted text, the dialog manager can generate a response, such as a response to a query associated with the food item.
For example, the dialog manager may analyze the request to determine an intent associated with the request and a slot (slot) associated with the intent. The dialog manager may then use the knowledge database to determine information associated with the request based on the intent and slot. In some cases, the knowledge database may include structured data, such as structured data representing fields that associate (e.g., pair) a particular identifier with information. Additionally or alternatively, in some cases, the knowledge database may include unstructured data, such as unstructured data representing fields that describe the subject using plain text descriptions and/or narratives. However, problems may occur when the same knowledge database includes both structured and unstructured data. For example, it may be difficult for a dialog manager to identify the information required for a request because structured data and unstructured data are used to represent the information differently.
Furthermore, training a neural network used by a dialog manager to generate a response may require a large amount of training data, such as when using a knowledge database that includes structured data. For example, and as described above, structured data can represent fields that associate particular identifiers with information. As such, to train the neural network, training data representing a sample of each identifier may be required so that the neural network can then interpret the request associated with the identifier. This may increase the amount of computing resources and/or the amount of time required to train the neural network.
Disclosure of Invention
Embodiments of the present disclosure relate to generating query responses using combined structured and unstructured data for a conversational AI system and applications. Systems and methods are disclosed for generating context data using both structured data (converted to unstructured form in embodiments) and unstructured data (e.g., from one or more knowledge databases). For example, the context data may represent text (e.g., narrative), where at least a first portion of the text (e.g., in unstructured form in an embodiment) is generated using the structured data and at least a second portion of the text is generated using the unstructured data. The system and method may then process the input data and the context data representing the request (e.g., query) using the neural network(s) (e.g., neural network(s) associated with the dialog manager) to generate a response to the request. For example, if the request includes a query for information associated with the topic, the neural network(s) may generate a response that includes the requested information.
In contrast to conventional systems (such as those described above), in some embodiments, the present system is able to generate a response to a request using both structured and unstructured data. As described herein, the present system is capable of generating a response by generating context data representing text using both structured data and unstructured data, the text including at least a portion of the text represented by the structured data and at least a portion of the text represented by the unstructured data. Furthermore, in contrast to conventional systems, in some embodiments, the present system is able to generate responses using neural network(s) that may not train on each field represented by the structured data. Conversely, the neural network(s) may be trained using unstructured training data similar to context data later processed by the neural network(s) (e.g., generated in unstructured form using both unstructured data and structured data), which may require less training data, computational resources, and/or training time.
Drawings
The present system and method for generating a query response using structured and unstructured data for a conversational AI system and applications is described in detail below with reference to the accompanying drawings, wherein:
FIG. 1 is an example data flow diagram of a process of processing context data generated using structured and unstructured data in order to determine a response to a request or query, according to some embodiments of the present disclosure;
FIG. 2 illustrates an example of structured data and unstructured data according to some embodiments of the present disclosure;
3A-3B illustrate examples of generating context data using structured data and unstructured data according to some embodiments of the present disclosure;
FIG. 4 illustrates an example of using context data to extract information associated with a request, according to some embodiments of the present disclosure;
FIG. 5 illustrates an example of generating a response to a request using extracted information in accordance with some embodiments of the present disclosure;
FIG. 6 is a data flow diagram illustrating a process for training neural network(s) to extract information associated with a request, according to some embodiments of the invention;
FIG. 7 is a flow chart illustrating a method for processing context data generated using structured and unstructured data to determine a response to a request or query in accordance with some embodiments of the present disclosure;
FIG. 8 is a flow chart illustrating a method for generating context data using structured data and unstructured data, according to some embodiments of the present disclosure;
FIG. 9 is a flow chart illustrating a method for generating a response associated with a request according to some embodiments of the present disclosure;
FIG. 10 is a block diagram of an example computing device suitable for use in implementing some embodiments of the present disclosure; and
FIG. 11 is a block diagram of an example data center suitable for use in implementing some embodiments of the present disclosure.
Detailed Description
Systems and methods are disclosed that relate to generating context data using structured and unstructured data for session AI systems and applications. For example, one or more systems store (such as in one or more knowledge databases) structured data representing structured text associated with intent, subject matter, action, etc., and unstructured data representing unstructured text associated with intent, subject matter, action, etc. As described herein, structured text may include fields that associate (e.g., pair) an identifier with information (e.g., key: value pair). For example, and for a food item, the structured text may include a first field that associates a name identifier (e.g., a key) with information (e.g., a value) describing the name, a second field that associates a size identifier with the size of the food item, a third field that associates a calorie identifier with the number of calories associated with the food item, a fourth field that associates a price identifier with the price of the food item, and so forth. Further, unstructured data may represent one or more fields that include one or more plaintext descriptions (such as information that is not associated with a particular identifier). For example, also for food products, unstructured text may include a description of how the food product was created (e.g., mixed, cooked, baked, left to stand, etc.).
The system(s) may then use the structured data and unstructured data to generate context data representing text associated with intent, subject matter, action, and the like. For example, in some examples, the system(s) may use the structured data to generate one or more narratives associated with the one or more fields such that the individual narrative is generated as a plain description including at least the identifier and the information associated with the identifier. The system(s) may also use unstructured data to generate one or more narratives associated with one or more fields such that individual narratives are generated as plain text descriptions associated with one of the fields of unstructured data. Furthermore, the system(s) may then use the narrative to generate context data. For example, in some examples, the system(s) may generate contextual data to represent text in the form of paragraphs using the narrative.
The system(s) may then receive and/or generate input data representing the request, such as input data representing text including one or more letters, words, sub-words, numbers, and/or symbols. For a first example, one or more systems may receive audio data representing user speech from a user device and then process the audio data to generate input data. For a second example, one or more systems may receive input data representing a request from a user device. In any of these examples, the request may include a query for information associated with the topic (e.g., objects, items, features, attributes, characteristics, etc.), a request to perform an action associated with the topic (e.g., schedule a dinner reservation, reserve a trip, generate a list, provide content, etc.), and/or any other type of request. The one or more systems may then process the input data using the neural network(s) to generate a response to the request.
For example, the system(s) may input data into one or more neural networks along with context data. In some examples, in addition to or alternatively to inputting the input data, one or more systems may pre-process the input data to determine an intent and/or one or more slots associated with the input data. In such examples, one or more systems may input data representing the intent and/or one or more slots into one or more neural networks. The one or more neural networks may then process the input data and the context data to generate a response associated with the request. For example, the one or more first neural networks may initially process the input data and the context data to generate index data representing one or more words associated with the request, wherein the one or more first neural networks may determine the one or more words from the context data. The one or more second neural networks may then process the input data and the index data to generate response data representing the response. One or more systems may then provide the response, such as by sending response data to the user device.
In some examples, the one or more neural networks used by the one or more systems may include similar one or more neural networks used by the system to process structured data and/or similar one or more neural networks used by the system to process unstructured data. As such, one or more systems may not need to train one or more new neural networks to perform the processes described herein (e.g., process context data). For example, because one or more neural networks may be trained to process unstructured data, and structured data may be converted into unstructured form, one or more neural networks may be configured to process the unstructured data without any additional training. However, in some examples, one or more systems may train one or more neural networks to perform the processes described herein, such as by training the neural networks using training data similar to the context data input into the one or more neural networks during deployment. In any of these examples, one or more systems may be capable of generating a response to the request using a knowledge base that includes both structured data and unstructured data.
The systems and methods described herein may be used for a variety of purposes, by way of example and not limitation, in systems associated with machine control, machine motion, machine driving, synthetic data generation, model training, perception, augmented reality, virtual reality, mixed reality, robotics, security and surveillance, analog and digital twinning, autonomous or semi-autonomous machine applications, deep learning, environmental simulation, object or actor simulation and/or digital twinning, data center processing, conversational AI, digital avatar, light transmission simulation (e.g., ray tracing, path tracing, etc.), collaborative content creation for 3D assets, cloud computing, and/or any other suitable application.
The disclosed embodiments may be included in a variety of different systems, such as automotive systems (e.g., chat robots, digital avatars, or conversational AI components for autonomous, non-autonomous, or semi-autonomous machine in-vehicle infotainment (IVI) systems), systems implemented using robots, air systems, medical systems, boating systems, intelligent area monitoring systems, systems for performing deep learning operations, systems for performing simulation operations, systems for performing digital twinning operations, systems implemented using edge devices, systems including one or more Virtual Machines (VMs), systems for performing synthetic data generation operations, systems implemented at least in part in a data center, systems for performing conversational AI operations, systems for generating, rendering, or rendering digital avatars, systems for performing light transmission simulations, systems for performing collaborative content creation for 3D assets, systems implemented at least in part using cloud computing resources, and/or other types of systems.
Referring to fig. 1, fig. 1 is an example data flow diagram of a process 100 of processing context data generated using structured and unstructured data in order to determine a response to a request, according to some embodiments of the present disclosure. It should be understood that this and other arrangements described herein are set forth only as examples. Other arrangements and elements (e.g., machines, interfaces, functions, orders, groupings of functions, etc.) can be used in addition to or instead of those shown, and some elements may be omitted entirely. Further, many of the elements described herein are functional entities that may be implemented as discrete or distributed components or in combination with other components, and in any suitable combination and location. The various functions described herein as being performed by an entity may be performed by hardware, firmware, and/or software. For example, the functions may be implemented by a processor executing instructions stored in a memory.
The process 100 can include a context component 102 that receives structured data 104 and unstructured data 106, for example, from a knowledge database. Structured data 104 can represent text in a structured format associated with intent, subject matter, action, etc., and unstructured data 106 represents text in an unstructured format associated with intent, subject matter, action, etc. For example, the text represented by the structured data 104 can include fields that associate (e.g., pair) an identifier with the information. For example, and for a subject, the text represented by structured data 104 can include a first field that associates a first identifier with first information describing the first identifier, a second field that associates a second identifier with second information describing the second identifier, a third field that associates a third identifier with third information describing the third identifier, and so forth. Further, the text represented by unstructured text 106 may include one or more fields that include one or more plain text descriptions, such as information that is not associated with the particular identifier(s) and/or is drafted in narrative form.
For example, fig. 2 illustrates an example of structured data 202 (which may represent and/or include structured data 104) and unstructured data 204 (which may represent and/or include unstructured data 106) according to some embodiments of the present disclosure. In the example of fig. 2, structured data 202 and unstructured data 204 may be associated with a particular theme, such as a particular food item (e.g., a hamburger). However, in other examples, structured data 202 and/or unstructured data 204 may be associated with an intent, another topic, an action, or the like.
As shown, structured data 202 may represent at least field 206 and information 208 associated with (e.g., paired with) field 206. In the example of fig. 2, field 206 includes three different identifiers 210 (1) - (3) (also referred to in the singular as "identifier 210" or plural as "identifiers 210"), such as a name identifier 210 (1), an ingredient identifier 210 (2), and a price identifier 210 (3). Further, information 208 includes respective information 212 (1) - (3) for each of fields 206, such as "hamburger" information 212 (1) associated with name identifier 210 (1), "lettuce, tomato" information 212 (2) associated with ingredient identifier 210 (2), and "$1.00" information 212 (3) associated with price identifier 210 (3). Using this structured format, one or more systems may be able to identify the information 208 for each identifier 210 using an association (e.g., pairing).
In the example of fig. 2, unstructured data 204 includes a single field 214 that includes a description associated with a topic. As shown, unlike structured data 202, unstructured data 204 does not associate an identifier 210 from field 206 with information 208. Rather, field 214 includes a plain description and/or narration associated with the subject matter. Although the example of fig. 2 shows structured data 202 as including three fields 206 and unstructured data 204 as including one field 214, in other examples structured data 202 and/or unstructured data 204 may be associated with any number of fields.
In some examples, the context component 102 can be configured to pull information 208 from at least one or more databases. For example, the context component 102 can pull information 208 from a database using one or more Application Programming Interfaces (APIs), wherein the information 208 is associated with the field 206. For example, when the context component 102 receives a request associated with information 208 for a field, the API can be configured to access one or more databases in order to pull the information 208. Thus, even if the information 208 is updated, the API pulls the updated information 208 from the database.
Referring back to the example of fig. 1, the process 100 can include a context component 102 that uses structured data 104 and unstructured data 106 to generate context data 108 that represents text associated with intent, subject matter, action, and the like. For example, the context component 102 can use the structured data 104 to generate a first portion of text. In some examples, the context component 102 generates a first portion of text (e.g., in unstructured form) using one or more first narratives associated with one or more fields of the structured data 104. For example, the respective narrative may include text, where the text includes at least an identifier associated with the field, information associated with the identifier, and one or more words that provide context to the narrative (and/or convert structured data into a more natural sentence form, which may be similar to the form or format of unstructured data). For example, where the structured data includes keys: in the case of value pairs, the key and value may be combined with the key: one or more additional words, symbols, etc. that the value pairs convert to a statement are included in the statement. The context component 102 can also use the unstructured data 106 to generate a second portion of text. In some examples, the context component 102 generates a second portion of text using one or more second narratives associated with one or more fields of the unstructured data 106. For example, the corresponding narrative may include a plain description associated with the field. The context component 102 can then use the narration to generate context data 108.
For example, fig. 3A illustrates a first example of generating context data 302 (which may represent and/or include context data 108) using structured data 202 and unstructured data 204 according to some embodiments of the present disclosure. As shown, context data 302 may be generated using statements 304 (1) - (4) (also referred to in the singular as "statement 304" or plural as "multiple statements 304"). For example, the first narrative 304 (1) of the context data 302 includes text associated with the field 214 from the unstructured data 204. Although the example of fig. 3A shows the text of the first narrative 304 (1) matching the text from the field 214, in other examples, the text of the first narrative 304 (1) may include less text, more text, and/or different text than the text from the field 214. Furthermore, while the example of FIG. 3A shows the context data 302 as including only a single statement 304 (1) associated with the unstructured data 204, in other examples the context data 302 may include additional statements associated with additional fields of the unstructured data 204.
The context data 302 further includes statements 304 (2) - (4) associated with field 206 of the structured data 202. As shown, the second narration 304 (2) may be generated using the identifier 210 (1) from the field 206 and the information 212 (1) associated with the identifier 210 (1), the third narration 304 (3) may be generated using the identifier 210 (2) from the field 206 and the information 212 (2) associated with the identifier 210 (2), and the fourth narration 304 (4) may be generated using the identifier 210 (3) from the field 206 and the information 212 (3) associated with the identifier 210 (3). In the example of FIG. 3A, the statements 304 (2) - (4) are further generated by including additional text such that the statements 304 (2) - (4) include a similar format as the first statement 304 (1). For example, statements 304 (2) - (4) are plain text descriptions that include identifications from field 206 and information 208. While the example of FIG. 3A illustrates the context data 302 as including only three statements 304 (2) - (4) associated with the structured data 202, in other examples, the context data 302 may include any number of statements associated with any number of fields of the structured data 202.
Furthermore, while the example of FIG. 3A shows context component 102 as generating narratives 304 (2) - (4) by converting structured data 202 into a format similar to unstructured data 204, in other examples context component 102 may additionally or alternatively generate narrative 304 (1) by converting unstructured data 204 into a format similar to structured data 202. For example, fig. 3B illustrates a second example of generating context data 306 (which may represent and/or include context data 108) using structured data 202 and unstructured data 204 according to some embodiments of the present disclosure. As shown, context data 306 may be generated using statements 308 (1) - (5) (also referred to in the singular as "statement 308" or plural as "multiple statements 308").
For example, as shown in the example of fig. 3B, the context component 102 can determine one or more words from the text of the unstructured data 204 to use as one or more identifiers. The context component 102 can then associate (e.g., pair) one or more identifiers with information from the text. For the first instance, the context component 102 can determine that the word "made" from the description associated with field 214 is an identifier, and then associate the identifier with the information "fresh when subscribed". As such, the first narration 308 (1) of the context data 306 associates the identifier "make" with the information "fresh when subscribed". For a second example, the context component 102 can determine that the word "preference" from the description associated with the field 214 is an identifier, and then associate the identifier with the information "French fries or salad". As such, the second narrative 308 (2) of the context data 306 associates the identifier "preference" with the information "French fries or salad".
As further shown in the example of FIG. 3B, context component 102 further generates statements 308 (3) - (5) using structured data 202. For example, the third narration 308 (3) associates the identifier "name" with the information "hamburger", the fourth narration 308 (4) associates the identifier "ingredients" with the information "lettuce or tomato", and the fifth narration 308 (5) associates the identifier "price" with the information "$1.00". Although the examples of fig. 3A-3B illustrate two techniques for generating context data 302 and 306 using structured data 202 and unstructured data 204, in other examples, context component 102 can perform additional and/or alternative techniques to generate context data using structured data 202 and unstructured data 204.
Referring back to the example of fig. 1, the process 100 may include one or more user devices 110 providing input data 112. In some examples, the input data 112 may include audio data generated (e.g., using one or more microphones) and/or transmitted by the user device 110, where the audio data represents user speech from one or more users. Additionally or alternatively, in some examples, the input data 112 may include text data generated (e.g., using a keyboard, touch screen, and/or other input device) and/or transmitted by the user device 110, where the text data represents one or more letters, words, numbers, and/or symbols. While these are just a few example types of data that the input data 112 may include, in other examples, the input data 112 may include any other type of data.
The process 100 may include a processing component 114 configured to process the input data 112 to generate text data 116. For a first example, when the input data 112 includes audio data representing user speech, for example, the processing component 114 may include one or more speech processing models, such as one or more Automatic Speech Recognition (ASR) models, one or more speech-to-text (STT) models, one or more Natural Language Processing (NLP) models, one or more diary chemical models, and/or the like, configured to generate text data 116 associated with the audio data. For example, the text data 116 may represent transcripts (e.g., one or more letters, words, symbols, numbers, etc.) associated with the user's speech. For a second example, such as when the input data 112 includes text data, the process 100 may not include the processing component 114 such that the text data 116 includes the input data 112.
In some examples, the processing component 114 may be configured to perform additional processing. For example, the processing component 114 can process the input data 112 in order to determine an intent and/or one or more slots associated with the input data 112. As described herein, intent may include, but is not limited to, requesting information (e.g., information about objects, information about features, etc.), scheduling events (e.g., reservation scheduling for traffic and accommodation, etc.), planning activities (e.g., making reservations, etc.), communicating with others (e.g., making phone calls, starting video conferences, etc.), purchasing items (e.g., purchasing items from an online marketplace, ordering food from a local restaurant, etc.), and so forth. Further, one or more slots may provide additional information associated with the intent. For example, if the intent is a request for information associated with an object, the first slot may include an identifier (e.g., a name) of the object and the second slot may include a type of information requested for the object. In examples where the processing component 114 performs this additional processing, the text data 116 may additionally and/or alternatively be a schematic representation and/or a slot.
In some examples, the context data 108 may be selected based on the intent and/or information of the slot associated with the intent. For example, if the intent is a request for information associated with an object and the information associated with the slot indicates a type of the object, the context data 108 may be selected based on the context data 108 including information associated with the type of the object.
The process 100 can include an information component 118 configured to process the text data 116 and the context data 108, for example, by using one or more neural networks, to identify information (e.g., one or more words) associated with the request represented by the text data 116. For example, the information component 118 can process the text data 116 to determine an intent associated with the request and/or one or more slots associated with the intent. For example, if the request is a query for information associated with an object, the information component 118 can process the text data 116 to determine that the intent is to request the information. The information component 118 can also process the text data 116 to determine that the first information of the first slot associated with the intent includes an identifier (e.g., a name) of the object and/or the second information of the second slot associated with the intent includes a type of information requested.
The information component 118 can then process the context data 108 to identify a portion of text (e.g., one or more letters, words, numbers, and/or symbols) associated with the intent and/or slot. In some examples, to identify a portion of text, the information component 118 can initially identify one or more words within the text represented by the context data 108, such as words associated with intent and/or slots. For example, if the intent includes "request information" associated with a food item and the slot includes "ingredients," the information component 118 may initially identify the word "ingredients" in the text represented by the context data 108. The information component 118 can then identify the portion of text as one or more letters, words, numbers, and/or symbols using the identified words (such as one or more letters, words, numbers, and/or symbols within the text that are located proximate to the identified words). For example, and using the above example, if the text includes the word "ingredients include lettuce and kimchi," the information component 118 can identify the portion of the text as including "lettuce and kimchi" because the portion of the text follows the identified word "ingredients. The information component 118 can then generate and output index data 120 representing the portion of text represented by the information 122.
For example, fig. 4 illustrates an example of using context data to extract information associated with a request according to some embodiments of the present disclosure. For example and as shown, the information component 118 can receive context data 402 (which can represent and/or include context data 108) that represents text associated with a subject. In the example of fig. 4, the context data 402 may be generated using the context data 302 from the example of fig. 3A. For example, the context data 402 may be generated by combining text from the narrative 304 (such as in the form of paragraphs). While the example of FIG. 4 shows the context data 402 being generated using the text of the narrative 304 (1), followed by the text of the narrative 304 (2), followed by the text of the narrative 304 (3), and finally followed by the text of the narrative 304 (4), in other examples, the context data 402 may be generated by combining the text from the narrative using a different order.
The information component 118 can further receive text data 404 (which can represent and/or include text data 116) representative of a request from a user. In the example of fig. 4, the request includes a query regarding ingredients provided by the hamburger. As such, the information component 118 can process the text data 404 and the context data 402 using one or more neural networks to identify at least a portion of the context data 402 associated with the text data 404. For example, the information component 118 can determine that the intent associated with the query is "request information," the first slot associated with the query is "hamburger," and the second slot associated with the query is "ingredient. The information component 118 can then use the intent and/or slot to identify portions of the context data 402.
The information component 118 can then output index data 406 (which can represent and/or include the index data 120) representing portions of the context data 402. For example, in the example of fig. 4, the index data 406 represents a text portion including "lettuce and tomato". Thus, the information component 118 can determine that the information of the query request is "lettuce and tomato".
Referring back to the example of fig. 1, in some examples, the information component 118 may output multiple instances of the information 122. For example, the information component 118 can output first information 122 representing a first portion of text represented by the context data 108, second information 122 representing a second portion of text represented by the context data 108, third information 122 representing a third portion of text represented by the context data 108, and so forth. In some examples, the information component 118 outputs a threshold number of instances of the information 122. The threshold number may include, but is not limited to, one instance, two instances, five instances, ten instances, and/or any other number of instances.
In such an example, the information component 118 can also generate a respective confidence 124 for one or more (e.g., each) of the instances. For example, and using the above examples, the information component 118 can output a first confidence 124 associated with the first information 122, a second confidence 124 associated with the second information 122, a third confidence 124 associated with the third information 122, and so on. The information component 118 can then use the confidence 124 to select at least one of the instances. For example, the information component 118 can select an instance of the information 122 associated with the highest confidence 124 of the confidence 124.
In some examples, the information component 118 may perform additional processes based on the confidence 124. For example, the information component 118 (and/or another component) may determine whether the confidence 124 meets (is equal to or greater than) a threshold confidence. Threshold confidence may include, but is not limited to, 25%, 50%, 75%, 90%, 99%, and/or any other threshold. If the information component 118 determines that the one or more confidence levels 124 satisfy the threshold confidence level, the process 100 can include using the information 122 associated with the one or more confidence levels 124. However, if the information component 118 determines that the one or more confidence levels 124 do not meet the threshold confidence level, the process 100 may include performing additional processing to identify the additional information 122.
For a first example, the information component 118 can process the structured data 104 and the text data 116 to determine information 122 associated with the request represented by the text data 116. Similar to the context data 108, if the information component 118 (and/or another component) determines that the confidence 124 associated with the information 122 meets a threshold confidence, the process 100 can include using the information 122. For the second example, the information component 118 can process the unstructured data 106 and the text data 116 to determine information 122 associated with the request represented by the text data 116. Similar to the context data 108, if the information component 118 (and/or another component) determines that the confidence 124 associated with the information 122 meets a threshold confidence, the process 100 can include using the information 122.
However, in the above example, if the information component 118 (and/or another component) again determines that the confidence 124 associated with the new information 122 does not meet the threshold confidence, the process 100 may include performing even more processing. For the first example, if the information component 118 determines that the confidence 124 associated with the information 122 determined using the structured data 104 does not meet the threshold confidence, the information component 118 can process the unstructured data 106 and the text data 116 to determine the information 122 associated with the request represented by the text data 116. Similar to the context data 108, if the information component 118 (and/or another component) determines that the confidence 124 associated with the information 122 meets a threshold confidence, the process 100 can include using the information 122. For the second example, if the information component 118 determines that the confidence 124 associated with the information 122 determined using the unstructured data 106 does not meet the threshold confidence, the information component 118 may process the structured data 104 and the text data 116 to determine the information 122 associated with the request represented by the text data 116. Similar to the context data 108, if the information component 118 (and/or another component) determines that the confidence 124 associated with the information 122 meets a threshold confidence, the process 100 can include using the information 122.
In other words, the information component 118 may initially attempt to identify the information 122 using the context data 108, and if failed, the information component 118 may attempt to identify the information 122 using the structured data 104 or the structured data 104. Further, if attempting to identify the information 122 using the structured data 104 or the unstructured data 106 fails, the information component 118 can attempt to identify the information 122 using the other of the structured data 104 or the unstructured data 106.
The process 100 can include a response component 126 configured to process the text data 116 and the index data 120 (e.g., the information 122) using one or more neural networks to generate a response associated with the request. As described herein, the response may include text including at least the information 122 identified by the information component 118. For example, the response component 126 can process the text data 116 to determine an intent associated with the request and/or one or more slots associated with the intent. For example, if the request is a query for information associated with an object, the response component 126 can process the text data 116 to determine that the intent is to request the information. The response component 126 can also process the text data 116 to determine that the first information of the first slot associated with the intent includes an identifier (e.g., a name) of the object and/or the second information of the second slot associated with the intent includes a type of information requested.
Response component 126 can also process index data 120 to determine information 122 that the user is requesting. In addition, the response component 126 can generate a response using the intent, slot, and/or information 122. The response component 126 can then generate and output data 128 representative of the response. In some examples, the response component 126 (and/or another component) can send the output data 128 to the user device 110.
For example, fig. 5 illustrates an example of generating a response to a request using the extracted information in accordance with some embodiments of the present disclosure. As shown, response component 126 can receive index data 406 and text data 404 from the example of FIG. 4. The response component 126 can then determine that the intent associated with the query is "request information," the first slot associated with the query is "hamburger," and the second slot associated with the query is "ingredient. The response component 126 can then use the intent, slot, and text from the index data 406 to generate a response represented by the output data 502 (which can represent and/or include the output data 128). As shown in the example of fig. 5, the response includes the text "lettuce and tomato" from the index data 406 and additional text that generates a complete sentence.
Referring back to the example of fig. 1, while the example of fig. 1 shows each of the processing component 114, the information component 118, and the response component 126 as separate from one another, in some examples, one or more of the processing component 114, the information component 118, and the response component 126 may be combined into a single component. For example, the processing component 114, the information component 118, and the response component 126 can be part of a dialog management system. Further, the processing component 114, the information component 118, and the response component 126 can employ any type of one or more neural networks and/or one or more models to perform processes described herein, such as Convolutional Neural Networks (CNNs), feed-forward neural networks, recurrent neural networks, extract question-answer models, answer extender models, large language models, and the like.
Further, as described herein, in some examples, one or more neural networks used by the information component 118 may have been trained to process structured data 104 or unstructured data 106 such that the one or more neural networks do not require further training to process context data 108 (e.g., after converting the data into a common form in which the neural networks are trained or configured to process). For example, the context data 108 may include text in a format similar to the text of the structured data 104 or the text of the unstructured data 106. However, in other examples, one or more neural networks may be trained using context data similar to the context data 108 that the one or more neural networks process in the deployment.
For example, fig. 6 is a data flow diagram illustrating a process 600 for training one or more neural networks 602 to extract information associated with a request, according to some embodiments of the invention. As shown, one or more neural networks 602 may be trained using context data 604 (e.g., training context data) and text data 606 (e.g., training text data). The context data 604 may represent text in a format similar to the context data 108 generated by the context component 102. Further, the text data 606 may represent a request associated with the context data 604.
As shown, one or more neural networks 602 may be trained using training context data 604, training text data 606, and corresponding truth data 608. The truth data 608 may represent information 610 that one or more neural networks 602 should extract from the context data 604 based on the text data 606. In some examples, the truth data 608 may be generated using a program adapted to generate the truth data 608, and/or may be human-generated (e.g., by hand). In any example, the truth data 608 may be synthetically generated (e.g., generated from a computer model or rendering), truly generated (e.g., designed and generated from real world data), machine automated, human annotated (e.g., a marker or annotation expert, etc.), and/or combinations thereof. In some examples, for each request represented by text data 606, there may be corresponding truth data 608.
Training engine 612 may use one or more penalty functions of measured output 614 for penalty (e.g., error) compared to truth data 608. Any type of loss function may be used, such as cross entropy loss, mean square error, mean absolute error, mean deviation error, and/or other loss function types. In some examples, different outputs 614 may have different loss functions. In some examples, a backward pass calculation may be performed to recursively calculate the gradient of the loss function relative to the training parameters. In some examples, the weights and biases of one or more neural networks 602 may be used to calculate these gradients.
Referring now to fig. 7-9, each block of the methods 700, 800, and 900 described herein includes a computing process that may be performed using any combination of hardware, firmware, and/or software. For example, the functions may be implemented by a processor executing instructions stored in a memory. Methods 700, 800, and 900 may also be embodied as computer-usable instructions stored on a computer storage medium. Methods 700, 800, and 900 may be provided to another product by a stand-alone application, service, or hosted service (alone or in combination with other hosted services) or plug-in, to name a few. Further, by way of example, with respect to the system of fig. 1, methods 700, 800, and 900 are described. However, methods 700, 800, and 900 may additionally or alternatively be performed by any one or any combination of systems, including but not limited to those described herein.
Fig. 7 is a flowchart illustrating a method 700 for processing context data generated using structured and unstructured data to determine a response to a request or query, according to some embodiments of the present disclosure. At block B702, the method 700 may include receiving first data representing a request. For example, one or more systems (e.g., information component 118) can receive text data 116 representing a request. As described herein, in some examples, the text data 116 may be generated based on audio data (e.g., the input data 112) received from one or more user devices 110. For example, the text data 116 may represent a transcription of user speech represented by audio data. In some examples, the text data 116 may include text entered into the one or more user devices 110, such as by using an input device.
At block B704, the method 700 may include receiving second data representing text, a first portion of the text being associated with unstructured text and a second portion of the text being associated with structured text. For example, one or more systems (e.g., information component 118) can receive the context data 108, wherein the context data 108 is generated using at least the structured data 104 and the unstructured data 106. For example, and as described herein, the context data 108 may be generated using one or more first statements associated with the structured data 104 and one or more second statements associated with the unstructured data 106. In some examples, the text represented by the context data 108 has a similar format as the text represented by the unstructured data 106. In some examples, the text represented by the context data 108 has a similar format as the text represented by the structured data 104.
At block B706, the method 700 may include determining a response to the request using one or more neural networks and based at least on the first data and the second data. For example, one or more systems (e.g., information component 118) can process the text data 116 and the context data 108 and, based on the processing, output index data 120 representing at least a portion of the text. One or more systems (e.g., response component 126) can then process the text data 116 and the index data 120 and determine a response to the request based upon the processing. For example, the response may include at least a portion of text.
At block B708, the method 700 may include outputting third data representing the response. For example, one or more systems (e.g., response component 126) can then output data 128 representative of the response. In some examples, one or more systems may output data 128 by sending data 128 to one or more user devices 110.
Fig. 8 is a flow chart illustrating a method 800 for generating context data using structured data and unstructured data according to some embodiments of the present disclosure. At block B802, the method 800 may include receiving first data representing structured data. For example, one or more systems (e.g., the context component 102) can receive structured data 104 that represents structured text associated with intent, subject matter, action, and the like. As described herein, in some examples, structured text may include fields that associate (e.g., pair) an identifier with information. For example, and for a topic, structured text can include a first field that associates a first identifier with first information describing the first identifier, a second field that associates a second identifier with second information describing the second identifier, a third field that associates a third identifier with third information describing the third identifier, and so forth.
At block B804, the method 800 may include receiving second data representing unstructured text. For example, one or more systems (e.g., the context component 102) can receive unstructured data 106 that represents unstructured text associated with intent, subject, action, and the like. As described herein, in some examples, unstructured text may represent one or more fields that include one or more plain text descriptions (such as information not associated with a particular identifier).
At block B806, the method 800 may include generating third data representing the context text based at least on the first data and the second data. For example, one or more systems (e.g., context component 102) can use structured data 104 and unstructured data 106 to generate context data 108. In some examples, to generate the context data 108, one or more systems may generate one or more narratives using the structured data 104 and one or more narratives using the unstructured data 106. In some examples, one or more systems generate the narrative using a format similar to the text represented by the unstructured data 106. In some examples, one or more systems generate the narrative using a format similar to the text represented by the structured data 104. In any example, one or more systems can then generate context data 108 using the narration to represent the paragraph.
Fig. 9 is a flow chart illustrating a method 900 for generating a response associated with a request according to some embodiments of the present disclosure. At block B902, the method 900 may include receiving context data generated using structured data and unstructured data. For example, one or more systems (e.g., information component 118) can receive the context data 108, wherein the context data 108 is generated using at least the structured data 104 and the unstructured data 106. For example, as described herein, the context data 108 may be generated based on one or more statements associated with the structured data 104 and one or more statements associated with the unstructured data 106. In some examples, the text represented by the context data 108 has a similar format as the text represented by the unstructured data 106. In some examples, the text represented by the context data 108 has a similar format as the text represented by the structured data 104.
At block B904, the method 900 may include determining first information associated with the request and a confidence score associated with the first information using one or more neural networks and based at least on the context data. For example, one or more systems (e.g., information component 118) can process the text data 116 and the context data 108 and, based on the processing, output index data 120 representing the first information 122 and a confidence 124 associated with the first information 122. As described herein, the first information 122 may include at least a portion of text represented by the context data 108.
At block B906, the method 900 may include determining whether the confidence score meets a threshold score. For example, one or more systems (e.g., information component 118) may compare the confidence 124 to a threshold confidence score. Based on the comparison, the one or more systems may determine whether the confidence 124 meets (e.g., is equal to or greater than) a threshold confidence score or whether the confidence 124 does not meet (e.g., is less than) the threshold confidence score.
If it is determined at block B906 that the confidence score meets the threshold score, at block B908, the method 900 may include generating a first response using the first information. For example, if the system determines that the confidence 124 meets a threshold confidence score, the system (e.g., response component 126) may generate a first response using the first information 122. The system may then output data 128 representing the first response.
However, if it is determined at block B906 that the confidence score does not meet the threshold score, at block B910, the method 900 may include determining second information associated with the request using one or more neural networks and based at least on structured data and/or unstructured data. For example, if one or more systems determine that the confidence 124 does not meet the threshold confidence score, the one or more systems (e.g., the information component 118) may process the text data 116 and the structured data 104 and/or the unstructured data 106 and, based on the processing, output index data 120 representative of the second information 122. In some examples, one or more systems may initially use one of structured data 104 or unstructured data 106 to determine second information 122. In such an example, one or more systems may use the second information 122 if the confidence 124 associated with the second information 122 meets a threshold confidence score. However, if the confidence 124 associated with the second information 122 does not meet the threshold confidence score, then the one or more systems may use the other of the structured data 104 or the unstructured data 106 to determine the third information 122 associated with the request.
At block B912, the method 900 may include generating a second response using the second information. For example, one or more systems (e.g., response component 126) can use the second information 122 to generate a second response. One or more systems may then output data 128 representing the second response.
Example computing device
Fig. 10 is a block diagram of an example computing device 1000 suitable for use in implementing some embodiments of the disclosure. Computing device 1000 may include an interconnection system 1002 that directly or indirectly couples the following devices: memory 1004, one or more Central Processing Units (CPUs) 1006, one or more Graphics Processing Units (GPUs) 1008, a communication interface 1010, input/output (I/O) ports 1012, input/output components 1014, a power supply 1016, one or more presentation components 1018 (e.g., one or more displays), and one or more logic units 1020. In at least one embodiment, one or more computing devices 1000 may include one or more Virtual Machines (VMs), and/or any of its components may include virtual components (e.g., virtual hardware components). For non-limiting examples, the one or more GPUs 1008 may include one or more vgus, the one or more CPUs 1006 may include one or more vcpus, and/or the one or more logic units 1020 may include one or more virtual logic units. As such, one or more computing devices 1000 may include discrete components (e.g., a full GPU dedicated to computing device 1000), virtual components (e.g., a portion of a GPU dedicated to computing device 1000), or a combination thereof.
Although the various blocks of fig. 10 are shown as being connected with wires via the interconnection system 1002, this is not intended to be limiting and is for clarity only. For example, in some embodiments, the presentation component 1018 (such as a display device) may be considered the I/O component 1014 (e.g., if the display is a touch screen). As another example, CPU 1006 and/or GPU 1008 may include memory (e.g., memory 1004 may represent a storage device other than memory of GPU 1008, CPU 1006, and/or other components). In other words, the computing device of fig. 10 is merely illustrative. No distinction is made between such categories as "workstation," "server," "laptop," "desktop," "tablet," "client device," "mobile device," "handheld device," "game console," "Electronic Control Unit (ECU)", "virtual reality system," and/or other device or system types, as all are contemplated within the scope of the computing device of fig. 10.
The interconnect system 1002 may represent one or more links or buses, such as an address bus, a data bus, a control bus, or a combination thereof. The interconnect system 1002 may include one or more bus or link types, such as an Industry Standard Architecture (ISA) bus, an Extended ISA (EISA) bus, a Video Electronics Standards Association (VESA) bus, a Peripheral Component Interconnect (PCI) bus, a peripheral component interconnect express (PCIe) bus, and/or another type of bus or link. In some embodiments, there is a direct connection between the components. By way of example, the CPU 1006 may be directly connected to the memory 1004. Further, the CPU 1006 may be directly connected to the GPU 1008. Where there is a direct or point-to-point connection between the components, the interconnect system 1002 may include PCIe links to perform the connection. In these examples, the PCI bus need not be included in computing device 1000.
Memory 1004 may include any of a variety of computer-readable media. Computer readable media can be any available media that can be accessed by computing device 1000. Computer readable media can include both volatile and nonvolatile media, as well as removable and non-removable media. By way of example, and not limitation, computer readable media may comprise computer storage media and communication media.
Computer storage media may include volatile and nonvolatile media and/or removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules, and/or other data types. For example, memory 1004 may store computer-readable instructions (e.g., representing one or more programs and/or one or more program elements, such as an operating system). Computer storage media may include, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital Versatile Disks (DVD) or other optical disk storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can be accessed by computing device 1000. As used herein, a computer storage medium does not include a signal itself.
Computer storage media may embody computer readable instructions, data structures, program modules, and/or other data types in a modulated data signal such as a carrier wave or other transport mechanism and include any information delivery media. The term "modulated data signal" may refer to a signal that has one or more of its characteristics set or changed in such a manner as to encode information in the signal. By way of example, and not limitation, computer storage media may include wired media such as a wired network or direct-wired connection, and wireless media such as acoustic, RF, infrared and other wireless media. Combinations of any of the above should also be included within the scope of computer readable media.
The one or more CPUs 1006 may be configured to execute at least some of the computer readable instructions to control one or more components of the computing device 1000 to perform one or more of the methods and/or processes described herein. The one or more CPUs 1006 may each include one or more cores (e.g., one, two, four, eight, twenty-eight, seventy-two, etc.) capable of processing numerous software threads simultaneously. The one or more CPUs 1006 may include any type of processor, and may include different types of processors (e.g., processors with fewer cores for mobile devices and processors with more cores for servers) depending on the type of computing device 1000 implemented. For example, depending on the type of computing device 1000, the processor may be an Advanced RISC Machine (ARM) processor implemented using Reduced Instruction Set Computing (RISC) or an x86 processor implemented using Complex Instruction Set Computing (CISC). In addition to one or more microprocessors or supplemental coprocessors (such as math coprocessors), computing device 1000 may include one or more CPUs 1006.
In addition to or in lieu of the one or more CPUs 1006, the one or more GPUs 1008 can be configured to execute at least some of the computer readable instructions to control one or more components of the computing device 1000 to perform one or more of the methods and/or processes described herein. One or more of the GPUs 1008 may be integrated GPUs (e.g., where one or more of the CPUs 1006 and/or one or more of the GPUs 1008 may be discrete GPUs). In an embodiment, one or more of the one or more GPUs 1008 may be coprocessors of one or more of the one or more CPUs 1006. One or more GPUs 1008 can be used by the computing device 1000 to render graphics (e.g., 3D graphics) or perform general purpose computations. For example, one or more GPUs 1008 may be used for general purpose computing on GPUs (GPGPUs). One or more GPUs 1008 may include hundreds or thousands of cores capable of processing hundreds or thousands of software threads simultaneously. The one or more GPUs 1008 may generate pixel data for outputting an image in response to rendering commands (e.g., rendering commands from the one or more CPUs 1006 received via a host interface). The one or more GPUs 1008 may include graphics memory, such as display memory, for storing pixel data or any other suitable data, such as GPGPU data. Display memory may be included as part of memory 1004. The one or more GPUs 1008 may include two or more GPUs that operate in parallel (e.g., via a link). The links may connect GPUs directly (e.g., using NVLINK) or may connect GPUs through switches (e.g., using NVSwitch). When combined together, each GPU 1008 may generate pixel data or GPGPU data for different portions of the output or for different outputs (e.g., a first GPU for a first image and a second GPU for a second image). Each GPU may include its own memory, or may share memory with other GPUs.
In addition to or alternatively to the one or more CPUs 1006 and/or the one or more GPUs 1008, the one or more logic units 1020 may be configured to execute at least some of the computer readable instructions to control one or more components of the computing device 1000 to perform one or more of the methods and/or processes described herein. In embodiments, the one or more CPUs 1006, the one or more GPUs 1008, and/or the one or more logic units 1020 may perform any combination of methods, processes, and/or portions thereof, either discretely or jointly. One or more of the logic units 1020 may be one or more of the one or more CPUs 1006 and/or the one or more GPUs 1008 and/or integrated into one or more of the one or more CPUs 1006 and/or the one or more GPUs 1008, and/or one or more of the logic units 1020 may be a discrete component or otherwise external to the one or more CPUs 1006 and/or the one or more GPUs 1008. In an embodiment, one or more of logic units 1020 may be coprocessors of one or more CPUs 1006 and/or one or more of one or more GPUs 1008.
Examples of the one or more logic units 1020 include one or more processing cores and/or components thereof, such as a Data Processing Unit (DPU), tensor Core (TC), tensor Processing Unit (TPU), pixel Vision Core (PVC), vision Processing Unit (VPU), graphics Processing Cluster (GPC), texture Processing Cluster (TPC), streaming Multiprocessor (SM), tree Traversal Unit (TTU), artificial Intelligence Accelerator (AIA), deep Learning Accelerator (DLA), arithmetic Logic Unit (ALU), application Specific Integrated Circuit (ASIC), floating Point Unit (FPU), input/output (I/O) element, peripheral Component Interconnect (PCI), or peripheral component interconnect express (PCIe) element, and the like.
The communication interface 1010 may include one or more receivers, transmitters, and/or transceivers that enable the computing device 1000 to communicate with other computing devices, including wired and/or wireless communications, via an electronic communication network. The communication interface 1010 may include components and functionality for enabling communication over any of a number of different networks, such as a wireless network (e.g., wi-Fi, Z-wave, bluetooth LE, zigBee, etc.), a wired network (e.g., over ethernet or infiniband), a low power wide area network (e.g., loRaWAN, sigFox, etc.), and/or the internet. In one or more embodiments, logic 1020 and/or communication interface 1010 may include one or more Data Processing Units (DPUs) to send data received via a network and/or through interconnect system 1002 directly to one or more GPUs 1008 (e.g., memory of one or more GPUs 1008).
The I/O ports 1012 can enable the computing device 1000 to be logically coupled to other devices including an I/O component 1014, one or more presentation components 1018, and/or other components, some of which can be built into (e.g., integrated into) the computing device 1000. Illustrative I/O components 1014 include a microphone, mouse, keyboard, joystick, game pad, game controller, satellite dish, scanner, printer, wireless device, or the like. The I/O component 1014 can provide a Natural User Interface (NUI) that processes air gestures, voice, or other physiological input generated by a user. In some cases, the input may be sent to an appropriate network element for further processing. NUI may enable any combination of speech recognition, stylus recognition, facial recognition, biometric recognition, on-screen and near-screen gesture recognition, air gesture, head and eye tracking, and touch recognition (as described in more detail below) associated with a display of computing device 1000. Computing device 1000 may include a depth camera, such as a stereoscopic camera system, an infrared camera system, an RGB camera system, touch screen technology, and combinations of these, for gesture detection and recognition. Additionally, the computing device 1000 may include an accelerometer or gyroscope (e.g., as part of an Inertial Measurement Unit (IMU)) that enables detection of motion. In some examples, the computing device 1000 may use the output of the accelerometer or gyroscope to render immersive augmented reality or virtual reality.
The power source 1016 may include a hard-wired power source, a battery power source, or a combination thereof. The power supply 1016 may provide power to the computing device 1000 to enable components of the computing device 1000 to operate.
The one or more presentation components 1018 may include a display (e.g., monitor, touch screen, television screen, head-up display (HUD), other display types, or combinations thereof), speakers, and/or other presentation components. One or more presentation components 1018 may receive data from other components (e.g., one or more GPUs 1008, one or more CPUs 1006, dpus, etc.) and output the data (e.g., as images, video, sound, etc.).
Example data center
FIG. 11 illustrates an example data center 1100 that can be used in at least one embodiment of the present disclosure. The data center 1100 may include a data center infrastructure layer 1110, a framework layer 1120, a software layer 1130, and/or an application layer 1140.
As shown in fig. 11, the data center infrastructure layer 1110 may include a resource coordinator 1112, grouped computing resources 1114, and node computing resources ("node c.r.") 1116 (1) -1116 (N), where "N" represents any whole positive integer. In at least one embodiment, nodes c.r.1116 (1) -1116 (N) may include, but are not limited to, any number of central processing units ("CPUs") or other processors (including DPUs, accelerators, field Programmable Gate Arrays (FPGAs), graphics processors or Graphics Processing Units (GPUs), etc.), memory devices (e.g., dynamic read only memory), storage devices (e.g., solid state or disk drives), network input/output ("NW I/O") devices, network switches, virtual machines ("VMs"), power modules and/or cooling modules, and the like. In some embodiments, one or more of nodes c.r.1116 (1) -1116 (N) may correspond to a server having one or more of the computing resources described above. Further, in some embodiments, nodes c.r.1116 (1) -1116 (N) may include one or more virtual components, such as vGPU, vCPU, etc., and/or one or more of nodes c.r.1116 (1) -1116 (N) may correspond to a Virtual Machine (VM).
In at least one embodiment, the grouped computing resources 1114 may include individual groupings of nodes C.R.1116 housed within one or more racks (not shown), or individual groupings of many racks within a data center housed at different geographic locations (also not shown). Individual packets of node c.r.1116 within the packet's computing resources 1114 may include packet computing, network, memory, or storage resources that may be configured or allocated to support one or more workloads. In at least one embodiment, several nodes c.r.1116 including CPU, GPU, DPU and/or other processors may be grouped within one or more racks to provide computing resources to support one or more workloads. The one or more racks may also include any number of power modules, cooling modules, and/or network switches in any combination.
The resource coordinator 1112 may configure or otherwise control one or more nodes c.r.1116 (1) -1116 (N) and/or grouped computing resources 1114. In at least one embodiment, the resource coordinator 1112 may include a software design infrastructure ("SDI") management entity for the data center 1100. The resource coordinator 1112 may include hardware, software, or some combination thereof.
In at least one embodiment, as shown in FIG. 11, the framework layer 1120 can include a job scheduler 1128, a configuration manager 1134, a resource manager 1136, and/or a distributed file system 1138. The framework layer 1120 may include a framework of one or more applications 1142 supporting the software 1132 of the software layer 1130 and/or the application layer 1140. Software 1132 or applications 1142 may include Web-based services software or applications, such as those provided by Amazon Web services, *** Cloud, and Microsoft Azure, respectively. The framework layer 1120 may be, but is not limited to, a type of free and open source software Web application framework, such as APACHE SPARK TM (hereinafter "Spark"), that can use the distributed file system 1138 for large scale data processing (e.g., "big data"). In at least one embodiment, job scheduler 1128 may include Spark drivers to facilitate scheduling the workloads supported by the various layers of data center 1100. The configuration manager 1134 may be capable of configuring different layers, such as a software layer 1130 and a framework layer 1120 including Spark and distributed file system 1138, to support large-scale data processing. The resource manager 1136 may be capable of managing computing resources mapped to the distributed file system 1138 and the job scheduler 1128 or allocated for supporting clusters or groupings of the distributed file system 1138 and the job scheduler 1128. In at least one embodiment, clustered or grouped computing resources can include grouped computing resources 1114 at a data center infrastructure layer 1110. The resource manager 1136 may coordinate with the resource coordinator 1112 to manage these mapped or allocated computing resources.
In at least one embodiment, the software 1132 included in the software layer 1130 can include software used by at least portions of the distributed file system 1138 of the nodes C.R.1116 (1) -1116 (N), the grouped computing resources 1114, and/or the framework layer 1120. One or more types of software may include, but are not limited to, internet web search software, email virus scanning software, database software, and streaming video content software.
In at least one embodiment, the one or more application programs 1142 included in the application layer 1140 may include one or more types of applications used by at least portions of the nodes c.r.1116 (1) -1116 (N), the grouped computing resources 1114, and/or the distributed file system 1138 of the framework layer 1120. The one or more types of applications may include, but are not limited to, any number of genomic applications, cognitive computing and machine learning applications, including training or reasoning software, machine learning framework software (e.g., pyTorch, tensorFlow, caffe, etc.), and/or other machine learning applications used in connection with one or more embodiments.
In at least one embodiment, any of the configuration manager 1134, resource manager 1136, and resource coordinator 1112 may implement any number and type of self-modifying changes based on any number and type of data acquired in any technically feasible manner. The self-modifying action may protect the data center operator of the data center 1100 from making potentially poor configuration decisions and possibly from underutilized and/or poorly performing portions of the data center.
According to one or more embodiments described herein, the data center 1100 may include tools, services, software, or other resources to train or predict or infer information using one or more machine learning models. For example, one or more machine learning models may be trained by calculating weight parameters from neural network architecture using software and/or computing resources described above with respect to data center 1100. In at least one embodiment, a trained or deployed machine learning model corresponding to one or more neural networks may be used to infer or predict information using the resources described above with respect to the data center 1100 by using weight parameters calculated by one or more training techniques, such as, but not limited to, those described herein.
In at least one embodiment, the data center 1100 can use a CPU, application Specific Integrated Circuit (ASIC), GPU, FPGA, and/or other hardware (or virtual computing resources corresponding thereto) to perform training and/or reasoning using the resources described above. Further, one or more of the software and/or hardware resources described above may be configured to allow a user to train or perform services that infer information, such as image recognition, voice recognition, or other artificial intelligence services.
Example network Environment
A network environment suitable for implementing embodiments of the present disclosure may include one or more client devices, servers, network Attached Storage (NAS), other back-end devices, and/or other device types. Client devices, servers, and/or other device types (e.g., each device) can be implemented on one or more instances of one or more computing devices 1000 of fig. 10-e.g., each device can include similar components, features, and/or functions of one or more computing devices 1000. Further, where a back-end device (e.g., server, NAS, etc.) is implemented, the back-end device may be included as part of the data center 1100, examples of which are described in more detail herein with respect to fig. 11.
Components of the network environment may communicate with each other via one or more networks, which may be wired, wireless, or both. The network may comprise a plurality of networks or one of a plurality of networks. For example, the network may include one or more Wide Area Networks (WANs), one or more Local Area Networks (LANs), one or more public networks such as the internet and/or a Public Switched Telephone Network (PSTN), and/or one or more private networks. Where the network comprises a wireless telecommunications network, components such as base stations, communication towers, or even access points (among other components) may provide wireless connectivity.
Compatible network environments may include one or more peer-to-peer network environments (in which case the server may not be included in the network environment) and one or more client-server network environments (in which case the one or more servers may be included in the network environment). In a peer-to-peer network environment, the functionality described herein with respect to one or more servers may be implemented on any number of client devices.
In at least one embodiment, the network environment may include one or more cloud-based network environments, distributed computing environments, combinations thereof, and the like. The cloud-based network environment may include a framework layer, a job scheduler, a resource manager, and a distributed file system implemented on one or more servers, which may include one or more core network servers and/or edge servers. The framework layer may include a framework of one or more applications supporting software of the software layer and/or the application layer. The software or application may include web-based service software or application, respectively. In embodiments, one or more client devices may use web-based service software or applications (e.g., by accessing the service software and/or applications via one or more Application Programming Interfaces (APIs)). The framework layer may be, but is not limited to, a type of free and open source software web application framework, such as may use a distributed file system for large scale data processing (e.g., "big data").
The cloud-based network environment may provide cloud computing and/or cloud storage that performs any combination of the computing and/or data storage functions described herein (or one or more portions thereof). Any of these different functions may be distributed across multiple locations from a central or core server (e.g., of one or more data centers that may be distributed across states, regions, countries, the earth, etc.). If a connection with a user (e.g., a client device) is relatively close to one or more edge servers, the one or more core servers may assign at least a portion of the functionality to the one or more edge servers. The cloud-based network environment may be private (e.g., limited to a single organization), public (e.g., available to many organizations), and/or a combination thereof (e.g., a hybrid cloud environment).
The one or more client devices may include at least some of the components, features, and functions of one or more example computing devices 1000 described herein with respect to fig. 10. By way of example and not limitation, a client device may be implemented as a Personal Computer (PC), laptop computer, mobile device, smart phone, tablet computer, smart watch, wearable computer, personal Digital Assistant (PDA), MP3 player, virtual reality headset, global Positioning System (GPS) or device, video player, camera, surveillance device or system, vehicle, boat, airship, virtual machine, drone, robot, handheld communication device, hospital device, gaming device or system, entertainment system, vehicle computer system, embedded system controller, remote control, appliance, consumer electronics device, workstation, edge device, any combination of these depicted devices, or any other suitable device.
The disclosure may be described in the general context of computer code or machine-useable instructions, including computer-executable instructions such as program modules, being executed by a computer or other machine, such as a personal data assistant or other handheld device. Generally, program modules including routines, programs, objects, components, data structures, etc., refer to code that perform particular tasks or implement particular abstract data types. The present disclosure may be practiced in various system configurations, including hand-held devices, consumer electronics, general-purpose computers, more specialized computing devices, and the like. The disclosure may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network.
As used herein, recitation of "and/or" with respect to two or more elements should be interpreted to mean only one element, or a combination of more elements. For example, "element a, element B, and/or element C" may include element a only, element B only, element C only, element a and element B, element a and element C, element B and element C, or elements A, B and C. Further, "at least one of element a or element B" may include at least one of element a, at least one of element B, or at least one of element a and at least one of element B. Further, "at least one of element a and element B" may include at least one of element a, at least one of element B, or at least one of element a and at least one of element B.
The subject matter of the present disclosure is described with specificity herein to meet statutory requirements. However, the description itself is not intended to limit the scope of this disclosure. Rather, the inventors have contemplated that the claimed subject matter might also be embodied in other ways, to include different steps or combinations of steps similar to the ones described in this document, in conjunction with other present or future technologies. Furthermore, although the terms "step" and/or "block" may be used herein to connote different elements of methods employed, the terms should not be interpreted as implying any particular order among or between various steps herein disclosed unless and except when the order of individual steps is explicitly described.

Claims (20)

1. A method, comprising:
Receiving first data representing a query;
receiving second data representing text, the first portion of text being associated with unstructured text and the second portion of text being associated with structured text;
determining a response to the query using one or more neural networks and based at least on the first data and the second data; and
So that third data representing the response is output.
2. The method of claim 1, wherein determining a response to the query comprises:
determining one or more words from the text using one or more first neural networks of the one or more neural networks and based at least on the first data and the second data; and
The response to the query is determined using one or more second neural networks of the one or more neural networks and based at least on the one or more words.
3. The method of claim 2, further comprising:
determining that confidence scores associated with the one or more words are equal to or greater than a threshold score,
Wherein determining a response to the query is further based at least on the confidence score being equal to or greater than the threshold score.
4. The method of claim 1, wherein determining a response to the query comprises:
Determining one or more first words from the text using one or more first neural networks of the one or more neural networks and based at least on the first data and the second data;
Determining that confidence scores associated with the one or more first words are less than a threshold score;
Determining one or more second words using the one or more first neural networks and based at least on fourth data representing at least one of the unstructured text or the structured text based at least on the confidence score being less than the threshold score; and
The response to the query is determined using one or more second neural networks of the one or more neural networks and based at least on the one or more second words.
5. The method of claim 1, further comprising:
receiving fourth data representing the unstructured text associated with a topic, the unstructured text including a description associated with the topic;
Receiving fifth data representing the structured text associated with the topic, the structured text including one or more identifiers and one or more values associated with the one or more identifiers; and
The second data representing the text is generated based at least on the fourth data and the fifth data.
6. The method of claim 5, wherein generating the second data representing the text comprises:
Generating one or more first narratives based at least on the unstructured text, the first portion of the text including at least the one or more first narratives; and
One or more second narratives are generated based at least on the structured text, the second portion of the text including at least the one or more second narratives.
7. The method of claim 1, wherein one of:
the text represented by the second data includes a first format based at least on the unstructured text; or alternatively
The text represented by the second data includes a second format based at least on the structured text.
8. The method of claim 1, further comprising:
Determining intent associated with the query using one or more second neural networks and based at least on the first data; and
The second data associated with the query is determined based at least on the intent.
9. A system, comprising:
One or more processing units, the one or more processing units to:
Receiving first data representing user input;
receiving second data representing text, the first portion of text being associated with unstructured text and the second portion of text being associated with structured text;
Determining, using one or more neural networks and based at least on the first data and the second data, one or more words associated with the user input from the text; and
Third data representing at least the one or more words is output.
10. The system of claim 9, wherein the one or more processing units are further to:
determining, using one or more second neural networks and based at least on the first data and the third data, a response to the user input, the response including at least the one or more words; and
Fourth data representing the response is output.
11. The system of claim 9, wherein the one or more processing units are further to:
receiving fourth data representing the unstructured text associated with a topic, the unstructured text including a description associated with the topic;
Receiving fifth data representing the structured text associated with the topic, the structured text including one or more identifiers and one or more values associated with the one or more identifiers; and
The second data representing the text is generated based at least on the fourth data and the fifth data.
12. The system of claim 11, wherein generating the second data representative of the text comprises:
Generating one or more first narratives based at least on the unstructured text, the first portion of the text including at least the one or more first narratives; and
One or more second narratives are generated based at least on the structured text, the second portion of the text including at least the one or more second narratives.
13. The system of claim 9, wherein the one or more processing units are further to:
Determining confidence scores associated with the one or more words using the one or more neural networks;
determining that the confidence score is equal to or greater than a threshold score; and
Determining to generate a response associated with the user input using the third data based at least on the confidence score being equal to or greater than the threshold score.
14. The system of claim 9, wherein the one or more processing units are further to:
Determining confidence scores associated with the one or more words using the one or more neural networks;
Determining that the confidence score is less than a threshold score; and
Further processing is performed to generate a response associated with the user input based at least on the confidence score being less than the threshold score.
15. The system of claim 14, wherein performing further processing to generate a response associated with the user input comprises:
Determining, using the one or more neural networks and based at least on the first data and fourth data representing the unstructured text, one or more second words associated with the user input from the unstructured text; and
The response is generated using at least the one or more second words.
16. The system of claim 14, wherein performing further processing to generate a response associated with the user input comprises:
determining, using the one or more neural networks and based at least on the first data and fourth data representing the structured text, one or more second words associated with the user input from the structured text; and
The response is generated using at least the one or more second words.
17. The system of claim 9, wherein the one or more processing units are further to:
Determining, using one or more second neural networks and based at least on the first data, an intent associated with the user input; and
The second data associated with the user input is determined based at least on the intent.
18. The system of claim 9, wherein the system is included in at least one of:
A control system for an autonomous or semi-autonomous machine;
a perception system for an autonomous or semi-autonomous machine;
A system for performing a simulation operation;
a system for performing digital twinning operations;
A system for performing optical transmission simulation;
a system for performing collaborative content creation for a 3D asset;
a system for performing a deep learning operation;
a system implemented using edge devices;
A system implemented using a robot;
a system for performing a session AI operation;
A system for generating synthetic data;
a system incorporating one or more virtual machines VMs;
A system implemented at least in part in a data center; or (b)
A system implemented at least in part using cloud computing resources.
19. A processor, comprising:
One or more processing units to process data based at least on one or more neural networks to generate a response to a query, the data including query data corresponding to the query and unstructured context data generated at least in part by converting structured context data into unstructured form.
20. The processor of claim 19, wherein the processor is included in at least one of:
A control system for an autonomous or semi-autonomous machine;
a perception system for an autonomous or semi-autonomous machine;
A system for performing a simulation operation;
a system for performing digital twinning operations;
A system for performing optical transmission simulation;
a system for performing collaborative content creation for a 3D asset;
a system for performing a deep learning operation;
a system implemented using edge devices;
A system implemented using a robot;
a system for performing a session AI operation;
A system for generating synthetic data;
a system incorporating one or more virtual machines VMs;
A system implemented at least in part in a data center; or (b)
A system implemented at least in part using cloud computing resources.
CN202311622222.2A 2022-11-30 2023-11-29 Query response generation using structured and unstructured data for conversational AI systems and applications Pending CN118113823A (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US63/428,843 2022-11-30
US18/172,571 US20240176808A1 (en) 2022-11-30 2023-02-22 Query response generation using structured and unstructured data for conversational ai systems and applications
US18/172,571 2023-02-22

Publications (1)

Publication Number Publication Date
CN118113823A true CN118113823A (en) 2024-05-31

Family

ID=91216202

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311622222.2A Pending CN118113823A (en) 2022-11-30 2023-11-29 Query response generation using structured and unstructured data for conversational AI systems and applications

Country Status (1)

Country Link
CN (1) CN118113823A (en)

Similar Documents

Publication Publication Date Title
JP2022040183A (en) Selection of synthetic speech for agent by computer
US20210358188A1 (en) Conversational ai platform with rendered graphical output
US11741949B2 (en) Real-time video conference chat filtering using machine learning models
US11769495B2 (en) Conversational AI platforms with closed domain and open domain dialog integration
US20220391175A1 (en) Machine learning application deployment using user-defined pipeline
US11757974B2 (en) System and method for online litigation platform
CN114764896A (en) Automatic content identification and information in live adapted video games
CN115774774A (en) Extracting event information from game logs using natural language processing
US20230153612A1 (en) Pruning complex deep learning models based on parent pruning information
US20240111894A1 (en) Generative machine learning models for privacy preserving synthetic data generation using diffusion
CN116610777A (en) Conversational AI platform with extracted questions and answers
US20230177583A1 (en) Playstyle analysis for game recommendations
US20240176808A1 (en) Query response generation using structured and unstructured data for conversational ai systems and applications
US20230205797A1 (en) Determining intents and responses using machine learning in conversational ai systems and applications
CN118113823A (en) Query response generation using structured and unstructured data for conversational AI systems and applications
US20240193445A1 (en) Domain-customizable models for conversational ai systems and applications
US20240062014A1 (en) Generating canonical forms for task-oriented dialogue in conversational ai systems and applications
US20240045662A1 (en) Software code verification using call graphs for autonomous systems and applications
US20240184991A1 (en) Generating variational dialogue responses from structured data for conversational ai systems and applications
US20240177034A1 (en) Simulating quantum computing circuits using kronecker factorization
US20240160888A1 (en) Realistic, controllable agent simulation using guided trajectories and diffusion models
US20230244985A1 (en) Optimized active learning using integer programming
US20230385687A1 (en) Estimating optimal training data set size for machine learning model systems and applications
US20230376849A1 (en) Estimating optimal training data set sizes for machine learning model systems and applications
WO2023080806A1 (en) Synthetic audio-driven body animation using voice tempo

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination