CN111753080B - Method and device for outputting information - Google Patents

Method and device for outputting information Download PDF

Info

Publication number
CN111753080B
CN111753080B CN201910243599.4A CN201910243599A CN111753080B CN 111753080 B CN111753080 B CN 111753080B CN 201910243599 A CN201910243599 A CN 201910243599A CN 111753080 B CN111753080 B CN 111753080B
Authority
CN
China
Prior art keywords
data
sentences
sentence
semantic expression
subset
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910243599.4A
Other languages
Chinese (zh)
Other versions
CN111753080A (en
Inventor
卜建辉
黄强
谢炜坚
吴伟佳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN201910243599.4A priority Critical patent/CN111753080B/en
Publication of CN111753080A publication Critical patent/CN111753080A/en
Application granted granted Critical
Publication of CN111753080B publication Critical patent/CN111753080B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Machine Translation (AREA)

Abstract

The embodiment of the application discloses a method and a device for outputting information. One embodiment of the above method comprises: acquiring a target sentence; classifying the target sentences, and determining a pre-established semantic expression model corresponding to the classification result, wherein the semantic expression model is used for representing the corresponding relation between the sentences and the vectors; determining a vector of the target sentence according to the determined semantic expression model; based on the determined vector, information related to the target sentence is output. This embodiment improves the accuracy of semantic expression of sentences.

Description

Method and device for outputting information
Technical Field
The embodiment of the application relates to the technical field of computers, in particular to a method and a device for outputting information.
Background
Semantic representation of text refers to encoding text in natural language into a particular vector such that the vector contains semantic information for the text. A good semantic expression result is beneficial to improving the effect and performance of various tasks such as text similarity retrieval, emotion classification, field classification and the like.
Disclosure of Invention
The embodiment of the application provides a method and a device for outputting information.
In a first aspect, an embodiment of the present application provides a method for outputting information, including: acquiring a target sentence; classifying the target sentences, and determining a pre-established semantic expression model corresponding to the classification result, wherein the semantic expression model is used for representing the corresponding relation between the sentences and the vectors; determining the vector of the target sentence according to the determined semantic expression model; based on the determined vector, information related to the above-described target sentence is output.
In some embodiments, the target sentence is a sentence input by the user through a search engine; the method further comprises the following steps: responding to the detection of the clicking operation of a user on a search result page returned by a search engine according to the target sentence, and acquiring the title of the page corresponding to the clicking operation; determining the vector of the title; storing the vector association of the target sentence and the heading in a first data set; and in response to determining that the first data set meets a preset condition, taking sentences in the first data set as input, taking vectors associated with the input sentences as expected output, and training the determined semantic expression model to obtain a target semantic expression model.
In some embodiments, the above method further comprises: acquiring a second data set, wherein the second data set comprises sentences and vectors corresponding to the sentences; classifying sentences in the second data set to obtain at least one data subset; and determining a semantic expression model corresponding to the at least one data subset according to the at least one data subset.
In some embodiments, determining the semantic expression model corresponding to the at least one subset of data according to the at least one subset of data includes: and training to obtain a semantic expression model corresponding to the data subset by taking sentences in the data subset as input and taking vectors corresponding to the input sentences in the data subset as expected output for the data subset in the at least one data subset.
In some embodiments, determining the semantic expression model corresponding to the at least one subset of data according to the at least one subset of data includes: selecting at least one sentence from the data subset as a training sentence for the data subset in the at least one data subset; adding training sentences selected from other data subsets and vectors corresponding to the training sentences into the data subsets to obtain updated data subsets; sentences in the updated data subsets are used as input, vectors corresponding to the input sentences are used as expected output, and the semantic expression model corresponding to the updated data subsets is trained.
In a second aspect, an embodiment of the present application provides an apparatus for outputting information, including: a sentence acquisition unit configured to acquire a target sentence; a model determining unit configured to classify the target sentence and determine a pre-established semantic expression model corresponding to the classification result, wherein the semantic expression model is used for representing a corresponding relationship between the sentence and the vector; a first vector determination unit configured to determine a vector of the target sentence according to the determined semantic expression model; and an information output unit configured to output information related to the target sentence based on the determined vector.
In some embodiments, the target sentence is a sentence input by the user through a search engine; the above apparatus further comprises: the title acquisition unit is configured to respond to detection of clicking operation of a user on a search result page returned by the search engine according to the target sentence, and acquire a title of the page corresponding to the clicking operation; a second vector determination unit configured to determine a vector of the header; a data storage unit configured to store vector associations of the target sentence and the headline in a first data set; the first model training unit is configured to train the determined semantic expression model to obtain a target semantic expression model by taking sentences in the first data set as input and taking vectors associated with the input sentences as expected output in response to determining that the first data set meets preset conditions.
In some embodiments, the apparatus further comprises: a data acquisition unit configured to acquire a second data set, wherein the second data set includes a sentence and a vector corresponding to the sentence; a sentence classification unit configured to classify sentences in the second data set to obtain at least one data subset; and a second model training unit configured to determine a semantic expression model corresponding to the at least one data subset according to the at least one data subset.
In some embodiments, the second model training unit is further configured to: and training to obtain a semantic expression model corresponding to the data subset by taking sentences in the data subset as input and taking vectors corresponding to the input sentences in the data subset as expected output for the data subset in the at least one data subset.
In some embodiments, the second model training unit is further configured to: selecting at least one sentence from the data subset as a training sentence for the data subset in the at least one data subset; adding training sentences selected from other data subsets and vectors corresponding to the training sentences into the data subsets to obtain updated data subsets; sentences in the updated data subsets are used as input, vectors corresponding to the input sentences are used as expected output, and the semantic expression model corresponding to the updated data subsets is trained.
In a third aspect, an embodiment of the present application provides an electronic device, including: one or more processors; and a storage device having one or more programs stored thereon, which when executed by the one or more processors cause the one or more processors to implement the method as described in any of the embodiments of the first aspect.
In a fourth aspect, embodiments of the present application provide a computer readable medium having stored thereon a computer program which, when executed by a processor, implements a method as described in any of the embodiments of the first aspect.
The method and apparatus for outputting information provided in the above embodiments of the present application may first obtain a target sentence. Then, the target sentence is classified, and a pre-established semantic expression model corresponding to the classification result is determined. And determining the vector of the target sentence according to the determined semantic expression model. Finally, based on the determined vector, information related to the target sentence is output. The method of the embodiment can determine the vector of the sentence, thereby outputting the information related to the sentence and improving the accuracy of the semantic expression of the sentence.
Drawings
Other features, objects and advantages of the present application will become more apparent upon reading of the detailed description of non-limiting embodiments, made with reference to the accompanying drawings in which:
FIG. 1 is an exemplary system architecture diagram in which an embodiment of the present application may be applied;
FIG. 2 is a flow chart of one embodiment of a method for outputting information in accordance with the present application;
FIG. 3 is a schematic diagram of an application scenario of a method for outputting information according to the present application;
FIG. 4 is a flow chart of another embodiment of a method for outputting information in accordance with the present application;
FIG. 5 is a flow chart of the creation of a semantic expression model corresponding to each category in a method for outputting information according to the present application;
FIG. 6 is a schematic diagram of an embodiment of an apparatus for outputting information in accordance with the present application;
fig. 7 is a schematic diagram of a computer system suitable for use in implementing an embodiment of the application.
Detailed Description
The application is described in further detail below with reference to the drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the application and are not limiting of the application. It should be noted that, for convenience of description, only the portions related to the present application are shown in the drawings.
It should be noted that, without conflict, the embodiments of the present application and features of the embodiments may be combined with each other. The application will be described in detail below with reference to the drawings in connection with embodiments.
Fig. 1 shows an exemplary system architecture 100 to which an embodiment of a method for outputting information or an apparatus for outputting information of the present application may be applied.
As shown in fig. 1, a system architecture 100 may include terminal devices 101, 102, 103, a network 104, and a server 105. The network 104 is used as a medium to provide communication links between the terminal devices 101, 102, 103 and the server 105. The network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.
The user may interact with the server 105 via the network 104 using the terminal devices 101, 102, 103 to receive or send messages or the like. Various communication client applications, such as a web browser application, a shopping class application, a search class application, an instant messaging tool, a mailbox client, social platform software, etc., may be installed on the terminal devices 101, 102, 103.
The terminal devices 101, 102, 103 may be hardware or software. When the terminal devices 101, 102, 103 are hardware, they may be various electronic devices having a display screen and supporting information input, including but not limited to smartphones, tablet computers, laptop and desktop computers, and the like. When the terminal devices 101, 102, 103 are software, they can be installed in the above-listed electronic devices. Which may be implemented as multiple software or software modules (e.g., to provide distributed services), or as a single software or software module. The present application is not particularly limited herein.
The server 105 may be a server that provides various services, such as a background server that processes sentences input on the terminal devices 101, 102, 103. The background server may perform analysis or the like on the received data such as the target sentence, and feed back the processing result (e.g., information about the target sentence) to the terminal devices 101, 102, 103.
The server 105 may be hardware or software. When the server 105 is hardware, it may be implemented as a distributed server cluster formed by a plurality of servers, or as a single server. When server 105 is software, it may be implemented as a plurality of software or software modules (e.g., to provide distributed services), or as a single software or software module. The present application is not particularly limited herein.
It should be noted that, the method for outputting information provided by the embodiment of the present application may be performed by the terminal devices 101, 102, 103, or may be performed by the server 105. Accordingly, the means for outputting information may be provided in the terminal devices 101, 102, 103 or in the server 105. It will be appreciated that when the method for outputting information provided by the embodiments of the present application is performed by the terminal devices 101, 102, 103, the network 104 and the server 105 may not be included in the system architecture 100.
It should be understood that the number of terminal devices, networks and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.
With continued reference to FIG. 2, a flow 200 of one embodiment of a method for outputting information in accordance with the present application is shown. The method for outputting information of the present embodiment includes the steps of:
step 201, a target sentence is acquired.
In the present embodiment, the execution subject of the method for outputting information (e.g., the terminal devices 101, 102, 103 or the server 105 shown in fig. 1) can acquire a target sentence in various ways. The target sentence can be a sentence input by a user through a search inlet of a search engine, can also be a sentence input by the user through social software, and can also be any sentence to be analyzed.
Step 202, classifying the target sentences, and determining a pre-established semantic expression model corresponding to the classification result.
In this embodiment, after the execution subject acquires the target sentence, the execution subject may classify the target sentence. In particular, the execution subject may classify the target sentence using various clustering algorithms or other classification algorithms. After determining the classification result of the target sentence, the execution body may determine a pre-established semantic expression model corresponding to the classification result. The semantic expression model is used for representing the corresponding relation between sentences and vectors, and the vectors are used for representing the semantics of the sentences. The semantic expression model may be constructed from neural networks, or from other vector generation algorithms. The execution subject may store a plurality of pre-established semantic expression models corresponding to the respective classifications locally, or the execution subject may acquire the semantic expression models corresponding to the respective classifications from the connected devices.
Step 203, determining the vector of the target sentence according to the determined semantic expression model.
After determining the semantic expression model corresponding to the classification result, the execution subject may input the target sentence into the determined semantic expression model to obtain a vector of the target sentence. It will be appreciated that this vector may be used to represent the semantics of the target sentence.
Step 204, based on the determined vector, outputting information related to the target sentence.
The execution subject may output information related to the target sentence based on the vector of the target sentence. For example, the execution subject may determine two sentences having similar semantics among the plurality of target sentences based on the vectors of the plurality of target sentences, and then output the sentences having similar semantics. Alternatively, the execution subject may determine a reply sentence corresponding to the vector from the vector of the target sentence, and then output the reply sentence.
With continued reference to fig. 3, fig. 3 is a schematic diagram of an application scenario of the method for outputting information according to the present embodiment. In the application scenario of fig. 3, a user dialogues with the intelligent robot through social software. The user inputs the sentence "is i'm running nose, throat pain is cold? ". The intelligent robot takes sentences input by a user as target sentences and classifies the target sentences. And then determining a corresponding semantic expression model to obtain a vector of the target sentence. And determines the reply sentence "your above symptom advice hanging department of respiratory department outpatient service" from the vector.
The method for outputting information provided by the above-described embodiment of the present application may first acquire a target sentence. Then, the target sentence is classified, and a pre-established semantic expression model corresponding to the classification result is determined. And determining the vector of the target sentence according to the determined semantic expression model. Finally, based on the determined vector, information related to the target sentence is output. The method of the embodiment can determine the vector of the sentence, thereby outputting the information related to the sentence and improving the accuracy of the semantic expression of the sentence.
With continued reference to fig. 4, a flow 400 of another embodiment of a method for outputting information in accordance with the present application is shown. In this embodiment, the target sentence is a sentence input by the user through the search engine. The search engine may return search results after receiving the target sentence. As shown in fig. 4, after determining the semantic expression model corresponding to the classification result, the method may further include the steps of:
in step 401, in response to detecting a click operation of a user on a search result page returned by the search engine according to the target sentence, a title of the page corresponding to the click operation is obtained.
The user can browse the search result page returned by the search engine through the terminal, and can click on the title in the search result page to browse the page indicated by the title. In this embodiment, after detecting the click operation of the user on the search result page, the execution body may acquire the title of the page corresponding to the click operation. It will be appreciated that the title of the page has a correlation with sentences input by the user via the search engine.
In step 402, a vector of titles is determined.
After acquiring the title of the page, the execution body may determine the vector of the title in various ways. For example, the execution body may determine the vector of titles through the semantic expression model determined in step 202. Alternatively, the executing body may also utilize other vector calculation algorithms to determine the vector of the title.
In step 403, the vector associations of the target sentences and headlines are stored in the first data set.
After deriving the vector of headlines, the execution body may store the vector associations of the target sentences and headlines in the first data set. That is, there is an association relationship between the target sentence and the vector of the title.
In step 404, in response to determining that the first data set meets the preset condition, taking sentences in the first data set as input, taking vectors associated with the input sentences as expected output, and training to obtain a target semantic expression model.
The execution body may determine in real time whether the first data set satisfies a preset condition. The preset conditions may include, but are not limited to: the number of sentences is greater than a first threshold and the data capacity of the first data set is greater than a second threshold. When the execution subject determines that the first data set meets the preset condition, sentences in the first data set can be used as input, vectors associated with the input sentences are used as expected output, and the target semantic expression model is obtained through training.
Most of the existing semantic expression schemes for sentences rely on large-scale machine learning techniques, and a large amount of supervised or weakly supervised data needs to be accumulated to train a machine learning model. For some emerging professional application fields, such as insurance, medical treatment and the like, long-term data accumulation is needed to train a more targeted semantic expression model. In this embodiment, when there are enough sentences in the first data set, the determined semantic expression model may be trained using the accumulated sentences and the corresponding vectors. Thus, the obtained target semantic expression model can express sentences more accurately.
According to the method for outputting information provided by the embodiment of the application, the determined semantic expression model can be trained by utilizing the sentences input by the user in the search engine and the vectors corresponding to the titles of the clicked pages, so that the semantic expression accuracy of the finally obtained target semantic expression model can be improved.
With continued reference to FIG. 5, a flow 500 for building a semantic expression model corresponding to each category in a method for outputting information according to the present application is shown. As shown in fig. 5, the present embodiment can determine the semantic expression model corresponding to each category by:
step 501, a second data set is obtained.
In this embodiment, the second data set includes sentences and vectors corresponding to the sentences. It is understood that the vector herein is a vector for expressing the semantics of a sentence.
Step 502, classifying sentences in the second data set to obtain at least one data subset.
The execution body may classify sentences in the second data set. In particular, the executing body may employ a clustering algorithm or other classification algorithm to classify sentences in the second data set. After classification, at least one subset of data is obtained.
Step 503, determining a semantic expression model corresponding to at least one data subset according to at least one data subset.
After the execution body obtains each data subset, the semantic expression model corresponding to each data subset can be determined.
In some specific implementations, the execution body may derive the semantic expression models using the following steps, not shown in fig. 5: and taking sentences in the data subset as input, taking vectors corresponding to the input sentences in the data subset as expected output, and training to obtain a semantic expression model corresponding to the data subset.
In this implementation, the execution body may train to obtain the semantic expression model using each data subset. Specifically, for each data subset, the execution body may train to obtain a semantic expression model corresponding to the data subset by using, as a desired output, a vector corresponding to the input sentence in the data subset.
In some specific implementations, the execution body may derive the semantic expression models using the following steps, not shown in fig. 5: selecting at least one sentence from the subset of data as a training sentence for a subset of data in the at least one subset of data; adding training sentences selected from other data subsets and vectors corresponding to the training sentences into the data subsets to obtain updated data subsets; sentences in the updated data subsets are used as input, vectors corresponding to the input sentences are used as expected output, and the semantic expression model corresponding to the updated data subsets is trained.
In this implementation, the executing body may first select at least one sentence from each subset of data as a training sentence. And then adding the selected training sentences into each data subset respectively to obtain each updated data subset. And then taking sentences in the updated data subsets as input, taking vectors corresponding to the input sentences as expected output, and training to obtain a semantic expression model corresponding to the updated data subsets.
According to the method for outputting information provided by the embodiment of the application, sentences in the second data set can be classified, and then the sentences and vectors in each classification are utilized for training, so that each obtained semantic expression model can accurately express the semantics of the sentences. For the application scene of the semantic expression model in the new field, the semantic expression of sentences can be optimized before the target semantic expression model is obtained.
With further reference to fig. 6, as an implementation of the method shown in the above figures, the present application provides an embodiment of an apparatus for outputting information, which corresponds to the method embodiment shown in fig. 2, and which is particularly applicable to various electronic devices.
As shown in fig. 6, the apparatus 600 for outputting information of the present embodiment includes: a sentence acquisition unit 601, a model determination unit 602, a first vector determination unit 603, and an information output unit 604.
The sentence acquisition unit 601 is configured to acquire a target sentence.
A model determining unit 602 configured to classify the target sentence and determine a pre-established semantic expression model corresponding to the classification result. The semantic expression model is used for representing the corresponding relation between sentences and vectors.
The first vector determination unit 603 is configured to determine a vector of the target sentence according to the determined semantic expression model.
The information output unit 604 is configured to output information related to the target sentence based on the determined vector.
In some alternative implementations of the present embodiment, the target sentence is a sentence input by the user through a search engine. The apparatus 600 may further include a title acquisition unit, a second vector determination unit, a data storage unit, and a first model training unit, which are not shown in fig. 6.
And the title acquisition unit is configured to respond to detection of clicking operation of a user on a search result page returned by the search engine according to the target sentence, and acquire a title of the page corresponding to the clicking operation.
A second vector determination unit configured to determine a vector of the title;
and a data storage unit configured to store vector associations of the target sentence and the headline in the first data set.
The first model training unit is configured to train the determined semantic expression model to obtain a target semantic expression model by taking sentences in the first data set as input and taking vectors associated with the input sentences as expected output in response to determining that the first data set meets preset conditions.
In some optional implementations of this embodiment, the apparatus 600 may further include a data acquisition unit, a sentence classification unit, and a second model training unit, which are not shown in fig. 6.
And a data acquisition unit configured to acquire the second data set. Wherein the second data set includes sentences and vectors corresponding to the sentences;
and the sentence classification unit is configured to classify sentences in the second data set to obtain at least one data subset.
And a second model training unit configured to determine a semantic expression model corresponding to the at least one data subset from the at least one data subset.
In some optional implementations of the present embodiment, the second model training unit may be further configured to: and training to obtain a semantic expression model corresponding to the data subset by taking sentences in at least one data subset as input and taking vectors corresponding to the input sentences in the data subset as expected output.
In some optional implementations of the present embodiment, the second model training unit may be further configured to: selecting at least one sentence from the subset of data as a training sentence for a subset of data in the at least one subset of data; adding training sentences selected from other data subsets and vectors corresponding to the training sentences into the data subsets to obtain updated data subsets; sentences in the updated data subsets are used as input, vectors corresponding to the input sentences are used as expected output, and the semantic expression model corresponding to the updated data subsets is trained.
It should be understood that the units 601 to 604 described in the apparatus 600 for outputting information correspond to the respective steps in the method described with reference to fig. 2, respectively. Thus, the operations and features described above with respect to the method for outputting information are equally applicable to the apparatus 600 and the units contained therein, and are not described in detail herein.
Referring now to fig. 7, a schematic diagram of an electronic device (e.g., server or terminal device of fig. 1) 700 suitable for use in implementing embodiments of the present disclosure is shown. The electronic devices in the embodiments of the present disclosure may include, but are not limited to, mobile terminals such as mobile phones, notebook computers, digital broadcast receivers, PDAs (personal digital assistants), PADs (tablet computers), PMPs (portable multimedia players), in-vehicle terminals (e.g., in-vehicle navigation terminals), and the like, and stationary terminals such as digital TVs, desktop computers, and the like. The electronic device shown in fig. 7 is only one example and should not impose any limitations on the functionality and scope of use of embodiments of the present disclosure.
As shown in fig. 7, the electronic device 700 may include a processing means (e.g., a central processor, a graphics processor, etc.) 701, which may perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 702 or a program loaded from a storage means 708 into a Random Access Memory (RAM) 703. In the RAM703, various programs and data required for the operation of the electronic device 700 are also stored. The processing device 701, the ROM 702, and the RAM703 are connected to each other through a bus 704. An input/output (I/O) interface 705 is also connected to bus 704.
In general, the following devices may be connected to the I/O interface 705: input devices 706 including, for example, a touch screen, touchpad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, and the like; an output device 707 including, for example, a Liquid Crystal Display (LCD), a speaker, a vibrator, and the like; storage 708 including, for example, magnetic tape, hard disk, etc.; and a communication device 709. The communication means 709 may allow the electronic device 700 to communicate wirelessly or by wire with other devices to exchange data. While fig. 7 shows an electronic device 700 having various means, it is to be understood that not all of the illustrated means are required to be implemented or provided. More or fewer devices may be implemented or provided instead. Each block shown in fig. 7 may represent one device or a plurality of devices as needed.
In particular, according to embodiments of the present disclosure, the processes described above with reference to flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method shown in the flowcharts. In such an embodiment, the computer program may be downloaded and installed from a network via communication device 709, or installed from storage 708, or installed from ROM 702. The above-described functions defined in the methods of the embodiments of the present disclosure are performed when the computer program is executed by the processing device 701. It should be noted that, the computer readable medium according to the embodiments of the present disclosure may be a computer readable signal medium or a computer readable storage medium, or any combination of the two. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples of the computer-readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In an embodiment of the present disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. Whereas in embodiments of the present disclosure, the computer-readable signal medium may comprise a data signal propagated in baseband or as part of a carrier wave, with computer-readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, fiber optic cables, RF (radio frequency), and the like, or any suitable combination of the foregoing.
The computer readable medium may be contained in the electronic device; or may exist alone without being incorporated into the electronic device. The computer readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to: acquiring a target sentence; classifying the target sentences, and determining a pre-established semantic expression model corresponding to the classification result, wherein the semantic expression model is used for representing the corresponding relation between the sentences and the vectors; determining a vector of the target sentence according to the determined semantic expression model; based on the determined vector, information related to the target sentence is output.
Computer program code for carrying out operations of embodiments of the present disclosure may be written in one or more programming languages, including an object oriented programming language such as Java, smalltalk, C ++ and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computer (for example, through the Internet using an Internet service provider).
The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The units involved in the embodiments described in the present disclosure may be implemented by means of software, or may be implemented by means of hardware. The described units may also be provided in a processor, for example, described as: a processor includes a sentence acquisition unit, a model determination unit, a first vector determination unit, and an information output unit. The names of these units do not constitute a limitation on the unit itself in some cases, and for example, a sentence acquisition unit may also be described as a "unit that acquires a target sentence".
The foregoing description is only of the preferred embodiments of the present disclosure and description of the principles of the technology being employed. It will be appreciated by those skilled in the art that the scope of the application in the embodiments of the present disclosure is not limited to the specific combination of the above technical features, but encompasses other technical features formed by any combination of the above technical features or their equivalents without departing from the spirit of the application. Such as the above-described features, are mutually substituted with (but not limited to) the features having similar functions disclosed in the embodiments of the present disclosure.

Claims (10)

1. A method for outputting information, comprising:
acquiring a target sentence, wherein the target sentence is a sentence input by a user through a search engine;
classifying the target sentences, and determining a pre-established semantic expression model corresponding to the classification result, wherein the semantic expression model is used for representing the corresponding relation between the sentences and the vectors;
determining a vector of the target sentence according to the determined semantic expression model;
outputting information related to the target sentence based on the determined vector;
responding to the detection of the click operation of a user on a search result page returned by a search engine according to the target sentence, and acquiring the title of the page corresponding to the click operation;
determining a vector of the header;
storing the vector association of the target sentence and the heading in a first data set;
and in response to determining that the first data set meets a preset condition, taking sentences in the first data set as input, taking vectors associated with the input sentences as expected output, and training the determined semantic expression model to obtain a target semantic expression model.
2. The method of claim 1, wherein the method further comprises:
acquiring a second data set, wherein the second data set comprises sentences and vectors corresponding to the sentences;
classifying sentences in the second data set to obtain at least one data subset;
and determining a semantic expression model corresponding to the at least one data subset according to the at least one data subset.
3. The method of claim 2, wherein the determining a semantic expression model corresponding to the at least one subset of data from the at least one subset of data comprises:
and for the data subsets in the at least one data subset, taking sentences in the data subsets as input, taking vectors corresponding to the input sentences in the data subsets as expected output, and training to obtain a semantic expression model corresponding to the data subsets.
4. The method of claim 2, wherein the determining a semantic expression model corresponding to the at least one subset of data from the at least one subset of data comprises:
selecting at least one sentence from the subset of data as a training sentence for a subset of data in the at least one subset of data; adding training sentences selected from other data subsets and vectors corresponding to the training sentences into the data subsets to obtain updated data subsets;
sentences in the updated data subsets are used as input, vectors corresponding to the input sentences are used as expected output, and the semantic expression model corresponding to the updated data subsets is trained.
5. An apparatus for outputting information, comprising:
a sentence acquisition unit configured to acquire a target sentence, the target sentence being a sentence input by a user through a search engine;
a model determining unit configured to classify the target sentence, and determine a pre-established semantic expression model corresponding to the classification result, wherein the semantic expression model is used for representing a corresponding relationship between the sentence and the vector;
a first vector determination unit configured to determine a vector of the target sentence according to the determined semantic expression model;
an information output unit configured to output information related to the target sentence based on the determined vector;
the title acquisition unit is configured to respond to detection of clicking operation of a user on a search result page returned by a search engine according to the target sentence, and acquire a title of the page corresponding to the clicking operation;
a second vector determination unit configured to determine a vector of the title;
a data storage unit configured to store vector associations of the target sentence and the headline in a first data set;
the first model training unit is configured to train the determined semantic expression model to obtain a target semantic expression model by taking sentences in the first data set as input and taking vectors associated with the input sentences as expected output in response to determining that the first data set meets preset conditions.
6. The apparatus of claim 5, wherein the apparatus further comprises:
a data acquisition unit configured to acquire a second data set, wherein the second data set includes a sentence and a vector corresponding to the sentence;
a sentence classification unit configured to classify sentences in the second data set to obtain at least one data subset;
a second model training unit configured to determine a semantic expression model corresponding to the at least one subset of data from the at least one subset of data.
7. The apparatus of claim 6, wherein the second model training unit is further configured to:
and for the data subsets in the at least one data subset, taking sentences in the data subsets as input, taking vectors corresponding to the input sentences in the data subsets as expected output, and training to obtain a semantic expression model corresponding to the data subsets.
8. The apparatus of claim 6, wherein the second model training unit is further configured to:
selecting at least one sentence from the subset of data as a training sentence for a subset of data in the at least one subset of data; adding training sentences selected from other data subsets and vectors corresponding to the training sentences into the data subsets to obtain updated data subsets;
sentences in the updated data subsets are used as input, vectors corresponding to the input sentences are used as expected output, and the semantic expression model corresponding to the updated data subsets is trained.
9. An electronic device, comprising:
one or more processors;
a storage device having one or more programs stored thereon,
when executed by the one or more processors, causes the one or more processors to implement the method of any of claims 1-4.
10. A computer readable medium having stored thereon a computer program, wherein the program when executed by a processor implements the method of any of claims 1-4.
CN201910243599.4A 2019-03-28 2019-03-28 Method and device for outputting information Active CN111753080B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910243599.4A CN111753080B (en) 2019-03-28 2019-03-28 Method and device for outputting information

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910243599.4A CN111753080B (en) 2019-03-28 2019-03-28 Method and device for outputting information

Publications (2)

Publication Number Publication Date
CN111753080A CN111753080A (en) 2020-10-09
CN111753080B true CN111753080B (en) 2023-08-22

Family

ID=72671610

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910243599.4A Active CN111753080B (en) 2019-03-28 2019-03-28 Method and device for outputting information

Country Status (1)

Country Link
CN (1) CN111753080B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107329949A (en) * 2017-05-24 2017-11-07 北京捷通华声科技股份有限公司 A kind of semantic matching method and system
CN107784048A (en) * 2016-11-14 2018-03-09 平安科技(深圳)有限公司 The problem of question and answer corpus sorting technique and device
CN108829719A (en) * 2018-05-07 2018-11-16 中国科学院合肥物质科学研究院 The non-true class quiz answers selection method of one kind and system
CN108932342A (en) * 2018-07-18 2018-12-04 腾讯科技(深圳)有限公司 A kind of method of semantic matches, the learning method of model and server
CN109522394A (en) * 2018-10-12 2019-03-26 北京奔影网络科技有限公司 Knowledge base question and answer system and method for building up

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9984772B2 (en) * 2016-04-07 2018-05-29 Siemens Healthcare Gmbh Image analytics question answering

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107784048A (en) * 2016-11-14 2018-03-09 平安科技(深圳)有限公司 The problem of question and answer corpus sorting technique and device
CN107329949A (en) * 2017-05-24 2017-11-07 北京捷通华声科技股份有限公司 A kind of semantic matching method and system
CN108829719A (en) * 2018-05-07 2018-11-16 中国科学院合肥物质科学研究院 The non-true class quiz answers selection method of one kind and system
CN108932342A (en) * 2018-07-18 2018-12-04 腾讯科技(深圳)有限公司 A kind of method of semantic matches, the learning method of model and server
CN109522394A (en) * 2018-10-12 2019-03-26 北京奔影网络科技有限公司 Knowledge base question and answer system and method for building up

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
刘飞龙 ; 郝文宁 ; 陈刚 ; 靳大尉 ; 宋佳星 ; .基于双线性函数注意力Bi-LSTM模型的机器阅读理解.计算机科学.2017,(S1),全文. *

Also Published As

Publication number Publication date
CN111753080A (en) 2020-10-09

Similar Documents

Publication Publication Date Title
US9923860B2 (en) Annotating content with contextually relevant comments
CN112699991A (en) Method, electronic device, and computer-readable medium for accelerating information processing for neural network training
CN107679217B (en) Associated content extraction method and device based on data mining
CN111523640B (en) Training method and device for neural network model
CN110633423B (en) Target account identification method, device, equipment and storage medium
CN108121699B (en) Method and apparatus for outputting information
WO2022121801A1 (en) Information processing method and apparatus, and electronic device
US20180329985A1 (en) Method and Apparatus for Compressing Topic Model
CN111428010A (en) Man-machine intelligent question and answer method and device
CN108121814B (en) Search result ranking model generation method and device
CN111460288B (en) Method and device for detecting news event
CN111078849B (en) Method and device for outputting information
CN110457325B (en) Method and apparatus for outputting information
CN110059172B (en) Method and device for recommending answers based on natural language understanding
CN115801980A (en) Video generation method and device
US20210150270A1 (en) Mathematical function defined natural language annotation
CN113033707B (en) Video classification method and device, readable medium and electronic equipment
US11361031B2 (en) Dynamic linguistic assessment and measurement
CN114625699A (en) Identification and reconstruction of previously presented material
CN111382262A (en) Method and apparatus for outputting information
CN111753080B (en) Method and device for outputting information
CN111770125A (en) Method and device for pushing information
CN112148865B (en) Information pushing method and device
CN110472055B (en) Method and device for marking data
CN111767290B (en) Method and apparatus for updating user portraits

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant