CN108959256A - Generation method, device, storage medium and the terminal device of short text - Google Patents

Generation method, device, storage medium and the terminal device of short text Download PDF

Info

Publication number
CN108959256A
CN108959256A CN201810712807.6A CN201810712807A CN108959256A CN 108959256 A CN108959256 A CN 108959256A CN 201810712807 A CN201810712807 A CN 201810712807A CN 108959256 A CN108959256 A CN 108959256A
Authority
CN
China
Prior art keywords
short text
slot position
word
position word
template
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810712807.6A
Other languages
Chinese (zh)
Other versions
CN108959256B (en
Inventor
王臻
刘家辰
肖欣延
吕雅娟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN201810712807.6A priority Critical patent/CN108959256B/en
Publication of CN108959256A publication Critical patent/CN108959256A/en
Application granted granted Critical
Publication of CN108959256B publication Critical patent/CN108959256B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/166Editing, e.g. inserting or deleting
    • G06F40/186Templates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present invention proposes generation method, device, storage medium and the terminal device of a kind of short text, wherein the described method includes: obtaining the slot position word for generating short text;According to the slot position word, the short text template with the slot position word association is extracted;Wherein, the short text template is stored with the slot position word association in advance;The short text template includes the short text for being embedded with slot position;And insert the slot position word in the short text template with the matched slot position of slot position word, generate short text.Using the present invention, the semanteme that generating process may be implemented is controllable.

Description

Generation method, device, storage medium and the terminal device of short text
Technical field
The present invention relates to field of computer technology more particularly to a kind of generation method of short text, device, storage medium and Terminal device.
Background technique
With the development of internet, network information also expands therewith.In the case where information content is constantly expanded, if necessary The depth seized and range are manually compiled in guarantee, then manually compile the low efficiency seized.Based on this, realize that text is given birth to automatically using machine At technology come into being therewith, can largely save the time and efforts of human-edited, improve the efficiency compiled and seized.But It is that the content that machine generates is difficult to control, is easy to appear and the preset unmatched situation of semanteme.Thus, how in text Guarantee one of the problem of semantic controllability is current urgent need to resolve during this generation.
Summary of the invention
The embodiment of the present invention provides generation method, device, storage medium and the terminal device of a kind of short text, with solve or Alleviate above one or more technical problems in the prior art.
In a first aspect, the embodiment of the invention provides a kind of generation methods of short text, comprising:
Obtain the slot position word for generating short text;
According to the slot position word, the short text template with the slot position word association is extracted;Wherein, the short text template is pre- It is first to be stored with the slot position word association;The short text template includes the short text for being embedded with slot position;And
The slot position word is inserted in the short text template with the matched slot position of slot position word, short text is generated.
With reference to first aspect, in the first embodiment of first aspect, the method also includes:
According to the text field and text attribute of short text template to be generated, the pass that the short text template includes is determined Keyword;
According to the keyword that the short text template includes, candidate title is retrieved from the inquiry title of search log;Its In, described search log is used to record search performed by search engine and the search result comprising the inquiry title;It is described Inquiring title includes keyword;
According to entity word type, the entity word of the candidate title is determined;And
Using the entity word as slot position word, the slot position word is removed from the candidate title, generates the short essay This template, and by the slot position word and the short text template associated storage.
The first embodiment with reference to first aspect, in second of embodiment of first aspect, the method is also Include:
According to the filter word of preset short text template, the candidate comprising the filter word is excluded from the candidate title Title.
The first embodiment with reference to first aspect, in the third embodiment of first aspect, the method is also Include:
Duplicate removal is carried out to the candidate title retrieved;And
The short text template of generation is subjected to duplicate removal, and the associated slot position word of the short text template of duplicate removal is merged.
With reference to first aspect or its any embodiment, in the 4th kind of embodiment of first aspect, comprising:
Training data is formed by the slot position word for being used to generate short text and according to the short text that the slot position word generates;
Sequence is obtained using training data training and generates model, and the sequence generates the slot that model is used to utilize input Position word exports corresponding short text.
The 4th kind of embodiment with reference to first aspect, in the 5th kind of embodiment of first aspect, the sequence is raw It include that the sequence based on attention mechanism generates model at model, the training data further includes being generated according to the slot position word Keyword included by the short text template used when short text.
The 4th kind of embodiment with reference to first aspect, in the 6th kind of embodiment of first aspect, the sequence is raw It include variation self-encoding encoder model at model;And the method also includes:
It is obtained and the slot position word according to the encoder of the variation self-encoding encoder model to the slot position word into coding The hidden vector of corresponding short text;
The hidden vector is adjusted, extensive hidden vector is obtained;And
According to the variation self-encoding encoder solution to model code device, extensive hidden vector described in the hidden vector sum is solved Code obtains short text.
The 4th kind of embodiment with reference to first aspect, in the 7th kind of embodiment of first aspect, the method is also Include:
Slot position word in the training data is inputted the sequence generation model to calculate;
Calculated result is compared with the short text in the training data, generates model to adjust the sequence;Its In, the mode of the calculating includes the optimal way of network beam-search.
Second aspect, the embodiment of the present invention provide a kind of generating means of short text, comprising:
Slot position word obtains module, for obtaining the slot position word for generating short text;
Short text template extraction module, for extracting the short text mould with the slot position word association according to the slot position word Plate;Wherein, the short text template is stored with the slot position word association in advance;The short text template includes being embedded with slot position Short text;And
Short text generation module, it is matched with the slot position word in the short text template for inserting the slot position word Slot position generates short text.
In conjunction with second aspect, in the first embodiment of second aspect, described device further include:
Keyword determining module determines institute for the text field and text attribute according to short text template to be generated State the keyword that short text template includes;
Candidate title retrieval module, the keyword for including according to the short text template, from the inquiry of search log Candidate title is retrieved in title;Wherein, described search log is for recording search performed by search engine and looking into comprising described Ask the search result of title;The inquiry title includes keyword;
Entity word determining module, for determining the entity word of the candidate title according to entity word type;And
Short text template generation module, for using the entity word as slot position word, by the slot position word from the candidate It is removed in title, generates the short text template, and by the slot position word and the short text template associated storage.
The function of described device can also execute corresponding software realization by hardware realization by hardware.It is described Hardware or software include one or more modules corresponding with above-mentioned function.
It include processor and memory in the generation structure of short text in a possible design, the memory is used The generation program of short text in above-mentioned first aspect is executed in the generating means of short text, the processor is configured to for holding The program stored in the row memory.The generating means of the short text can also include communication interface, for short text Generating means and other equipment or communication.
The third aspect, the embodiment of the present invention also provide a kind of computer readable storage medium, and the generation for short text fills Computer software instructions used are set, involved in the generation method including the short text for executing above-mentioned first aspect Program.
Any one technical solution in above-mentioned technical proposal have the following advantages that or the utility model has the advantages that
For the embodiment of the present invention by being stored in advance being associated with pair for slot position word and short text template, short text template is that interior have slot The short text of position, when slot position matching in associated slot position word filling short text template, available one has had The short text of whole syntactic-semantic.Because then, the present embodiment generates short text by way of filling in slot position, may be implemented to generate The semanteme of journey is controllable.
Above-mentioned general introduction is merely to illustrate that the purpose of book, it is not intended to be limited in any way.Except foregoing description Schematical aspect, except embodiment and feature, by reference to attached drawing and the following detailed description, the present invention is further Aspect, embodiment and feature, which will be, to be readily apparent that.
Detailed description of the invention
In the accompanying drawings, unless specified otherwise herein, otherwise indicate the same or similar through the identical appended drawing reference of multiple attached drawings Component or element.What these attached drawings were not necessarily to scale.It should be understood that these attached drawings depict only according to the present invention Disclosed some embodiments, and should not serve to limit the scope of the present invention.
Fig. 1 is the flow diagram of one embodiment of the generation method of short text provided by the invention;
Fig. 2 is the one embodiment for the method for generation short text template die plate and slot position word provided by the invention being associated with pair Flow diagram;
Fig. 3 is that the process of one embodiment of the short text generation method provided by the invention that model is generated using sequence is shown It is intended to;
Fig. 4 is the schematic diagram that the sequence of standard provided by the invention generates one embodiment of model;
Fig. 5 is the schematic diagram of one embodiment that the sequence provided by the invention based on attention mechanism generates model;
Fig. 6 is the flow diagram of one embodiment of the method provided by the invention for generating short text;
Fig. 7 is the schematic diagram of one embodiment of variation self-encoding encoder model provided by the invention;
Fig. 8 is the flow diagram of one embodiment of the method for model training provided by the invention;
Fig. 9 is the schematic diagram of one embodiment of network beam-search provided by the invention;
Figure 10 is the structural schematic diagram of one embodiment of the generating means of short text provided by the invention;
Figure 11 is the structural schematic diagram of one embodiment of terminal device provided by the invention.
Specific embodiment
Hereinafter, certain exemplary embodiments are simply just described.As one skilled in the art will recognize that Like that, without departing from the spirit or scope of the present invention, described embodiment can be modified by various different modes. Therefore, attached drawing and description are considered essentially illustrative rather than restrictive.
Referring to Fig. 1, can be applied to terminal device the embodiment of the invention provides a kind of generation method of short text. Terminal device may include processor, computer, smart phone, plate etc..The present embodiment includes step S100 to step S300, It is specific as follows:
S100 obtains the slot position word for generating short text.
In the present embodiment, short text may include the word less than preset quantity.For example, being less than 30,40 or 50 lists The sentence of word.Short text can include but is not limited to list, title etc..By taking classification of travelling as an example, short text may include: " the X month The tourist attractions XX heat searches list, and XX arranges XX X, and XXX occupies the umber one ", " the ten large tourism sight spot XX seniority among brothers and sisters, XX X, XXX are rebasing " " XX Ten large tourism sight spot ranking list of province/city XX, which do you want to go to? ", " where is XX tourism played? look at that " X month XX ten, big heat was searched fastly Sight spot ranking list " etc..Slot position word may include entity word, proper noun etc..Such as: Guangdong Province, spring, May etc..
S200 extracts the short text template with slot position word association according to slot position word.It wherein, include a large amount of in terminal system Short text template, and each short text template and slot position word association store;Each short text template includes be embedded with slot position short Text.
In the present embodiment, short text template can obtain after the entity word removed on specific position in original short text Short text, this feature position be slot position.Such as: original short text d1 are as follows: " look at the ten large tourism scapes in China's most suitable spring Point " removes entity word, shape so that entity word is " China " and " spring " as an example on the position of the entity word of original short text At short text template D1 " looking at that [entity: place] is most suitable for the ten large tourism sight spots of [entity: time] ".Wherein, [entity: Place] and [entity: time] be template in slot position, include: " China " and " spring " with the matched slot position word K1 of the slot position.
Slot position word is inserted in short text template with the matched slot position of slot position word, generates short text by S300.
Connect example, it is assumed that the association pair of short text template D1 and slot position word K1 are previously stored with, when the slot position word got K1 can be matched to short text template D1.According to the attribute of each word in slot position word K1, for example, the attribute of " China " is place, The attribute in " spring " is the time, by " China " and " spring " that slot position word K1 includes filling short text template D1 " look at [entity: Place] be most suitable for [entity: time] ten large tourism sight spots " in slot position [entity: place] and [entity: time] in, obtain Obtain short text " the ten large tourism sight spots for looking at China's most suitable spring ".
The present embodiment can form the short text with complete syntactic-semantic by way of filling in slot position, realize text The semantic controllability of generation.
In one possible implementation, current embodiment require that being associated with for short text template die plate and slot position word is stored in advance It is right.As shown in Fig. 2, may include the present embodiment provides the method for a kind of generation short text template die plate and slot position word being associated with pair Step S410 is as follows to step S440:
S410 determines that short text template includes according to the text field and text attribute of short text template to be generated Keyword.
In the present embodiment, text field may include category name, such as: the fields such as tourism, education.With tour field For, the keyword occurred in short text may include tourism.Text attribute may include ten big, seniority among brothers and sisters, list, the umber one, heat It searches, ranking.The keyword occurred in short text may include words such as " ten is big, seniority among brothers and sisters, list, the umber one, heat are searched, ranking ".
S420 retrieves candidate title according to the keyword that short text template includes from the inquiry title of search log.
In the present embodiment, since search log is used to record search performed by search engine and comprising inquiry title Search result, and inquiring title includes keyword, thus, the present embodiment can be to search for log as initial data, to search day Will is retrieved.
In the present embodiment, the candidate title that letter answer in area's can be retrieved to step S420 carries out subordinate sentence, according in text The keyword for including is needed, clause is chosen from clause as candidate title.
S430 determines the entity word of candidate's title according to entity word type.
In the present embodiment, the entity in candidate title can be marked.Entity word type may include place, when Between, personage etc., for example, determining that place is by taking candidate title " the ten large tourism sight spots for looking at China's most suitable spring " as an example " China " and time are " spring ", can determine that the entity word of this candidate title includes " China " and " spring ".
S440 is removed slot position word using entity word as slot position word from candidate title, generates short text template, and will Slot position word and short text template associated storage.
Example is connected, " China " and " spring " that entity word includes is the slot position word of candidate title.By the slot position of candidate title Word " China " and " spring " remove from candidate title, can obtain the corresponding short text template of candidate title: " look at [entity: Place] it is most suitable for the ten large tourism sight spots of [entity: time] ".Wherein, slot position word " China ", " spring " and short text template " looking at that [entity: place] is most suitable for the ten large tourism sight spots of [entity: time] " is associatedly stored in terminal device.
In one possible implementation, the present embodiment can be set in candidate title there can be no word (i.e. Filter word), thus the present embodiment is during choosing retrieval title, can also include: the mistake according to preset short text template Word is filtered, the candidate title comprising filter word is excluded from candidate title.Filter word can be set according to actual application demand It sets.For example, using some description yellow, outdated, violence word as filter word.For another example, by some unsuitable children groups The word of body is as filter word.In addition it is also possible to therefrom select filter word according to the candidate title for not being inconsistent standard.
In one possible implementation, above-mentioned generation be associated with to during, the present embodiment can execute duplicate removal Operation, may include: first, carrying out duplicate removal to the candidate title that retrieves;Second, the short text template of generation is gone Weight, and the associated slot position word of the short text template of duplicate removal is merged.
In the present embodiment, identical content can be retained one by duplicate removal.For example, multiple identical candidate titles pass through Only retain a candidate title after duplicate removal.Multiple identical short text templates only retain a short text template after duplicate removal, But since respectively associated slot position word is possible to not identical to multiple identical short text template, thus can will be multiple identical The slot position word of short text template merges.After merging, multiple identical short text templates can be ranked up, it can be by The frequency in an identical short text template is appeared according to slot position word to sort from high to low.
In the present embodiment, can also to the processing at short text template fining, such as: merge adjacent time and ground The entity word of point is as an entity word.
The present embodiment can generate being associated with for a large amount of short text template and slot position word in each text field It is right.The short text template of these associations pair can carry out the generation of short text by way of filling in slot position word.
In translation field, by sequence generate modelling technique come realize machine translation in the way of, compared to traditional machine Translation, can largely improve the accuracy of translation.The embodiment of the present invention, it is contemplated that sequence generates solution to model code rank The flexibility of section, can use slot position word above-mentioned and the short text composing training according to slot position word and short text template generation Data generate model to sequence and carry out model training.It is short to generate that the present embodiment can use trained sequence generation model Text not only may be implemented semanteme controllably, can also generate a large amount of short text, improve the abundant degree of the generation of short text.
In one possible implementation, as shown in figure 3, the embodiment of the present invention provides a kind of life of sequence generation model It may include step S510 and step S520 at method, as follows:
S510 forms training data by the slot position word for being used to generate short text and according to the short text that slot position word generates.
S520 obtains sequence using training data training and generates model, and sequence generates the slot position that model is used to utilize input Word exports corresponding short text.
In the present embodiment, the sequence generation model of standard can be as shown in Figure 4.Sequence generates the input of model (also known as Source is indicated with S) and output (also known as target side is indicated with T) be all with the text of sequence form tissue, such as slot position word, Short text etc..The model is intended to learn a kind of mapping relationship f from source to target side: S- > T, for example, study from slot position word to The mapping text system of short text.When model grasp this mapping relations after, will can be applied to more extensive non-poster material (S ' -> T ') in, solving practical problems.For example, will not be identified or the slot position word of its unknown concrete meaning is input in model, it can To obtain the short text for meeting demand.
In the present embodiment, the organizational form of training data can be exemplified as follows:
Source: [entity: place] [entity: time], i.e. slot position word;
Target side: look at that [entity: place] is most suitable for the ten large tourism sight spots of [entity: time], i.e. short text.
But since the information that sequence generates the source of model receives the compression of height, so that target side is generating rank Section defies capture to the information of some more elephants.Such as: slot position word " China " and " spring ", not due to the relationship between word and word It is clear, in the case that the word quantity that each word includes is very few, when generating short text, it is difficult to capture slot position word " China " and " spring More detailed information in season ".
In this context, the present embodiment can generate model using the sequence based on attention mechanism, to training number Short text is generated according to being trained.The model not only can be by compressed source information in generation phase, it is also considered that arrives The information of each linguistic unit of source.The schematic diagram of the model is referring to Fig. 5.
During model training, the other language granularity of word rank, word-level can be attempted respectively, can also combine and examine Consider the above two o'clock, can in modeling process linguistic unit meaning abundant and interactive form.Further, in above-mentioned source On the basis of field, the semanteme that can be paid close attention to the present embodiment is further to be expanded.Training data may include other keywords, Keyword includes the corresponding word in short text template of short text of target side.The keyword that short text includes can be such as ten Greatly, seniority among brothers and sisters, ranking etc., the keyword that can be inputted as the source in training data.
In one possible implementation, the model that previous embodiment provides can be solved largely really The certainly rich problem of short text, while the also not big damage in semantic controllability.However, believing for the same input Breath, generates this requirement of the different short text of a plurality of form, requires the generation of model higher.In order to meet such need Ask, the sequence of the present embodiment generate model may include variation self-encoding encoder model (Variational Auto-encoder, VAE).Slot position word list entries can be generated into model and generate rich and varied short text.As shown in fig. 6, the present embodiment provides Generation short text method, may include step S610 to step S630, it is as follows:
S610 is obtained corresponding with slot position word short according to the encoder of variation self-encoding encoder model to slot position word into coding The hidden vector of text.
S620 is adjusted hidden vector, obtains extensive hidden vector.
S630, according to variation self-encoding encoder solution to model code device, hidden vector extensive to hidden vector sum is decoded, and is obtained short Text.
In the present embodiment, variation self-encoding encoder model can be as shown in Figure 7.Recognition network in Fig. 7 For posteriority network, prior network is pro-active network, MLP (Multi-Layer Perceptron, multilayer perceptron), Softmax is one of activation primitive.The optimal way that the training process of variation self-encoding encoder model is utilized is different from it The optimal way of the loss function of the direct optimization generating probability of his model.It is introduced simultaneously to input in variation self-encoding encoder model Between the prior probability and posterior probability of the hidden vector z obtained after data encoding KL distance (first item in VAE optimized-type, Also known as relative entropy), it is therefore an objective to it (is indicated with the posterior probability of z) while it is desirable that referring to output information in model training, and not Wish that this generation (can only obtain the prior probability of z) relied on to deduction generates too much influence.The specific public affairs of optimization aim Formula is as follows:
First item in formula after equal signRelative entropy is the coding point of the slot position word of input The distance between cloth and prior distribution, also referred to as KL distance, the output distribution that can have measured encoder are uniformly distributed with center Tightness.Section 2 in formula after equal sign For reconstructed error, retouch Slot position word is stated by the information Loss Rate of coding the further decoding short text obtained and target short text, the lower numerical value the better.Numerical value Low presentation code quality is better.It is the posterior probability (coding distribution) of the slot position word of input.P (z | S) it is priori Distribution is that the ideal coding of encoder is distributed.P θ (T | z, S`) is the short essay of acquisition of the slot position word after encoding further decoding This output distribution probability, p (z | S`) are the output distribution probabilities of target short text.S is the slot position word of input, and T is When output short text, z is hidden vector, and S` is the slot position word for training input.
It is the hidden vector of nature sentence for the natural sentence of input.Optimization aim in this way not only ensure that deduction Generate the stage of short text and the consistency as far as possible of training stage, it is often more important that, inferring the stage for generating short text to not Sampling with hidden vector z can simulate the different expections for generating result, and the text of different style then can be generated.
In the organizational form of training data, the present embodiment continues to continue to use aforementioned schemes, i.e. the training number of sequence generation model According to organizational form.The mode of the short text so generated can not only can produce various informative for different slot position words Title, can also be directed to identical source, by generating the generation result of different style to the extensive of hidden vector z.Meanwhile It can also keep semantic controllability.
In one possible implementation, it may be incorporated into explicit semanteme, draw in the training process that sequence generates model The constraint for entering explicit semanteme generates the clear and coherent degree and diversity that the short text of generation can be improved in the short text stage in model. The method of model training provided in this embodiment, as shown in figure 8, may include step S710 and step S720, it is as follows:
Slot position word list entries in training data is generated model and calculated by S710.
Calculated result is compared S720 with the short text in training data, generates model to adjust sequence.Wherein, The mode of calculating includes the optimal way of network beam-search.
In the present embodiment, sequence, which generates model, can be generated rich and varied short text, but also because it is natural Probability Forms bring uncertainty to result is generated, and cause sometimes understand certain semantic components.Grid Beam Search (network beam-search) decoding process can infer that it is explicit semantic that generation phase introduces, and improve traditional Beam To semantic satisfaction degree during Search (beam-search).The decoding of Grid Beam Search (network beam-search) Journey can be as shown in Figure 9.Time steps in figure is iteration time, and constriaint number is the number of semantic constraint Amount.Different from the generating mode of traditional Beam Search flattening, vertical dimensions of the Grid Beam Search in figure consider The text generated meets the number of semantic constraint.In this case, the text for reaching top is considered as meeting all semantemes The qualified text of constraint.
In the present embodiment, if the stage in stage and deduction generation short text to training pattern treats with a certain discrimination, so that Model does not generate perception to existing semantic constraint in the training process, but semantic constraint is added in deduction phase, gives birth at this time There is negative situation at the clear and coherent degree and satisfaction of result.Based on this, the present embodiment innovatively proposes Grid Beam The mode of Search Optimization (optimization of network beam-search) is just considered in model training to explicit semanteme Perception.Specifically, it is optimizing index with the whole word of short text, model is generated to sequence in conjunction with explicit semantic constraint and is instructed Practice.For example, the output (calculated result in step S720) for one layer of text substring in Fig. 9, if as target side Text substring (short text in example step S720) is not comprised in the text substring output of this layer, can be searched for this layer empty Between in be turned up with the consistent weighted value of text substring or score of target side, to text of the output in this layer of search space The weighted value or score of string are inhibited.In this way, training and the process inferred complete unity can be got up.
The embodiment of the present invention also provides a kind of application example of the generating mode of short text, in conjunction with Fig. 4, Fig. 5, Fig. 7 and figure 9, description is unfolded by taking the generation of list title as an example.It should be noted that the embodiment of the present invention is not limited to this application scene.
1, using exemplary logic
Should example be intended to generate the title of list class for certain given classifications or field, by taking classification of travelling as an example, Similar list title can be such that
(1), list title T1 " X month XX tourist attractions heat searches list, and XX arranges XX X, and XXX occupies the umber one ".
(2), list title T2 " the ten large tourism sight spot XX seniority among brothers and sisters, XX X, XXX are rebasing ".
(3), list title T3 " ten large tourism sight spot ranking list of XX province/city XX, which you want to go to? ".
(4), list title T4 " where XX tourism is played? look at " X month XX ten, big heat searched sight spot ranking list " fastly.
The present embodiment can automatically generate the list title of each class now, can not only greatly reduce what volume was seized Workload can also effectively improve the vividness of list title and rich.
The semantic controllability of the present embodiment, can be embodied on the control for the keyword that list title includes, for example, closing Keyword may include the mark word of some place names, time or proper noun, classification keyword (for example, tourism) and list title Keyword (such as: ranking, ten big) etc..These specific semantic marks can be incorporated in the slug generated.
2, technological frame
By taking above-mentioned application scenarios as an example, the present embodiment proposes four kinds of different semantic controlling technology sides from shallow to deep Case.
(1), list title template generation and list title template and entity word are associated with
1., to search for the entitled initial data of inquiry recorded in log, according to category name, such as travel, setting should The keyword occurred is needed inside field list title and corresponding query statement, for example, tourism, sight spot etc..According to list The attribute of title is arranged in general list title and needs the keyword comprising one of them.For example, including ten in list title Greatly, seniority among brothers and sisters, list, the umber one, heat search, at least one of words such as ranking.Meanwhile being optionally arranged some in list title The blacklist word (namely filter word) that cannot occur;
2., to previous step retrieval obtain list title carry out subordinate sentence, obtain clause.It can also be similar upper to each clause Requirement is stated therefrom to be screened.And the list title duplicate removal to acquisition, generate the original candidates of list title;
3., entity indicia is carried out to each of original candidates list title, to such our interested reality now Body type carries out extensive composition title template and entity pair, such as:
Original candidates: look at the ten large tourism sight spots in China's most suitable spring;
Title template: look at that [entity: place] is most suitable for the ten large tourism sight spots of [entity: time];
Entity (slot position word): (China, spring).
4., to identical title template carry out duplicate removal, corresponding template entity is merged, and according to entity occur Number is from high to low ranked up title template.
5., to template carry out process of refinement.For example, merging adjacent time, location entity.It is commented according to some common Estimate for poor list title, therefrom extracts blacklist word.Then title template can be filtered according to blacklist word.
By above-mentioned process, a large amount of list title template can be obtained in each classification, these templates can lead to It crosses and fills in the form of slot position and instantiated.The semanteme of the present embodiment is to guarantee semantic controllable reality by way of filling in slot position It is existing.But, due to the form of template be it is fixed, thus, the list title that the present embodiment is also faced with generation is not abundant enough Problem.
(2), sequence generates model learning
In translation field, by sequence generate modelling technique come realize machine translation in the way of, compared to traditional machine Translation, can largely improve the accuracy of translation.The embodiment of the present invention, it is contemplated that sequence generates solution to model code rank The flexibility of section can use slot position word above-mentioned and be constituted according to the list title of slot position word and list title template generation Training data generates model to sequence and carries out model training.The present embodiment can use trained sequence and generate model next life At list title, semanteme not only may be implemented controllably, a large amount of list title can also be generated, improve the generation of list title Abundant degree.
Sequence used in the embodiment of the present invention generates the sequence that model includes standard and generates model, based on attention mechanism Sequence generates model, VAE model etc..Wherein, the sequence of standard, which generates model, may refer to Fig. 4, the sequence based on attention mechanism Column-generation model may refer to Fig. 5, and VAE model may refer to Fig. 7.And it can also be introduced during above-mentioned model training aobvious Show semantic optimization, for example, the mode based on the optimization of network beam-search adjusts the training process of model.The technical effect of model It has illustrated aforementioned, no longer one has repeated one by one herein.
The embodiment of the present invention is directed to the process that list title generates, and can meet given semantic restriction, and in smoothness Have on degree, diversity and preferably shows.
Referring to Fig. 10, the embodiment of the present invention provides a kind of generating means of short text, comprising:
Slot position word obtains module 100, for obtaining the slot position word for generating short text;
Short text template extraction module 200, for extracting the short essay with the slot position word association according to 2 word of slot position This template;Wherein, the short text template is stored with the slot position word association in advance;The short text template includes being embedded with slot The short text of position;And
Short text generation module 300 is used to insert the slot position word in the short text template and the slot position word The slot position matched generates short text.
In one possible implementation, described device further include:
Keyword determining module determines institute for the text field and text attribute according to short text template to be generated State the keyword that short text template includes;
Candidate title retrieval module, the keyword for including according to the short text template, from the inquiry of search log Candidate title is retrieved in title;Wherein, described search log is for recording search performed by search engine and looking into comprising described Ask the search result of title;The inquiry title includes keyword;
Entity word determining module, for determining the entity word of the candidate title according to entity word type;And
Short text template generation module, for using the entity word as slot position word, by the slot position word from the candidate It is removed in title, generates the short text template, and by the slot position word and the short text template associated storage.
The function of described device can also execute corresponding software realization by hardware realization by hardware.It is described Hardware or software include one or more modules corresponding with above-mentioned function.
It include processor and memory in the generation structure of short text in a possible design, the memory is used The generation program of short text in above-mentioned first aspect is executed in the generating means of short text, the processor is configured to for holding The program stored in the row memory.The generating means of the short text can also include communication interface, for short text Generating means and other equipment or communication.
The embodiment of the present invention also provides a kind of generation terminal device of short text, and as shown in figure 11, which includes: storage Device 21 and processor 22, being stored in memory 21 can be in the computer program on processor 22.Processor 22 executes computer The generation method of the short text in above-described embodiment is realized when program.The quantity of memory 21 and processor 22 can for one or It is multiple.
The equipment further include:
Communication interface 23, for the communication between processor 22 and external equipment.
Memory 21 may include high speed RAM memory, it is also possible to further include nonvolatile memory (non-volatile Memory), a for example, at least magnetic disk storage.
If memory 21, processor 22 and the independent realization of communication interface 23, memory 21, processor 22 and communication are connect Mouth 23 can be connected with each other by bus and complete mutual communication.Bus can be industry standard architecture (ISA, Industry Standard Architecture) bus, external equipment interconnection (PCI, Peripheral Component) be total Line or extended industry-standard architecture (EISA, Extended Industry Standard Component) bus etc..Always Line can be divided into address bus, data/address bus, control bus etc..Only to be indicated with a thick line in Figure 11, but simultaneously convenient for indicating Only a bus or a type of bus are not indicated.
Optionally, in specific implementation, if memory 21, processor 22 and communication interface 23 are integrated in chip piece On, then memory 21, processor 22 and communication interface 23 can complete mutual communication by internal interface.
In the description of this specification, reference term " one embodiment ", " some embodiments ", " example ", " specifically show The description of example " or " some examples " etc. means specific features, structure, material or spy described in conjunction with this embodiment or example Point is included at least one embodiment or example of the invention.Moreover, particular features, structures, materials, or characteristics described It may be combined in any suitable manner in any one or more of the embodiments or examples.In addition, without conflicting with each other, this The technical staff in field can be by the spy of different embodiments or examples described in this specification and different embodiments or examples Sign is combined.
In addition, term " first ", " second " are used for descriptive purposes only and cannot be understood as indicating or suggesting relative importance Or implicitly indicate the quantity of indicated technical characteristic." first " is defined as a result, the feature of " second " can be expressed or hidden It include at least one this feature containing ground.In the description of the present invention, the meaning of " plurality " is two or more, unless otherwise Clear specific restriction.
Any process described otherwise above or method description are construed as in flow chart or herein, and expression includes It is one or more for realizing specific logical function or process the step of executable instruction code module, segment or portion Point, and the range of the preferred embodiment of the present invention includes other realization, wherein can not press shown or discussed suitable Sequence, including according to related function by it is basic simultaneously in the way of or in the opposite order, to execute function, this should be of the invention Embodiment person of ordinary skill in the field understood.
Expression or logic and/or step described otherwise above herein in flow charts, for example, being considered use In the order list for the executable instruction for realizing logic function, may be embodied in any computer-readable medium, for Instruction execution system, device or equipment (such as computer based system, including the system of processor or other can be held from instruction The instruction fetch of row system, device or equipment and the system executed instruction) it uses, or combine these instruction execution systems, device or set It is standby and use.For the purpose of this specification, " computer-readable medium ", which can be, any may include, stores, communicates, propagates or pass Defeated program is for instruction execution system, device or equipment or the dress used in conjunction with these instruction execution systems, device or equipment It sets.
The computer-readable medium of the embodiment of the present invention can be computer-readable signal media or computer-readable deposit Storage media either the two any combination.The more specific example at least (non-exclusive of computer readable storage medium List) include the following: there is the electrical connection section (electronic device) of one or more wirings, portable computer diskette box (magnetic dress Set), random access memory (RAM), read-only memory (ROM), erasable edit read-only storage (deposit by EPROM or flash Reservoir), fiber device and portable read-only memory (CDROM).In addition, computer readable storage medium can even is that Can the paper of print routine or other suitable media on it because can for example be swept by carrying out optics to paper or other media It retouches, is then edited, interprets or handled when necessary with other suitable methods electronically to obtain program, then will It is stored in computer storage.
In embodiments of the present invention, computer-readable signal media may include in a base band or as carrier wave a part The data-signal of propagation, wherein carrying computer-readable program code.The data-signal of this propagation can use a variety of Form, including but not limited to electromagnetic signal, optical signal or above-mentioned any appropriate combination.Computer-readable signal media is also It can be any computer-readable medium other than computer readable storage medium, which can send, pass It broadcasts or transmits for instruction execution system, input method or device use or program in connection.Computer can The program code for reading to include on medium can transmit with any suitable medium, including but not limited to: wirelessly, electric wire, optical cable, penetrate Frequently (Radio Frequency, RF) etc. or above-mentioned any appropriate combination.
It should be appreciated that each section of the invention can be realized with hardware, software, firmware or their combination.Above-mentioned In embodiment, software that multiple steps or method can be executed in memory and by suitable instruction execution system with storage Or firmware is realized.It, and in another embodiment, can be under well known in the art for example, if realized with hardware Any one of column technology or their combination are realized: having a logic gates for realizing logic function to data-signal Discrete logic, with suitable combinational logic gate circuit specific integrated circuit, programmable gate array (PGA), scene Programmable gate array (FPGA) etc..
Those skilled in the art are understood that realize all or part of step that above-described embodiment method carries Suddenly be that relevant hardware can be instructed to complete by program, program can store in a kind of computer readable storage medium In, which when being executed, includes the steps that one or a combination set of embodiment of the method.
It, can also be in addition, each functional unit in each embodiment of the present invention can integrate in a processing module It is that each unit physically exists alone, can also be integrated in two or more units in a module.Above-mentioned integrated mould Block both can take the form of hardware realization, can also be realized in the form of software function module.If integrated module with The form of software function module is realized and when sold or used as an independent product, also can store computer-readable at one In storage medium.Storage medium can be read-only memory, disk or CD etc..
More than, only a specific embodiment of the invention, but scope of protection of the present invention is not limited thereto, and it is any to be familiar with Those skilled in the art in the technical scope disclosed by the present invention, can readily occur in its various change or replacement, these It should be covered by the protection scope of the present invention.Therefore, protection scope of the present invention should be with scope of protection of the claims It is quasi-.

Claims (12)

1. a kind of generation method of short text characterized by comprising
Obtain the slot position word for generating short text;
According to the slot position word, the short text template with the slot position word association is extracted;Wherein, the short text template in advance with The slot position word association storage;The short text template includes the short text for being embedded with slot position;And
The slot position word is inserted in the short text template with the matched slot position of slot position word, short text is generated.
2. the generation method of short text as described in claim 1, which is characterized in that the method also includes:
According to the text field and text attribute of short text template to be generated, the key that the short text template includes is determined Word;
According to the keyword that the short text template includes, candidate title is retrieved from the inquiry title of search log;Wherein, institute It states search log and is used to record search performed by search engine and the search result comprising the inquiry title;The inquiry mark Topic includes keyword;
According to entity word type, the entity word of the candidate title is determined;And
Using the entity word as slot position word, the slot position word is removed from the candidate title, generates the short text mould Plate, and by the slot position word and the short text template associated storage.
3. the generation method of short text as claimed in claim 2, which is characterized in that the method also includes:
According to the filter word of preset short text template, the candidate mark comprising the filter word is excluded from the candidate title Topic.
4. the generation method of short text as claimed in claim 2, which is characterized in that the method also includes:
Duplicate removal is carried out to the candidate title retrieved;And
The short text template of generation is subjected to duplicate removal, and the associated slot position word of the short text template of duplicate removal is merged.
5. the generation method of short text according to any one of claims 1 to 4 characterized by comprising
Training data is formed by the slot position word for being used to generate short text and according to the short text that the slot position word generates;
Sequence is obtained using training data training and generates model, and the sequence generates the slot position word that model is used to utilize input Export corresponding short text.
6. the generation method of short text as claimed in claim 5, which is characterized in that it includes based on note that the sequence, which generates model, The sequence for power mechanism of anticipating generates model, and the training data is used when further including the short text generated according to the slot position word Short text template included by keyword.
7. the generation method of short text as claimed in claim 5, which is characterized in that the sequence generate model include variation from Encoder model;And the method also includes:
It is obtained corresponding with the slot position word according to the encoder of the variation self-encoding encoder model to the slot position word into coding Short text hidden vector;
The hidden vector is adjusted, extensive hidden vector is obtained;And
According to the variation self-encoding encoder solution to model code device, extensive hidden vector described in the hidden vector sum is decoded, is obtained Obtain short text.
8. the generation method of short text as claimed in claim 5, which is characterized in that the method also includes:
Slot position word in the training data is inputted the sequence generation model to calculate;
Calculated result is compared with the short text in the training data, generates model to adjust the sequence;Wherein, institute The mode for stating calculating includes the optimal way of network beam-search.
9. a kind of generating means of short text characterized by comprising
Slot position word obtains module, for obtaining the slot position word for generating short text;
Short text template extraction module, for extracting the short text template with the slot position word association according to the slot position word;Its In, the short text template is stored with the slot position word association in advance;The short text template includes the short essay for being embedded with slot position This;And
Short text generation module is used to insert the slot position word in the short text template and the matched slot of slot position word Position generates short text.
10. the generating means of short text as claimed in claim 9, which is characterized in that described device further include:
Keyword determining module determines described short for the text field and text attribute according to short text template to be generated The keyword that text template includes;
Candidate title retrieval module, the keyword for including according to the short text template, from the inquiry title of search log The candidate title of middle retrieval;Wherein, described search log is marked for recording search performed by search engine and comprising the inquiry The search result of topic;The inquiry title includes keyword;
Entity word determining module, for determining the entity word of the candidate title according to entity word type;And
Short text template generation module, for using the entity word as slot position word, by the slot position word from candidate's title Middle removal, generates the short text template, and by the slot position word and the short text template associated storage.
11. a kind of generation terminal device for realizing short text, which is characterized in that the terminal device includes:
One or more processors;
Storage device, for storing one or more programs;
When one or more of programs are executed by one or more of processors, so that one or more of processors Realize the generation method such as short text described in any one of claims 1-8.
12. a kind of computer readable storage medium, is stored with computer program, which is characterized in that the program is held by processor The generation method such as short text described in any one of claims 1-8 is realized when row.
CN201810712807.6A 2018-06-29 2018-06-29 Short text generation method and device, storage medium and terminal equipment Active CN108959256B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810712807.6A CN108959256B (en) 2018-06-29 2018-06-29 Short text generation method and device, storage medium and terminal equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810712807.6A CN108959256B (en) 2018-06-29 2018-06-29 Short text generation method and device, storage medium and terminal equipment

Publications (2)

Publication Number Publication Date
CN108959256A true CN108959256A (en) 2018-12-07
CN108959256B CN108959256B (en) 2023-04-07

Family

ID=64485036

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810712807.6A Active CN108959256B (en) 2018-06-29 2018-06-29 Short text generation method and device, storage medium and terminal equipment

Country Status (1)

Country Link
CN (1) CN108959256B (en)

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109960749A (en) * 2019-02-22 2019-07-02 清华大学 Model acquisition methods, keyword generation method, device, medium and calculating equipment
CN110209838A (en) * 2019-06-10 2019-09-06 广东工业大学 A kind of text template acquisition methods and relevant apparatus
CN110287461A (en) * 2019-05-24 2019-09-27 北京百度网讯科技有限公司 Text conversion method, device and storage medium
CN110309507A (en) * 2019-05-30 2019-10-08 深圳壹账通智能科技有限公司 Testing material generation method, device, computer equipment and storage medium
CN110727782A (en) * 2019-10-22 2020-01-24 苏州思必驰信息科技有限公司 Question and answer corpus generation method and system
CN110766085A (en) * 2019-10-28 2020-02-07 北京声智科技有限公司 Slot position recognition model training method and device based on user-defined scene
CN110929505A (en) * 2019-11-28 2020-03-27 贝壳技术有限公司 Method and device for generating house source title, storage medium and electronic equipment
CN111241789A (en) * 2020-01-14 2020-06-05 平安科技(深圳)有限公司 Text generation method and device
CN111241832A (en) * 2020-01-15 2020-06-05 北京百度网讯科技有限公司 Core entity labeling method and device and electronic equipment
CN111401044A (en) * 2018-12-27 2020-07-10 北京字节跳动网络技术有限公司 Title generation method and device, terminal equipment and storage medium
CN111414103A (en) * 2019-01-04 2020-07-14 百度在线网络技术(北京)有限公司 Method and device for generating instruction
CN111488450A (en) * 2020-04-08 2020-08-04 北京字节跳动网络技术有限公司 Method and device for generating keyword library and electronic equipment
CN112036164A (en) * 2020-09-17 2020-12-04 深圳市欢太科技有限公司 Sample generation method and device, computer-readable storage medium and electronic device
CN112232052A (en) * 2020-10-23 2021-01-15 中国平安人寿保险股份有限公司 Text splicing method and device, computer equipment and storage medium
CN112597748A (en) * 2020-12-18 2021-04-02 深圳赛安特技术服务有限公司 Corpus generation method, apparatus, device and computer readable storage medium
CN113010768A (en) * 2019-12-19 2021-06-22 北京搜狗科技发展有限公司 Data processing method and device and data processing device

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103268339A (en) * 2013-05-17 2013-08-28 中国科学院计算技术研究所 Recognition method and system of named entities in microblog messages
US20170206897A1 (en) * 2016-01-18 2017-07-20 Alibaba Group Holding Limited Analyzing textual data
CN107832229A (en) * 2017-12-03 2018-03-23 中国直升机设计研究所 A kind of system testing case automatic generating method based on NLP
CN107943774A (en) * 2017-11-20 2018-04-20 北京百度网讯科技有限公司 article generation method and device

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103268339A (en) * 2013-05-17 2013-08-28 中国科学院计算技术研究所 Recognition method and system of named entities in microblog messages
US20170206897A1 (en) * 2016-01-18 2017-07-20 Alibaba Group Holding Limited Analyzing textual data
CN107943774A (en) * 2017-11-20 2018-04-20 北京百度网讯科技有限公司 article generation method and device
CN107832229A (en) * 2017-12-03 2018-03-23 中国直升机设计研究所 A kind of system testing case automatic generating method based on NLP

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
孙凌等: "基于变分自动编码器的动态主题模型", 《河北工业科技》, no. 06, 15 November 2017 (2017-11-15), pages 422 - 424 *

Cited By (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111401044A (en) * 2018-12-27 2020-07-10 北京字节跳动网络技术有限公司 Title generation method and device, terminal equipment and storage medium
CN111414103B (en) * 2019-01-04 2021-11-16 百度在线网络技术(北京)有限公司 Method and device for generating instruction
CN111414103A (en) * 2019-01-04 2020-07-14 百度在线网络技术(北京)有限公司 Method and device for generating instruction
CN109960749A (en) * 2019-02-22 2019-07-02 清华大学 Model acquisition methods, keyword generation method, device, medium and calculating equipment
CN110287461A (en) * 2019-05-24 2019-09-27 北京百度网讯科技有限公司 Text conversion method, device and storage medium
CN110287461B (en) * 2019-05-24 2023-04-18 北京百度网讯科技有限公司 Text conversion method, device and storage medium
CN110309507A (en) * 2019-05-30 2019-10-08 深圳壹账通智能科技有限公司 Testing material generation method, device, computer equipment and storage medium
CN110209838A (en) * 2019-06-10 2019-09-06 广东工业大学 A kind of text template acquisition methods and relevant apparatus
CN110727782A (en) * 2019-10-22 2020-01-24 苏州思必驰信息科技有限公司 Question and answer corpus generation method and system
CN110766085A (en) * 2019-10-28 2020-02-07 北京声智科技有限公司 Slot position recognition model training method and device based on user-defined scene
CN110929505B (en) * 2019-11-28 2021-04-16 北京房江湖科技有限公司 Method and device for generating house source title, storage medium and electronic equipment
CN110929505A (en) * 2019-11-28 2020-03-27 贝壳技术有限公司 Method and device for generating house source title, storage medium and electronic equipment
CN113010768A (en) * 2019-12-19 2021-06-22 北京搜狗科技发展有限公司 Data processing method and device and data processing device
CN113010768B (en) * 2019-12-19 2024-03-19 北京搜狗科技发展有限公司 Data processing method and device for data processing
CN111241789A (en) * 2020-01-14 2020-06-05 平安科技(深圳)有限公司 Text generation method and device
CN111241832A (en) * 2020-01-15 2020-06-05 北京百度网讯科技有限公司 Core entity labeling method and device and electronic equipment
CN111241832B (en) * 2020-01-15 2023-08-15 北京百度网讯科技有限公司 Core entity labeling method and device and electronic equipment
CN111488450A (en) * 2020-04-08 2020-08-04 北京字节跳动网络技术有限公司 Method and device for generating keyword library and electronic equipment
CN112036164A (en) * 2020-09-17 2020-12-04 深圳市欢太科技有限公司 Sample generation method and device, computer-readable storage medium and electronic device
CN112232052A (en) * 2020-10-23 2021-01-15 中国平安人寿保险股份有限公司 Text splicing method and device, computer equipment and storage medium
CN112597748A (en) * 2020-12-18 2021-04-02 深圳赛安特技术服务有限公司 Corpus generation method, apparatus, device and computer readable storage medium
CN112597748B (en) * 2020-12-18 2023-08-11 深圳赛安特技术服务有限公司 Corpus generation method, corpus generation device, corpus generation equipment and computer-readable storage medium

Also Published As

Publication number Publication date
CN108959256B (en) 2023-04-07

Similar Documents

Publication Publication Date Title
CN108959256A (en) Generation method, device, storage medium and the terminal device of short text
CN110795543B (en) Unstructured data extraction method, device and storage medium based on deep learning
CN110633409B (en) Automobile news event extraction method integrating rules and deep learning
CN110097085A (en) Lyrics document creation method, training method, device, server and storage medium
CN109271506A (en) A kind of construction method of the field of power communication knowledge mapping question answering system based on deep learning
CN110502621A (en) Answering method, question and answer system, computer equipment and storage medium
KR100533810B1 (en) Semi-Automatic Construction Method for Knowledge of Encyclopedia Question Answering System
CN109492099A (en) It is a kind of based on field to the cross-domain texts sensibility classification method of anti-adaptive
CN111581474B (en) Evaluation object extraction method of case-related microblog comments based on multi-head attention system
CN104346438B (en) Based on big data data management service system
CN107944027A (en) Create the method and system of semantic key index
CN103488724A (en) Book-oriented reading field knowledge map construction method
JP2012198277A (en) Document reading-aloud support device, document reading-aloud support method, and document reading-aloud support program
CN107797998A (en) The recognition methods of user-generated content containing rumour and device
CN106844341A (en) News in brief extracting method and device based on artificial intelligence
CN109815491A (en) Answer methods of marking, device, computer equipment and storage medium
CN108304373A (en) Construction method, device, storage medium and the electronic device of semantic dictionary
CN107193882A (en) Why not query answer methods based on figure matching on RDF data
CN111553159B (en) Question generation method and system
CN107832439A (en) Method, system and the terminal device of more wheel state trackings
CN110008309A (en) A kind of short phrase picking method and device
CN107943940A (en) Data processing method, medium, system and electronic equipment
CN110516145A (en) Information searching method based on sentence vector coding
CN109753650A (en) A kind of Laotian name place name entity recognition method merging multiple features
CN107193941A (en) Story generation method and device based on picture content

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant