CN113377884B - Event corpus purification method based on multi-agent reinforcement learning - Google Patents

Event corpus purification method based on multi-agent reinforcement learning Download PDF

Info

Publication number
CN113377884B
CN113377884B CN202110773927.9A CN202110773927A CN113377884B CN 113377884 B CN113377884 B CN 113377884B CN 202110773927 A CN202110773927 A CN 202110773927A CN 113377884 B CN113377884 B CN 113377884B
Authority
CN
China
Prior art keywords
training
data
agent
model
reinforcement learning
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110773927.9A
Other languages
Chinese (zh)
Other versions
CN113377884A (en
Inventor
后敬甲
王悦
白璐
崔丽欣
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Central university of finance and economics
Original Assignee
Central university of finance and economics
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Central university of finance and economics filed Critical Central university of finance and economics
Priority to CN202110773927.9A priority Critical patent/CN113377884B/en
Publication of CN113377884A publication Critical patent/CN113377884A/en
Application granted granted Critical
Publication of CN113377884B publication Critical patent/CN113377884B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • G06F16/288Entity relationship models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Machine Translation (AREA)

Abstract

The invention relates to an event corpus purifying method based on multi-agent reinforcement learning, which comprises the steps that before model training starts, an environment and an agent are required to be initialized and reset, and corresponding training parameters are set; the intelligent agent performs corresponding purification and optimization actions in the environment to form a series of data required by training, samples the data and stores the data into a data cache area for subsequent training; the data quantity in the data cache area reaches a set value, and training and updating are started to be carried out on the real networks of all the intelligent agents by using the data; after the update of the real network is finished, updating the target networks of all the intelligent agents by an irregular parameter copying method; repeating the steps until the training times reach the preset training times. The method and the device are based on purification optimization of marked data, so that the problem of data tag noise encountered by the sequence marking joint extraction model in the training process is solved, and the effect of the event entity relation joint extraction task is improved.

Description

Event corpus purification method based on multi-agent reinforcement learning
Technical Field
The invention relates to the field of multi-agent reinforcement learning methods, in particular to an event corpus purifying method based on multi-agent reinforcement learning.
Background
Reinforcement Learning (MARL) is a method of machine learning, and can be classified into single-agent reinforcement learning and multi-agent reinforcement learning according to the number of agents, wherein the multi-agent reinforcement learning has wider application scenarios and is a key tool for solving a plurality of real world problems. In multi-agent reinforcement learning, according to the difference of agent task relationships, the learning method can be divided into: fully collaborative tasks, fully competing tasks, and mixed tasks, we consider here only fully collaborative tasks.
In the multi-agent reinforcement learning training under the complete cooperation task, the agents aim at maximizing the combined rewards, select actions according to own strategies, execute corresponding rewards and feedback in the environment to update own strategies, and execute the steps in a circulating way until the combined rewards converge to the maximum value, and each agent achieves the optimal strategy under the current environment.
At present, the MADDPG (Multi-Agent Deep Deterministic Policy Gradient) algorithm is one of more advanced reinforcement learning methods in a Multi-agent environment, solves the problem that a traditional value-based algorithm (such as DQN) is difficult to apply in a continuous environment, and simultaneously adds a deep learning method to improve the training efficiency of the traditional strategy algorithm (DPG), introduces an experience playback pool and a 'centralized training' training mechanism to further improve the training effect.
However, madppg still has poor exploratory and suboptimal problems for joint solution space, namely: in a multi-agent reinforcement learning environment, as the number of agents increases, the size of the combined strategy space increases exponentially, which results in a decrease in the exploration completion of the agent strategy space during the training process, and further results in a trend of converging the training results on a global suboptimal solution, failing to achieve a better training effect.
Entity-related extraction refers to the simultaneous detection of entity references from unstructured text and recognition of semantic relationships between them. The conventional entity relationship extraction method processes the task in a serial manner, namely extracting the entities and then identifying their relationships. The serial processing mode is simple, the two subtasks are independent and flexible, each subtask corresponds to one sub-model, and the correlation between the two subtasks is ignored.
The entity relationship joint extraction is to use a single model to combine entity identification and relationship extraction, so that entity information and relationship information can be effectively integrated, and better effect is achieved compared with a serial entity relationship extraction method, but the entity and relationship are required to be extracted respectively, which leads to additional redundant information of the model.
In order to solve the problem that the entity relationship joint extraction model generates extra redundant information, a research has been proposed to convert the joint extraction task into a label, and the entity and the relationship thereof are directly extracted by using the sequence labeling model by establishing the label with the relationship information, so that the entity and the relationship are not separately identified.
The sequence labeling combined extraction model is an efficient event combined extraction model, but a large amount of high-quality labeling data is needed in the training process, and the automatic labeling of the data can be effectively realized by a remote supervision method, but the remote supervision method can assume that: if two entities have a relationship in a given corpus, then all sentences that contain both entities will refer to the relationship. This results in a significant amount of labeled data set generated by the remote monitoring method, which has the problem of label noise that can adversely affect the joint extraction model.
Disclosure of Invention
Aiming at the technical problems, the invention provides an event corpus purifying method based on multi-agent reinforcement learning.
In order to solve the problems in the prior art, the invention provides a multi-agent reinforcement learning-based event corpus purification method, which comprises the following steps:
before model training starts, the environment and the intelligent body are required to be initialized and reset, and corresponding training parameters are set;
the intelligent agent performs corresponding purification and optimization actions in the environment to form a series of data required by training, samples the data and stores the data into a data cache area for subsequent training;
the data quantity in the data cache area reaches a set value, and training and updating are started to be carried out on the real networks of all the intelligent agents by using the data;
after the update of the real network is finished, updating the target networks of all the intelligent agents by an irregular parameter copying method;
repeating the steps until the training times reach the preset training times.
Preferably, the initialization and the resetting of the environment and the intelligent agent are needed before the model training is started, and the setting of the corresponding training parameters specifically includes: and carrying out data preprocessing on the event corpus, and inputting the corpus as the environmental parameters of the multi-agent reinforcement learning model.
Preferably, the agent forms a series of data required for training by performing corresponding purification optimization actions in the environment, samples the data and stores the data in a data buffer area, so that the subsequent training use specifically includes:
the multi-agent reinforcement learning model generates an action set of an agent group according to the input environmental parameters;
the intelligent agent group executes an action set, and corresponding event knowledge is selected from a corpus to form an event knowledge set;
mapping the event knowledge set into word vectors, and inputting the word vectors into a sequence labeling joint model;
the sequence labeling joint model labels the input word vectors, and compares the word vectors with a test set to verify the event purifying effect of the current multi-agent reinforcement model, and outputs evaluation indexes.
Preferably, the step of starting training and updating the real network of all the agents by using the data when the data quantity in the data buffer reaches the set value specifically includes:
and converting the evaluation index into a reward value according to a preset reward function, and feeding back to the training of the multi-agent reinforcement learning model to optimize the model.
Preferably, after the updating of the real network is completed, the updating of the target network of all the agents by the method of copying the irregular parameters further includes:
extracting network parameters of each layer of each intelligent agent as parameter vectors, subtracting the parameter vectors of each layer one by one to obtain a pairwise parameter vector difference among multiple intelligent agents, and then multiplying the parameter vector difference by a differentiation factor to feed back the parameter vector difference to the updated intelligent agent, so that the final update of the intelligent agent is completed.
Compared with the prior art, the event corpus purifying method based on multi-agent reinforcement learning has the following beneficial effects:
1. based on the multi-agent reinforcement learning environment, the training effect of the multi-agent reinforcement learning algorithm is improved and the multi-agent reinforcement learning model is optimized through the research on the exploration degree of the improved joint strategy space;
2. the invention extracts the parameters in each intelligent agent sub-network, forms the parameter vector to represent the strategy of the intelligent agent, reduces the repeatability of strategy exploration among the intelligent agents by maximizing the difference among the parameter vectors, and improves the exploration degree of the combined strategy solution space;
3. according to the method, based on the optimized multi-agent reinforcement learning model, the labeled data is purified and optimized, so that the problem of data tag noise encountered by the sequence labeling joint extraction model in the training process is solved, and the effect of the event entity relationship joint extraction task is improved.
Drawings
Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the invention. Also, like reference numerals are used to designate like parts throughout the figures. In the drawings:
fig. 1 is a flowchart of an event corpus purification method based on multi-agent reinforcement learning according to an embodiment of the present invention.
Fig. 2 is a training flowchart of an event corpus purification method based on multi-agent reinforcement learning according to an embodiment of the present invention.
Fig. 3 is a flow chart of a data sampling part of an event corpus purifying method based on multi-agent reinforcement learning according to an embodiment of the present invention.
Fig. 4 is a network updating flowchart of an event corpus purifying method based on multi-agent reinforcement learning according to an embodiment of the present invention.
Fig. 5 is a schematic structural diagram of a sequence labeling model of an event corpus purification method based on multi-agent reinforcement learning according to an embodiment of the present invention.
Fig. 6 is another flowchart of an event corpus purifying method based on multi-agent reinforcement learning according to an embodiment of the present invention.
Detailed Description
Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
As shown in FIG. 1, the invention provides a method for purifying an event corpus based on multi-agent reinforcement learning, which comprises the following steps of
S1, initializing and resetting an environment and an intelligent agent before model training starts, and setting corresponding training parameters;
s2, the intelligent agent performs corresponding purification and optimization actions in the environment to form a series of data required by training, samples the data and stores the data in a data cache area for subsequent training;
s3, training and updating the real network of all the intelligent agents by using the data when the data quantity in the data cache area reaches a set value;
s4, after the update of the real network is finished, updating the target networks of all the intelligent agents by an irregular parameter copying method;
s5, repeating the steps until the training times reach the preset training times.
The invention provides an event corpus purifying method based on multi-agent reinforcement learning, which mainly comprises the following steps: training environment and parameter initialization, data sampling, real network training and target network updating.
The invention mainly consists of two models, namely: a multi-agent reinforcement learning model based on a neural network parameter vector difference maximization strategy searching method and a sequence labeling joint model based on a Bi-LSTM-CRF structure.
In the invention, the sequence labeling module is used as an effect verification and rewarding feedback part of a corpus purification model, and the selected model structure is shown in fig. 5 and mainly comprises two layers, namely: bi-LSTM layer and CRF layer.
The Bi-LSTM model has excellent performance in sequence labeling tasks, long-term context information can be effectively combined and utilized, and meanwhile, the self fitting capability of the neural network to nonlinear data is also achieved, but as the optimization target is to search labels with the highest occurrence probability at each moment, and then the labels form a sequence, the phenomenon that the model is inconsistent with the output of the label sequence is often caused.
The CRF model can be well complemented with the advantages and disadvantages of the Bi-LSTM model to a certain extent, and the CRF model has the advantages that the whole input text can be scanned through the feature template, so that more consideration is given to linear weighted combination of the local features of the whole text, and the optimization target of the CRF model is a sequence with highest occurrence probability instead of labeling of each position of the sequence with highest occurrence probability. The CRF model has the defects that firstly, a certain priori knowledge is required to be selected for training the corpus, the characteristics which have important influence on labeling are required to be analyzed from the statistical data of the related information in the corpus, the model is over-fitted due to the excessive number of the characteristics, the under-fitting phenomenon of the model is caused due to the insufficient number of the characteristics, and how to combine the characteristics is a difficult work; secondly, the CRF model is limited by the window size specified by the feature template in the training process, and long-term context information is difficult to consider.
Based on the characteristics of the two models, namely, the Bi-LSTM-CRF model combining the two models is selected, namely, a linear CRF layer is added on a hidden layer of the traditional Bi-LSTM model and is used as a sequence labeling module in the invention to verify the training effect of the corpus purification model, and the training result is fed back to the training of the corpus purification model to optimize the model.
As shown in fig. 5, the initialization and the reset of the environment and the intelligent agent are required before the model training starts, and the setting of the corresponding training parameters specifically includes: and carrying out data preprocessing on the event corpus, and inputting the corpus as the environmental parameters of the multi-agent reinforcement learning model.
As shown in fig. 6, the agent performs corresponding purification optimization actions in the environment to form a series of data required by training, samples the data and stores the data in a data buffer area, so that the subsequent training use specifically includes:
the multi-agent reinforcement learning model generates an action set of an agent group according to the input environmental parameters;
the intelligent agent group executes an action set, and corresponding event knowledge is selected from a corpus to form an event knowledge set;
mapping the event knowledge set into word vectors, and inputting the word vectors into a sequence labeling joint model;
the sequence labeling joint model labels the input word vectors, and compares the word vectors with a test set to verify the event purifying effect of the current multi-agent reinforcement model, and outputs evaluation indexes.
As shown in fig. 6, the step of starting training and updating the real network of all the agents by using the data when the data amount in the data buffer reaches the set value specifically includes:
and converting the evaluation index into a reward value according to a preset reward function, and feeding back to the training of the multi-agent reinforcement learning model to optimize the model.
Data sampling and agent network updating are described in detail as follows:
as shown in fig. 2, the detailed steps of data sampling are as follows:
step 1-1: initializing sampling process parameters: maximum data storage amount max-ep-length, sampled and stored data amount t=1;
step 1-2: acquiring a state X of a current environment, wherein X is a vector formed by a series of environment parameters;
step 1-3: each Agent i takes An environmental state X as input, generates An action Ai through the operation of a real Actor network in the Agent i, and all actions selected by the agents form An action group A (A1, A2, …, an);
step 1-4: all agents perform their respective actions in the current environment, namely: in the environment state X, executing An action group A (A1, A2, …, an) to obtain a new environment state as X', and simultaneously obtaining a combined rewards value R;
step 1-5: obtaining a complete data tuple (X, A, R, X') and storing the complete data tuple in the data cache pool D;
step 1-6: updating the current environmental state: x' - > X;
step 1-7: the steps are executed until the data exchange amount in the data cache pool D reaches the maximum data storage amount, namely: and when t > max-epi-length, ending data sampling and starting learning.
As shown in fig. 3, the detailed steps of the agent network update are as follows:
the following operations are performed for each Agent i for all agents:
step 2-1: randomly sampling a data tuple (X, A, R, X') of a miniband from the data cache pool D, wherein the size of the miniband can be set independently;
step 2-2: calculating a target Q value from the randomly sampled data tuples;
step 2-3: updating the real Critic network of the Agent i in a mode of minimizing the loss function, and calculating the loss function by taking the actual Q value and the target Q value as factors;
step 2-4: updating a real Actor network of the Agent i in a gradient descending mode, and calculating a strategy gradient of the model network;
step 2-5: extracting parameter vectors of an Actor network and a Critic network of the Agent i respectively, and marking the parameter vectors as: mi and Ni;
step 2-6: taking the difference between the parameter vector of Agent i and the parameter vector of Agent (i-1), and marking as: sub-Mi and Sub-Ni;
step 2-7: multiplying the Sub-Mi and the Sub-Ni by a differentiation factor beta, and respectively feeding back and updating the original network;
step 2-8: the steps are circulated until all the agents finish updating the real network;
step 2-9: updating the target network of all the agents by means of soft update, namely: parameters of the real network are copied into the target network periodically.
The target Q value is:
Figure BDA0003153522780000081
wherein X is an environmental state characterization parameter, a i Is an action, Q is a Q-value calculation function whose parameters are x and a i R refers to the prize value R, γ refers to the decay factor;
the loss function is:
Figure BDA0003153522780000082
s is the total number of agents in the environment, y j Is the target Q value, Q of the intelligent agent u Is the actual Q value of the intelligent agent;
the strategy gradient is as follows:
Figure BDA0003153522780000083
μ is the agent policy, σ is the policy network input parameter;
periodically copying parameters of the real network into the target network adopts the following formula:
θ′ i ←τθ i +(1-τ)θ′ i
θ refers to the network parameters and τ refers to the coefficients of the parameter replications at the time of network update.
The sequence labeling joint extraction model is an efficient event entity relationship joint extraction model, but a large amount of high-quality labeling data is required in the training process, and the labeled data volume can be effectively increased by a remote supervision method, but the label noise problem exists in the generated labeled data set, so that the model is adversely affected. Aiming at the problem, the method and the device are based on the improved multi-agent reinforcement learning model, and the marked data is purified and optimized, so that the problem of data tag noise encountered by the sequence marking joint extraction model in the training process is solved, and the effect of the event entity relation joint extraction task is improved.
The embodiment of the invention provides an event corpus purifying method based on multi-agent reinforcement learning, wherein in the multi-agent reinforcement learning environment, an agent consists of a multi-layer neural network, and network parameters of each layer are current strategy generation parameters of the agent. On the basis of the MADDPG original training process, after the strategy of the intelligent agent is updated, network parameters of each layer of the intelligent agent are extracted to be used as parameter vectors, then the parameter vectors of each layer are subtracted one by one to obtain a two-by-two parameter vector difference between multiple intelligent agents, and the parameter vector difference is multiplied by a differentiation factor and fed back to the updated intelligent agent, so that the final updating of the intelligent agent is completed. By the method of maximizing the vector difference of the neural network parameters, the exploration degree of the intelligent agent to the combination strategy space in the training process is enlarged, so that the training result is further close to the global optimal solution.
Multi-agent reinforcement learning (MARL) is a key tool to solve many real world problems, while reinforcement learning algorithms in multi-agent environments face typical problems: as the number of agents increases, the joint policy solution space increases exponentially, resulting in poor policy space exploration and policy suboptimal that are difficult to avoid for such algorithms. Through the research of the strategy space exploration method, the exploration efficiency of the agent for the combined strategy solution space is optimized, the exploration degree of the combined strategy solution space is increased, and the combined strategy solution space is further prone to the coverage of the full strategy solution space, so that the current optimal strategy is more close to the global optimal solution.
The exploration of the strategy solution space by the agent group is independent, and the random exploration process cannot avoid repeated coverage of the strategy solution space, so that the exploration efficiency is reduced to a certain extent. The invention provides a strategy exploration method for maximizing the parameter vector difference of a neural network, which extracts the parameter vector of each agent forming the neural network, combines exploration of the strategy solution space of an agent group, and avoids repeated exploration of the strategy solution space to a certain extent by maximizing the difference between the parameter vectors of each agent, thereby improving the exploration degree of the combined strategy solution space, leading the combined strategy solution space to further trend to cover the full strategy solution space, further improving the training effect compared with the original algorithm, and improving the model.
It should be noted that the above-mentioned embodiments illustrate rather than limit the invention, and that those skilled in the art will be able to design alternative embodiments without departing from the scope of the appended claims. In the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The word "comprising" does not exclude the presence of elements or steps not listed in a claim. The word "a" or "an" preceding an element does not exclude the presence of a plurality of such elements. The invention may be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer. In the unit claims enumerating several means, several of these means may be embodied by one and the same item of hardware. The use of the words first, second, third, etc. do not denote any order. These words may be interpreted as names.
It will be apparent to those skilled in the art that various modifications and variations can be made to the present invention without departing from the spirit or scope of the invention. Thus, it is intended that the present invention also include such modifications and alterations insofar as they come within the scope of the appended claims or the equivalents thereof.

Claims (3)

1. The event corpus purifying method based on multi-agent reinforcement learning is characterized by comprising the following steps of:
before model training starts, the environment and the intelligent body are required to be initialized and reset, and corresponding training parameters are set;
the multi-agent reinforcement learning model generates an action set of an agent group according to the input environmental parameters;
the intelligent agent group executes the action set, and selects corresponding event knowledge from a corpus to form an event knowledge set;
mapping the event knowledge set into word vectors, and inputting the word vectors into a sequence labeling joint model of a Bi-LSTM-CRF structure;
the sequence labeling joint model of the Bi-LSTM-CRF structure labels the input word vectors, and compares the input word vectors with a test set to verify the event purifying effect of the current multi-agent strengthening model and output an evaluation index;
the data quantity in the data cache area reaches a set value, and training and updating are started to be carried out on the real networks of all the intelligent agents by using the data;
after the update of the real network is finished, updating the target networks of all the intelligent agents by an irregular parameter copying method;
extracting network parameters of each layer of each intelligent agent as parameter vectors, subtracting the parameter vectors of each layer one by one to obtain a pairwise parameter vector difference among multiple intelligent agents, and then multiplying the parameter vector difference by a differentiation factor to feed back the parameter vector difference to the updated intelligent agent, so that the final update of the intelligent agent is completed;
repeating the steps until the training times reach the preset training times.
2. The method for purifying an event corpus based on multi-agent reinforcement learning according to claim 1, wherein the model training is performed before starting, the environment and the agents are required to be initialized and reset, and the setting of corresponding training parameters specifically comprises: and carrying out data preprocessing on the event corpus, and inputting the corpus as the environmental parameters of the multi-agent reinforcement learning model.
3. The method for purifying an event corpus based on multi-agent reinforcement learning according to claim 1, wherein the step of starting training and updating the real network of all agents by using the data when the number of data in the data buffer reaches a set value specifically comprises:
and converting the evaluation index into a reward value according to a preset reward function, and feeding back to the training of the multi-agent reinforcement learning model to optimize the model.
CN202110773927.9A 2021-07-08 2021-07-08 Event corpus purification method based on multi-agent reinforcement learning Active CN113377884B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110773927.9A CN113377884B (en) 2021-07-08 2021-07-08 Event corpus purification method based on multi-agent reinforcement learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110773927.9A CN113377884B (en) 2021-07-08 2021-07-08 Event corpus purification method based on multi-agent reinforcement learning

Publications (2)

Publication Number Publication Date
CN113377884A CN113377884A (en) 2021-09-10
CN113377884B true CN113377884B (en) 2023-06-27

Family

ID=77581381

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110773927.9A Active CN113377884B (en) 2021-07-08 2021-07-08 Event corpus purification method based on multi-agent reinforcement learning

Country Status (1)

Country Link
CN (1) CN113377884B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114897168A (en) * 2022-06-20 2022-08-12 支付宝(杭州)信息技术有限公司 Fusion training method and system of wind control model based on knowledge representation learning

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111160035A (en) * 2019-12-31 2020-05-15 北京明朝万达科技股份有限公司 Text corpus processing method and device
CN112487811A (en) * 2020-10-21 2021-03-12 上海旻浦科技有限公司 Cascading information extraction system and method based on reinforcement learning

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108804715A (en) * 2018-07-09 2018-11-13 北京邮电大学 Merge multitask coordinated recognition methods and the system of audiovisual perception
JP7043373B2 (en) * 2018-09-18 2022-03-29 ヤフー株式会社 Information processing equipment, information processing methods, and programs
CN110008332B (en) * 2019-02-13 2020-11-10 创新先进技术有限公司 Method and device for extracting main words through reinforcement learning
CN109978176B (en) * 2019-03-05 2021-01-19 华南理工大学 Multi-agent cooperative learning method based on state dynamic perception
CN110110086A (en) * 2019-05-13 2019-08-09 湖南星汉数智科技有限公司 A kind of Chinese Semantic Role Labeling method, apparatus, computer installation and computer readable storage medium
CN110807069B (en) * 2019-10-23 2022-06-07 华侨大学 Entity relationship joint extraction model construction method based on reinforcement learning algorithm
CN110990590A (en) * 2019-12-20 2020-04-10 北京大学 Dynamic financial knowledge map construction method based on reinforcement learning and transfer learning
CN111312354B (en) * 2020-02-10 2023-10-24 东华大学 Mammary gland medical record entity identification marking enhancement system based on multi-agent reinforcement learning
CN111382575A (en) * 2020-03-19 2020-07-07 电子科技大学 Event extraction method based on joint labeling and entity semantic information
CN112541339A (en) * 2020-08-20 2021-03-23 同济大学 Knowledge extraction method based on random forest and sequence labeling model
CN112801290B (en) * 2021-02-26 2021-11-05 中国人民解放军陆军工程大学 Multi-agent deep reinforcement learning method, system and application

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111160035A (en) * 2019-12-31 2020-05-15 北京明朝万达科技股份有限公司 Text corpus processing method and device
CN112487811A (en) * 2020-10-21 2021-03-12 上海旻浦科技有限公司 Cascading information extraction system and method based on reinforcement learning

Also Published As

Publication number Publication date
CN113377884A (en) 2021-09-10

Similar Documents

Publication Publication Date Title
CN109741332B (en) Man-machine cooperative image segmentation and annotation method
CN110457675B (en) Predictive model training method and device, storage medium and computer equipment
CN111914644B (en) Dual-mode cooperation based weak supervision time sequence action positioning method and system
US20210342371A1 (en) Method and Apparatus for Processing Knowledge Graph
CN111127364B (en) Image data enhancement strategy selection method and face recognition image data enhancement method
CN111914085A (en) Text fine-grained emotion classification method, system, device and storage medium
CN117009490A (en) Training method and device for generating large language model based on knowledge base feedback
CN111784595B (en) Dynamic tag smooth weighting loss method and device based on historical record
CN110414005A (en) Intention recognition method, electronic device, and storage medium
CN113377884B (en) Event corpus purification method based on multi-agent reinforcement learning
CN114359659A (en) Image automatic labeling method, system and medium based on attention disturbance
CN114358117A (en) Model training method and device based on network data, electronic equipment and medium
CN114139548A (en) Spoken language understanding method based on template matching and small sample depth model
CN110287294A (en) Intellectual property concept answers method and system automatically
CN111783688B (en) Remote sensing image scene classification method based on convolutional neural network
KR102272921B1 (en) Hierarchical object detection method for extended categories
CN116777568A (en) Financial market transaction advanced intelligent dialogue ordering method, device and storage medium
Thiesson et al. Computationally efficient methods for selecting among mixtures of graphical models
CN111652269A (en) Active machine learning method and device based on crowd-sourcing interaction
US11875250B1 (en) Deep neural networks with semantically weighted loss functions
CN116467452A (en) Chinese complaint classification method based on multi-task learning hybrid neural network
CN113886602B (en) Domain knowledge base entity identification method based on multi-granularity cognition
CN113590748B (en) Emotion classification continuous learning method based on iterative network combination and storage medium
CN113987170A (en) Multi-label text classification method based on convolutional neural network
Iranmanesh et al. HGAN: Hybrid generative adversarial network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant