CN113128238A - Financial information semantic analysis method and system based on natural language processing technology - Google Patents

Financial information semantic analysis method and system based on natural language processing technology Download PDF

Info

Publication number
CN113128238A
CN113128238A CN202110469467.0A CN202110469467A CN113128238A CN 113128238 A CN113128238 A CN 113128238A CN 202110469467 A CN202110469467 A CN 202110469467A CN 113128238 A CN113128238 A CN 113128238A
Authority
CN
China
Prior art keywords
module
company
natural language
language processing
input
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110469467.0A
Other languages
Chinese (zh)
Other versions
CN113128238B (en
Inventor
方正平
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Anhui Zhiyuxin Information Technology Co ltd
Original Assignee
Anhui Zhiyuxin Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Anhui Zhiyuxin Information Technology Co ltd filed Critical Anhui Zhiyuxin Information Technology Co ltd
Priority to CN202110469467.0A priority Critical patent/CN113128238B/en
Publication of CN113128238A publication Critical patent/CN113128238A/en
Application granted granted Critical
Publication of CN113128238B publication Critical patent/CN113128238B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • G06F16/355Class or cluster creation or modification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/166Editing, e.g. inserting or deleting
    • G06F40/169Annotation, e.g. comment data or footnotes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/211Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Data Mining & Analysis (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • Evolutionary Computation (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Biophysics (AREA)
  • Databases & Information Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Machine Translation (AREA)

Abstract

The invention discloses a financial information semantic analysis method and system based on natural language processing technology, and relates to the technical field of natural language processing. According to the financial information semantic analysis method and system based on the natural language processing technology, BERT model parameters in a BERT + CRF module are fixed, CRF related model parameters are trained, after a good effect is obtained, the recognition rate is high through the combination of the BERT model and the CRF model, KW-E special characters are added before and after keywords transmitted by a splitting summary module through an adding module, at the moment, label names and summaries are spliced together through SEP characters by a splicing module, output word vectors are input into a two-layer fully-connected neural network through the RT BEmodel in a network connection module, finally 195 sigmoid binary tasks are connected behind the word vectors, and labels for correcting financial events are associated with companies through the splicing module, so that the system efficiency is improved.

Description

Financial information semantic analysis method and system based on natural language processing technology
Technical Field
The invention relates to the technical field of natural language processing, in particular to a financial information semantic analysis method and a financial information semantic analysis system based on natural language processing technology.
Background
In the current society, tens of thousands of company financial public opinion data are generated every day, and people are difficult to extract and digest the information in a short time. These financial public opinion data are automatically structured in a short time by the related technology of natural language processing, which is an important direction in the fields of computer science and artificial intelligence, to facilitate human analysis. It studies various theories and methods that enable efficient communication between humans and computers using natural language. Natural language processing is a science integrating linguistics, computer science and mathematics.
In the current financial information semantic analysis process, the identification rate of the BilSTM is low mainly based on a BilSTM + CRF model; in terms of label classification, most systems do not depend on labels on objects, the number of the labels is generally between 10 and 30, but the label classification without the dependence cannot correspond the company and the labels; for this reason, a financial information semantic analysis method and system based on natural language processing technology are provided by those skilled in the art.
Disclosure of Invention
Technical problem to be solved
Aiming at the defects of the prior art, the invention provides a financial information semantic analysis method and a system based on a natural language processing technology, which solves the problems that the recognition rate of BilSTM is low or the company and the label cannot be corresponded by label classification without dependence; too few tags may not satisfy the business requirements.
(II) technical scheme
In order to achieve the purpose, the invention is realized by the following technical scheme: a financial information semantic analysis method based on natural language processing technology specifically comprises the following steps:
s1, firstly, collecting a batch of news data from the network through a data collection module, then, using a duplication removal module to duplicate the collected news data in a simhash manner, ensuring that the duplicated data are between interval values of 9000 plus 10001, then, independently splitting sentences in each news by a splitting abstract module to be used as abstract sentences, marking the positions of company name characters in the abstract sentences, and simultaneously, transmitting the data into a BIO labeling module and an adding module;
s2, converting the positions of the company name characters into BIO labels through a BIO labeling module, marking the first character at the beginning of each company name as B, marking other characters as I, marking other characters in a sentence as O, training model parameters related to CRF when BERT model parameters in a BERT + CRF module are fixed, combining the BERT model and the CRF model for fine adjustment after a better effect is obtained, and finally obtaining a better result of F1 score, wherein F1 score is a calculation result comprehensively considering model precision and recall, the larger the F1-score is, the higher the quality of the model is naturally explained, and the company names (possibly full names, possibly short names and possibly brand names) in the data are extracted through an extracting module;
s3, according to the existing database screening module of the company, the extracted company name is corresponding to the company full name, when the existing database screening module is the company full name which is screened to be corresponding to the company name, the screening range is expanded through the network screening module until the company name is extracted from the network and is corresponding to the company full name, then the company full name is extracted through the result corresponding module and is transmitted into the splicing module, meanwhile, the adding module adds KW-E special characters before and after the key word transmitted from the splitting abstract module, at the moment, the splicing module splices the label name and the abstract together by SEP characters, the output word vector is input into the two layers of fully connected neural networks through a BERT model in the network connection module, and finally, 195 Sigmoid binary tasks are connected afterwards, the Sigmoid function is a common S-type function in biology, in the properties of simple increment and inverse function simple increment, the Sigmoid function is often used as a threshold function of a neural network, a variable is mapped between 0 and 1, a tag for correcting a financial event is associated with a company through a docking module, finally all tags of the company are counted for a period of time through a data collection module, and a marketing risk index of the company is calculated on a risk calculation module according to the weight of the tags.
The utility model provides a financial information semantic analysis system based on natural language processing technique, includes data preprocessing unit, the first output of data preprocessing unit is connected with entity identification unit's input, the second output of data preprocessing unit is connected with the first input of label classification unit, the output of entity identification unit is connected with the input that the unit was linked to the entity, the output that the unit was linked to the entity is connected with the second input of label classification unit, the output and the input of risk calculation unit of label classification unit are connected.
Preferably, the data preprocessing unit comprises a data acquisition module, a deduplication module and a splitting summary module, wherein the output end of the data acquisition module is connected with the input end of the deduplication module, and the output end of the deduplication module is connected with the input end of the splitting summary module.
Preferably, the entity identification unit includes a BIO labeling module, a BERT + CRF module, and an extracting module, an output end of the BIO labeling module is connected to an input end of the BERT + CRF module, and an output end of the BERT + CRF module is connected to an input end of the extracting module.
Preferably, the entity linking unit includes a database screening module, a network screening module and a result corresponding module, a first output end of the database screening module is connected with an input end of the network screening module, a second output end of the database screening module is connected with a first input end of the result corresponding module, and an output end of the network screening module is connected with a second input end of the result corresponding module.
Preferably, the label classification unit comprises an adding module, a splicing module, a network connection module and a butt joint module, wherein the output end of the adding module is connected with the first input end of the splicing module, the output end of the splicing module is connected with the input end of the network connection module, and the output end of the network connection module is connected with the input end of the butt joint module.
Preferably, the risk calculation unit comprises a data collection module and a risk calculation module.
Preferably, the output end of the splitting abstract module is connected with the input end of the adding module.
Preferably, the output end of the result corresponding module is connected with the second input end of the splicing module.
Preferably, the output end of the data collection module is connected with the input end of the risk calculation module.
(III) advantageous effects
The invention provides a financial information semantic analysis method and system based on natural language processing technology. The method has the following beneficial effects:
(1) according to the financial information semantic analysis method and system based on the natural language processing technology, the BERT model parameters in the BERT + CRF module are fixed, the CRF related model parameters are trained, after a better effect is obtained, the BERT model and the CRF model are combined together for fine adjustment, a better result of F1 score is finally obtained, the company name in the data is extracted through the extraction module, so that the recognition rate is higher through the combination of the BERT model and the CRF model, and the recognition effect is improved.
(2) According to the financial information semantic analysis method and system based on the natural language processing technology, the adding module is used for adding KW-E special characters before and after the key words transmitted by the splitting abstract module, at the moment, the splicing module is used for splicing the label name and the abstract together by using SEP characters, and the label for correcting the financial event is associated with a company through the network connection module and the butt joint module, so that special symbols and label classification can be added, the label of the financial event is rapidly associated with the company, and the system efficiency is improved.
(3) According to the financial information semantic analysis method and system based on the natural language processing technology, the output word vectors are input into the two-layer fully-connected neural network through a BERT model in the network connection module, and finally, 195 sigmoid binary tasks are connected behind the word vectors, so that the accuracy of correlation between the tags of financial events and companies is improved through the 195 sigmoid binary tasks.
Drawings
FIG. 1 is a system schematic block diagram of the system of the present invention;
FIG. 2 is a system schematic block diagram of a data preprocessing unit of the present invention;
FIG. 3 is a system schematic block diagram of an entity identification unit of the present invention;
FIG. 4 is a system schematic block diagram of a tag sorting unit of the present invention;
FIG. 5 is a system schematic block diagram of the entity linking unit of the present invention;
FIG. 6 is a system schematic block diagram of a risk calculation unit of the present invention;
in the figure, 1, a data preprocessing unit; 2. an entity identification unit; 3. a label classification unit; 4. an entity linking unit; 5. a risk calculation unit; 6. a data acquisition module; 7. a duplicate removal module; 8. splitting the abstract module; 9. a BIO labeling module; 10. a BERT + CRF module; 11. an extraction module; 12. a database screening module; 13. a network screening module; 14. adding a module; 15. a splicing module; 16. a network connection module; 17. a docking module; 18. a data collection module; 19. a risk calculation module; 20. the result corresponds to the module.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Referring to fig. 1-6, the embodiment of the present invention provides two technical solutions:
the first embodiment,
The financial information semantic analysis method based on the natural language processing technology specifically comprises the following steps:
s1, firstly, collecting a batch of news data from the network through the data collection module 6, then using the duplication removal module 7 to duplicate the collected news data in a simhash manner, ensuring that the duplicated data are between interval values of 9000 and 10001, then using the splitting abstract module 8 to separately split sentences in each news as abstract sentences, marking the positions of company name characters in the abstract sentences, and simultaneously transmitting the data into the BIO labeling module 9 and the adding module 14;
s2, converting the positions of the company name characters into BIO labels through a BIO labeling module 9, marking the first character at the beginning of each company name as B, marking other characters as I, marking other characters in a sentence as O, training model parameters related to CRF when BERT model parameters in a BERT + CRF module 10 are fixed, combining the BERT model and the CRF model for fine tuning after a better effect is obtained, finally obtaining a result of F1 score93.15, and extracting the company names (possibly full names, possibly short names and possibly brand names) in the data through an extracting module 11;
s3, according to the existing database screening module 12 of the company, the extracted company name is corresponding to the company full name, when the existing database screening module 12 screens the company full name corresponding to the company name, the screening range is expanded through the network screening module 13 until the company name extracted from the network is corresponding to the company full name, then the company full name is extracted and transmitted into the splicing module 15 through the result corresponding module 20, at the same time, the adding module 14 adds KW-E special characters before and after the key words transmitted from the splitting abstract module 8, at this time, the splicing module 15 splices together the label name and the abstract with SEP characters, the output word vector is input into the two-layer fully-connected neural network through the BERT model in the network connection module 16, and finally 195 sigmoid binary tasks are connected afterwards, and the label of the corrected financial event is associated with the company through the butt module 17, finally, all labels of the company in a period of time are counted through the data collection module 18, and according to the weight of the labels, the marketing risk index of the company is calculated on the risk calculation module 19.
Example II,
As a modification of the previous embodiment,
the financial information semantic analysis method based on the natural language processing technology specifically comprises the following steps:
s1, firstly, collecting a batch of news data from the network through the data collection module 6, then using the duplication removal module 7 to duplicate the collected news data in a simhash manner, ensuring that the duplicated data are between interval values of 9000 and 10001, then using the splitting abstract module 8 to separately split sentences in each news as abstract sentences, marking the positions of company name characters in the abstract sentences, and simultaneously transmitting the data into the BIO labeling module 9 and the adding module 14;
s2, converting the positions of the company name characters into BIO labels through a BIO labeling module 9, marking the first character at the beginning of each company name as B, marking other characters as I, marking other characters in a sentence as O, training model parameters related to CRF when BERT model parameters in a BERT + CRF module 10 are fixed, combining the BERT model and the CRF model for fine tuning after a better effect is obtained, finally obtaining a result of F1 score93.15, and extracting the company names (possibly full names, possibly short names and possibly brand names) in the data through an extracting module 11;
s3, according to the existing database screening module 12 of the company, the extracted company name is corresponding to the company full name, when the existing database screening module 12 screens the company full name corresponding to the company name, the screening range is expanded through the network screening module 13 until the company name extracted from the network is corresponding to the company full name, then the company full name is extracted and transmitted into the splicing module 15 through the result corresponding module 20, at the same time, the adding module 14 adds KW-E special characters before and after the key words transmitted from the splitting abstract module 8, at this time, the splicing module 15 splices together the label name and the abstract with SEP characters, the output word vector is input into the two-layer fully-connected neural network through the BERT model in the network connection module 16, and finally 195 sigmoid binary tasks are connected afterwards, and the label of the corrected financial event is associated with the company through the butt module 17, finally, all labels of the company in a period of time are counted through the data collection module 18, and according to the weight of the labels, the marketing risk index of the company is calculated on the risk calculation module 19.
As a preferred scheme, the financial information semantic analysis system based on the natural language processing technology comprises a data preprocessing unit 1, wherein a first output end of the data preprocessing unit 1 is connected with an input end of an entity identification unit 2, a second output end of the data preprocessing unit 1 is connected with a first input end of a label classification unit 3, an output end of the entity identification unit 2 is connected with an input end of an entity linking unit 4, an output end of the entity linking unit 4 is connected with a second input end of the label classification unit 3, and an output end of the label classification unit 3 is connected with an input end of a risk calculation unit 5.
As a preferred scheme, the data preprocessing unit 1 includes a data acquisition module 6, a deduplication module 7 and a split summary module 8, an output end of the data acquisition module 6 is connected with an input end of the deduplication module 7, and an output end of the deduplication module 7 is connected with an input end of the split summary module 8.
Preferably, the entity identification unit 2 includes a BIO labeling module 9, a BERT + CRF module 10, and an extracting module 11, an output end of the BIO labeling module is connected to an input end of the BERT + CRF module 10, an output end of the BERT + CRF module 10 is connected to an input end of the extracting module 11, BERT model parameters in the BERT + CRF module 10 are fixed, CRF-related model parameters are trained, after a better effect is obtained, the BERT model and the CRF model are combined together for fine tuning, and finally, a result of F1 score93.15 is obtained, and a company name (which may be a full name, a short name, or a brand name) in data is extracted through the extracting module 11.
Preferably, the entity linking unit 4 includes a database screening module 12, a network screening module 13, and a result corresponding module 20, a first output end of the database screening module 12 is connected to an input end of the network screening module 13, a second output end of the database screening module 12 is connected to a first input end of the result corresponding module 20, and an output end of the network screening module 13 is connected to a second input end of the result corresponding module 20.
As a preferred scheme, the tag classification unit 3 includes an adding module 14, a splicing module 15, a network connection module 16 and a docking module 17, an output end of the adding module 14 is connected to a first input end of the splicing module 15, an output end of the splicing module 15 is connected to an input end of the network connection module 16, an output end of the network connection module 16 is connected to an input end of the docking module 17, an output end of the splitting summary module 8 is connected to an input end of the adding module 14, and as a result, an output end of the corresponding module 20 is connected to a second input end of the splicing module 15.
Preferably, the risk calculating unit 5 comprises a data collecting module 18 and a risk calculating module 19, and an output end of the data collecting module 18 is connected with an input end of the risk calculating module 19.
The advantages of the second embodiment over the first embodiment are: the BIO labeling module 9 and the BERT + CRF module 10 enable the recognition rate to be high, output word vectors are input into a two-layer fully-connected neural network through a BERT model in the network connection module 16, 195 sigmoid binary classification tasks are finally connected behind the word vectors, tags for correcting financial events are associated with companies through the docking module 17, and the system efficiency is improved.
It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation. The statement that an element defined by the phrase "comprises an … … does not exclude the presence of other like elements in the process, method, article, or apparatus that comprises the element.
Although embodiments of the present invention have been shown and described, it will be appreciated by those skilled in the art that changes, modifications, substitutions and alterations can be made in these embodiments without departing from the principles and spirit of the invention, the scope of which is defined in the appended claims and their equivalents.

Claims (10)

1. The financial information semantic analysis method based on the natural language processing technology is characterized by comprising the following steps: the method specifically comprises the following steps:
s1, firstly, collecting a batch of news data from the network through a data collection module (6), then using a duplication removal module (7) to duplicate the collected news data in a simhash manner, ensuring that the duplicated data are between interval values of 9000 plus 10001, then using a splitting abstract module (8) to separately split sentences in each news, taking the sentences as abstract sentences, marking the positions of company name characters in the abstract sentences, and simultaneously transmitting the data into a BIO labeling module (9) and an adding module (14);
s2, converting the positions of the company name characters into BIO labels through a BIO labeling module (9), wherein the first character at the beginning of each company name is labeled as B, other characters are labeled as I, other characters in a sentence are labeled as O, when BERT model parameters in a BERT + CRF module (10) are fixed, the related model parameters of CRF are trained, after a better effect is obtained, the BERT model and the CRF model are combined together for fine adjustment, a better result of F1 score is finally obtained, and the company names in the data are extracted through an extracting module (11);
s3, then, according to the existing database screening module (12) of the company, the extracted company name is corresponding to the company full name, when the existing database screening module (12) screens the company full name corresponding to the company name, the screening range is expanded through the network screening module (13) until the company name extracted from the network is corresponding to the company full name, then the company full name is extracted through the result corresponding module (20) and is transmitted into the splicing module (15), meanwhile, the adding module (14) adds KW-E special characters before and after the key words transmitted from the splitting abstract module (8), at the moment, the splicing module (15) splices the label name and the abstract together with SEP characters, the output word vector is input into the two-layer fully-connected neural network through the BERT model in the network connection module (16), finally, 195 sigmoid binary tasks are followed, the tags for correcting financial events are associated with the company through a docking module (17), all tags of the company in a period of time are counted through a data collecting module (18), and the marketing risk index of the company is calculated on a risk calculating module (19) according to the weight of the tags.
2. A financial intelligence semantic analysis system based on natural language processing technology according to claim 1, comprising a data preprocessing unit (1), characterized by: the first output of data preprocessing unit (1) is connected with the input of entity identification unit (2), the second output of data preprocessing unit (1) is connected with the first input of label classification unit (3), the output of entity identification unit (2) is connected with the input that entity links unit (4), the output that entity links unit (4) is connected with the second input of label classification unit (3), the output of label classification unit (3) is connected with the input of risk calculation unit (5).
3. The natural language processing technology based financial intelligence semantic analysis system of claim 2, wherein: the data preprocessing unit (1) comprises a data acquisition module (6), a duplication removing module (7) and a splitting abstract module (8), the output end of the data acquisition module (6) is connected with the input end of the duplication removing module (7), and the output end of the duplication removing module (7) is connected with the input end of the splitting abstract module (8).
4. The natural language processing technology based financial intelligence semantic analysis system of claim 2, wherein: the entity identification unit (2) comprises a BIO labeling module (9), a BERT + CRF module (10) and an extraction module (11), wherein the output end of the BIO labeling module is connected with the input end of the BERT + CRF module (10), and the output end of the BERT + CRF module (10) is connected with the input end of the extraction module (11).
5. The natural language processing technology based financial intelligence semantic analysis system of claim 2, wherein: the entity linking unit (4) comprises a database screening module (12), a network screening module (13) and a result corresponding module (20), wherein a first output end of the database screening module (12) is connected with an input end of the network screening module (13), a second output end of the database screening module (12) is connected with a first input end of the result corresponding module (20), and an output end of the network screening module (13) is connected with a second input end of the result corresponding module (20).
6. The system of claim 5, wherein the system comprises: the label classification unit (3) is including adding module (14), concatenation module (15), network connection module (16) and butt joint module (17), the output that adds module (14) is connected with the first input of concatenation module (15), the output of concatenation module (15) is connected with the input of network connection module (16), the output of network connection module (16) is connected with the input of butt joint module (17).
7. The natural language processing technology based financial intelligence semantic analysis system of claim 2, wherein: the risk calculation unit (5) comprises a data collection module (18) and a calculated risk module (19).
8. The natural language processing technology based financial intelligence semantic analysis system of claim 6, wherein: the output end of the splitting abstract module (8) is connected with the input end of the adding module (14).
9. The natural language processing technology based financial intelligence semantic analysis system of claim 6, wherein: the output end of the result corresponding module (20) is connected with the second input end of the splicing module (15).
10. The natural language processing technology based financial intelligence semantic analysis system of claim 7, wherein: the output end of the data collection module (18) is connected with the input end of the risk calculation module (19).
CN202110469467.0A 2021-04-28 2021-04-28 Financial information semantic analysis method and system based on natural language processing technology Active CN113128238B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110469467.0A CN113128238B (en) 2021-04-28 2021-04-28 Financial information semantic analysis method and system based on natural language processing technology

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110469467.0A CN113128238B (en) 2021-04-28 2021-04-28 Financial information semantic analysis method and system based on natural language processing technology

Publications (2)

Publication Number Publication Date
CN113128238A true CN113128238A (en) 2021-07-16
CN113128238B CN113128238B (en) 2023-06-20

Family

ID=76780582

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110469467.0A Active CN113128238B (en) 2021-04-28 2021-04-28 Financial information semantic analysis method and system based on natural language processing technology

Country Status (1)

Country Link
CN (1) CN113128238B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114490626A (en) * 2022-04-18 2022-05-13 成都数融科技有限公司 Financial information analysis method and system based on parallel computing

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR102111735B1 (en) * 2018-11-29 2020-05-15 주식회사 솔트룩스 Automatic Question-Answering system having multiple Question-Answering modules
CN112084790A (en) * 2020-09-24 2020-12-15 中国民航大学 Relation extraction method and system based on pre-training convolutional neural network
CN112257443A (en) * 2020-09-30 2021-01-22 华泰证券股份有限公司 MRC-based company entity disambiguation method combined with knowledge base
CN112560484A (en) * 2020-11-09 2021-03-26 武汉数博科技有限责任公司 Improved BERT training model and named entity recognition method and system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR102111735B1 (en) * 2018-11-29 2020-05-15 주식회사 솔트룩스 Automatic Question-Answering system having multiple Question-Answering modules
CN112084790A (en) * 2020-09-24 2020-12-15 中国民航大学 Relation extraction method and system based on pre-training convolutional neural network
CN112257443A (en) * 2020-09-30 2021-01-22 华泰证券股份有限公司 MRC-based company entity disambiguation method combined with knowledge base
CN112560484A (en) * 2020-11-09 2021-03-26 武汉数博科技有限责任公司 Improved BERT training model and named entity recognition method and system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
谢腾;杨俊安;刘辉;: "基于BERT-BiLSTM-CRF模型的中文实体识别", 计算机***应用, no. 07 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114490626A (en) * 2022-04-18 2022-05-13 成都数融科技有限公司 Financial information analysis method and system based on parallel computing
CN114490626B (en) * 2022-04-18 2022-08-16 成都数融科技有限公司 Financial information analysis method and system based on parallel computing

Also Published As

Publication number Publication date
CN113128238B (en) 2023-06-20

Similar Documents

Publication Publication Date Title
CN111708773B (en) Multi-source scientific and creative resource data fusion method
CN110807328A (en) Named entity identification method and system oriented to multi-strategy fusion of legal documents
CN111783394A (en) Training method of event extraction model, event extraction method, system and equipment
CN111159336B (en) Semi-supervised judicial entity and event combined extraction method
CN112214614B (en) Knowledge-graph-based risk propagation path mining method and system
CN115470871B (en) Policy matching method and system based on named entity recognition and relation extraction model
CN111209362A (en) Address data analysis method based on deep learning
CN111967267A (en) XLNET-based news text region extraction method and system
CN111782793A (en) Intelligent customer service processing method, system and equipment
CN112380868A (en) Petition-purpose multi-classification device based on event triples and method thereof
CN112749283A (en) Entity relationship joint extraction method for legal field
CN114297987A (en) Document information extraction method and system based on text classification and reading understanding
CN113869055A (en) Power grid project characteristic attribute identification method based on deep learning
CN115953788A (en) Green financial attribute intelligent identification method and system based on OCR (optical character recognition) and NLP (non-line-segment) technologies
Zhao RETRACTED ARTICLE: Application of deep learning algorithm in college English teaching process evaluation
CN111460147A (en) Title short text classification method based on semantic enhancement
CN113128238B (en) Financial information semantic analysis method and system based on natural language processing technology
CN109446522B (en) Automatic test question classification system and method
CN114328841A (en) Question-answer model training method and device, question-answer method and device
CN116842142B (en) Intelligent retrieval system for medical instrument
CN115618085B (en) Interface data exposure detection method based on dynamic tag
CN114492362B (en) Method and system for generating research and report questions and answers and computer readable storage medium
CN115270774A (en) Big data keyword dictionary construction method for semi-supervised learning
CN112488593B (en) Auxiliary bid evaluation system and method for bidding
CN116049385B (en) Method, device, equipment and platform for generating information and create industry research report

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant