CN116150323B - Text language data processing method based on artificial intelligence - Google Patents

Text language data processing method based on artificial intelligence Download PDF

Info

Publication number
CN116150323B
CN116150323B CN202310440514.8A CN202310440514A CN116150323B CN 116150323 B CN116150323 B CN 116150323B CN 202310440514 A CN202310440514 A CN 202310440514A CN 116150323 B CN116150323 B CN 116150323B
Authority
CN
China
Prior art keywords
bidding
auditing
document
entity
bidding document
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310440514.8A
Other languages
Chinese (zh)
Other versions
CN116150323A (en
Inventor
丁靖
骆国荣
鲍宇
赵明
赵春阳
任新蕊
吴利梅
胡红红
胡兵
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
State Grid Siji Location Service Co ltd
Tianjin Richsoft Electric Power Information Technology Co ltd
Original Assignee
Tianjin Richsoft Electric Power Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin Richsoft Electric Power Information Technology Co ltd filed Critical Tianjin Richsoft Electric Power Information Technology Co ltd
Priority to CN202310440514.8A priority Critical patent/CN116150323B/en
Publication of CN116150323A publication Critical patent/CN116150323A/en
Application granted granted Critical
Publication of CN116150323B publication Critical patent/CN116150323B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/3332Query translation
    • G06F16/3335Syntactic pre-processing, e.g. stopword elimination, stemming
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/226Validation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/08Auctions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/06Energy or water supply
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y04INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
    • Y04SSYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
    • Y04S10/00Systems supporting electrical power generation, transmission or distribution
    • Y04S10/50Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention relates to the technical field of text language processing, and discloses a text language data processing method based on artificial intelligence, which comprises the steps of uploading a bidding document, machine auditing of the bidding document, expert spot checking auditing, evaluating accuracy of machine auditing, returning machine auditing judgment and outputting.

Description

Text language data processing method based on artificial intelligence
Technical Field
The invention relates to the technical field of text language processing, in particular to a text language data processing method based on artificial intelligence.
Background
It is well known that purchasing of electric power supplies is an extremely important link in the process of power grid construction, and the purchasing of electric power supplies directly affects the total investment amount, the construction progress and the quality of the whole electric power engineering to a great extent.
In recent years, with the increasing power of power grid construction of government, the bidding work of an electric power system is more and more emphasized, and the method is characterized in that bidding scope is enlarged for ensuring bidding quality, so that bidding workers of electric power bidding are increased, the number of bidding documents is huge, meanwhile, in view of the particularity of the electric power system, bidding of the electric power system is different from bidding of other types of materials, technical requirements and text specification requirements of bidding documents are higher, but the text quality of bidding documents is different due to the enlargement of bidding scope nowadays, a large number of bidding documents deviating from specification texts exist, under the condition, front end examination of text specification needs to be carried out on bidding documents, but at present, the text specification examination of bidding documents mainly depends on manual examination of experts by electric enterprises, the huge bidding documents greatly aggravate the examination workload of the experts, the examination efficiency is low, and the manual examination cost is also increased.
Along with the rapid development of text language processing technology, in order to remedy the defects, the text language processing technology is applied to bid auditing at present, so that machine auditing of bidding documents is realized, auditing efficiency is greatly improved, but the auditing process is too solidified due to the fact that a single auditing standard is executed in machine auditing, and flexibility is lacking, so that scientificity and accuracy of auditing results cannot be effectively ensured.
In addition, the text auditing emphasis of the electric power bidding documents is generally on the aspect of text term auditing at present, the rationality auditing of the electric power bidding documents is ignored, so that the text auditing coverage is too narrow, the rationality of the text plays a decisive role in the available value of the electric power bidding documents, when the rationality auditing is lacking, even if the machine auditing accuracy is higher, the availability of the electric power bidding documents screened out in the auditing mode cannot be ensured, and the utility of auditing results is easily weakened.
Disclosure of Invention
Aiming at the problems, the invention aims to provide the text language data processing method based on the artificial intelligence, which combines machine auditing and expert auditing in the process of auditing the bidding documents, thereby improving auditing efficiency, improving auditing accuracy and effectively solving the problems mentioned in the background art.
The aim of the invention can be achieved by the following technical scheme: a text language data processing method based on artificial intelligence comprises the following steps: (1) bid document upload: and collecting all bidding documents corresponding to the designated power system bidding projects, and uploading the bidding documents to a machine auditing terminal.
(2) Bidding document machine audit: an original auditing corpus is built, and text auditing is carried out on each bidding document by a machine auditing terminal by means of the original auditing corpus, and the concrete implementation process is as follows: (21) conducting a text term specification audit on each bid document.
(22) And performing text scheme rationality examination on each bidding document.
(23) And counting the machine auditing compliance corresponding to each bidding document based on the text phrase specification auditing result and the text scheme rationality auditing result.
(3) Expert spot checking and auditing: and extracting a plurality of spot check bidding documents from all the bidding documents according to a set extraction principle, further manually marking and checking each spot check bidding document by an expert, and counting the manual checking compliance corresponding to each spot check bidding document.
(4) Evaluating machine auditing accuracy: and comparing the manual auditing compliance degree corresponding to each spot check bidding document with the machine auditing compliance degree of the corresponding bidding document, thereby evaluating the machine auditing accuracy.
(5) Returning to audit judgment: comparing the machine auditing accuracy with a preset accuracy threshold, outputting the machine auditing compliance of each bidding document if the machine auditing accuracy is greater than the preset accuracy threshold, otherwise, returning other bidding documents except the spot check bidding document to audit, wherein the specific operation flow is to extract a correction auditing standard from expert auditing results corresponding to each spot check bidding document, supplement the correction auditing standard into an original auditing corpus, and further continue to perform machine auditing again, expert spot check again and machine auditing accuracy assessment again according to (2) - (4).
(6) And (3) outputting: and comparing the re-machine auditing accuracy with a preset accuracy threshold until the re-machine auditing accuracy is greater than the preset accuracy threshold, and outputting the machine auditing compliance of the bidding document.
According to a further object of the invention, the original audit corpus comprises a power bidding document sensitive language library and a power equipment technical standard language library.
According to a further object of the invention, the text term specification auditing comprises text technical term specification auditing and text expression term specification auditing, wherein the text technical term specification auditing is implemented by performing sentence breaking and word stopping removal processing on text information of each bidding document to obtain the substantial text information of each bidding document.
And carrying out entity identification and entity type marking on the essence text information of each bidding document according to the clauses to obtain the entity and entity type corresponding to each clause in the essence text information of each bidding document.
And selecting an entity corresponding to the entity type of the power equipment from the entities corresponding to the entity text information identification of each bidding document, taking the entity as an effective entity, and classifying clauses belonging to the same effective entity in the same bidding document to obtain a clause set corresponding to each effective entity in each bidding document.
And extracting the power equipment to be bid from the bid document corresponding to the specified power system bid project, taking the power equipment as a bid main body, and counting the number of the bid main bodies.
And matching each bidding subject with effective entities in each bidding document, and screening successfully matched entities from the effective entities to serve as key entities.
And integrating text contents of each clause in a clause set corresponding to each key entity in each bidding document to obtain integrated text information corresponding to each key entity in each bidding document.
Extracting expression expressions of the technical parameters corresponding to the key entities from the integrated text information corresponding to the key entities in the bidding documents, extracting standard expression expressions of the technical parameters corresponding to the key entities from a technical standard expression library of the power equipment based on the names of the key entities,
Comparing the expression term of each key entity corresponding to each technical parameter in each bidding document with the standard expression term of the key entity corresponding to the technical parameter in the technical standard term library of the power equipment, if the expression term of the key entity corresponding to a certain technical parameter is inconsistent, marking the expression term inconsistent, marking the key entity as a target entity, and marking the technical parameter as a target technical parameter.
According to a further object of the present invention, the text expression term specification auditing is implemented as follows: and extracting entity types corresponding to each bidding sensitive word from the electric bidding document sensitive term library, and performing de-duplication processing on the entity types to obtain the bidding document sensitive entity types.
And comparing the entity type corresponding to each clause identified in the text information corresponding to each bidding document with the sensitive entity type of the bidding document, and if the entity type corresponding to a clause is consistent with the sensitive entity type of the bidding document, marking the clause as a sensitive clause.
And thirdly, matching the entity corresponding to each sensitive clause in the corresponding substantial text information of each bidding document with a plurality of bidding sensitive words stored in the power bidding document sensitive word library, and if the entity corresponding to a certain sensitive clause is successfully matched with a certain bidding sensitive word, marking the sensitive clause with improper words.
According to a further object of the present invention, the specific implementation process of the text scheme rationality audit for each bidding document is as follows: and matching a reference bidding project from the historical bidding projects based on the designated power system bidding projects, further extracting the display data of the technical parameters corresponding to each bidding subject in the bidding document corresponding to the reference bidding project, and taking the display data as the reference display data of the technical parameters corresponding to each bidding subject.
And extracting display data of each technical parameter corresponding to each key entity in each bidding document from the integrated text information corresponding to each key entity in each bidding document, comparing the display data with reference display data of the technical parameter corresponding to the corresponding bidding subject, if the display data of the technical parameter corresponding to a certain key entity is inconsistent, carrying out unreasonable marking, marking the key entity as an abnormal entity, and marking the technical parameter as an abnormal technical parameter.
According to a further object of the present invention, the machine auditing compliance statistics process corresponding to each bidding document is as follows: (231) Counting the number of the non-compliance marks of the expressions in each bidding document, extracting target entities and target technical parameters corresponding to the non-compliance marks of the expressions, and further obtaining the composition importance degree of the target entities and the use value degree of the target technical parameters corresponding to the non-compliance marks of the expressions
Figure SMS_1
Calculating text technical term compliance corresponding to each bidding document>
Figure SMS_2
,/>
Figure SMS_3
、/>
Figure SMS_4
The weight factors of the corresponding target entities and the using value of the target technical parameters are respectively expressed as the jth expression disagreement marks in the ith bidding document, wherein i is expressed as the number of the bidding document, and- >
Figure SMS_5
J is expressed as the number of the discordance mark of the term existing in each bidding document, ++>
Figure SMS_6
(232) Counting the number of misexpression marks in each bidding document, extracting bidding sensitive words matched by misexpression marks at all places, acquiring weighing factors corresponding to misexpression marks at all places, and utilizing a formula
Figure SMS_7
Calculating text term compliance corresponding to each bidding document>
Figure SMS_8
Wherein->
Figure SMS_9
Expressed as a trade-off factor corresponding to the k-th misuse marker in the ith bid document, k being expressed as the misuse marker number,/-present in each bid document>
Figure SMS_10
E is expressed as a natural constant.
(233) Counting the number of unreasonable marks existing in each bidding document, and substituting display data and reference display data of abnormal technical parameters of abnormal entities corresponding to unreasonable marks everywhere into a formula
Figure SMS_11
Calculating the reasonable degree of the text scheme corresponding to each bidding document>
Figure SMS_12
,/>
Figure SMS_13
、/>
Figure SMS_14
Respectively expressed as the corresponding anomaly of the irrational mark at the f-th position in the ith bidding documentPresentation data of an entity belonging to an abnormal technical parameter, reference presentation data,>
Figure SMS_15
a weight factor expressed as an unreasonable tag corresponding to an abnormal entity at f in the ith bid document, where f is expressed as an unreasonable tag number present in each bid document,/- >
Figure SMS_16
(234) Will be
Figure SMS_17
、/>
Figure SMS_18
And->
Figure SMS_19
Introducing a compliance audit model ++>
Figure SMS_20
Calculating to obtain the machine auditing compliance degree corresponding to each bidding document>
Figure SMS_21
According to a further object of the invention, the
Figure SMS_22
And->
Figure SMS_23
The specific acquisition process of (a) is as follows: and determining the composition importance of the target entity corresponding to the wording disflag at each place and the composition importance of the abnormal entity corresponding to the disflag at each place based on the designated power system bidding project.
And extracting the bid amount of the corresponding entity from the bid document corresponding to the specified power system bid project.
Using expressions
Figure SMS_24
Performing calculation of>
Figure SMS_25
、/>
Figure SMS_26
Respectively expressed as the constitution importance degree, bid amount and/or the number of abnormal entities corresponding to the jth term disagreement mark and the f irrational mark in the ith bid document>
Figure SMS_27
、/>
Figure SMS_28
Respectively expressed as the bid amount of the target entity corresponding to the jth term disagreement mark and the abnormal entity corresponding to the f unreasonable mark in the ith bid document.
According to a further object of the present invention, the manual labeling and auditing method for each spot check bidding document by an expert comprises the following steps: the expert manually marks the misexpression, misexpression and unreasonable marks in the process of auditing the bidding documents of each spot check, and annotates the marks.
According to a further object of the present invention, the evaluation expression of the machine audit accuracy is
Figure SMS_29
Wherein->
Figure SMS_30
、/>
Figure SMS_31
The compliance degree of machine audit and the compliance degree of manual audit corresponding to the d-th spot check bidding document are respectively expressed, and d is expressed as the number of the spot check bidding document and +.>
Figure SMS_32
U represents the number of spot-check bidding documents.
According to a further object of the invention, the implementation process of the expert spot check again is as follows: acquiring the number of spot checks corresponding to the last expert spot checkMachine auditing accuracy and substituting formula
Figure SMS_33
Obtaining the number of the random samples corresponding to the expert random samples again>
Figure SMS_34
Wherein->
Figure SMS_35
Representing the number of the random checks corresponding to the previous expert random check,>
Figure SMS_36
machine auditing accuracy rate corresponding to last expert spot check is expressed as +.>
Figure SMS_37
Represented as a preset accuracy threshold.
By combining all the technical schemes, the invention has the advantages and positive effects that: 1. according to the invention, an original audit corpus is constructed in the process of compliance audit of the electric bidding documents, so that the bidding documents are subjected to machine audit firstly and then expert auxiliary audit, and further the machine audit is corrected and guided according to the expert auxiliary audit result, so that the audit efficiency is improved, the audit accuracy is ensured, the manual audit cost is reduced to a certain extent, and the electric bidding documents are provided with great practical advantages.
2. According to the invention, the rationality audit of the text scheme is added in the text audit process of the electric bidding document, and compared with the text term audit only, the text audit coverage of the bidding document is greatly expanded by the audit mode, so that the text audit of the electric bidding document is more comprehensive and effective, unreasonable parts in the bidding document can be found in time, and the effectiveness of the text audit of the electric bidding document is improved.
3. According to the invention, when the machine audit is corrected and guided according to the expert auxiliary audit result, the machine audit is corrected in a circulation mode of expert spot check audit, audit result feedback and spot check audit again, so that the machine audit is corrected for multiple times, the machine audit strengthening training is realized, audit errors can be directly hit, each correction is accurate and powerful, the correction effect is improved, and multi-level and deep guarantee is provided for the machine audit accuracy.
4. According to the invention, the bid document is audited in the machine audit and expert audit processes in a marked mode, so that the audit process is more visual and clear, the statistical efficiency of the bid document audit compliance is improved to the maximum extent, the correction of audit standards is convenient to quickly and accurately refine, the return of the machine audit process is facilitated, and on the other hand, the marked content can realize the retention traceability, and related references are provided for the follow-up machine audit improvement.
Drawings
The invention will be further described with reference to the accompanying drawings, in which embodiments do not constitute any limitation of the invention, and other drawings can be obtained by one of ordinary skill in the art without inventive effort from the following drawings.
FIG. 1 is a flow chart of the steps of the method of the present invention.
FIG. 2 is a flow chart of the return machine audit operation of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Referring to fig. 1, the invention provides a text language data processing method based on artificial intelligence, which comprises the following steps: (1) bid document upload: and collecting all bidding documents corresponding to the designated power system bidding projects, and uploading the bidding documents to a machine auditing terminal.
(2) Bidding document machine audit: an original auditing corpus is built, and text auditing is carried out on each bidding document by a machine auditing terminal through the original auditing corpus.
In a preferred scheme, the original audit corpus comprises a power bidding document sensitive phrase library and a power equipment technical standard phrase library, wherein the power bidding document sensitive phrase library is used for storing common bidding sensitive words of a power bidding document and sensitive characterization values corresponding to the bidding sensitive words, and the common bidding sensitive words comprise brands, personnel names, place names, road names and the like.
The standard expression library of the technical standard of the power equipment is used for storing standard expression expressions of various technical parameters corresponding to the power equipment, and the standard expression of the technical parameter of the transformer is exemplified by a transformer, and the standard expression of the technical parameter of the transformer comprises the highest running working temperature, short-time load, allowable voltage fluctuation and the like.
The text auditing method for each bidding document comprises the following specific implementation processes: (21) And performing text term specification auditing on each bidding document, wherein the text term specification auditing comprises text technical term specification auditing and text expression term specification auditing, and the text technical term specification auditing comprises the following specific implementation processes of performing sentence breaking and word stopping processing on text information of each bidding document to obtain the substantial text information of each bidding document.
It should be understood that the reason why the text information of each bidding document is subjected to the sentence breaking processing is that punctuation marks in the text information play a role of stopping and dividing, especially periods, and the content in one sentence is generally set forth around a theme.
It is further to be understood that the term "deactivated" mentioned above refers to words that have no meaning and no obvious category characteristics, including but not limited to "ground", "having" since the deactivated word itself does not have a significant impact on the digitized text semantic content, but affects the efficiency of the censoring. Therefore, by adopting the stop word removing method, the stop words in the bidding text are filtered and removed, so that the length of the bidding text can be shortened, the auditing efficiency of a machine can be improved, the adverse effect of the stop words on the auditing result can be reduced, and the accuracy of the auditing result can be improved.
And carrying out entity identification and entity type marking on the essence text information of each bidding document according to the clauses to obtain the entity and entity type corresponding to each clause in the essence text information of each bidding document.
It should be noted that, the above-mentioned entities refer to objects that exist objectively and can be distinguished from each other, and the entity types are a concept that is defined manually according to requirements, and are used to distinguish the category of the named entity so as to distinguish between treatment and use, where the entity identification can be performed by using a named entity identification model, and the commonly used entity types include time, place, product name, and the like.
And selecting an entity corresponding to the entity type of the power equipment from the entities corresponding to the entity text information identification of each bidding document, taking the entity as an effective entity, and classifying clauses belonging to the same effective entity in the same bidding document to obtain a clause set corresponding to each effective entity in each bidding document.
And extracting the power equipment to be bid from the bid document corresponding to the specified power system bid project, taking the power equipment as a bid main body, and counting the number of the bid main bodies.
And matching each bidding subject with effective entities in each bidding document, and screening successfully matched entities from the effective entities to serve as key entities.
And integrating text contents of each clause in a clause set corresponding to each key entity in each bidding document to obtain integrated text information corresponding to each key entity in each bidding document.
And extracting expression expressions of the technical parameters corresponding to the key entities from the integrated text information corresponding to the key entities in the bidding documents, and extracting standard expression expressions of the technical parameters corresponding to the key entities from the technical standard expression library of the power equipment based on the names of the key entities.
Comparing the expression term of each key entity corresponding to each technical parameter in each bidding document with the standard expression term of the key entity corresponding to the technical parameter in the technical standard term library of the power equipment, if the expression term of the key entity corresponding to a certain technical parameter is inconsistent, marking the expression term inconsistent, marking the key entity as a target entity, and marking the technical parameter as a target technical parameter.
In the exemplary embodiment, the transformer is taken as a key entity, the technical parameter of the transformer is short-term load, and when the corresponding technical parameter of the transformer in a bidding document is short-term load, the expression term is inconsistent with the short-term load, and then the short-term load is the target technical parameter.
The specific implementation process of the Chinese text expression term specification audit is as follows: the first step is to extract the entity type corresponding to each bidding sensitive word from the electric bidding document sensitive term library, for example, when the bidding sensitive word is a road name, the entity type corresponding to the bidding sensitive word is a road name, and to perform de-duplication processing to obtain the bidding document sensitive entity type.
And comparing the entity type corresponding to each clause identified in the text information corresponding to each bidding document with the sensitive entity type of the bidding document, and if the entity type corresponding to a clause is consistent with the sensitive entity type of the bidding document, marking the clause as a sensitive clause.
And thirdly, matching the entity corresponding to each sensitive clause in the corresponding substantial text information of each bidding document with a plurality of bidding sensitive words stored in the power bidding document sensitive word library, and if the entity corresponding to a certain sensitive clause is successfully matched with a certain bidding sensitive word, marking the sensitive clause with improper words.
(22) The text scheme rationality audit is carried out on each bidding document, and the specific implementation process is as follows: the construction scale of the specified power system bidding project is extracted from the bidding document corresponding to the specified power system bidding project, so that the reference bidding project is matched from the historical bidding project, and further, the display data of the technical parameters corresponding to each bidding subject in the bidding document corresponding to the reference bidding project are extracted and used as the reference display data of the technical parameters corresponding to each bidding subject.
In the specific embodiment of the invention, the matching process of the reference bidding project is to screen the history bidding project which is the same as the designated power system bidding project from the history bidding project by the designated power system bidding project, and the history bidding project is marked as the history association bidding project.
The construction scale of each history associated bidding project and the construction scale of the designated power system bidding project are passed through a similarity formula
Figure SMS_38
And obtaining the similarity corresponding to each history associated bidding project, comparing the similarity with a similarity threshold value, and screening the history associated bidding project with the highest similarity from the similarity threshold value to serve as a reference bidding project.
And extracting display data of each technical parameter corresponding to each key entity in each bidding document from the integrated text information corresponding to each key entity in each bidding document, comparing the display data with reference display data of the technical parameter corresponding to the corresponding bidding subject, if the display data of the technical parameter corresponding to a certain key entity is inconsistent, carrying out unreasonable marking, marking the key entity as an abnormal entity, and marking the technical parameter as an abnormal technical parameter.
(23) Based on text expression standard auditing results and text scheme rationality auditing results, the machine auditing compliance corresponding to each bidding document is counted, and the specific counting process is as follows:
(231) Counting the number of the non-compliance marks of the expressions in each bidding document, extracting target entities and target technical parameters corresponding to the non-compliance marks of the expressions, and further obtaining the composition importance degree of the target entities and the use value degree of the target technical parameters corresponding to the non-compliance marks of the expressions
Figure SMS_39
Calculating text technical term compliance corresponding to each bidding document>
Figure SMS_40
,/>
Figure SMS_41
、/>
Figure SMS_42
The weight factors of the corresponding target entities and the using value of the target technical parameters are respectively expressed as the jth expression disagreement marks in the ith bidding document, wherein i is expressed as the number of the bidding document, and->
Figure SMS_43
J is expressed as the number of the discordance mark of the term existing in each bidding document, ++>
Figure SMS_44
The specific acquisition mode of the using value of the target technical parameter is to match the target entity and the corresponding target technical parameter with the using value of each technical parameter corresponding to each power equipment in the power equipment reference information base, and the using value of the corresponding target technical parameter is marked by the term inconsistency at each place matched with the using value of the corresponding target technical parameter.
(232) Counting the number of misexpression marks in each bidding document, extracting bidding sensitive words matched by misexpression marks at all places, acquiring weighing factors corresponding to misexpression marks at all places, and utilizing a formula
Figure SMS_45
Calculating text term compliance corresponding to each bidding document>
Figure SMS_46
Wherein->
Figure SMS_47
Expressed as corresponding to the kth misuse mark in the ith bid documentTrade-off factor, k, represents the misuse of the mark number of the term present in each bidding document, ++>
Figure SMS_48
E is expressed as a natural constant.
The weighing factor obtaining mode corresponding to the misuse marks of the words is that the bid sensitive words matched with the misuse marks of the words in the bid documents are matched with the sensitive characterization values corresponding to the bid sensitive words stored in the power bid document sensitive word library, and then the successfully matched sensitive characterization values are used as the weighing factors corresponding to the misuse marks of the words.
(233) Counting the number of unreasonable marks existing in each bidding document, and substituting display data and reference display data of abnormal technical parameters of abnormal entities corresponding to unreasonable marks everywhere into a formula
Figure SMS_49
Calculating the reasonable degree of the text scheme corresponding to each bidding document>
Figure SMS_50
,/>
Figure SMS_51
、/>
Figure SMS_52
Display data and reference display data respectively expressed as abnormal technical parameters corresponding to abnormal entities at f position in ith bidding document>
Figure SMS_53
A weight factor expressed as an unreasonable tag corresponding to an abnormal entity at f in the ith bid document, where f is expressed as an unreasonable tag number present in each bid document,/- >
Figure SMS_54
In a preferred embodiment of the present invention,
Figure SMS_55
and->
Figure SMS_56
The specific acquisition process of (a) is as follows:
and determining the composition importance of the target entity corresponding to the wording disflag at each place and the composition importance of the abnormal entity corresponding to the disflag at each place based on the designated power system bidding project.
The specific operation mode for determining the composition importance degree in the technical scheme is that action types corresponding to the power equipment are extracted from the bidding documents corresponding to the bidding projects of the designated power system, and the action types are matched with the preset composition importance degrees corresponding to the various action types, so that the composition importance degree corresponding to the power equipment in the bidding projects of the designated power system is obtained.
As a specific example of the above-described scheme, assume that the specified power system bid-inviting item is a power transmission and transformation system, and power equipment included on the power transmission and transformation system has a transformer, a fuse, a relay, and an operation switch, wherein the action type of the transformer is voltage regulation, the action type of the fuse is circuit protection, the action type of the relay is an expansion control range, and the action type of the operation switch is to turn on and off a circuit.
And comparing the power equipment of the target entity corresponding to the each term inconsistent mark and the abnormal entity corresponding to the each unreasonable mark in each bidding document with the composition importance corresponding to each power equipment in the bidding project of the designated power system, thereby determining the composition importance of the target entity corresponding to the each term inconsistent mark and the composition importance of the abnormal entity corresponding to the each unreasonable mark.
And extracting the bid amount of the corresponding entity from the bid document corresponding to the specified power system bid project.
Using expressions
Figure SMS_57
Performing calculation of>
Figure SMS_58
、/>
Figure SMS_59
Respectively expressed as the constitution importance degree, bid amount and/or the number of abnormal entities corresponding to the jth term disagreement mark and the f irrational mark in the ith bid document>
Figure SMS_60
、/>
Figure SMS_61
Respectively expressed as the bid amount of the target entity corresponding to the jth term disagreement mark and the abnormal entity corresponding to the f unreasonable mark in the ith bid document.
(234) Will be
Figure SMS_62
、/>
Figure SMS_63
And->
Figure SMS_64
Introducing a compliance audit model ++>
Figure SMS_65
Calculating to obtain the machine auditing compliance degree corresponding to each bidding document>
Figure SMS_66
According to the invention, the rationality audit of the text scheme is added in the text audit process of the electric bidding document, and compared with the text term audit only, the text audit coverage of the bidding document is greatly expanded by the audit mode, so that the text audit of the electric bidding document is more comprehensive and effective, unreasonable parts in the bidding document can be found in time, and the effectiveness of the text audit of the electric bidding document is improved.
(3) Expert spot checking and auditing: and extracting a plurality of spot check bidding documents from all the bidding documents according to a set extraction principle, further manually marking and checking each spot check bidding document by an expert, and counting the manual checking compliance corresponding to each spot check bidding document, wherein the manual marking and checking is performed in a mode that the expert manually marks out misexpression, misexpression and unreasonable expression in the process of checking each spot check bidding document, and annotates the marks.
In the above scheme, the set extraction principle includes an extraction number and an extraction mode, wherein the set mode of the extraction number is the total number of the bid documents that are statistically uploaded, and the total number of the bid documents is multiplied by a preset sampling rate to obtain the extraction number
Figure SMS_67
The setting mode of the extraction mode is a random grouping extraction mode, specifically, the uploaded bidding documents are numbered, and grouping extraction is carried out according to the extraction quantity.
As an example of the above scheme, assuming that the total number of uploaded bidding documents is 100, the preset sampling rate is 20%, the sampling number is 20 at this time, and the uploaded bidding documents are sequentially numbered when sampling is performed by adopting a random sampling mode
Figure SMS_68
At this time, the uploaded bidding documents are equally divided into 20 groups according to the number sequence, namely, the bidding documents numbered 1-5 are one group, the bidding documents numbered 6-10 are one group …, the bidding documents numbered 96-100 are one group, and then one bidding document is extracted from each group, so that the spot check bidding documents are extracted.
The adoption of the random extraction mode can eliminate the artificial interference, so that the extracted bidding document has objectivity, and a real and reliable guarantee basis is provided for the subsequent machine auditing accuracy evaluation.
In the scheme, the manual checking compliance statistics mode corresponding to each spot check bidding document is consistent with the machine checking compliance statistics mode, and statistics is carried out according to the content of each place misstatement mark, each place misstatement mark and each place misstatement mark, so that the consistency of rules followed by statistics is ensured, and larger errors caused by inconsistent statistical rules to the evaluation of machine checking accuracy are avoided.
In the scheme, the misexpression marks and the unreasonable marks are annotated, wherein the annotation content of the misexpression marks is a misexpression electric entity and a misexpression technical parameter, the annotation content of the misexpression marks is a bid sensitive word related to text information of the misexpression marks, and the annotation content of the misexpression marks is presentation data corresponding to the unreasonable expression technical parameter.
According to the invention, the bid document is audited in the machine audit and expert audit processes in a marked mode, so that the audit process is more visual and clear, the statistical efficiency of the bid document audit compliance is improved to the maximum extent, the correction of audit standards is convenient to quickly and accurately refine, the return of the machine audit process is facilitated, and on the other hand, the marked content can realize the retention traceability, and related references are provided for the follow-up machine audit improvement.
(4) Evaluating machine auditing accuracy: comparing the manual auditing compliance degree corresponding to each spot check bidding document with the machine auditing compliance degree of the corresponding bidding document, thereby evaluating the machine auditing accuracy rate, wherein the evaluation expression is as follows
Figure SMS_69
Wherein->
Figure SMS_70
、/>
Figure SMS_71
The compliance degree of machine audit and the compliance degree of manual audit corresponding to the d-th spot check bidding document are respectively expressed, and d is expressed as the number of the spot check bidding document and +.>
Figure SMS_72
U represents the number of spot-check bidding documents.
(5) Returning to machine audit judgment: comparing the machine auditing accuracy with a preset accuracy threshold, outputting the machine auditing compliance of each bidding document if the machine auditing accuracy is greater than the preset accuracy threshold, otherwise, returning other bidding documents except the spot check bidding document to the machine auditing, wherein the specific operation flow of returning the machine auditing is to extract the correction auditing standard from expert auditing results corresponding to each spot check bidding document, supplement the correction auditing standard to an original auditing corpus, and further continue to perform machine auditing again, expert spot check again and machine auditing accuracy assessment again according to (2) - (4), as shown in fig. 2.
In a preferred embodiment of the present invention, the correction review criteria are refined by (51) counting the number of mislabels, mislabel data and number of unreasonable labels reviewed by the expert from each spot bid document, and extracting the corresponding annotation content of each mislabel, mislabel and unreasonable label.
(52) And matching target entities and target technical parameters corresponding to the expression disagreement marks of all the spot bidding documents under machine auditing with annotation contents corresponding to the expression disagreement marks of all the spot bidding documents under expert auditing, selecting expression disagreement marks failing to match, and taking the expression disagreement marks as corrected expression disagreement marks.
(53) And matching the bid sensitive words matched with the misexpression marks of the various spot-check bidding documents under machine auditing with annotation contents corresponding to the misexpression marks of the various spot-check bidding documents under expert auditing, and selecting the misexpression marks with failed matching as corrected misexpression marks.
(54) And matching abnormal entities and abnormal technical parameters corresponding to the unreasonable marks of the various spot-check bidding documents under machine auditing with annotation contents corresponding to the unreasonable marks of the various spot-check bidding documents under expert auditing, and selecting unreasonable marks failing to match from the unreasonable marks as corrected unreasonable marks.
(55) And taking the annotation content of the correction term mislabel, the correction term mislabel and the correction unreasonable label as correction auditing standards.
And further, the specific operation process of supplementing the correction auditing standard into the original auditing corpus is to supplement the annotation content corresponding to the correction term mislabel into the technical standard term library of the power equipment, and supplement the annotation content of the correction term mislabel into the sensitive term library of the power bidding document.
In a preferred scheme of the invention, the implementation process of expert spot check again is as follows: acquiring the number of spot checks and the machine checking accuracy corresponding to the spot check of the last expert, and substituting the number into a formula
Figure SMS_73
Obtaining the number of the random samples corresponding to the expert random samples again>
Figure SMS_74
Wherein->
Figure SMS_75
Representing the number of the random checks corresponding to the previous expert random check,>
Figure SMS_76
machine auditing accuracy rate corresponding to last expert spot check is expressed as +.>
Figure SMS_77
Represented as a preset accuracy threshold.
According to the method, the number of the re-expert spot checks is dynamically determined based on the accuracy of the last machine check, so that sufficient correction check standard supplement adapting to the check defects of the next machine check can be provided for the next machine check, the accuracy of the next machine check can reach the standard as soon as possible, and the machine check process is accelerated.
According to the invention, when the machine audit is corrected and guided according to the expert auxiliary audit result, the machine audit is corrected in a circulation mode of expert spot check audit, audit result feedback and spot check audit again, so that the machine audit is corrected for multiple times, the machine audit strengthening training is realized, audit errors can be directly hit, each correction is accurate and powerful, the correction effect is improved, and multi-level and deep guarantee is provided for the machine audit accuracy.
(6) And (3) outputting: and comparing the re-machine auditing accuracy with a preset accuracy threshold until the re-machine auditing accuracy is greater than the preset accuracy threshold, and outputting the machine auditing compliance of the bidding document.
According to the invention, the original audit corpus is constructed in the compliance audit process of the bidding document, so that the bidding document is subjected to machine audit firstly, then expert auxiliary audit is performed, and further the machine audit is corrected and guided according to the expert auxiliary audit result, so that the audit efficiency is improved, the audit accuracy is ensured, the manual audit cost is reduced to a certain extent, and the method has a great practical advantage.
The foregoing is merely illustrative of the structures of this invention and various modifications, additions and substitutions for those skilled in the art of describing particular embodiments without departing from the structures of the invention or exceeding the scope of the invention as defined by the claims.

Claims (10)

1. The text language data processing method based on artificial intelligence is characterized by comprising the following steps:
(1) And (5) uploading bidding documents: collecting all bidding documents corresponding to bidding projects of a designated power system, and uploading the bidding documents to a machine auditing terminal;
(2) Bidding document machine audit: an original auditing corpus is built, and text auditing is carried out on each bidding document by a machine auditing terminal by means of the original auditing corpus, and the concrete implementation process is as follows:
(21) Performing text term standard auditing on each bidding document;
(22) Performing text scheme rationality auditing on each bidding document;
(23) Based on the text expression standard auditing result and the text scheme rationality auditing result, counting the machine auditing compliance corresponding to each bidding document;
(3) Expert spot checking and auditing: extracting a plurality of spot check bidding documents from all bidding documents according to a set extraction principle, further manually marking and auditing each spot check bidding document by an expert, and counting the manual auditing compliance corresponding to each spot check bidding document;
(4) Evaluating machine auditing accuracy: comparing the manual auditing compliance degree corresponding to each spot check bidding document with the machine auditing compliance degree of the corresponding bidding document, thereby evaluating the machine auditing accuracy;
(5) Returning to audit judgment: comparing the machine auditing accuracy with a preset accuracy threshold, outputting the machine auditing compliance of each bidding document if the machine auditing accuracy is greater than the preset accuracy threshold, otherwise, returning other bidding documents except the spot check bidding document to audit, wherein the specific operation flow is to extract a correction auditing standard from expert auditing results corresponding to each spot check bidding document, supplement the correction auditing standard into an original auditing corpus, and further continue to perform machine auditing again, expert spot check again and machine auditing accuracy evaluation again according to (2) - (4);
(6) And (3) outputting: and comparing the re-machine auditing accuracy with a preset accuracy threshold until the re-machine auditing accuracy is greater than the preset accuracy threshold, and outputting the machine auditing compliance of the bidding document.
2. A method for processing text language data based on artificial intelligence according to claim 1, wherein: the original audit corpus comprises a power bidding document sensitive language library and a power equipment technical standard language library.
3. A method for processing text language data based on artificial intelligence according to claim 2, wherein: the text expression specification audit comprises text technical expression specification audit and text expression specification audit, wherein the text technical expression specification audit comprises the following specific implementation processes:
performing sentence breaking and stop word removing processing on text information of each bidding document to obtain substantial text information of each bidding document;
entity identification and entity type marking are carried out on the substantial text information of each bidding document according to the clauses, and the entity and entity type corresponding to each clause in the substantial text information of each bidding document are obtained;
selecting an entity corresponding to the entity type of the power equipment from the entities corresponding to the entity text information identification of each bidding document, taking the entity as an effective entity, and classifying clauses belonging to the same effective entity in the same bidding document to obtain a clause set corresponding to each effective entity in each bidding document;
Extracting power equipment to be bid-tendered from a bid-tendering document corresponding to a specified power system bid-tendering project, taking the power equipment as a bid-tendering main body, and counting the number of the bid-tendering main bodies;
matching each bidding subject with effective entities in each bidding document, screening successfully matched entities from the effective entities, and taking the successfully matched entities as key entities;
integrating text contents of each clause in a clause set corresponding to each key entity in each bidding document to obtain integrated text information corresponding to each key entity in each bidding document;
extracting expression expressions of the technical parameters corresponding to the key entities from the integrated text information corresponding to the key entities in each bidding document, and extracting standard expression expressions of the technical parameters corresponding to the key entities from a technical standard expression library of the power equipment based on the names of the key entities;
comparing the expression term of each key entity corresponding to each technical parameter in each bidding document with the standard expression term of the key entity corresponding to the technical parameter in the technical standard term library of the power equipment, if the expression term of the key entity corresponding to a certain technical parameter is inconsistent, marking the expression term inconsistent, marking the key entity as a target entity, and marking the technical parameter as a target technical parameter.
4. A method of processing text language data based on artificial intelligence according to claim 3, wherein: the text expression term specification auditing method comprises the following specific implementation processes:
extracting entity types corresponding to each bidding sensitive word from a power bidding document sensitive term library, and performing de-duplication processing on the entity types to obtain bidding document sensitive entity types;
comparing the entity type corresponding to each clause identified in the corresponding substantial text information of each bidding document with the sensitive entity type of the bidding document, and if the entity type corresponding to a clause is consistent with the sensitive entity type of the bidding document, marking the clause as a sensitive clause;
and thirdly, matching the entity corresponding to each sensitive clause in the corresponding parenchymal text information of each bidding document with a plurality of bidding sensitive words stored in a bidding document sensitive word library, and if the entity corresponding to a certain sensitive clause is successfully matched with a certain bidding sensitive word, marking the sensitive clause with improper words.
5. The artificial intelligence based text language data processing method according to claim 4, wherein: the specific implementation process of the text scheme rationality audit for each bidding document is as follows:
Matching a reference bidding project from the historical bidding projects based on the designated power system bidding projects, and further extracting display data of technical parameters corresponding to each bidding subject in a bidding document corresponding to the reference bidding project, and taking the display data as reference display data of technical parameters corresponding to each bidding subject;
and extracting display data of each technical parameter corresponding to each key entity in each bidding document from the integrated text information corresponding to each key entity in each bidding document, comparing the display data with reference display data of the technical parameter corresponding to the corresponding bidding subject, if the display data of the technical parameter corresponding to a certain key entity is inconsistent, carrying out unreasonable marking, marking the key entity as an abnormal entity, and marking the technical parameter as an abnormal technical parameter.
6. The artificial intelligence based text language data processing method of claim 5, wherein: the machine auditing compliance statistics process corresponding to each bidding document is as follows:
(231) Counting the number of the non-compliance marks of the expressions in each bidding document, extracting target entities and target technical parameters corresponding to the non-compliance marks of the expressions, and further obtaining the composition importance degree of the target entities and the use value degree of the target technical parameters corresponding to the non-compliance marks of the expressions
Figure QLYQS_1
Calculating text technical term compliance corresponding to each bidding document>
Figure QLYQS_2
,/>
Figure QLYQS_3
、/>
Figure QLYQS_4
The weight factors of the corresponding target entities and the using value of the target technical parameters are respectively expressed as the jth expression disagreement marks in the ith bidding document, wherein i is expressed as the number of the bidding document, and->
Figure QLYQS_5
J is expressed as the number of the discordance mark of the term existing in each bidding document, ++>
Figure QLYQS_6
(232) Counting the number of misexpression marks in each bidding document, extracting bidding sensitive words matched by misexpression marks at all places, acquiring weighing factors corresponding to misexpression marks at all places, and utilizing a formula
Figure QLYQS_7
Calculating text term compliance corresponding to each bidding document>
Figure QLYQS_8
Wherein->
Figure QLYQS_9
Expressed as a trade-off factor corresponding to the k-th misuse marker in the ith bid document, k being expressed as the misuse marker number,/-present in each bid document>
Figure QLYQS_10
E is expressed as a natural constant;
(233) Counting the number of unreasonable marks existing in each bidding document, and substituting display data and reference display data of abnormal technical parameters of abnormal entities corresponding to unreasonable marks everywhere into a formula
Figure QLYQS_11
Calculating the reasonable degree of the text scheme corresponding to each bidding document>
Figure QLYQS_12
,/>
Figure QLYQS_13
、/>
Figure QLYQS_14
Display data and reference display data respectively expressed as abnormal technical parameters corresponding to abnormal entities at f position in ith bidding document >
Figure QLYQS_15
A weight factor expressed as an unreasonable tag corresponding to an abnormal entity at f in the ith bid document, where f is expressed as an unreasonable tag number present in each bid document,/->
Figure QLYQS_16
(234) Will be
Figure QLYQS_17
、/>
Figure QLYQS_18
And->
Figure QLYQS_19
Introducing a compliance audit model ++>
Figure QLYQS_20
Calculating to obtain the machine auditing compliance degree corresponding to each bidding document>
Figure QLYQS_21
7. The artificial intelligence based text language data processing method of claim 6, wherein: the said
Figure QLYQS_22
And->
Figure QLYQS_23
The specific acquisition process of (a) is as follows:
determining the composition importance of the target entity corresponding to the term mislabel at each place and the composition importance of the abnormal entity corresponding to the unreasonable label at each place based on the designated power system bidding project;
extracting the bid amount of the corresponding entity from the bid document corresponding to the specified power system bid project;
using expressions
Figure QLYQS_24
Performing calculation of>
Figure QLYQS_25
Figure QLYQS_26
Respectively expressed as the constitution importance degree, bid amount and/or the number of abnormal entities corresponding to the jth term disagreement mark and the f irrational mark in the ith bid document>
Figure QLYQS_27
、/>
Figure QLYQS_28
Respectively expressed as the bid amount of the target entity corresponding to the jth term disagreement mark and the abnormal entity corresponding to the f unreasonable mark in the ith bid document.
8. A method for processing text language data based on artificial intelligence according to claim 1, wherein: and the manual marking and auditing mode of each spot check bidding document is carried out by an expert as follows: the expert manually marks the misexpression, misexpression and unreasonable marks in the process of auditing the bidding documents of each spot check, and annotates the marks.
9. The artificial intelligence based text language data processing method of claim 6, wherein: the evaluation expression of the machine auditing accuracy is that
Figure QLYQS_29
Wherein
Figure QLYQS_30
、/>
Figure QLYQS_31
The compliance degree of machine audit and the compliance degree of manual audit corresponding to the d-th spot check bidding document are respectively expressed, and d is expressed as the number of the spot check bidding document and +.>
Figure QLYQS_32
U represents the number of spot-check bidding documents.
10. A method for processing text language data based on artificial intelligence according to claim 1, wherein: the implementation process of the expert spot check again is as follows:
acquiring the number of spot checks and the machine checking accuracy corresponding to the spot check of the last expert, and substituting the number into a formula
Figure QLYQS_33
Obtaining the number of the random samples corresponding to the expert random samples again>
Figure QLYQS_34
Wherein->
Figure QLYQS_35
Representing the number of the random checks corresponding to the previous expert random check,>
Figure QLYQS_36
machine auditing accuracy rate corresponding to last expert spot check is expressed as +.>
Figure QLYQS_37
Represented as a preset accuracy threshold. />
CN202310440514.8A 2023-04-23 2023-04-23 Text language data processing method based on artificial intelligence Active CN116150323B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310440514.8A CN116150323B (en) 2023-04-23 2023-04-23 Text language data processing method based on artificial intelligence

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310440514.8A CN116150323B (en) 2023-04-23 2023-04-23 Text language data processing method based on artificial intelligence

Publications (2)

Publication Number Publication Date
CN116150323A CN116150323A (en) 2023-05-23
CN116150323B true CN116150323B (en) 2023-06-23

Family

ID=86354729

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310440514.8A Active CN116150323B (en) 2023-04-23 2023-04-23 Text language data processing method based on artificial intelligence

Country Status (1)

Country Link
CN (1) CN116150323B (en)

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111046142A (en) * 2019-12-13 2020-04-21 深圳前海环融联易信息科技服务有限公司 Text examination method and device, electronic equipment and computer storage medium
CN111723571A (en) * 2020-06-12 2020-09-29 上海极链网络科技有限公司 Text information auditing method and system
CN112597768A (en) * 2020-12-08 2021-04-02 北京百度网讯科技有限公司 Text auditing method and device, electronic equipment, storage medium and program product
CN113704498A (en) * 2021-09-01 2021-11-26 云知声(上海)智能科技有限公司 Intelligent auditing method and system for document
CN114677112A (en) * 2022-03-25 2022-06-28 国网江西省电力有限公司电力科学研究院 Power distribution network project large-scale evaluation method and system based on big data
CN114860882A (en) * 2022-05-18 2022-08-05 南京物浦大数据有限公司 Fair competition review auxiliary method based on text classification model
CN114970508A (en) * 2022-05-17 2022-08-30 国网浙江省电力有限公司电力科学研究院 Power text knowledge discovery method and device based on data multi-source fusion
CN115578045A (en) * 2021-06-21 2023-01-06 珠海采筑电子商务有限公司 Tender invitation auditing method, electronic equipment and related products
CN115689696A (en) * 2022-11-03 2023-02-03 安徽皖电招标有限公司 Intelligent bid evaluation method and system based on artificial intelligence technology

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080086298A1 (en) * 2006-10-10 2008-04-10 Anisimovich Konstantin Method and system for translating sentences between langauges

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111046142A (en) * 2019-12-13 2020-04-21 深圳前海环融联易信息科技服务有限公司 Text examination method and device, electronic equipment and computer storage medium
CN111723571A (en) * 2020-06-12 2020-09-29 上海极链网络科技有限公司 Text information auditing method and system
CN112597768A (en) * 2020-12-08 2021-04-02 北京百度网讯科技有限公司 Text auditing method and device, electronic equipment, storage medium and program product
CN115578045A (en) * 2021-06-21 2023-01-06 珠海采筑电子商务有限公司 Tender invitation auditing method, electronic equipment and related products
CN113704498A (en) * 2021-09-01 2021-11-26 云知声(上海)智能科技有限公司 Intelligent auditing method and system for document
CN114677112A (en) * 2022-03-25 2022-06-28 国网江西省电力有限公司电力科学研究院 Power distribution network project large-scale evaluation method and system based on big data
CN114970508A (en) * 2022-05-17 2022-08-30 国网浙江省电力有限公司电力科学研究院 Power text knowledge discovery method and device based on data multi-source fusion
CN114860882A (en) * 2022-05-18 2022-08-05 南京物浦大数据有限公司 Fair competition review auxiliary method based on text classification model
CN115689696A (en) * 2022-11-03 2023-02-03 安徽皖电招标有限公司 Intelligent bid evaluation method and system based on artificial intelligence technology

Also Published As

Publication number Publication date
CN116150323A (en) 2023-05-23

Similar Documents

Publication Publication Date Title
CN104408093B (en) A kind of media event key element abstracting method and device
CN103324609B (en) Text proofreading apparatus and text proofreading method
US10963912B2 (en) Method and system for filtering goods review information
CN105912625A (en) Linked data oriented entity classification method and system
CN104933443A (en) Automatic identifying and classifying method for sensitive data
CN111899090B (en) Enterprise associated risk early warning method and system
TW201405341A (en) Information Classification Based on Product Recognition
CN110674296B (en) Information abstract extraction method and system based on key words
CN109120632A (en) Network flow abnormity detection method based on online feature selection
CN107423264A (en) A kind of engineering material borrowing-word extracting method
CN108874984B (en) Quality improvement method for poor-quality power grid equipment defect text
CN104216876A (en) Informative text filter method and system
CN102880631A (en) Chinese author identification method based on double-layer classification model, and device for realizing Chinese author identification method
CN109376202A (en) NLP-based enterprise supply relationship automatic extraction and analysis method
CN104794108A (en) Webpage title extraction method and device thereof
CN107797994A (en) Vietnamese noun phrase block identifying method based on constraints random field
CN105912720B (en) A kind of text data analysis method of emotion involved in computer
CN103530283A (en) Method for extracting emotional triggers
CN109284504A (en) It grinds to call the score using the security of deep learning model and analyses method and device
CN116150323B (en) Text language data processing method based on artificial intelligence
CN105335446A (en) Short text classification model generation method and classification method based on word vector
CN114282010A (en) Power grid operation fault identification method and system based on knowledge graph and storage medium
CN109388804A (en) Report core views extracting method and device are ground using the security of deep learning model
CN107480126B (en) Intelligent identification method for engineering material category
CN111881258A (en) Self-learning event extraction method and application thereof

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20231113

Address after: Room 608, Building J, Haitai Green Industry Base, No. 6 Haitai Development Sixth Road, Huayuan Industrial Zone, Xiqing District, Tianjin, 300392

Patentee after: TIANJIN RICHSOFT ELECTRIC POWER INFORMATION TECHNOLOGY Co.,Ltd.

Patentee after: State Grid Siji Location Service Co.,Ltd.

Address before: Room 608, Building J, Haitai Green Industry Base, No. 6 Haitai Development Sixth Road, Huayuan Industrial Zone, Xiqing District, Tianjin, 300392

Patentee before: TIANJIN RICHSOFT ELECTRIC POWER INFORMATION TECHNOLOGY Co.,Ltd.

TR01 Transfer of patent right