CN115249007A - Method and device for detecting enclosing and bidding behavior based on electronic bidding document comparison - Google Patents

Method and device for detecting enclosing and bidding behavior based on electronic bidding document comparison Download PDF

Info

Publication number
CN115249007A
CN115249007A CN202210897373.8A CN202210897373A CN115249007A CN 115249007 A CN115249007 A CN 115249007A CN 202210897373 A CN202210897373 A CN 202210897373A CN 115249007 A CN115249007 A CN 115249007A
Authority
CN
China
Prior art keywords
information
document
bid
bidding
suppliers
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210897373.8A
Other languages
Chinese (zh)
Inventor
陈荣木
林傅荣
童晓婷
林妍
陈小雷
林镇勋
牛京杰
查道鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Bosi Digital Acquisition Technology Development Co ltd
Original Assignee
Bosi Digital Acquisition Technology Development Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Bosi Digital Acquisition Technology Development Co ltd filed Critical Bosi Digital Acquisition Technology Development Co ltd
Publication of CN115249007A publication Critical patent/CN115249007A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/12Use of codes for handling textual entities
    • G06F40/151Transformation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/194Calculation of difference between files
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • G06Q30/0206Price or cost determination based on market factors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/08Auctions

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Accounting & Taxation (AREA)
  • Finance (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Strategic Management (AREA)
  • Development Economics (AREA)
  • Computational Linguistics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • General Business, Economics & Management (AREA)
  • Game Theory and Decision Science (AREA)
  • Data Mining & Analysis (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a method and a device for detecting a surrounding bidding behavior based on electronic bidding document comparison, wherein the method comprises the following steps: converting the bidding document into a plain text, denoising the plain text, and removing the content consistent with the information in the bidding purchase document to obtain an effective text document; dividing all effective text documents into sentences, screening set sentences, calculating simhash values of the sentences, finding out similar sentences of different effective text documents, and splicing continuous sentences to obtain similar information; extracting basic key information, quotation information, supplier electronic bid document making information and bid security payment account information from all valid text documents; comparing and judging whether the bidder is in the behavior of surrounding the bidding according to the regulation and the obtained information; the method can more intuitively and accurately position the possible string label surrounding behaviors, further reduce the workload of experts at ordinary times, and improve the efficiency of the review experts.

Description

Method and device for detecting enclosing and bidding behavior based on electronic bidding document comparison
Technical Field
The invention relates to the technical field of computers, in particular to a method and a device for detecting a surrounding bidding behavior based on electronic bidding document comparison.
Background
In the prior art, in order to benefit, a supplier may have the behavior of bidding or cross bidding, which greatly damages the benefit of a tenderer; if the bidding documents are only manually read and compared, the efficiency is low and the accuracy is not high; if the software mode is adopted, the following three problems exist:
1. the reliability (accuracy) is not high, and there is a large possibility of misjudgment:
in practical operation, the similarity is more than high, the result is judged as the girth string mark, a determined reference value cannot be obtained in different purchasing items or different scenes, and the detected similarity cannot be directly used as a direct basis for judging the girth string mark;
2. only through similarity, it cannot be identified which contents of the two bidding documents are the same:
based on the condition of the first point, when two bidding documents with high similarity are detected, the review expert needs manual intervention to judge whether the cross bidding behavior exists, only one piece of similarity information is needed, and the review expert cannot quickly position which contents are highly consistent or similar or needs to manually perform complete reading comparison on the two bidding documents;
3. some key information capable of determining the behavior of the surrounding string tags cannot be embodied:
in the two bidding documents, the identity card numbers of the legal person are consistent, and the situation can be determined as the surrounding series bidding, but the similarity of the two bidding documents is not greatly improved only by the consistency of the identity card numbers, and the key information cannot be directly positioned only on the basis of the similarity information;
in the application No.: 2019113581250; the invention has the name: a bid document similarity calculation method and device concretely disclose that the method comprises the following steps: obtaining effective text information of the first bidding document and effective text information of the second bidding document; searching out paragraphs with the same paragraph semantics as the paragraphs in the Nth page of the effective text information of the first bidding document from the Nth page-a to the (N + b) th page of the effective text information of the second bidding document according to a preset same word searching algorithm; determining the same word number of the effective text information of the first bidding document and the effective text information of the second bidding document according to the searched paragraphs with the same semantics; and determining the similarity of the first bidding document and the second bidding document according to the same word number. The method greatly improves the efficiency and accuracy of finding the delineator and the extension, and can greatly reduce the labor cost and the expandable cost; the implementation idea of the invention is to calculate the text similarity of two bidding documents; although the scheme can indirectly detect the possible surrounding labels, in an actual situation, the possibility of the surrounding labels can not be simply judged only based on the text similarity information.
Disclosure of Invention
The invention aims to solve the technical problem of providing a method and a device for detecting a behavior of a bidding document based on electronic bidding document comparison, which can more intuitively and accurately locate the possible behavior of the bidding document, further reduce the workload of experts at ordinary times and improve the efficiency of the experts for evaluation.
In a first aspect, the invention provides a method for detecting a girdling behavior based on electronic bidding document comparison, which comprises the following steps:
step 1, converting a bid document into a plain text, denoising the plain text, and removing the content consistent with information in a bid procurement document to obtain an effective text document;
step 2, dividing all the effective text documents into sentences, screening the set sentences, calculating simhash values of the sentences, finding out similar sentences of different effective text documents, and splicing continuous sentences to obtain similar information;
step 3, extracting basic key information, quotation information, supplier electronic bid document making information and bid deposit payment account information from all effective text documents;
and 4, comparing and judging whether the bidders are in the cross bidding behavior according to the regulations, the information obtained in the step 2 and the information obtained in the step 3.
Further, the method also comprises a step 5 of displaying the information in the step 2, the information in the step 3 and the result in the step 4 according to the setting requirement.
Further, the step 2 is further specifically: dividing all effective text documents into sentences according to the set punctuations as separators of the sentences;
screening sentences set in the text: firstly, carrying out duplicate removal treatment on the same sentences in the same effective text document, and then selecting the sentences with the length larger than the preset length;
calculating the simhash value of each sentence of the selected sentence;
traversing and taking out all the simhash values of one effective text document, and sequentially calculating the hamming distance between the effective text document and all the simhash values of the other effective text document; marking sentences corresponding to two groups of simhash values with the hamming distance smaller than a preset value as similar sentences; and if continuous sentences exist, splicing to obtain similar information.
Further, the step 3 is further specifically: extracting basic key information, quotation information, supplier electronic bid document making information and bid security payment account information from all valid text documents;
the basic key information includes: chinese name, telephone number, address, email, and company name;
the Chinese name extraction: identifying and extracting name information in the effective text document based on the mmseg algorithm and the Chinese surname lexicon, and storing the extracted name information into a database;
the telephone number extraction: extracting the telephone number in the effective text document by using a regular expression, and storing the extracted telephone number information into a database;
the address extraction: identifying and extracting address information in the effective text document based on the mmseg algorithm, the administrative division word bank and the address word bank, and storing the extracted address information into a database;
and E, extracting the email box: extracting electronic mailbox information in the effective text document by using a regular expression, and storing the extracted electronic mailbox information into a database;
extracting the company name: identifying and extracting company name information in the effective text document based on the mmseg algorithm and a company name word bank, and storing the extracted company name information into a database;
the quotation information is as follows: acquiring corresponding quotation information from a database according to a supplier;
the supplier electronic bidding document making information: when a supplier uploads an electronic bidding document, recording the electronic bidding document encryption computer and the mac address and the ip address of the uploading computer;
the information of the bidding deposit payment account is as follows: when the supplier pays the guarantee fee, the transfer-out account number of the supplier and the transferred-in unique virtual guarantee fee number are recorded.
Further, the step 4 is further specifically:
comparing the electronic bid document making information of the suppliers of different suppliers pairwise, and if the bid documents of different suppliers are encrypted or the mac addresses of uploaded computers are consistent, judging that the two suppliers have the suspicion of the girdling label;
comparing the basic information of different suppliers pairwise, and marking the two suppliers as suspicious behaviors with surrounding serial marks if one or more of the name, the telephone number, the email, the company name and the address are consistent;
comparing the quotation information of different suppliers in pairs:
(1) if the quotation of two or more suppliers is high or low, and the deviation rate of the quotation amount is smaller than the integral deviation rate of the quotation of all suppliers;
the method for judging high or low price quoted is as follows:
if the difference between the quoted price of two or more suppliers and the average quoted price of each supplier is above the difference threshold value of the distortion height and distortion low, the quoted price is the distortion height or distortion low;
bid evaluation benchmark price = lowest bid price among all supplier bid prices;
deviation rate = | (bidder quoted price-bid evaluation benchmark price) |/bid evaluation benchmark price × 100%;
global deviation ratio = mean value of deviation ratios of bid offers of the respective suppliers involved in the calculation;
(2) step quotation condition of more than two suppliers
The step quotation judging method comprises the following steps:
calculating the absolute value of the difference between every two bidding quotations of all suppliers, and finding out the related suppliers which have the same absolute value of the difference and the same number of the related suppliers more than 2, wherein the quotation of the related suppliers is the step quotation;
when the quotation information meets any one of the conditions, marking the corresponding supplier label as suspicious behavior with a surrounding string label;
calculating similarity values among different bidding documents;
the similarity value Sab of the two effective text documents is calculated by the following method:
calculating the text length of the similar information content of the two effective text documents to be S;
the text length of a valid text document is La; the text length of another valid text document is Lb;
the similarity value Sab = S/Min (La, lb) × 100%, if the similarity value is greater than a set threshold value, the two suppliers are judged to have the suspicion of the cluster mark;
and comparing every two pieces of bidding deposit payment account information of different suppliers, and if different suppliers use the same export account number or the same transferred unique virtual deposit account number, directly judging that the corresponding supplier has the label enclosing action.
In a second aspect, the present invention provides a device for detecting a bidding behavior based on electronic bidding document comparison, including:
the bidding document preprocessing module is used for converting the bidding document into a plain text, denoising the plain text and removing the content consistent with the information in the bidding purchase document to obtain an effective text document;
the bid document similar content detection module is used for segmenting all effective text documents, screening set sentences, calculating simhash values of the sentences, finding out similar sentences of different effective text documents, and splicing continuous sentences to obtain similar information;
the bid document key information extraction module is used for extracting basic key information, quotation information, supplier electronic bid document making information and bid security payment account information from all effective text documents;
and the bid document detection module is used for comparing and judging whether the bidder is in the behavior of surrounding bidding according to the information obtained from the regulation and bid document similar content detection module and the information obtained from the bid document key information extraction module.
And the display module is used for displaying the information in the bid document similar content detection module, the information in the bid document key information extraction module and the result in the bid document key information extraction module according to the set requirement.
Further, the bid document similar content detection module is further specifically: dividing all the effective text documents into sentences according to the set punctuations as separators of the sentences;
screening the set sentences: firstly, carrying out duplicate removal treatment on the same sentences in the same effective text document, and then selecting the sentences with the length larger than the preset length;
calculating the simhash value of each sentence of the selected sentence;
traversing and taking out all the simhash values of one effective text document, and sequentially calculating the hamming distance between the effective text document and all the simhash values of the other effective text document; marking sentences corresponding to two groups of simhash values with the hamming distance smaller than a preset value as similar sentences; and if continuous sentences exist, splicing to obtain similar information.
Further, the bid document key information extraction module is further specifically: extracting basic key information, quotation information, supplier electronic bid document making information and bid security payment account information from all valid text documents;
the basic key information includes: chinese name, telephone number, address, email, and company name;
the Chinese name extraction: identifying and extracting name information in the effective text document based on the mmseg algorithm and the Chinese surname lexicon, and storing the extracted name information into a database;
the telephone number extraction: extracting the telephone number in the effective text document by using a regular expression, and storing the extracted telephone number information into a database;
the address extraction: identifying and extracting address information in the effective text document based on an mmseg algorithm, an administrative division word bank and an address word bank, and storing the extracted address information into a database;
the electronic mailbox extraction: extracting email information in the effective text document by using a regular expression, and storing the extracted email information into a database;
extracting the company name: identifying and extracting company name information in the effective text document based on an mmseg algorithm and a company name word bank, and storing the extracted company name information into a database;
the quotation information is as follows: acquiring corresponding quotation information from a database according to a supplier;
the supplier electronic bid document production information: when a supplier uploads an electronic bidding document, recording the electronic bidding document encryption computer and the mac address and the ip address of the uploading computer;
the information of the bidding deposit payment account is as follows: when the supplier pays the guarantee fee, the transfer-out account number of the supplier and the transferred-in unique virtual guarantee fee number are recorded.
Further, the bid document key information extraction module is further specifically:
comparing the making information of the electronic bidding documents of different suppliers in pairs, and if the bidding documents of different suppliers are encrypted or the mac addresses of uploaded computers are consistent, judging that the two suppliers have the suspicion of the circumferential mark;
comparing the basic information of different suppliers pairwise, and if one or more of the name, the telephone number, the email address, the company name and the address are consistent, marking the two suppliers as suspicious behaviors with surrounding serial marks;
comparing the quotation information of different suppliers in pairs:
(1) if the quotations of two or more suppliers are both high or low, and the deviation rate of the quotation amount is smaller than the integral deviation rate of the quotations of all suppliers;
the method for judging high or low price quoted is as follows:
if the difference between the quoted price of two or more suppliers and the average quoted price of each supplier is above the difference threshold value of the distortion height and distortion low, the quoted price is the distortion height or distortion low;
bid evaluation benchmark price = lowest bid price among all supplier bid prices;
deviation rate = | (bidder quoted price-bid evaluation benchmark price) |/bid evaluation benchmark price × 100%;
global deviation ratio = mean value of deviation ratios of bid offers of the respective suppliers involved in the calculation;
(2) step quotation condition of more than two suppliers
The step quotation judging method comprises the following steps:
calculating the absolute value of the difference between every two bidding quotations of all suppliers, and finding out the related suppliers which have the same absolute value of the difference and the same number of the related suppliers more than 2, wherein the quotation of the related suppliers is the step quotation;
when the quotation information meets any one of the conditions, marking the corresponding supplier as suspicious behavior with the surrounding string mark;
calculating similarity values among different bidding documents;
the similarity value Sab of the two effective text documents is calculated by the following method:
calculating the text length of the similar information content of the two effective text documents to be S;
the text length of a valid text document is La; the text length of the other valid text document is Lb;
the similarity value Sab = S/Min (La, lb) × 100%, if the similarity value is greater than a set threshold value, the two suppliers are judged to have the suspicion of the cluster mark;
and comparing every two bidding security fund payment account information of different suppliers, and if different suppliers use the same transfer-out account number or the same transferred-in unique virtual security fund account number, directly judging that the corresponding supplier has the label enclosing action.
One or more technical solutions provided in the embodiments of the present invention have at least the following technical effects or advantages:
according to the method and the device for detecting the behavior of the delineator based on the comparison of the electronic bidding documents, the possible behavior of the delineator can be more intuitively and accurately positioned based on the comparison display of the similar contents of the bidding documents and the extraction of the key information, the workload of experts in ordinary times is further reduced, the efficiency of experts in evaluation is improved, and the information of the delineator which is possibly omitted originally can be paid attention to with the assistance of the method and the device.
The above description is only an overview of the technical solutions of the present invention, and the present invention can be implemented in accordance with the content of the description so as to make the technical means of the present invention more clearly understood, and the above and other objects, features, and advantages of the present invention will be more clearly understood.
Drawings
The invention will be further described with reference to the following examples with reference to the accompanying drawings.
FIG. 1 is a schematic block diagram of the system of the present invention;
FIG. 2 is a flow chart of a method according to one embodiment of the present invention;
fig. 3 is a schematic structural diagram of a device according to a second embodiment of the present invention.
Detailed Description
The embodiment of the application provides the method and the device for detecting the behavior of the circular string bid based on the comparison of the electronic bidding document, so that the problem that the behavior of the circular string bid cannot be accurately judged in the prior art is solved, the existing behavior of the circular string bid can be more accurately positioned through the technical scheme of the invention, and the workload of experts is greatly reduced.
The technical scheme in the embodiment of the application has the following general idea:
1. as shown in fig. 1, the present invention is composed of the following parts:
1. a bid document preprocessing module;
2. a bid document similar content detection module;
3. a bidding document key information extraction module;
4. the bidding document surrounding and bidding behavior analysis and detection module;
5. and the bidding document similar content and the same key information display module.
2. The characteristics and the functions of each component part are as follows:
1. bid file preprocessing module
1. Convert bid document to plain text (txt): the same bidding document is a pdf document, and the pdf is converted into a text document (txt) through a screw
2. Denoising the processed and converted text document: mainly solves the problem of typesetting of the converted text document, the converted text document may have typesetting problems such as redundant spaces, line feed and the like due to the characteristics of the pdf file, and the redundant spaces, line feed characters and the like are removed in the step
3. Removing key contents in the purchase file: in order to avoid misjudgment in the subsequent steps, the content of the text document, which is consistent with the key information in the purchase file, is removed
Based on the algorithm of a bidding document similar content detection module, the purchasing document is used as one object participating in comparison, the comparison is carried out with all bidding documents one by one, similar contents or similar sections in the bidding document and the purchasing document are obtained, and then texts corresponding to the similar contents or similar sections in the bidding documents and the purchasing document are deleted.
After the preprocessing of the 3 steps, an effective text document of the bid document is obtained, which is hereinafter referred to as a text for short
2. Bidding file similar content detection module
1. Sentence splitting is carried out on the text: common punctuations (commas, periods, question marks, line breaks, etc., including Chinese and English punctuations) are used as separators of a sentence to separate the text into good sentences.
2. Screening out effective sentences:
(1) firstly, the same sentences are subjected to duplication elimination processing;
(2) then, selecting sentences with the sentence length larger than the preset length: considering that some short sentences are judged to be repeated in practical application and interfere with normal judgment, only sentences with the length larger than the preset length need to be further processed, and the preset length is more reasonable than 12 characters generally.
3. Calculate the sentence simhash value: and calculating the simhash value of the sentences screened out from the bidding document similar content detection module one by one, and storing the simhash value into a memory database for later use.
4. The above steps are repeatedly executed, and the simhash value (hereinafter referred to as simhash value) of the sentence is calculated for all the supplier bid documents.
5. Find out similar sentences of bid documents of different suppliers: the main implementation mode is that the bidding documents of different suppliers are compared pairwise based on the simhash value of the sentence.
(1) Traversing and taking out all simhash values of the bidding documents of the supplier A, and sequentially calculating Hamming distance with all simhash values of the bidding documents of the supplier B
(2) Marking sentences corresponding to two groups of simhash values with Hamming distance smaller than a preset value (practice generally takes 3 as an ideal value) as similar contents, and storing the similar contents in a database
6. Find similar paragraphs of bid document of different suppliers: further, based on the similar sentence result in the above 5, similar paragraph information is obtained by a sentence continuous similar diffusion search method (i.e. similar sentences are obtained based on the preamble step, and the continuous similar sentences, the character strings of one sentence by one sentence are spliced to obtain paragraph information composed of a plurality of sentences), and stored in the database.
3. Bidding file key information extraction module
And further extracting key information which can be used as the detection of the surrounding bidding behavior based on the effective text document of the bidding document obtained by the bidding document preprocessing module in the step one. The following information is specifically extracted:
1. basic key information extraction submodule
(1) Chinese name extraction submodule
Based on the mmseg algorithm (a Chinese word segmentation algorithm based on a dictionary) and a Chinese surname lexicon, possible name information in the text is identified and extracted, and the extracted name information is stored in a database.
(2) Telephone number extracting submodule
The telephone number (including landline and mobile phone numbers) carried in the text is extracted using a regular expression, which is/^ ([ 1] \ { d {10} | ([ \ ((. And stores the extracted telephone number information in a database.
(3) Address extraction submodule
Based on an mmseg algorithm (a Chinese word segmentation algorithm based on a dictionary) and an administrative division word bank and a common address word bank, possible address information in a text is identified and extracted, and the extracted address information is stored in a database.
(4) Email address extraction submodule
Extracting the electronic mailbox carried in the text by using a regular expression, wherein the regular expression is/[ A-Za-z0-9] + ([ \\. ] [ A-Za-z0-9] +) + [ @ ([ A-Za-z0-9\ - ] + ].) + [ A-Za-z ] {2,6} $/. And storing the extracted electronic mailbox information into a database.
(5) Company name extraction submodule
Based on the mmseg algorithm (a Chinese word segmentation algorithm based on a dictionary) and a common company name word bank, possible company name information in the text is identified and extracted, and the extracted company name information is stored in a database.
2. Quotation information extraction submodule
The quotation of the supplier is responded according to the structuring, the data can be obtained from the database without special processing (in the actual realization, the quotation information of the supplier is stored in the database in the structuring form, the direct taking from the database is more convenient in the realization convenience, or can be obtained from the bidding document in the algorithm form), the quotation information is also a big basis for judging the behavior of the surrounding string mark, and is also added into the key information as the key information
3. Supplier electronic bid document making information extraction submodule
When the supplier uploads the electronic bid document, the system records the mac address and the ip address of the electronic bid document encryption computer and the uploading computer, and the mac address and the ip address are used as key information for judging the surrounding bid
4. Bid deposit payment account information extraction submodule
When the supplier pays the deposit, the system records the transferred account number of the supplier and the transferred unique virtual deposit number as the key information for judging the gird mark
4. Bidding file surrounding and string bidding behavior analysis and detection module
Article fortieth of the regulations on implementation of the bid inviting and bid placing method of the people's republic of China:
there is one of the following situations, which are considered as the bidders getting through the bid with each other:
the bidding documents of different bidders are compiled by the same unit or individuals;
(II) different bidders entrust the same unit or individual to handle the bidding affairs;
(III) the project management members specified by the bid documents of different bidders are the same person;
fourthly, the bidding documents of different bidders are abnormal and consistent or the bidding quotations are regularly different;
fifthly, the bidding documents of different bidders are mixed and loaded;
and (VI) transferring the bid guarantee money of different bidders from the account of the same unit or individual.
Based on the regulation, the behaviors of the mutual collusion bidding can be identified and detected through technical means.
1. The bidding documents of different bidders are compiled by the same unit or individuals to be detected by the sub-module:
through the supplier electronic bidding document making information extraction submodule in the step three, the bidding document making information of different suppliers is compared pairwise, if the bidding documents of different suppliers are encrypted or the mac addresses of uploaded computers are consistent, the two suppliers are judged to have the suspected surrounding string mark suspicion
2. The detection submodule for detecting that different bidders entrust the same unit or personally transact bidding affairs and the project management member is the same person
Comparing every two basic key information of different suppliers through a basic key information extraction submodule in the third step, recording two suppliers as suspicious behavior with surrounding serial marks if one or more of information such as name, telephone number, email, company name, address and the like are consistent, and further intervening and judging by an expert to be evaluated and examined
3. Detection submodule for abnormal consistency of bidding documents of different bidders or difference of bidding quotations in regularity
Through the quotation information extraction submodule in the step three, the quotation information of different suppliers is compared in pairs, and the regularity difference judgment method comprises the following steps:
(1) the method comprises the following steps The price quoted by two or more suppliers is abnormal high or abnormal low, and the deviation rate of the quoted amount is less than the integral deviation rate of the quoted prices of all suppliers
The method for judging high and low quotation distortion comprises the following steps:
the price quotes of two or more suppliers are different from the average price quote of each supplier by more than 20 percent (the threshold value is an empirical value under the common condition), and the difference threshold value can be dynamically adjusted according to different purchased item types and purchased budgets. The adjustment algorithm is based on big data analysis results of historical data.
Bid evaluation benchmark price = lowest of all supplier bid prices
Deviation ratio = | (bidder quote-bid evaluation benchmark price) |/bid evaluation benchmark price × 100%
Overall deviation Rate = average of deviation rates of bid offers of individual suppliers involved in the calculation
The fact that the deviation ratio of the bid amount of two or more suppliers is less than the overall deviation ratio of all suppliers 'bids means that the suppliers' bid amounts exhibit a significant "clique" similarity, i.e., the quotation amounts of the suppliers are very close.
(2) The method comprises the following steps Step quotation situations exist when more than two suppliers submit prices, for example, three suppliers A, B and C, quote 100,200 and 300 respectively, and the step interval is 100;
the step quotation judging method comprises the following steps:
the absolute difference between every two bid offers of all suppliers is calculated, for example, n suppliers can obtain n x (n-1)/2 absolute differences. And finding out the associated suppliers with the same absolute value of the calculated difference and the same number of suppliers more than 2. The offers of these suppliers are step offers.
When the quotation information meets any one of the conditions, the corresponding supplier mark is marked as having the suspicious behavior of the surrounding string mark, and the expert to be evaluated further intervenes to judge
4. Mutual mixed loading detection submodule for bidding files of different bidders
Calculating reference similarity values among different bidding documents according to the similar paragraph contents obtained by the bidding document similar content detection module in the step two,
A. the similarity value Sab of the two bidding documents B is calculated by the following method:
setting the text length (number of characters) of the similar paragraph contents of the A and B bidding documents calculated by the bidding document similar content detection module in the second step as S;
a, the length of a text of the bid document after the text similar to the purchase document is removed is La;
b, the length of the text of the bid file after the text similar to the purchase file is removed is Lb;
the similarity value Sab = S/Min (La, lb) 100% of the bidding documents a and B. (Min function refers to taking the minimum value within a parameter)
The bidding document is displayed to the evaluation expert through a comparison interface of the bidding document, and the evaluation expert further intervenes in the judgment
5. Detection submodule for transferring bid guarantee money of different bidders from account of same unit or individual
Through the third step of bidding the fund payment account information extraction submodule, pairwise comparison is carried out on the fund payment conditions of different suppliers, and if the same transfer-out account or the same transfer-in account is used by the different suppliers, the fact that the corresponding supplier has the cross bidding behavior is directly judged
5. Bidding file similar content and same key information display module
Based on the processing results of the second step, the third step and the fourth step, a more visual and clear interface needs to be provided for the review expert to make final judgment on the supplier label enclosing behavior, and the display module has the following characteristics:
1. the same page supports the bid document display of at most 4 suppliers;
2. supporting highlighting comparison display of the similar sentences or paragraphs detected in the step two;
3. supporting the comparison and display of the key information detected in the step three according to different colors and marking modes;
4. and (4) displaying the determined and suspicious surrounding label behavior results detected in the step four, and supporting the review expert to quickly finish final evaluation.
Example one
As shown in fig. 2, the present embodiment provides a method for detecting a bid enclosing behavior based on electronic bid document comparison, including:
step 1, converting a bidding document into a plain text, denoising the plain text, and removing content consistent with information in a bidding purchase document to obtain an effective text document;
step 2, dividing all the effective text documents into sentences according to the set punctuations as separators of the sentences;
screening the set sentences: firstly, carrying out duplicate removal treatment on the same sentences in the same effective text document, and then selecting the sentences with the length larger than the preset length;
calculating the simhash value of each sentence of the selected sentence;
traversing and taking out all the simhash values of one effective text document, and sequentially calculating the hamming distance between the effective text document and all the simhash values of the other effective text document; marking sentences corresponding to two groups of simhash values with the hamming distance smaller than a preset value as similar sentences; if continuous sentences exist in the sentence group, splicing to obtain similar information;
step 3, extracting basic key information, quotation information, supplier electronic bid document making information and bid deposit payment account information from all effective text documents;
the basic key information includes: chinese name, telephone number, address, email, and company name;
the Chinese name extraction: identifying and extracting name information in the effective text document based on the mmseg algorithm and the Chinese surname lexicon, and storing the extracted name information into a database;
the telephone number extraction: extracting the telephone number in the effective text document by using a regular expression, and storing the extracted telephone number information into a database;
the address extraction: identifying and extracting address information in the effective text document based on an mmseg algorithm, an administrative division word bank and an address word bank, and storing the extracted address information into a database;
and E, extracting the email box: extracting email information in the effective text document by using a regular expression, and storing the extracted email information into a database;
extracting the company name: identifying and extracting company name information in the effective text document based on the mmseg algorithm and a company name word bank, and storing the extracted company name information into a database;
the quotation information is as follows: acquiring corresponding quotation information from a database according to a supplier;
the supplier electronic bid document production information: when a supplier uploads an electronic bid file, recording the mac address and the ip address of the electronic bid file encryption computer and the uploading computer;
the information of the payment account of the bid security is as follows: when a supplier pays a guarantee deposit, recording the transfer-out account number of the supplier and the transferred-in unique virtual guarantee deposit number;
step 4, comparing every two electronic bidding document making information of suppliers of different suppliers, and if the bidding documents of the different suppliers are encrypted or the mac addresses of uploaded computers are consistent, judging that the two suppliers have the suspicion of the girdling label;
comparing the basic information of different suppliers pairwise, and if one or more of the name, the telephone number, the email address, the company name and the address are consistent, marking the two suppliers as suspicious behaviors with surrounding serial marks;
comparing the quotation information of different suppliers in pairs:
(1) if the quotation of two or more suppliers is high or low, and the deviation rate of the quotation amount is smaller than the integral deviation rate of the quotation of all suppliers;
the method for judging the price distortion is as follows:
if the difference between the quoted price of two or more suppliers and the average quoted price of each supplier is above the difference threshold of abnormal high and abnormal low, the quoted price is abnormal high or abnormal low;
bid evaluation benchmark price = lowest bid price among all supplier bid prices;
deviation rate = | (bidder quoted price-bid evaluation benchmark price) |/bid evaluation benchmark price × 100%;
global deviation ratio = mean value of deviation ratios of bid offers of the respective suppliers involved in the calculation;
(2) step quotation condition of more than two suppliers
The step quotation judging method comprises the following steps:
calculating absolute values of differences between every two bidding quotations of all suppliers, and finding out related suppliers with the same calculated absolute values of the differences and the same number of the differences larger than 2, wherein the quotations of the related suppliers are step quotations;
when the quotation information meets any one of the conditions, marking the corresponding supplier label as suspicious behavior with a surrounding string label;
calculating similarity values among different bid documents;
the similarity value Sab of the two effective text documents is calculated by the following method:
calculating the text length of the similar information content of the two effective text documents to be S;
the text length of a valid text document is La; the text length of the other valid text document is Lb;
the similarity value Sab = S/Min (La, lb) × 100%, if the similarity value is greater than a set threshold value, the two suppliers are judged to have the suspicion of the cluster mark;
comparing every two bidding security fund payment account information of different suppliers, and if different suppliers use the same transfer-out account number or the same transferred-in unique virtual security fund account number, directly judging that the corresponding supplier has the label enclosing action;
step 5, displaying the information in the step 2, the information in the step 3 and the result in the step 4 according to the setting requirement; by displaying this information, the review expert can be made to review again.
Based on the same inventive concept, the application also provides a device corresponding to the method in the first embodiment, which is detailed in the second embodiment.
Example two
As shown in fig. 3, in the present embodiment, a second aspect is provided, and the present invention provides a device for detecting a bid enclosing behavior based on an electronic bid document comparison, including:
the bidding document preprocessing module is used for converting the bidding document into a plain text, denoising the plain text and removing the content consistent with the information in the bidding purchase document to obtain an effective text document;
the bid document similar content detection module is used for dividing all effective text documents into sentences according to the set punctuation marks as separators of the sentences;
screening the set sentences: firstly, carrying out duplicate removal treatment on the same sentences in the same effective text document, and then selecting the sentences the length of which is greater than a preset length;
calculating the simhash value of each sentence of the selected sentence;
traversing and taking out all simhash values of one effective text document, and sequentially calculating the hamming distance with all simhash values of another effective text document; marking sentences corresponding to two groups of simhash values with the hamming distance smaller than a preset value as similar sentences; if continuous sentences exist in the sentence information, splicing to obtain similar information;
the bidding document key information extraction module is used for extracting basic key information, quotation information, supplier electronic bidding document making information and bidding deposit payment account information from all effective text documents;
the basic key information includes: chinese name, telephone number, address, email, and company name;
the Chinese name extraction: identifying and extracting name information in the effective text document based on the mmseg algorithm and the Chinese surname lexicon, and storing the extracted name information into a database;
the telephone number extraction: extracting the telephone number in the effective text document by using a regular expression, and storing the extracted telephone number information into a database;
the address extraction: identifying and extracting address information in the effective text document based on an mmseg algorithm, an administrative division word bank and an address word bank, and storing the extracted address information into a database;
the electronic mailbox extraction: extracting email information in the effective text document by using a regular expression, and storing the extracted email information into a database;
extracting the company name: identifying and extracting company name information in the effective text document based on an mmseg algorithm and a company name word bank, and storing the extracted company name information into a database;
the quotation information is as follows: acquiring corresponding quotation information from a database according to a supplier;
the supplier electronic bidding document making information: when a supplier uploads an electronic bid file, recording the mac address and the ip address of the electronic bid file encryption computer and the uploading computer;
the information of the payment account of the bid security is as follows: when a supplier pays a guarantee deposit, recording the transfer-out account number of the supplier and the transferred-in unique virtual guarantee deposit number;
a bid document detection module for detecting a bid document,
comparing the electronic bid document making information of the suppliers of different suppliers pairwise, and if the bid documents of different suppliers are encrypted or the mac addresses of uploaded computers are consistent, judging that the two suppliers have the suspicion of the girdling label;
comparing the basic information of different suppliers pairwise, and marking the two suppliers as suspicious behaviors with surrounding serial marks if one or more of the name, the telephone number, the email, the company name and the address are consistent;
comparing the quotation information of different suppliers in pairs:
(1) if the quotations of two or more suppliers are both high or low, and the deviation rate of the quotation amount is smaller than the integral deviation rate of the quotations of all suppliers;
the method for judging high or low price quoted is as follows:
if the difference between the quoted price of two or more suppliers and the average quoted price of each supplier is above the difference threshold value of the distortion height and distortion low, the quoted price is the distortion height or distortion low;
bid evaluation benchmark price = the lowest of all supplier bid prices;
deviation rate = | (bidder quoted price-bid evaluation benchmark price) |/bid evaluation benchmark price × 100%;
overall deviation ratio = average of deviation ratios of bid offers of each supplier participating in the calculation;
(2) step quotation condition of more than two suppliers
The step quotation judging method comprises the following steps:
calculating the absolute value of the difference between every two bidding quotations of all suppliers, and finding out the related suppliers which have the same absolute value of the difference and the same number of the related suppliers more than 2, wherein the quotation of the related suppliers is the step quotation;
when the quotation information meets any one of the conditions, marking the corresponding supplier label as suspicious behavior with a surrounding string label;
calculating similarity values among different bidding documents;
the similarity value Sab of the two effective text documents is calculated by the following method:
calculating the text length of the similar information content of the two effective text documents to be S;
the text length of a valid text document is La; the text length of the other valid text document is Lb;
the similarity value Sab = S/Min (La, lb) × 100%, if the similarity value is greater than a set threshold value, the two suppliers are judged to have the suspicion of the cluster mark;
comparing every two bidding security fund payment account information of different suppliers, and if different suppliers use the same transfer-out account number or the same transferred-in unique virtual security fund account number, directly judging that the corresponding supplier has the label enclosing action;
the display module is used for displaying the information in the bid document similar content detection module, the information in the bid document key information extraction module and the result in the bid document key information extraction module according to the set requirement; by displaying this information, the review expert can be made to review again.
Since the apparatus described in the second embodiment of the present invention is an apparatus used for implementing the method of the first embodiment of the present invention, based on the method described in the first embodiment of the present invention, a person skilled in the art can understand the specific structure and the deformation of the apparatus, and thus the details are not described herein. All the devices adopted in the method of the first embodiment of the present invention belong to the protection scope of the present invention.
Although specific embodiments of the invention have been described above, it will be understood by those skilled in the art that the specific embodiments described are illustrative only and are not limiting upon the scope of the invention, and that equivalent modifications and variations can be made by those skilled in the art without departing from the spirit of the invention, which is to be limited only by the appended claims.

Claims (10)

1. A method for detecting a bid enclosing behavior based on electronic bid document comparison is characterized by comprising the following steps: the method comprises the following steps:
step 1, converting a bid document into a plain text, denoising the plain text, and removing the content consistent with information in a bid procurement document to obtain an effective text document;
step 2, dividing all effective text documents into sentences, screening set sentences, calculating simhash values of the sentences, finding out similar sentences of different effective text documents, and splicing continuous sentences to obtain similar information;
step 3, extracting basic key information, quotation information, supplier electronic bid document making information and bid deposit payment account information from all effective text documents;
and 4, comparing and judging whether the bidders are in the cross bidding behavior according to the regulations, the information obtained in the step 2 and the information obtained in the step 3.
2. The method for detecting the surrounding bidding behavior based on the comparison of the electronic bidding documents as claimed in claim 1, wherein: and step 5, displaying the information in the step 2, the information in the step 3 and the result in the step 4 according to the setting requirement.
3. The method for detecting the surrounding bidding behavior based on the comparison of the electronic bidding documents as claimed in claim 1, wherein: the step 2 is further specifically: dividing all the effective text documents into sentences according to the set punctuations as separators of the sentences;
screening sentences set in the text: firstly, carrying out duplicate removal treatment on the same sentences in the same effective text document, and then selecting the sentences with the length larger than the preset length;
calculating the simhash value of each sentence of the selected sentence;
traversing and taking out all simhash values of one effective text document, and sequentially calculating the hamming distance with all simhash values of another effective text document; marking sentences corresponding to two groups of simhash values with the hamming distance smaller than a preset value as similar sentences; and if continuous sentences exist in the sentence structure, splicing to obtain similar information.
4. The method for detecting the behavior of the surrounding bid based on the comparison of the electronic bid documents according to claim 1, wherein: the step 3 is further specifically: extracting basic key information, quotation information, supplier electronic bid document making information and bid security payment account information from all valid text documents;
the basic key information includes: chinese name, telephone number, address, email, and company name;
the Chinese name extraction: identifying and extracting name information in the effective text document based on the mmseg algorithm and the Chinese surname lexicon, and storing the extracted name information into a database;
the telephone number extraction: extracting the telephone number in the effective text document by using a regular expression, and storing the extracted telephone number information into a database;
the address extraction: identifying and extracting address information in the effective text document based on an mmseg algorithm, an administrative division word bank and an address word bank, and storing the extracted address information into a database;
and E, extracting the email box: extracting email information in the effective text document by using a regular expression, and storing the extracted email information into a database;
extracting the company name: identifying and extracting company name information in the effective text document based on the mmseg algorithm and a company name word bank, and storing the extracted company name information into a database;
the quotation information is as follows: acquiring corresponding quotation information from a database according to a supplier;
the supplier electronic bid document production information: when a supplier uploads an electronic bid file, recording the mac address and the ip address of the electronic bid file encryption computer and the uploading computer;
the information of the bidding deposit payment account is as follows: when the supplier pays the guarantee fee, the transfer-out account number of the supplier and the transferred-in unique virtual guarantee fee number are recorded.
5. The method for detecting the surrounding bidding behavior based on the comparison of the electronic bidding documents as claimed in claim 1, wherein: the step 4 is further specifically as follows:
comparing the electronic bid document making information of the suppliers of different suppliers pairwise, and if the bid documents of different suppliers are encrypted or the mac addresses of uploaded computers are consistent, judging that the two suppliers have the suspicion of the girdling label;
comparing the basic information of different suppliers pairwise, and if one or more of the name, the telephone number, the email address, the company name and the address are consistent, marking the two suppliers as suspicious behaviors with surrounding serial marks;
comparing the quotation information of different suppliers in pairs:
(1) if the quotation of two or more suppliers is high or low, and the deviation rate of the quotation amount is smaller than the integral deviation rate of the quotation of all suppliers;
the method for judging high or low price quoted is as follows:
if the difference between the quoted price of two or more suppliers and the average quoted price of each supplier is above the difference threshold value of the distortion height and distortion low, the quoted price is the distortion height or distortion low;
bid evaluation benchmark price = lowest bid price among all supplier bid prices;
deviation rate = | (bidder quoted price-bid evaluation benchmark price) |/bid evaluation benchmark price × 100%;
global deviation ratio = mean value of deviation ratios of bid offers of the respective suppliers involved in the calculation;
(2) step quotation condition of more than two suppliers
The step quotation judging method comprises the following steps:
calculating absolute values of differences between every two bidding quotations of all suppliers, and finding out related suppliers with the same calculated absolute values of the differences and the same number of the differences larger than 2, wherein the quotations of the related suppliers are step quotations;
when the quotation information meets any one of the conditions, marking the corresponding supplier as suspicious behavior with the surrounding string mark;
calculating similarity values among different bid documents;
the similarity value Sab of the two effective text documents is calculated by the following method:
calculating the text length of the similar information content of the two effective text documents to be S;
the text length of a valid text document is La; the text length of the other valid text document is Lb;
the similarity value Sab = S/Min (La, lb) × 100%, if the similarity value is greater than the set threshold value, the two suppliers are determined to have the suspicion of the girdling mark;
and comparing every two bidding security fund payment account information of different suppliers, and if different suppliers use the same transfer-out account number or the same transferred-in unique virtual security fund account number, directly judging that the corresponding supplier has the label enclosing action.
6. The utility model provides a enclose string mark action detection device based on electron bid file compares which characterized in that: the method comprises the following steps:
the bid document preprocessing module is used for converting the bid document into a plain text, denoising the plain text and removing the content consistent with the information in the bid purchasing document to obtain an effective text document;
the bid document similar content detection module is used for segmenting all effective text documents, screening set sentences, calculating simhash values of the sentences, finding out similar sentences of different effective text documents, and splicing continuous sentences to obtain similar information;
the bidding document key information extraction module is used for extracting basic key information, quotation information, supplier electronic bidding document making information and bidding deposit payment account information from all effective text documents;
and the bid document detection module is used for comparing and judging whether the bidder performs the action of enclosing the bidding document according to the information obtained from the regulation and bid document similar content detection module and the information obtained from the bid document key information extraction module.
7. The device for detecting the behavior of the surrounding bidding document based on the comparison of the electronic bidding document as claimed in claim 6, wherein: the system also comprises a display module which displays the information in the bid document similar content detection module, the information in the bid document key information extraction module and the result in the bid document key information extraction module according to the set requirements.
8. The device for detecting the behavior of the surrounding bidding document based on the comparison of the electronic bidding document as claimed in claim 6, wherein: the bid document similar content detection module is further embodied as follows: dividing all effective text documents into sentences according to the set punctuations as separators of the sentences;
screening the set sentences: firstly, carrying out duplicate removal treatment on the same sentences in the same effective text document, and then selecting the sentences with the length larger than the preset length;
calculating the simhash value of each sentence of the selected sentence;
traversing and taking out all the simhash values of one effective text document, and sequentially calculating the hamming distance between the effective text document and all the simhash values of the other effective text document; marking sentences corresponding to two groups of simhash values with the hamming distance smaller than a preset value as similar sentences; and if continuous sentences exist in the sentence structure, splicing to obtain similar information.
9. The device for detecting the behavior of the surrounding bidding document based on the comparison of the electronic bidding document as claimed in claim 6, wherein: the bid document key information extraction module is further embodied as follows: extracting basic key information, quotation information, supplier electronic bid document making information and bid security payment account information from all valid text documents;
the basic key information includes: chinese name, telephone number, address, email, and company name;
the Chinese name extraction: identifying and extracting name information in the effective text document based on the mmseg algorithm and the Chinese surname lexicon, and storing the extracted name information into a database;
the telephone number extraction: extracting the telephone number in the effective text document by using a regular expression, and storing the extracted telephone number information into a database;
the address extraction: identifying and extracting address information in the effective text document based on an mmseg algorithm, an administrative division word bank and an address word bank, and storing the extracted address information into a database;
and E, extracting the email box: extracting email information in the effective text document by using a regular expression, and storing the extracted email information into a database;
extracting the company name: identifying and extracting company name information in the effective text document based on an mmseg algorithm and a company name word bank, and storing the extracted company name information into a database;
the quotation information is as follows: acquiring corresponding quotation information from a database according to a supplier;
the supplier electronic bidding document making information: when a supplier uploads an electronic bidding document, recording the electronic bidding document encryption computer and the mac address and the ip address of the uploading computer;
the information of the payment account of the bid security is as follows: when the supplier pays the guarantee fee, the transfer-out account number of the supplier and the transferred-in unique virtual guarantee fee number are recorded.
10. The device for detecting the behavior of the bid enclosing object based on the comparison of the electronic bidding documents as claimed in claim 6, wherein: the bid document key information extraction module is further embodied as follows:
comparing the making information of the electronic bidding documents of different suppliers in pairs, and if the bidding documents of different suppliers are encrypted or the mac addresses of uploaded computers are consistent, judging that the two suppliers have the suspicion of the circumferential mark;
comparing the basic information of different suppliers pairwise, and if one or more of the name, the telephone number, the email address, the company name and the address are consistent, marking the two suppliers as suspicious behaviors with surrounding serial marks;
comparing the quotation information of different suppliers in pairs:
(1) if the quotation of two or more suppliers is high or low, and the deviation rate of the quotation amount is smaller than the integral deviation rate of the quotation of all suppliers;
the method for judging high or low price quoted is as follows:
if the difference between the quoted price of two or more suppliers and the average quoted price of each supplier is above the difference threshold of abnormal high and abnormal low, the quoted price is abnormal high or abnormal low;
bid evaluation benchmark price = lowest bid price among all supplier bid prices;
deviation ratio = | (bidder quote-bid evaluation benchmark price) |/bid evaluation benchmark price × 100%;
overall deviation ratio = average of deviation ratios of bid offers of each supplier participating in the calculation;
(2) step quotation condition of more than two suppliers
The step quotation judging method comprises the following steps:
calculating the absolute value of the difference between every two bidding quotations of all suppliers, and finding out the related suppliers which have the same absolute value of the difference and the same number of the related suppliers more than 2, wherein the quotation of the related suppliers is the step quotation;
when the quotation information meets any one of the conditions, marking the corresponding supplier label as suspicious behavior with a surrounding string label;
calculating similarity values among different bid documents;
the similarity value Sab of the two effective text documents is calculated by the following method:
calculating the text length of the similar information content of the two effective text documents to be S;
the text length of a valid text document is La; the text length of another valid text document is Lb;
the similarity value Sab = S/Min (La, lb) × 100%, if the similarity value is greater than a set threshold value, the two suppliers are judged to have the suspicion of the cluster mark;
and comparing every two bidding security fund payment account information of different suppliers, and if different suppliers use the same transfer-out account number or the same transferred-in unique virtual security fund account number, directly judging that the corresponding supplier has the label enclosing action.
CN202210897373.8A 2021-12-27 2022-07-28 Method and device for detecting enclosing and bidding behavior based on electronic bidding document comparison Pending CN115249007A (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111612185.8A CN114492323A (en) 2021-12-27 2021-12-27 Method and device for detecting enclosing and bidding behavior based on electronic bidding document comparison
CN2021116121858 2021-12-27

Publications (1)

Publication Number Publication Date
CN115249007A true CN115249007A (en) 2022-10-28

Family

ID=81495415

Family Applications (2)

Application Number Title Priority Date Filing Date
CN202111612185.8A Pending CN114492323A (en) 2021-12-27 2021-12-27 Method and device for detecting enclosing and bidding behavior based on electronic bidding document comparison
CN202210897373.8A Pending CN115249007A (en) 2021-12-27 2022-07-28 Method and device for detecting enclosing and bidding behavior based on electronic bidding document comparison

Family Applications Before (1)

Application Number Title Priority Date Filing Date
CN202111612185.8A Pending CN114492323A (en) 2021-12-27 2021-12-27 Method and device for detecting enclosing and bidding behavior based on electronic bidding document comparison

Country Status (1)

Country Link
CN (2) CN114492323A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117252194A (en) * 2023-11-17 2023-12-19 上海百通项目管理咨询有限公司 Bid file detection method and system based on natural semantic model

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117114720B (en) * 2023-10-25 2024-02-20 湖南华菱电子商务有限公司 E-commerce platform management system based on Internet

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117252194A (en) * 2023-11-17 2023-12-19 上海百通项目管理咨询有限公司 Bid file detection method and system based on natural semantic model
CN117252194B (en) * 2023-11-17 2024-02-23 上海百通项目管理咨询有限公司 Bid file detection method and system based on natural semantic model

Also Published As

Publication number Publication date
CN114492323A (en) 2022-05-13

Similar Documents

Publication Publication Date Title
CN110069623B (en) Abstract text generation method and device, storage medium and computer equipment
CN107330752B (en) Method and device for identifying brand words
RU2679209C2 (en) Processing of electronic documents for invoices recognition
US20170235820A1 (en) System and engine for seeded clustering of news events
CN115249007A (en) Method and device for detecting enclosing and bidding behavior based on electronic bidding document comparison
CN111797210A (en) Information recommendation method, device and equipment based on user portrait and storage medium
US20130275451A1 (en) Systems And Methods For Contract Assurance
CN110766486A (en) Method and device for determining item category
US9256805B2 (en) Method and system of identifying an entity from a digital image of a physical text
CN112131348B (en) Method for preventing repeated declaration of project based on similarity of text and image
CA2956627A1 (en) System and engine for seeded clustering of news events
US10699112B1 (en) Identification of key segments in document images
CN111078839A (en) Structured processing method and processing device for referee document
CN111191614A (en) Document classification method and device
US20240193522A1 (en) Citation and policy based document classification
CN111582314A (en) Target user determination method and device and electronic equipment
CN115098440A (en) Electronic archive query method, device, storage medium and equipment
CN116739626A (en) Commodity data mining processing method and device, electronic equipment and readable medium
CN113408660A (en) Book clustering method, device, equipment and storage medium
CN111640025B (en) Method for realizing information labeling processing based on label system
CN112487808A (en) Big data based news message pushing method, device, equipment and storage medium
CN111159399A (en) Automobile vertical website water army discrimination method
CN113706207A (en) Order transaction rate analysis method, device, equipment and medium based on semantic analysis
CN112862305A (en) Method, device, equipment and storage medium for determining risk state of object
CN112818215A (en) Product data processing method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination