CN116204594A - Data processing method, device and equipment based on block chain - Google Patents

Data processing method, device and equipment based on block chain Download PDF

Info

Publication number
CN116204594A
CN116204594A CN202310493762.9A CN202310493762A CN116204594A CN 116204594 A CN116204594 A CN 116204594A CN 202310493762 A CN202310493762 A CN 202310493762A CN 116204594 A CN116204594 A CN 116204594A
Authority
CN
China
Prior art keywords
input
word
input information
information
detected
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310493762.9A
Other languages
Chinese (zh)
Inventor
李劲松
于明亮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Travelsky Technology Co Ltd
Original Assignee
China Travelsky Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Travelsky Technology Co Ltd filed Critical China Travelsky Technology Co Ltd
Priority to CN202310493762.9A priority Critical patent/CN116204594A/en
Publication of CN116204594A publication Critical patent/CN116204594A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/335Filtering based on additional data, e.g. user or group profiles
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a data processing method, device and equipment based on a block chain. The data processing method based on the block chain comprises the following steps: receiving input information sent by a first node on a blockchain; performing similarity compliance detection processing on the part-of-speech sequence to be detected of the input information and the target comparison information through intelligent contracts of the blockchain to obtain a detection result; and if the detection result indicates that the rule detection is passed, the input information is sent to a second node on the blockchain, otherwise, feedback information which is not passed by the detection is fed back to the first node. The scheme of the invention can improve the accuracy of data exchange by carrying out compliance detection on the input information.

Description

Data processing method, device and equipment based on block chain
Technical Field
The present invention relates to the field of computer information processing technologies, and in particular, to a data processing method, apparatus, and device based on a blockchain.
Background
The block chain, namely a chain composed of one block, each block stores certain information, each block is connected into a chain according to the time sequence generated by each block, the chain is stored in all servers, as long as one server in the whole system can work, the whole block chain is safe, the servers are called nodes in the block chain system, and each node provides storage space and calculation support for the whole block chain system. If the information in the blockchain is to be modified, the consent of a plurality of nodes must be characterized and the information in all the nodes must be modified, and the nodes usually master the information in the blockchain in different subject hands, so that the information recorded by the blockchain is more real and reliable. Meanwhile, the block chain also has the characteristics of information synchronization and transparent information disclosure.
However, because the word habit and the required content focus among the parties are different, the difficulty of providing the corresponding data to the accurate position is increased in the process of providing the data, and the accuracy in the information interaction process is further reduced. In the scenario where a passenger travels on an airplane and data is involved in the departure of the data, an airport or a flight crews are usually required by the flight crews, and relevant data are provided for explaining and applying the situation. However, due to the fact that the data required by the data provider and the data receiver are different, the data exchange accuracy is low, and therefore the problem of low data exchange efficiency is caused.
Disclosure of Invention
The invention aims to provide a data processing method, device and equipment based on a block chain, which can improve the accuracy of data exchange by detecting the compliance of input information.
In order to solve the technical problems, the technical scheme of the invention is as follows:
a blockchain-based data processing method, the method comprising:
Receiving input information sent by a first node on a blockchain;
performing similarity compliance detection processing on the part-of-speech sequence to be detected of the input information and the target comparison information through intelligent contracts of the blockchain to obtain a detection result;
and if the detection result indicates that the rule detection is passed, the input information is sent to a second node on the blockchain, otherwise, feedback information which is not passed by the detection is fed back to the first node.
Optionally, receiving the input information sent by the first node on the blockchain includes:
acquiring an initial questionnaire template image;
performing word recognition processing on the initial questionnaire template image to obtain a plurality of texts to be input; each text to be input comprises a plurality of fields to be input;
and carrying out recombination processing on the sub-input information corresponding to each field to be input to obtain the input information.
Optionally, performing text recognition processing on the initial questionnaire template image to obtain a plurality of texts to be input, including:
performing character recognition scanning on the initial questionnaire template image through a first scanning frame;
determining a plurality of first fields to be input and a plurality of second fields to be input according to the types of the objects to be input contained in the first scanning frame;
Acquiring a coding similarity set of each first field to be input and each second field to be input;
when the maximum coding similarity in the coding similarity set is larger than a coding threshold, replacing a second text mark of a second field to be input corresponding to the maximum coding similarity with the first text mark;
and adding the first field to be input and the second field to be input with the same text mark into the same initial questionnaire template image to obtain a plurality of texts to be input.
Optionally, performing similarity compliance detection processing on the part-of-speech sequence to be detected of the input information and the target comparison information through intelligent contracts of the blockchain to obtain a detection result, where the detection result comprises:
acquiring a comparison set of at least one target comparison information corresponding to the input information;
word segmentation processing is carried out on the input information, so that a plurality of words to be detected corresponding to the input information are obtained;
performing part-of-speech detection on each word to be detected, and determining part-of-speech tags corresponding to each word to be detected;
determining a part-of-speech sequence to be detected of the input information according to part-of-speech tags contained in the input information;
obtaining a similarity value sequence formed by the input information and the similarity value of each target comparison information according to the word to be detected in the part-of-speech sequence to be detected and the target comparison information;
And if the maximum value in the similarity value sequence is larger than the similarity threshold value, obtaining a detection result passing detection, otherwise, obtaining a detection result not passing detection.
Optionally, according to the word to be detected in the part-of-speech sequence to be detected and the target comparison information, obtaining a similarity value sequence formed by the input information and a similarity value of each target comparison information includes:
according to
Figure SMS_1
Obtaining a similarity value sequence formed by the similarity values of the input information and each target comparison information;
wherein ,
Figure SMS_2
for the similarity value of the first input information and the ith target comparison information, +.>
Figure SMS_3
For the sub-similarity value of the b-th word to be detected and the i-th target comparison information in the input information, b=1, 2,3, …, y, and y are the total number of words to be detected in the first input information.
Alternatively to this, the method may comprise,
Figure SMS_4
is determined by the following process:
acquiring a target matching sequence corresponding to a b-th word to be detected in the input information; the target matching sequence is part-of-speech class sequence and vocabulary sequence after the matching point position corresponding to the b-1 th word to be detected in the comparison set of the ith target comparison information corresponding to the input information; the matching point position corresponding to the b-1 th word to be detected is a target comparison word corresponding to the b-1 st word to be detected;
When the part-of-speech tag corresponding to the b-th word to be detected in the input information is different from any one of the comparison part-of-speech tags in the target matching sequence,
Figure SMS_5
when the part-of-speech tag corresponding to the b-th word to be detected in the input information is the same as any one of the part-of-speech tags in the target matching sequence, determining the comparison word corresponding to the part-of-speech tag as an initial comparison word;
when the b-th word to be detected in the input information does not belong to the vocabulary corresponding to the initial comparison word,
Figure SMS_6
when the b-th word to be detected in the input information belongs to the word list corresponding to the initial comparison word, determining the initial comparison word as the target comparison word, and
Figure SMS_7
wherein ,
Figure SMS_8
for the first sub-similarity value, < >>
Figure SMS_9
For the second sub-similarity value, +.>
Figure SMS_10
>/>
Figure SMS_11
Optionally, the similarity threshold satisfies the following condition:
Figure SMS_12
wherein Y1 is a similarity threshold,
Figure SMS_13
as the threshold coefficient, AVG (a) is a number average of comparison words included in a plurality of target comparison information corresponding to the input information.
The invention also provides a data processing device based on the block chain, which comprises:
the receiving and transmitting module is used for receiving input information sent by a first node on the block chain;
the processing module is used for carrying out similarity compliance detection processing on the part-of-speech sequence to be detected of the input information and the target comparison information through intelligent contracts of the blockchain to obtain a detection result; and if the detection result indicates that the rule detection is passed, the input information is sent to a second node on the blockchain, otherwise, feedback information which is not passed by the detection is fed back to the first node.
The present invention also provides a computing device comprising: a processor, a memory and a program or instruction stored on the memory and executable on the processor, which when executed by the processor, implements the steps of the method as described above.
The present invention also provides a readable storage medium having stored thereon a program or instructions which when executed by a processor performs the steps of the method as described above.
The scheme of the invention at least comprises the following beneficial effects:
according to the scheme, the input information sent by the first node on the block chain is received; performing similarity compliance detection processing on the part-of-speech sequence to be detected of the input information and the target comparison information through intelligent contracts of the blockchain to obtain a detection result; and if the detection result indicates that the rule detection is passed, the input information is sent to a second node on the blockchain, otherwise, feedback information which is not passed by the detection is fed back to the first node. The data exchange accuracy can be improved by carrying out compliance detection on the input information.
Drawings
FIG. 1 is a flow chart of a blockchain-based data processing method provided by an embodiment of the present invention;
FIG. 2 is a block diagram of a block chain based data processing apparatus according to an embodiment of the present invention.
Detailed Description
Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present invention are shown in the drawings, it should be understood that the present invention may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the invention to those skilled in the art.
As shown in fig. 1, an embodiment of the present invention provides a data processing method based on a blockchain, the method including:
step 11, receiving input information sent by a first node on a block chain;
step 12, performing similarity compliance detection processing on the part-of-speech sequence to be detected of the input information and the target comparison information through intelligent contracts of the blockchain to obtain a detection result;
and step 13, if the detection result indicates that the rule detection is passed, the input information is sent to a second node on the blockchain, otherwise, feedback information which is not passed by the detection is fed back to the first node.
In the embodiment of the invention, input information sent by a first node on a blockchain is received, after a first node in the blockchain clicks an information uploading instruction, a corresponding intelligent contract on the blockchain is triggered, compliance detection is carried out on the input information to obtain a detection result, if the detection result shows that the compliance detection is passed, the input information is sent to a second node on the blockchain, otherwise, feedback information which is not passed by the detection is fed back to the first node. Therefore, the data exchange accuracy can be improved by carrying out compliance detection on the input information.
It should be noted that the blockchain may include: a first node, a second node, and an intelligent contract; the first node is used for receiving input information; the intelligent contract is used for carrying out compliance detection on the input information uploaded by the first node;
the first node may be a data provider, for example: an airport or airline; the second node may be a data receiver or data compliance auditor, for example: the aeronautical credit department or the data compliance audit department;
the intelligent contract is encapsulated with the capability of carrying out compliance detection on input information provided by the first node; if the compliance detection of the intelligent contract is adopted, the consistency between the input information and the information required by the second node is higher, and the data exchange accuracy is further improved. In addition, after passing the compliance detection, the input information is sent to the second node, if the second node is a data compliance auditor, the input information can be audited again, and therefore, the accuracy of data exchange can be further improved through the two-time compliance detection of the intelligent contract and the second node.
In an alternative embodiment of the present invention, step 11 may include:
step 111, acquiring an initial questionnaire template image;
step 112, performing word recognition processing on the initial questionnaire template image to obtain a plurality of texts to be input; each text to be input comprises a plurality of fields to be input; here, the field to be input is an item to be input in the text to be input;
and 113, carrying out recombination processing on the sub-input information corresponding to each field to be input to obtain the input information.
In this embodiment, after the initial questionnaire template image is split through the text recognition process, a plurality of sub questionnaires, that is, the text to be input, are recombined according to the relevance of the question content. Therefore, the concentration degree of the required content in each text to be input can be improved, so that the first node can conveniently distribute different texts to be input to operators at corresponding positions for filling.
It should be noted that, the initial questionnaire template image may be image information of a questionnaire made by the second node according to the required data; the initial questionnaire template image comprises a plurality of pixel sets of information to be input; the areas of the plurality of information to be input are different;
The pixel set of the information to be input can be an image set of each item to be filled, and the length of each item to be filled is different due to different collected contents; for example: the to-be-filled item may include: name: "," gender: "wait for simple entry, can also include: organizing or personal basic information, data safety management mechanism information and other complex items to be filled in; setting an answer area with a corresponding size according to the complexity of the item to be filled; for example: name: the corresponding answer area is smaller, so that an area with a few characters is reserved behind the name; the answer area corresponding to the information of the data safety management mechanism is larger, so that an area with a size capable of placing a plurality of lines of characters is reserved below the answer area.
Therefore, through the initial questionnaire template image, the input information can be ensured not to be changed at will, and further typesetting style information in the initial questionnaire template image is reserved.
Note that the initial questionnaire template image may be subjected to a text recognition process by OCR (Optical Character Recognition ). This allows questions with similar content to be placed in the same questionnaire to facilitate the first node filling out the relevant content.
In an alternative embodiment of the present invention, step 112 may include:
step 1121, performing character recognition scanning on the initial questionnaire template image through a first scanning frame;
step 1122, determining a plurality of first fields to be input and a plurality of second fields to be input according to the type of the object to be input contained in the first scan frame; here, the object type to be input may include: a text object and an input identification object;
step 1123, obtaining a coding similarity set of each first field to be input and each second field to be input;
step 1124, when the maximum coding similarity in the coding similarity set is greater than the coding threshold, replacing the second text label of the second field to be input corresponding to the maximum coding similarity with the first text label;
in step 1125, the first field to be input and the second field to be input with the same text label are added to the same initial questionnaire template image, so as to obtain a plurality of texts to be input.
In this embodiment, the similarity between the first field to be input and each second field to be input may be determined, and then whether the maximum similarity corresponding to each first field to be input is greater than the coding threshold value is determined; if the text mark is larger than the first text mark, changing a first text mark corresponding to a first field to be input; thus, the same second text label can be determined for the first field to be input which is similar to the second field to be input; therefore, the first field to be input and the second field to be input with higher relativity can be added into the same initial questionnaire template image according to the same second text mark, so that a plurality of questions with higher relativity can be placed into the same text to be input.
It should be noted that, the size of the first scanning frame may be set according to actual needs, and in particular, but not limited to, the size of the first scanning frame may be set to cover a whole line of text, and the scanning mode of performing text recognition scanning on the initial questionnaire template image through the first scanning frame may be progressive scanning from top to bottom;
specifically, the set of coding similarities of the first field to be input and each second field to be input may include:
Figure SMS_14
,/>
Figure SMS_15
; wherein ,/>
Figure SMS_16
For the d-th first field to be input and the coding similarity set of each second field to be input, x is the total number of the first fields to be input, +.>
Figure SMS_17
For the coding similarity of the d first field to be input and the q second field to be input, w isThe total number of second fields to be entered;
specifically, an existing text coding mode may be used to code each first field to be input and each second field to be input, so as to obtain coding vectors corresponding to each first field to be input and each second field to be input, and respectively calculate vector similarity between the two coding vectors;
when (when)
Figure SMS_18
When it will
Figure SMS_19
The second text label of the corresponding second field to be entered is replaced by +.>
Figure SMS_20
Is a first text mark of (a); wherein Y2 is a coding threshold; y2 may be set according to an actual use scenario.
In yet another alternative embodiment of the present invention, step 1122 may include:
step 11221, when at least one object to be input is included in the first scan frame, converting each object to be input into a first field to be input corresponding to the object to be input, and setting a first text mark; the first text labels corresponding to each first field to be input are the same;
step 11222, when the first scan frame includes only text objects, enlarging the scan area of the first scan frame;
step 11223, when the enlarged first scan frame includes an input identification object, converting the object to be input into a second field to be input and setting a second text label; wherein the second text labels corresponding to each second field to be input are different.
In this embodiment, the scanning area of the first scanning frame may be enlarged downward by increasing the length of the first scanning frame in the up-down direction of the initial questionnaire template image, and specifically, the size of the line space may be doubled each time;
because the complex item to be filled can be at least one line of text description, setting a corresponding blank area below the complex item to be filled; when encountering a complex item to be filled, the first scanning frame only has text information in the scanning frame, so that the first scanning frame can be vertically enlarged, and the first scanning frame can completely cover the complex item to be filled.
It should be noted that, the object to be input is the item to be filled, and the text object in the object to be input is the question field, for example, the input identification object corresponding to the gender is the input mark, for example ": "or an underline or a blank area of a corresponding size, the simple entry to be filled in can conform to the above characteristics, and therefore, the simple entry to be filled in can be converted into the first field to be input having the same first text label.
In yet another alternative embodiment of the present invention, step 12 may include:
step 121, obtaining a comparison set of at least one target comparison information corresponding to the input information;
step 122, word segmentation processing is performed on the input information, so as to obtain a plurality of words to be detected corresponding to the input information;
step 123, performing part-of-speech detection on each word to be detected, and determining part-of-speech tags corresponding to each word to be detected;
step 124, determining a part-of-speech sequence to be detected of the input information according to the part-of-speech tag contained in the input information;
step 125, obtaining a similarity value sequence formed by the input information and the similarity value of each target comparison information according to the word to be detected in the part-of-speech sequence to be detected and the target comparison information;
And step 126, if the maximum value in the similarity value sequence is greater than the similarity threshold value, obtaining a detection result passing detection, otherwise, obtaining a detection result not passing detection.
Wherein the similarity threshold satisfies the following condition:
Figure SMS_21
wherein Y1 is a similarity threshold,
Figure SMS_22
as the threshold coefficient, AVG (a) is a number average of comparison words included in a plurality of target comparison information corresponding to the input information.
In this embodiment, similarity compliance detection processing can be performed on the part-of-speech sequence to be detected of the input information and the target comparison information through intelligent contracts of the blockchain, so as to obtain a detection result. Therefore, the difference between the data can be more accurately determined, and finally the data exchange accuracy is improved.
The comparison set of the at least one target comparison information corresponding to the input information may be:
Figure SMS_23
; wherein ,/>
Figure SMS_24
,/>
Figure SMS_25
For the comparison set of the ith target comparison information corresponding to the input information,/for the comparison set of the ith target comparison information>
Figure SMS_26
For the part-of-speech class sequence corresponding to the comparison word included in the ith target comparison information,/->
Figure SMS_27
For the vocabulary sequence corresponding to the comparison word included in the ith target comparison information, z is the total number of target comparison information corresponding to the input information, i=1, 2, … and z;
It should be noted that, when different people answer the same question, different answers may be input due to different semantic habits, for example: in answering questions of the self-evaluation work development situation, the detailed contents to be answered may specifically include: start-stop time, organization, implementation process, implementation mode and other contents; when different people reply, the sequence of the several items may be exchanged, different description modes such as flip-chip sentence may be used in the reply, and several items may be combined to reply, for example: the start-stop time, implementation process and implementation mode are answered in a reply; therefore, there may be a plurality of answer patterns meeting the requirements for the same question, so that the input information may correspond to at least one target comparison information;
since the contents included in all the answer patterns are substantially the same, the keywords corresponding to the answer contents are substantially the same, and thus, the part of speech class sequence of each target comparison information can be generated based on the part of speech of the keywords included in the answer and the arrangement order of the keywords.
In order to improve the matching precision, a vocabulary sequence of each target comparison information can be generated according to the vocabulary to which each keyword belongs; for example: one correct input is: the A aviation carries out data security self-checking activities in the A international airport in 2000, wherein the keywords are "A aviation", "2000", "A international airport" and "data security self-checking activities"; the part-of-speech class sequence of the target comparison information corresponding to the input information is noun, noun and noun; the corresponding vocabulary sequence is the name of the airline company, the time, the airport name and the activity name; if the answer input by the first node is that the international airport A and the aviation A jointly develop service promotion training in 2000. Although the part-of-speech class sequences of the keywords "International airport A", "aviation A", "year 2000" and "service improvement training" are the same as those described above, the vocabulary sequence corresponding to the input information is significantly different from the vocabulary sequence described above; therefore, by setting the part-of-speech class sequence and the vocabulary sequence at the same time, the reference dimension of similarity calculation is increased, and the accuracy of calculating the similarity between the input information and the standard answer can be improved;
Specifically, a plurality of word lists can be set, corresponding common words are configured in the word lists, and corresponding word list names are configured; configuring a corresponding vocabulary name for the comparison word according to the vocabulary name of the vocabulary to which the comparison word belongs in the target comparison information, and further generating a corresponding vocabulary sequence;
word segmentation processing is carried out on the input information to obtain a plurality of corresponding words to be detectedTo include:
Figure SMS_28
; wherein ,/>
Figure SMS_29
For the b-th word to be detected in the input information, y is the total number of words to be detected in the input information, b=1, 2, …, y;
performing part-of-speech detection on each word to be detected, and determining the part-of-speech tag corresponding to each word to be detected may include:
Figure SMS_30
; wherein ,/>
Figure SMS_31
The part-of-speech tag corresponding to the b-th word to be detected in the input information;
specifically, word segmentation processing and part-of-speech tagging processing can be performed through a CRF (conditional random field ) model;
the part-of-speech tags included in the input information may include:
Figure SMS_32
obtaining a part-of-speech sequence to be detected of input information; wherein y is the total number of words to be detected in the input information, and b=1, 2, …, y.
In yet another alternative embodiment of the present invention, step 125 may include:
step 1251, according to
Figure SMS_33
Obtaining a similarity value sequence formed by the similarity values of the input information and each target comparison information;
wherein ,
Figure SMS_34
for the similarity value of the first input information and the ith target comparison information, +.>
Figure SMS_35
For the b-th of the input informationAnd b=1, 2,3, …, y and y are the total number of words to be detected in the first input information.
In particular, in step 1251,
Figure SMS_36
the determination may be made by the following procedure:
step 12511, obtaining a target matching sequence corresponding to the b-th word to be detected in the input information; the target matching sequence is part-of-speech class sequence and vocabulary sequence after the matching point position corresponding to the b-1 th word to be detected in the comparison set of the ith target comparison information corresponding to the input information; the matching point position corresponding to the b-1 th word to be detected is a target comparison word corresponding to the b-1 st word to be detected;
step 12512, when the part-of-speech tag corresponding to the b-th word to be detected in the input information is different from any one of the comparison part-of-speech tags in the target matching sequence,
Figure SMS_37
step 12513, when the part-of-speech tag corresponding to the b-th word to be detected in the input information is the same as any one of the part-of-speech tags in the target matching sequence, determining the comparison word corresponding to the part-of-speech tag as the initial comparison word;
Step 12514, when the b-th word to be detected in the input information does not belong to the vocabulary corresponding to the initial comparison word,
Figure SMS_38
step 12515, when the b-th word to be detected in the input information belongs to the vocabulary corresponding to the initial comparison word, determining the initial comparison word as the target comparison word, and
Figure SMS_39
wherein ,
Figure SMS_40
for the first sub-similarity value, < >>
Figure SMS_41
For the second sub-similarity value, +.>
Figure SMS_42
>/>
Figure SMS_43
In this embodiment, when determining the sub-similarity value of each word to be detected, determining a target matching sequence corresponding to the sub-similarity value, and searching for a target comparison word corresponding to the sub-similarity value in the target matching sequence; in determining
Figure SMS_44
In the process, the sequence of each word to be detected in the input information input by the first node is arranged, so that the calculation accuracy of the similarity between the input information and the target comparison information is further improved by adding a reference factor of one dimension; the sub-similarity value of the word to be detected and the target comparison information is related to the similarity of the word to be detected and the similarity of the content of the word to be detected; therefore, the sub-similarity value corresponding to the word to be detected has higher accuracy, so that the calculation accuracy of the similarity between the input information and the target comparison information is improved; therefore, the difference between the data is more accurately determined, and finally the data exchange accuracy is improved.
It should be noted that, the sub similarity value is positively correlated with the part-of-speech relevance of the word to be detected and the content relevance of the word to be detected; generating part-of-speech relevance according to the part-of-speech tag of the word to be detected and the corresponding part-of-speech class sequence; generating the relativity of the word list according to the word to be detected and the word list sequence.
Specifically, for example, if the b-th word to be detected is aviation a, the b-th comparison word corresponding to the target comparison information and the part of speech in the part of speech class sequence can be compared to generate a corresponding part of speech relativity; for example: if the parts of speech are the same, the part of speech relativity is determined to be 1, and if the parts of speech are different, the part of speech relativity is determined to be 0.
Meanwhile, the word to be detected can be matched with a b-th word list in the word list sequence corresponding to the target comparison information, and corresponding word list relativity is generated; for example: if the word to be detected belongs to the vocabulary, determining the relativity of the vocabulary as 1, and if the word to be detected does not belong to the vocabulary, determining the relativity of the vocabulary as 0;
and finally, taking the sum of the part-of-speech relativity of the word to be detected and the relativity of the word list as a sub-similarity value of the word to be detected and the target comparison information.
In the above embodiment of the present invention, by the blockchain-based data processing method, the accuracy of data exchange can be improved by performing compliance detection on the input information. After passing the compliance detection, the input information is sent to a second node, and if the second node is a data compliance auditor, the input information can be audited again, so that the accuracy rate of data exchange can be further improved through the two-time compliance detection of the intelligent contract and the second node;
Meanwhile, in the compliance detection of the present invention, the similarity value of the input information and each target comparison information is the sum of sub-similarity values of the words to be detected included in the input information. And, the sub-similarity value is positively correlated with the part-of-speech relevance of the word to be detected and the content relevance of the word to be detected. The sub-similarity value of the word to be detected and the target comparison information is related to the similarity of the word to be detected and the similarity of the content of the word to be detected. Therefore, the sub-similarity value corresponding to the word to be detected has higher accuracy, the similarity of the input information and the target comparison information can be improved, the data difference is further reduced, and finally the data exchange accuracy is further improved.
As shown in fig. 2, an embodiment of the present invention further provides a data processing apparatus 20 based on a blockchain, the apparatus 20 including:
a transceiver module 21, configured to receive input information sent by a first node on a blockchain;
the processing module 22 is configured to perform similarity compliance detection processing on the part-of-speech sequence to be detected of the input information and the target comparison information through an intelligent contract of a blockchain, so as to obtain a detection result; and if the detection result indicates that the rule detection is passed, the input information is sent to a second node on the blockchain, otherwise, feedback information which is not passed by the detection is fed back to the first node.
Optionally, receiving the input information sent by the first node on the blockchain includes:
acquiring an initial questionnaire template image;
performing word recognition processing on the initial questionnaire template image to obtain a plurality of texts to be input; each text to be input comprises a plurality of fields to be input;
and carrying out recombination processing on the sub-input information corresponding to each field to be input to obtain the input information.
Optionally, performing text recognition processing on the initial questionnaire template image to obtain a plurality of texts to be input, including:
performing character recognition scanning on the initial questionnaire template image through a first scanning frame;
determining a plurality of first fields to be input and a plurality of second fields to be input according to the types of the objects to be input contained in the first scanning frame;
acquiring a coding similarity set of each first field to be input and each second field to be input;
when the maximum coding similarity in the coding similarity set is larger than a coding threshold, replacing a second text mark of a second field to be input corresponding to the maximum coding similarity with the first text mark;
and adding the first field to be input and the second field to be input with the same text mark into the same initial questionnaire template image to obtain a plurality of texts to be input.
Optionally, performing similarity compliance detection processing on the part-of-speech sequence to be detected of the input information and the target comparison information through intelligent contracts of the blockchain to obtain a detection result, where the detection result comprises:
acquiring a comparison set of at least one target comparison information corresponding to the input information;
word segmentation processing is carried out on the input information, so that a plurality of words to be detected corresponding to the input information are obtained;
performing part-of-speech detection on each word to be detected, and determining part-of-speech tags corresponding to each word to be detected;
determining a part-of-speech sequence to be detected of the input information according to part-of-speech tags contained in the input information;
obtaining a similarity value sequence formed by the input information and the similarity value of each target comparison information according to the word to be detected in the part-of-speech sequence to be detected and the target comparison information;
and if the maximum value in the similarity value sequence is larger than the similarity threshold value, obtaining a detection result passing detection, otherwise, obtaining a detection result not passing detection.
Optionally, according to the word to be detected in the part-of-speech sequence to be detected and the target comparison information, obtaining a similarity value sequence formed by the input information and a similarity value of each target comparison information includes:
According to
Figure SMS_45
Obtaining a similarity value sequence formed by the similarity values of the input information and each target comparison information; />
wherein ,
Figure SMS_46
for the similarity value of the first input information and the ith target comparison information, +.>
Figure SMS_47
For the sub-similarity value of the b-th word to be detected and the i-th target comparison information in the input information, b=1, 2,3, …, y, and y are the total number of words to be detected in the first input information.
Alternatively to this, the method may comprise,
Figure SMS_48
is determined by the following process:
acquiring a target matching sequence corresponding to a b-th word to be detected in the input information; the target matching sequence is part-of-speech class sequence and vocabulary sequence after the matching point position corresponding to the b-1 th word to be detected in the comparison set of the ith target comparison information corresponding to the input information; the matching point position corresponding to the b-1 th word to be detected is a target comparison word corresponding to the b-1 st word to be detected;
when the part-of-speech tag corresponding to the b-th word to be detected in the input information is different from any one of the comparison part-of-speech tags in the target matching sequence,
Figure SMS_49
when the part-of-speech tag corresponding to the b-th word to be detected in the input information is the same as any one of the part-of-speech tags in the target matching sequence, determining the comparison word corresponding to the part-of-speech tag as an initial comparison word;
When the b-th word to be detected in the input information does not belong to the vocabulary corresponding to the initial comparison word,
Figure SMS_50
when the b-th word to be detected in the input information belongs to the word list corresponding to the initial comparison word, determining the initial comparison word as the target comparison word, and
Figure SMS_51
wherein ,
Figure SMS_52
for the first sub-similarity value, < >>
Figure SMS_53
For the second sub-similarity value, +.>
Figure SMS_54
>/>
Figure SMS_55
Optionally, the similarity threshold satisfies the following condition:
Figure SMS_56
wherein Y1 is a similarity threshold,
Figure SMS_57
as the threshold coefficient, AVG (a) is a number average of comparison words included in a plurality of target comparison information corresponding to the input information.
It should be noted that, the device is a device corresponding to the above method, and all implementation manners in the above method embodiments are applicable to the embodiment of the device, so that the same technical effects can be achieved.
Embodiments of the present invention also provide a computing device comprising: a processor, a memory storing a computer program which, when executed by the processor, performs the method as described above. All the implementation manners in the method embodiment are applicable to the embodiment, and the same technical effect can be achieved.
Embodiments of the present invention also provide a computer-readable storage medium comprising instructions which, when run on a computer, cause the computer to perform a method as described above. All the implementation manners in the method embodiment are applicable to the embodiment, and the same technical effect can be achieved.
Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
It will be clear to those skilled in the art that, for convenience and brevity of description, specific working procedures of the above-described systems, apparatuses and units may refer to corresponding procedures in the foregoing method embodiments, and are not repeated herein.
In the embodiments provided in the present invention, it should be understood that the disclosed apparatus and method may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of the units is merely a logical function division, and there may be additional divisions when actually implemented, e.g., multiple units or components may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical or other form.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in the embodiments of the present invention may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a usb disk, a removable hard disk, a ROM, a RAM, a magnetic disk, or an optical disk, etc.
Furthermore, it should be noted that in the apparatus and method of the present invention, it is apparent that the components or steps may be disassembled and/or assembled. Such decomposition and/or recombination should be considered as equivalent aspects of the present invention. Also, the steps of performing the series of processes described above may naturally be performed in chronological order in the order of description, but are not necessarily performed in chronological order, and some steps may be performed in parallel or independently of each other. It will be appreciated by those of ordinary skill in the art that all or any of the steps or components of the methods and apparatus of the present invention may be implemented in hardware, firmware, software, or a combination thereof in any computing device (including processors, storage media, etc.) or network of computing devices, as would be apparent to one of ordinary skill in the art after reading this description of the invention.
The object of the invention can thus also be achieved by running a program or a set of programs on any computing device. The computing device may be a well-known general purpose device. The object of the invention can thus also be achieved by merely providing a program product containing program code for implementing said method or apparatus. That is, such a program product also constitutes the present invention, and a storage medium storing such a program product also constitutes the present invention. It is apparent that the storage medium may be any known storage medium or any storage medium developed in the future. It should also be noted that in the apparatus and method of the present invention, it is apparent that the components or steps may be disassembled and/or assembled. Such decomposition and/or recombination should be considered as equivalent aspects of the present invention. The steps of executing the series of processes may naturally be executed in chronological order in the order described, but are not necessarily executed in chronological order. Some steps may be performed in parallel or independently of each other.
While the foregoing is directed to the preferred embodiments of the present invention, it will be appreciated by those skilled in the art that various modifications and adaptations can be made without departing from the principles of the present invention, and such modifications and adaptations are intended to be comprehended within the scope of the present invention.

Claims (10)

1. A method of blockchain-based data processing, the method comprising:
receiving input information sent by a first node on a blockchain;
performing similarity compliance detection processing on the part-of-speech sequence to be detected of the input information and the target comparison information through intelligent contracts of the blockchain to obtain a detection result;
and if the detection result indicates that the rule detection is passed, the input information is sent to a second node on the blockchain, otherwise, feedback information which is not passed by the detection is fed back to the first node.
2. The blockchain-based data processing method of claim 1, wherein receiving the input information sent by the first node on the blockchain includes:
acquiring an initial questionnaire template image;
performing word recognition processing on the initial questionnaire template image to obtain a plurality of texts to be input; each text to be input comprises a plurality of fields to be input;
And carrying out recombination processing on the sub-input information corresponding to each field to be input to obtain the input information.
3. The blockchain-based data processing method of claim 2, wherein performing word recognition processing on the initial questionnaire template image to obtain a plurality of texts to be input comprises:
performing character recognition scanning on the initial questionnaire template image through a first scanning frame;
determining a plurality of first fields to be input and a plurality of second fields to be input according to the types of the objects to be input contained in the first scanning frame;
acquiring a coding similarity set of each first field to be input and each second field to be input;
when the maximum coding similarity in the coding similarity set is larger than a coding threshold, replacing a second text mark of a second field to be input corresponding to the maximum coding similarity with the first text mark;
and adding the first field to be input and the second field to be input with the same text mark into the same initial questionnaire template image to obtain a plurality of texts to be input.
4. The blockchain-based data processing method of claim 1, wherein the performing similarity compliance detection processing on the part-of-speech sequence to be detected of the input information and the target comparison information through the intelligent contract of the blockchain to obtain a detection result includes:
Acquiring a comparison set of at least one target comparison information corresponding to the input information;
word segmentation processing is carried out on the input information, so that a plurality of words to be detected corresponding to the input information are obtained;
performing part-of-speech detection on each word to be detected, and determining part-of-speech tags corresponding to each word to be detected;
determining a part-of-speech sequence to be detected of the input information according to part-of-speech tags contained in the input information;
obtaining a similarity value sequence formed by the input information and the similarity value of each target comparison information according to the word to be detected in the part-of-speech sequence to be detected and the target comparison information;
and if the maximum value in the similarity value sequence is larger than the similarity threshold value, obtaining a detection result passing detection, otherwise, obtaining a detection result not passing detection.
5. The blockchain-based data processing method of claim 4, wherein obtaining a sequence of similarity values formed by the input information and the similarity value of each target comparison information according to the word to be tested and the target comparison information in the part-of-speech sequence to be tested, comprises:
according to
Figure QLYQS_1
Obtaining a similarity value sequence formed by the similarity values of the input information and each target comparison information; / >
wherein ,
Figure QLYQS_2
for the similarity value of the first input information and the ith target comparison information, +.>
Figure QLYQS_3
For the sub-similarity value of the b-th word to be detected and the i-th target comparison information in the input information, b=1, 2,3, …, y, and y are the total number of words to be detected in the first input information.
6. The method for blockchain-based data processing of claim 5, wherein,
Figure QLYQS_4
is determined by the following process:
acquiring a target matching sequence corresponding to a b-th word to be detected in the input information; the target matching sequence is part-of-speech class sequence and vocabulary sequence after the matching point position corresponding to the b-1 th word to be detected in the comparison set of the ith target comparison information corresponding to the input information; the matching point position corresponding to the b-1 th word to be detected is a target comparison word corresponding to the b-1 st word to be detected;
when the part-of-speech tag corresponding to the b-th word to be detected in the input information is different from any one of the comparison part-of-speech tags in the target matching sequence,
Figure QLYQS_5
when the part-of-speech tag corresponding to the b-th word to be detected in the input information is the same as any one of the part-of-speech tags in the target matching sequence, determining the comparison word corresponding to the part-of-speech tag as an initial comparison word;
When the b-th word to be detected in the input information does not belong to the vocabulary corresponding to the initial comparison word,
Figure QLYQS_6
when the b-th word to be detected in the input information belongs to the word list corresponding to the initial comparison word, determining the initial comparison word as the target comparison word, and
Figure QLYQS_7
wherein ,
Figure QLYQS_8
for the first sub-similarity value, < >>
Figure QLYQS_9
For the second sub-similarity value, +.>
Figure QLYQS_10
>/>
Figure QLYQS_11
7. The blockchain-based data processing method of claim 5, wherein the similarity threshold satisfies the following condition:
Figure QLYQS_12
wherein Y1 is a similarity threshold,
Figure QLYQS_13
as the threshold coefficient, AVG (a) is a number average of comparison words included in a plurality of target comparison information corresponding to the input information.
8. A blockchain-based data processing device, the device comprising:
the receiving and transmitting module is used for receiving input information sent by a first node on the block chain;
the processing module is used for carrying out similarity compliance detection processing on the part-of-speech sequence to be detected of the input information and the target comparison information through intelligent contracts of the blockchain to obtain a detection result; and if the detection result indicates that the rule detection is passed, the input information is sent to a second node on the blockchain, otherwise, feedback information which is not passed by the detection is fed back to the first node.
9. A computing device, comprising: a processor, a memory and a program or instruction stored on the memory and executable on the processor, which program or instruction when executed by the processor implements the steps of the method of any of claims 1-7.
10. A readable storage medium, characterized in that it stores thereon a program or instructions, which when executed by a processor, implement the steps of the method according to any of claims 1-7.
CN202310493762.9A 2023-05-05 2023-05-05 Data processing method, device and equipment based on block chain Pending CN116204594A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310493762.9A CN116204594A (en) 2023-05-05 2023-05-05 Data processing method, device and equipment based on block chain

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310493762.9A CN116204594A (en) 2023-05-05 2023-05-05 Data processing method, device and equipment based on block chain

Publications (1)

Publication Number Publication Date
CN116204594A true CN116204594A (en) 2023-06-02

Family

ID=86509832

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310493762.9A Pending CN116204594A (en) 2023-05-05 2023-05-05 Data processing method, device and equipment based on block chain

Country Status (1)

Country Link
CN (1) CN116204594A (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190130114A1 (en) * 2017-10-30 2019-05-02 Pricewaterhousecoopers Llp Implementation of continuous real-time validation of distributed data storage systems
CN110765244A (en) * 2019-09-18 2020-02-07 平安科技(深圳)有限公司 Method and device for acquiring answering, computer equipment and storage medium
CN111930809A (en) * 2020-09-17 2020-11-13 支付宝(杭州)信息技术有限公司 Data processing method, device and equipment
CN112256271A (en) * 2020-10-19 2021-01-22 中国科学院信息工程研究所 Block chain intelligent contract security detection system based on static analysis
CN112541194A (en) * 2020-12-16 2021-03-23 国网河北省电力有限公司建设公司 Actual measurement data chaining method for engineering construction and engineering detection management method thereof
CN112883734A (en) * 2021-01-15 2021-06-01 成都链安科技有限公司 Block chain security event public opinion monitoring method and system
CN114913534A (en) * 2022-07-19 2022-08-16 北京嘉沐安科技有限公司 Block chain-based network security abnormal image big data detection method and system
US20230050782A1 (en) * 2021-08-13 2023-02-16 Usscyber Inc. Server systems and methods for valuing blockchain tokens based on organizational performance

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190130114A1 (en) * 2017-10-30 2019-05-02 Pricewaterhousecoopers Llp Implementation of continuous real-time validation of distributed data storage systems
CN110765244A (en) * 2019-09-18 2020-02-07 平安科技(深圳)有限公司 Method and device for acquiring answering, computer equipment and storage medium
CN111930809A (en) * 2020-09-17 2020-11-13 支付宝(杭州)信息技术有限公司 Data processing method, device and equipment
CN112256271A (en) * 2020-10-19 2021-01-22 中国科学院信息工程研究所 Block chain intelligent contract security detection system based on static analysis
CN112541194A (en) * 2020-12-16 2021-03-23 国网河北省电力有限公司建设公司 Actual measurement data chaining method for engineering construction and engineering detection management method thereof
CN112883734A (en) * 2021-01-15 2021-06-01 成都链安科技有限公司 Block chain security event public opinion monitoring method and system
US20230050782A1 (en) * 2021-08-13 2023-02-16 Usscyber Inc. Server systems and methods for valuing blockchain tokens based on organizational performance
CN114913534A (en) * 2022-07-19 2022-08-16 北京嘉沐安科技有限公司 Block chain-based network security abnormal image big data detection method and system

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
王爱英: "计算机组成与结构", 机械工业出版社, pages: 50 - 51 *
王翔;: "健康大数据平台的"区块链治理"", 网络空间安全, no. 12 *

Similar Documents

Publication Publication Date Title
CN110825882B (en) Knowledge graph-based information system management method
CA3052527C (en) Target document template generation
Huang et al. Automating intention mining
Poesio et al. Anaphora resolution
Wu et al. Question condensing networks for answer selection in community question answering
US20200019595A1 (en) System and method for graphical vector representation of a resume
Ameisen Building Machine Learning Powered Applications: Going from Idea to Product
CN109416705A (en) It parses and predicts for data using information available in corpus
US20200004765A1 (en) Unstructured data parsing for structured information
US11410130B2 (en) Creating and using triplet representations to assess similarity between job description documents
Altintas et al. Machine learning based ticket classification in issue tracking systems
US11238410B1 (en) Methods and systems for merging outputs of candidate and job-matching artificial intelligence engines executing machine learning-based models
CN111782793A (en) Intelligent customer service processing method, system and equipment
US11386263B2 (en) Automatic generation of form application
CN113157867A (en) Question answering method and device, electronic equipment and storage medium
CN114626351A (en) Form filling method and device combining RPA and AI, electronic equipment and storage medium
CN110610003B (en) Method and system for assisting text annotation
Iqbal et al. Multimedia based student-teacher smart interaction framework using multi-agents in eLearning
Bhagat et al. Survey on text categorization using sentiment analysis
Mgarbi et al. Towards a new job offers recommendation system based on the candidate resume
Banu et al. An intelligent web app chatbot
EP4300445A1 (en) Generalizable key-value set extraction from documents using machine learning models
Vysotska et al. Sentiment Analysis of Information Space as Feedback of Target Audience for Regional E-Business Support in Ukraine.
CN112732908B (en) Test question novelty evaluation method and device, electronic equipment and storage medium
CN116204594A (en) Data processing method, device and equipment based on block chain

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination