CN111859301B - Data reliability evaluation method based on improved Apriori algorithm and Bayesian network reasoning - Google Patents

Data reliability evaluation method based on improved Apriori algorithm and Bayesian network reasoning Download PDF

Info

Publication number
CN111859301B
CN111859301B CN202010728042.2A CN202010728042A CN111859301B CN 111859301 B CN111859301 B CN 111859301B CN 202010728042 A CN202010728042 A CN 202010728042A CN 111859301 B CN111859301 B CN 111859301B
Authority
CN
China
Prior art keywords
data
node
bayesian network
evaluation method
apriori algorithm
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010728042.2A
Other languages
Chinese (zh)
Other versions
CN111859301A (en
Inventor
邓建新
叶志兴
谢彬
曾向明
贺德强
李先旺
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangxi University
Original Assignee
Guangxi University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangxi University filed Critical Guangxi University
Priority to CN202010728042.2A priority Critical patent/CN111859301B/en
Publication of CN111859301A publication Critical patent/CN111859301A/en
Application granted granted Critical
Publication of CN111859301B publication Critical patent/CN111859301B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/18Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Operations Research (AREA)
  • Probability & Statistics with Applications (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Algebra (AREA)
  • Evolutionary Biology (AREA)
  • Databases & Information Systems (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Complex Calculations (AREA)

Abstract

The invention discloses a data reliability evaluation method based on an improved Apriori algorithm and Bayesian network reasoning, which belongs to the field of data processing. The algorithm evaluates the reliability of the data from the relationship between the data structure and the data, and the data distribution condition reduces the subjectivity of the prior data reliability evaluation without determining the reliability index. The algorithm has universality, is not only suitable for discrete data values, but also suitable for the reliability of interval numbers. The algorithm has higher accuracy, and is beneficial to mining the association relation between the same dimension and different dimensions of the high-dimension data to obtain the local reliability and the global reliability of each data.

Description

Data reliability evaluation method based on improved Apriori algorithm and Bayesian network reasoning
Technical Field
The invention relates to the field of data processing, in particular to a data reliability evaluation method based on an improved Apriori algorithm and Bayesian network reasoning.
Background
With the advent of the big data age, data mining algorithms were widely applied in various fields, making data the most valuable raw material for production by many organizations. Many organizations are selling data, while others offer services and solutions to mining data. In fact, there is an increasing reliance on sources of secondary data, such as estimates and predictions, which may have different characteristics that affect overall reliability. At this point, the more traditional reliability approach becomes less useful because the metadata needs to contain some information that is hidden in the background by the data representation of the system.
Reliability originates from the field of industrial engineering quality control and is initially defined as the ability of a product to run successfully for a predetermined time under specified conditions. This capability is typically attributed to a probability value, i.e., a probability that a given function will be completed within a given time and range under a given environmental condition. Data is taken as a product, and unlike the definition of the reliability of a general product, the definition of the reliability of the data does not have a unified standard. According to the prior reliability theory, more objective definition of the reliability of the data is proposed, namely, the reliability of the data is related to the conditional probability among the data in different dimensions.
The conventional data reliability method mainly focuses on the following three points: 1) Setting standards by means of scoring by users or experts, and establishing a scoring table through a statistical method and a field background; 2) Performing reliability evaluation on the data transmission process; 3) And formulating a reliability index according to the source information of the data, and performing global reliability evaluation through a data mining algorithm. However, when the expert marks the table, and the reliability of the data transmission process or the reliability evaluation is performed by depending on the source information of the data, the formulated reliability index has a certain degree of subjectivity, and a relatively objective data reliability evaluation method is needed at this time, and a relatively complete data reliability evaluation system is formulated by combining the two reliability evaluation methods.
Disclosure of Invention
The invention aims to provide a data reliability evaluation method based on an improved Apriori algorithm and Bayesian network reasoning, which solves the technical problems in the background art.
The method utilizes the integrated clustering based on the nonlinear dimension reduction algorithm, and adopts the integrated algorithm based on the association rule and the Bayesian network. The method is characterized in that the method is used for mining association relations in the same dimension of high-dimension data, and the method is used for mining association relations between different dimensions and expressing the association relations in a probability form. The data reliability calculation method provided by the invention does not need to determine the reliability index and the data distribution condition, thereby reducing the subjectivity of the previous data reliability evaluation. The method is applicable to discrete data values, is applicable to the reliability of interval numbers, and has universality. In addition, the method has a reference function on data reliability evaluation in other fields with correlation among data. The algorithm is helpful for mining the association relationship between the same dimension and different dimensions of the high-dimension data. The method has market prospect in the aspects of data-driven service application, data preprocessing in the field of big data, prediction application based on similar principles, collaborative recommendation of electronic commerce and the like.
Data reliability evaluation method based on improved Apriori algorithm and Bayesian network reasoning, the evaluation method comprises the following steps of
Step 1: multidimensional correlation data S provided with input diversity characteristics ij ={a ji A mixed set of interval values and discrete values, where i represents the dimension of the data i=1, 2, …, n, j represents the number of samples j=1, 2, …, m, if each data is considered as an interval number a ji =[x ji ,y ji ]Wherein x is ji ,y ji Can be equal, record data S ij Set of left endpointsIs data S ij Minimum value set of S ij Set of right endpoints->For the maximum value set of the data, the multi-dimensional interval number set with minimum value and maximum value is formed into a sample matrix, namely +.>For maximum value set S ij - And minimum value set S ij + Carrying out data coding treatment to obtain a data coding Code and a coding Rule;
step 2: and constructing a Bayesian network directed acyclic graph according to the data correlation and the attribute characteristics. Representing each dimension data of the original data subjected to data encoding according to the step 1 as nodes in the Bayesian networkWhere i represents the dimension of the data and k represents the dimensionThe state of the degree is the Rule of the code under the corresponding code. Calculating node variable +.>Wherein->Independent node variable representing no parent node, +.>Dependent node variables representing parent nodes and directed edges +.>The directed edge represents the relationship of the individual dimension data, whereinFor node->Is a parent node of (a);
step 3: obtaining each node by adopting improved Apriori algorithmSupport of->And as a conditional probability table L (V) of the bayesian network;
step 4: and reasoning the Bayesian network of the data according to the evidence correlation method, and calculating to obtain the reliability of each data.
Further, in the step 1, the data encoding process includes:
step 1.1: respectively to dataAnd->And performing unsupervised cluster learning to obtain the maximum neighbor number N. And pairs of sample matrixes S according to the number N of neighbors ij And performing linear reconstruction according to a local linear embedding algorithm, and calculating to obtain the eigenvector of the sample matrix. Clustering the feature vectors to obtain a data coding Code of a sample matrix and a set Rule of data dimension clustering, wherein the Rule is a coding Rule.
Further, the specific process of the step 1.1 is as follows:
step 1.1.1: input data matrixDetermining a threshold T by cross-checking;
step 1.1.2: from dataset S ij - Or S ij + Counting into a classification set Canopy;
step 1.1.3: from dataset S ij - Or S ij + P, calculating the distance between P and the classification set Canopy;
step 1.1.4: determining a classification set Canopy, storing P into the classification set Canopy if the classification set Canopy distance is smaller than T, otherwise, storing P from S ij - Or S ij + Delete in the middle;
step 1.1.5: repeating step 1.1.3,1.1.4 until S ij - Or S ij + No data in the classification set Canopy, the data number K in the classification set Canopy is output - Or classification set K + Obtaining the clustering number K;
step 1.1.6: from S ij - Or S ij + Randomly selecting K data sets, counting into C - Or C +
Step 1.1.7: according to Euclidean distance, S is ij - Or S ij + Is distributed into C - Or C + Form data set Q - Or Q +
Step 1.1.8: calculate each class Q - Or Q + As a new C - Or C +
Step 1.1.9: repeat 1.1.7 and 1.1.8 until C - Or C + No longer changes;
step 1.1.10: output Q - Or Q + The maximum neighbor number N in (a);
step 1.1.11: matrix S of samples ij ={S ij - ,S ij + Linearly reconstructing according to the maximum neighbor N to obtain a weight coefficient matrix W= { W j }(j=1,2,…,m);
Step 1.1.12: calculate matrix m= (I-W) T (I-W) and obtaining a feature vector d of the 2 nd feature value j Will S ij - Or S ij + Replaced by d j Repeating steps 1.1.6-1.1.9 until C - Or C + No longer change, get d based j New clustering result Q of (2) i ';
Step 1.1.13: by the arithmetic code=index (d j ,Q i ') to d j At Q i Index in', i.e. data Code, index (d) j ,Q i ') indicates that d is taken j Distribution into Q i Index in'; and the data set combination with the same cluster index is the coding Rule, and the operation is expressed as Rule 'C' S ij
Further, the data cluster number k=min (K - ,K + ) The maximum number of neighbors n=max (Q in step 1.1.10 - ,Q + )。
Further, the method for calculating the weight coefficient in the step 1.1.11 is as follows:
wherein eta represents S ij Is a single-point network.
Further, the specific process of the modified Apriori algorithm in the step 3 is as follows:
step 3.1:input node variables
Step 3.2: computing independent node variable sets without parent nodesConditional probability table->And data encoding S C
Step 3.3: for a set of dependent node variables with parent nodesConnecting branches, i.e. combining the node with all its parent nodes to obtain +.>
Step 3.4: computing node variable setsThe proportion of the number of nodes to the total data samples in the state k, i.e. the support +.>
Step 3.5: computing node variable setsConfidence of->And conditional probability table->
Step 3.6: the final conditional probability table L (V) is output.
Further, in the step 3.1, node variables with a parent node and a node without a parent node are divided, and the continuous branch in the step 3.3 starts scanning from the node variable without a parent node.
Further, the reasoning formula of the evidence correlation method in the step 4:
representing node v i Is a child node of (a). />Representing node v i Is a parent node of A (v) i ) Representing node v i Probability value of state, wherein |S| represents child node +.>The |f| indicates the number of elements in the parent node F.
The invention adopts the technical proposal and has the following technical effects:
1) Objectivity, no reliability index is required to be determined, and data distribution conditions reduce subjectivity of the prior data reliability evaluation;
2) Universality is applicable not only to discrete data values, but also to reliability of the number of intervals. The method has a reference function for evaluating the data reliability in other fields with related relations among the data;
3) The model can also distinguish data to identify noise in the sample, even if interval values exist in the sample, a certain basis is provided for improving the sample quality and the accuracy of a data driving algorithm, and the model has market prospects in the aspects of data driving service application, data preprocessing in the field of big data, prediction application based on a similar principle, data reliability evaluation of electronic commerce and the like.
Drawings
Fig. 1 is a flow chart of the present invention.
Fig. 2 is a schematic diagram of data encoding of the present invention.
Fig. 3 is a data encoding flow chart of the present invention.
Fig. 4 is a diagram of a bayesian network of related data according to the present invention.
Fig. 5 is a diagram of a bayesian network model of the present invention, for example, squeeze casting.
Detailed Description
The present invention will be described in further detail with reference to preferred embodiments for the purpose of making the objects, technical solutions and advantages of the present invention more apparent. It should be noted, however, that many of the details set forth in the description are merely provided to provide a thorough understanding of one or more aspects of the invention, and that these aspects of the invention may be practiced without these specific details.
As shown in fig. 1, the data reliability evaluation method based on the improved Apriori algorithm and bayesian network reasoning according to the present invention comprises the following steps:
step 1: and (5) data encoding. Multidimensional correlation data S provided with input diversity characteristics ij ={a ji And is a mixed set of interval values and discrete values, where i represents the dimension of the data i=1, 2, …, n, j represents the number of samples j=1, 2, …, m. Specific examples are shown in Table 1.
Table 1 data sample matrix taking squeeze casting process data as an example
If each data is regarded as the interval number a ji =[x ij ,y ji ]Wherein x is ji ,y ji May be equal. Record data S ij Set of left endpointsIs data S ij Minimum value set of S ij Set of right endpoints->For the maximum value set of the data, the multi-dimensional interval number set with minimum value and maximum value is formed into a sample matrix, namely +.>Respectively to dataAnd->Performing unsupervised cluster learning to obtain the maximum neighbor number N, and comparing the sample matrix S with the neighbor number N ij And linearly reconstructing according to a local linear embedding algorithm (LLE), and calculating to obtain the eigenvectors of the sample matrix. Clustering the feature vectors to obtain a data coding Code of a sample matrix and a set Rule of data dimension clustering, wherein the Rule is a coding Rule. The calculation process of the algorithm is schematically shown in fig. 2, and the basic flow is shown in fig. 3. Taking extrusion casting process data as an example, the data of table 1 are subjected to data coding, and the results are shown in table 2;
the method comprises the following specific steps:
step 1.1: input data matrix S i =(S i - ,S i + ) Determining a threshold T through cross checking;
step 1.2: from dataset S i - (or S) i + ) Counting into a classification set Canopy;
step 1.3: from dataset S i - (or S) i + ) Calculating the distance between P and Canopy by taking one point P;
step 1.4: and judging Canopy. If the distance is less than T, P is stored in Canopy, otherwise
P is taken from S i - (or S) i + ) Delete in the middle;
step 1.5: repeating steps 1.3 and 1.4 until S i - (or S) i + ) No data in, output Canopy
Number of data K in (3) - (or K) + ) And k=min (K - ,K + );
Step 1.6: from S ij - Or S ij + Randomly selecting K data sets, counting into C - Or C +
Step 1.7: according to Euclidean distance, S is ij - Or S ij + Is distributed into C - Or C + Form data set Q - Or Q +
Step 1.8: calculate each class Q - Or Q + As a new C - Or C +
Step 1.9: repeat 1.1.7 and 1.1.8 until C - Or C + No longer changes;
step 1.10: output Q - Or Q + The maximum neighbor number N in (a);
step 1.11: matrix S of samples ij ={S ij - ,S ij + Linearly reconstructing according to the maximum neighbor N to obtain a weight coefficient matrix W= { W j }(j=1,2,…,m);
Step 1.12: calculate matrix m= (I-W) T (I-W) and obtaining a feature vector d of the 2 nd feature value j Will S ij - Or S ij + Replaced by d j Repeating steps 1.1.6-1.1.9 until C - Or C + No longer change, get d based j New clustering result Q of (2) i ';
Step 1.13: by the arithmetic code=index (d j ,Q i ') to d j At Q i Index in', i.e. data Code, index (d) j ,Q i ') indicates that d is taken j Distribution into Q i Index in'; and the data set combination with the same cluster index is the coding Rule, and the operation is expressed as Rule 'C' S ij
TABLE 2 Code-Rule after encoding of data samples, for example squeeze casting process data
Step 2: and constructing a Bayesian network directed acyclic graph according to the data correlation and the attribute characteristics, as shown in fig. 4. Taking squeeze casting process data as an example, a specific example is shown in fig. 5. After the original data is subjected to data coding according to the step 1, each dimension data is expressed as a node in the Bayesian networkWhere i represents the dimension of the data, k represents the state of the dimension (corresponding to the Rule of encoding), node variable +.>Wherein->An independent node variable representing no parent node,representing dependent node variables with parent nodes. Directed edge->Representing the relation of the respective dimension data, wherein +.>For node->Is a parent node of (a);
step 3: determination using modified Apriori algorithmEach nodeSupport of->And obtaining a conditional probability table L (V) of the Bayesian network; the pseudo code of this algorithm is shown in table 3.
Table 3 improved Apriori algorithm based on bayesian network nodes
The method comprises the following specific steps:
step 3.1: input node variables
Step 3.2: computing node variable sets without parent nodesConditional probability table->And data braiding
Code S C
Step 3.3: for a set of dependent node variables with parent nodesConnecting branches, i.e. combining the node with all its parent nodes to obtain +.>
Step 3.4: computing node variable setsThe number of nodes at state k is the proportion of the total data sample,i.e. support->
Step 3.5: computing node variable setsConfidence of->And conditional probability table->
Step 3.6: the final conditional probability table L (V) is output.
Step 4: reasoning is carried out on the Bayesian network of the data according to the evidence correlation method, and the reliability of the data is calculated; taking the squeeze casting process data in Table 1 as an example, the reliability reasoning results that can be obtained are shown in Table 4, where global reliability is the relative reliability of the process parameter population at each material composition.
TABLE 4 reliability reasoning results for the extrusion casting process data
The method comprises the following specific steps:
step 4.1: input: current node variable v i
Step 4.2: distribution node variable v i ={v P ,v S And connection node Join (v) i ,v i+1 ) Construction of Bayesian networks
Step 4.3: initializing node states and conditional probability tables L (v i )
Step 4.4: input evidence Sub (v) i ),Par(v i )
Step 4.6: reasoning according to evidence correlation formula
Step 4.7: and (3) outputting: current node reliability R (v i )
Evidence correlation formula:
representing node v i Is a child node of (a). F (F) i θ =Par(v i ) Representing node v i Is a parent node of A (v) i ) Representing node v i Probability values for states. Wherein |S| represents child node +.>The |f| indicates the number of elements in the parent node F.
The invention respectively forms minimum value and maximum value data sets by the left end point and the right end point of the interval number, performs Canopy-Kmeans clustering on each data, and adopts LLE popular learning algorithm to reduce the dimension of the whole interval data set to obtain the data code of the interval data set. And then, taking each dimension data as a node, taking the relation among the different dimension data as a directed edge, constructing a Bayesian network graph, obtaining a conditional probability table of each node through an improved Apriori algorithm, and finally, constructing reasoning of each node data according to an evidence correlation method.
The foregoing is merely a preferred embodiment of the present invention and it should be noted that modifications and adaptations to those skilled in the art may be made without departing from the principles of the present invention, which are intended to be comprehended within the scope of the present invention.

Claims (8)

1. The extrusion casting process data reliability evaluation method based on the improved Apriori algorithm and Bayesian network reasoning is characterized by comprising the following steps of: the evaluation method comprises the following steps of
Step 1: multidimensional correlation data S provided with input diversity characteristics ij ={a ji A mixed set of interval values and discrete values, where i represents the dimension of the data i=1, 2, …, n, j represents the number of samples j=1, 2, …, m, if each data is considered as an interval number a ji =[x ji ,y ji ]Wherein x is ji ,y ji Equality, record data S ij Set of left endpointsIs data S ij Minimum value set of S ij Set of right endpoints->For the maximum value set of the data, the multi-dimensional interval number set with minimum value and maximum value is formed into a sample matrix, namely +.>For maximum value set S ij - And minimum value set S ij + Carrying out data coding treatment to obtain a data coding Code and a coding Rule;
step 2: constructing a Bayesian network directed acyclic graph according to the data correlation and attribute characteristics, and representing each dimension data of the original data after data encoding according to the step 1 as nodes in the Bayesian networkWherein i represents the dimension of the data, k represents the state of the dimension, namely the Rule of the code under the corresponding code, and the node variable +.>Wherein->Independent node variable representing no parent node, +.>Dependent node variables representing parent nodes and directed/>Is a parent node of (a);
step 3: obtaining each node by adopting improved Apriori algorithmSupport of->And as a conditional probability table L (V) of the bayesian network;
step 4: and reasoning the Bayesian network of the data according to the evidence correlation method, and calculating the reliability of the data.
2. The extrusion casting process data reliability evaluation method based on the improved Apriori algorithm and bayesian network reasoning according to claim 1, wherein the method comprises the following steps of: in the step 1, the data encoding process comprises the following steps:
step 1.1: respectively to dataAnd->Performing unsupervised cluster learning to obtain the maximum neighbor number N, and comparing the sample matrix S with the neighbor number N ij And carrying out linear reconstruction according to a local linear embedding algorithm, calculating to obtain a feature vector of the sample matrix, clustering the feature vector to obtain a data coding Code of the sample matrix and a set Rule of data dimension clusters, wherein the Rule is a coding Rule.
3. The extrusion casting process data reliability evaluation method based on the improved Apriori algorithm and bayesian network reasoning according to claim 2, wherein the method comprises the following steps of: the specific process of the step 1.1 is as follows:
step 1.1.1: input data matrix S ij =(S ij - ,S ij + ) Determining a threshold T through cross checking;
step 1.1.2: from dataset S ij - Or S ij + Counting into a classification set Canopy;
step 1.1.3: from dataset S ij - Or S ij + P, calculating the distance between P and the classification set Canopy;
step 1.1.4: determining a classification set Canopy, storing P into the classification set Canopy if the classification set Canopy distance is smaller than T, otherwise, storing P from S ij - Or S ij + Delete in the middle;
step 1.1.5: repeating step 1.1.3,1.1.4 until S ij - Or S ij + No data in the classification set Canopy, the data number K in the classification set Canopy is output - Or classification set K + Obtaining the clustering number K;
step 1.1.6: from S ij - Or S ij + Randomly selecting K data sets, counting into C - Or C +
Step 1.1.7: according to Euclidean distance, S is ij - Or S ij + Is distributed into C - Or C + Form data set Q - Or Q +
Step 1.1.8: calculate each class Q - Or Q + As a new C - Or C +
Step 1.1.9: repeat 1.1.7 and 1.1.8 until C - Or C + No longer changes;
step 1.1.10: output Q - Or Q + The maximum neighbor number N in (a);
step 1.1.11: matrix S of samples ij ={S ij - ,S ij + Linearly reconstructing according to the maximum neighbor N to obtain a weight coefficient matrix W= { W j }(j=1,2,…,m);
Step 1.1.12: calculate matrix m= (I-W) T (I-W) and obtaining a feature vector d of the 2 nd feature value j Will beS ij - Or S ij + Replaced by d j Repeating steps 1.1.6-1.1.9 until C-or C + No longer change, get d based j New clustering result Q of (2) i ';
Step 1.1.13: by the arithmetic code=index (d j ,Q i ') to d j At Q i Index in', i.e. data Code, index (d) j ,Q i ') indicates that d is taken j Distribution into Q i Index in'; and the data set combination with the same cluster index is the coding Rule, and the operation is expressed as Rule 'C' S ij
4. The extrusion casting process data reliability evaluation method based on the improved Apriori algorithm and bayesian network reasoning according to claim 3, wherein: the data cluster number k=min (K - ,K + ) The maximum number of neighbors n=max (Q in step 1.1.10 - ,Q + )。
5. The extrusion casting process data reliability evaluation method based on the improved Apriori algorithm and bayesian network reasoning according to claim 3, wherein: the method for calculating the weight coefficient in the step 1.1.11 is as follows:
wherein S is ij Is denoted by eta.
6. The extrusion casting process data reliability evaluation method based on the improved Apriori algorithm and bayesian network reasoning according to claim 1, wherein the method comprises the following steps of: the specific process of the improved Apriori algorithm in the step 3 is as follows:
step 3.1: input node variables
Step 3.2: computing independent node variable sets without parent nodesConditional probability table->And data encoding S C
Step 3.3: for a set of dependent node variables with parent nodesConnecting branches, i.e. combining the node with all its parent nodes to obtain +.>
Step 3.4: computing node variable setsThe proportion of the number of nodes to the total data sample at state k, i.e. the degree of support
Step 3.5: computing node variable setsConfidence of->And conditional probability table->
Step 3.6: the final conditional probability table L (V) is output.
7. The extrusion casting process data reliability evaluation method based on the improved Apriori algorithm and bayesian network reasoning according to claim 6, wherein: and dividing node variables with father nodes and without father nodes in the step 3.1, and starting scanning by connecting branches in the step 3.3 from the node variables without father nodes.
8. The extrusion casting process data reliability evaluation method based on the improved Apriori algorithm and bayesian network reasoning according to claim 1, wherein the method comprises the following steps of: the reasoning formula of the evidence correlation method in the step 4:
representing node v i Is a child node of F i θ =Par(v i ) Representing node v i Is a parent node of A (v) i ) Representing node v i Probability value of state, where S represents child node +.>The |f| indicates the number of elements in the parent node F.
CN202010728042.2A 2020-07-23 2020-07-23 Data reliability evaluation method based on improved Apriori algorithm and Bayesian network reasoning Active CN111859301B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010728042.2A CN111859301B (en) 2020-07-23 2020-07-23 Data reliability evaluation method based on improved Apriori algorithm and Bayesian network reasoning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010728042.2A CN111859301B (en) 2020-07-23 2020-07-23 Data reliability evaluation method based on improved Apriori algorithm and Bayesian network reasoning

Publications (2)

Publication Number Publication Date
CN111859301A CN111859301A (en) 2020-10-30
CN111859301B true CN111859301B (en) 2024-02-02

Family

ID=72950179

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010728042.2A Active CN111859301B (en) 2020-07-23 2020-07-23 Data reliability evaluation method based on improved Apriori algorithm and Bayesian network reasoning

Country Status (1)

Country Link
CN (1) CN111859301B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2007147166A2 (en) * 2006-06-16 2007-12-21 Quantum Leap Research, Inc. Consilence of data-mining
CN102956023A (en) * 2012-08-30 2013-03-06 南京信息工程大学 Bayes classification-based method for fusing traditional meteorological data with perception data
CN106570525A (en) * 2016-10-26 2017-04-19 昆明理工大学 Method for evaluating online commodity assessment quality based on Bayesian network
CN107247995A (en) * 2016-09-29 2017-10-13 上海交通大学 Transmission line of electricity running status association rule mining and Forecasting Methodology based on Bayesian model

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120078678A1 (en) * 2010-09-23 2012-03-29 Infosys Technologies Limited Method and system for estimation and analysis of operational parameters in workflow processes

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2007147166A2 (en) * 2006-06-16 2007-12-21 Quantum Leap Research, Inc. Consilence of data-mining
CN102956023A (en) * 2012-08-30 2013-03-06 南京信息工程大学 Bayes classification-based method for fusing traditional meteorological data with perception data
CN107247995A (en) * 2016-09-29 2017-10-13 上海交通大学 Transmission line of electricity running status association rule mining and Forecasting Methodology based on Bayesian model
CN106570525A (en) * 2016-10-26 2017-04-19 昆明理工大学 Method for evaluating online commodity assessment quality based on Bayesian network

Also Published As

Publication number Publication date
CN111859301A (en) 2020-10-30

Similar Documents

Publication Publication Date Title
Cremonesi et al. An evaluation methodology for collaborative recommender systems
CN110309195B (en) FWDL (full Width Domain analysis) model based content recommendation method
CN111127146B (en) Information recommendation method and system based on convolutional neural network and noise reduction self-encoder
CN108647226B (en) Hybrid recommendation method based on variational automatic encoder
CN105512273A (en) Image retrieval method based on variable-length depth hash learning
CN113807422A (en) Weighted graph convolutional neural network score prediction model fusing multi-feature information
CN110619540A (en) Click stream estimation method of neural network
CN113328755A (en) Compressed data transmission method facing edge calculation
Tembusai et al. K-nearest neighbor with k-fold cross validation and analytic hierarchy process on data classification
CN114357312A (en) Community discovery method and personality recommendation method based on automatic modeling of graph neural network
CN110443574B (en) Recommendation method for multi-project convolutional neural network review experts
Tong et al. Model-free conditional feature screening with FDR control
CN117216281A (en) Knowledge graph-based user interest diffusion recommendation method and system
CN114154070A (en) MOOC recommendation method based on graph convolution neural network
CN111859301B (en) Data reliability evaluation method based on improved Apriori algorithm and Bayesian network reasoning
CN117495481A (en) Article recommendation method based on heterogeneous timing diagram attention network
CN117422134A (en) Knowledge graph recommendation method based on graph convolution neural network
CN114201635B (en) Case source clue classification method based on multi-view graph data feature learning
Manoju et al. Conductivity based agglomerative spectral clustering for community detection
JP2010073195A (en) Collaborative filtering processing method and collaborative filtering processing program
Zhu et al. Influential Recommender System
CN114817566A (en) Emotion reason pair extraction method based on emotion embedding
Velikova et al. Decision trees for monotone price models
CN114820074A (en) Target user group prediction model construction method based on machine learning
CN114547276A (en) Three-channel diagram neural network-based session recommendation method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant