CN117827937B - Monitoring method, system and storage medium based on multi-source data integration and data mining - Google Patents

Monitoring method, system and storage medium based on multi-source data integration and data mining Download PDF

Info

Publication number
CN117827937B
CN117827937B CN202410245153.6A CN202410245153A CN117827937B CN 117827937 B CN117827937 B CN 117827937B CN 202410245153 A CN202410245153 A CN 202410245153A CN 117827937 B CN117827937 B CN 117827937B
Authority
CN
China
Prior art keywords
data
monitoring
information
acquiring
source
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202410245153.6A
Other languages
Chinese (zh)
Other versions
CN117827937A (en
Inventor
杜红阳
王淑玉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong Tianda Qingyuan Information Technology Co ltd
Original Assignee
Shandong Tianda Qingyuan Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong Tianda Qingyuan Information Technology Co ltd filed Critical Shandong Tianda Qingyuan Information Technology Co ltd
Priority to CN202410245153.6A priority Critical patent/CN117827937B/en
Publication of CN117827937A publication Critical patent/CN117827937A/en
Application granted granted Critical
Publication of CN117827937B publication Critical patent/CN117827937B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a monitoring method, a system and a storage medium based on multi-source data integration and data mining, which relate to the technical field of intelligent monitoring and comprise the steps of obtaining monitoring information, obtaining multi-source monitoring data according to the monitoring data source information, obtaining a monitoring data set based on data integration according to the multi-source monitoring data, and carrying out early warning on data risks according to data risk assessment indexes and data risk assessment index thresholds. According to the invention, the multisource monitoring data is subjected to data integration, the data conflict caused by the difference or contradiction of the data of different data sources on certain attributes is avoided through the data rule correlation indexes, the accuracy of the data is improved, the association rules are screened through the association rule support degree information and the association rule confidence degree information, the association information among the monitoring data is analyzed, the risk degree of the monitoring characteristic data is accurately estimated through the data risk assessment indexes, and the analysis is early-warned in time.

Description

Monitoring method, system and storage medium based on multi-source data integration and data mining
Technical Field
The invention relates to the technical field of intelligent monitoring, in particular to a monitoring method, a monitoring system and a storage medium based on multi-source data integration and data mining.
Background
The real-time monitoring technology is widely applied to various industries, can help enterprises and organizations to master key business processes, find potential problems, perform timely and effective processing, and timely early warn abnormal conditions and avoid the occurrence of the abnormal conditions by monitoring the data change condition of a monitored object.
However, the existing monitoring system also has the problems that depending on a single data source, information is difficult to collect from different data sources, the overall state of a monitored object is comprehensively judged, monitoring one-sidedness is easy to cause, key indexes are omitted, abnormal early warning on the monitored data is not timely and the accuracy is not high, reasonable and proper data integration cannot be carried out on the processing of multi-source data of the monitored object, data conflict among different attribute data cannot be solved, and accurate assessment on risks cannot be carried out.
Disclosure of Invention
In order to solve the technical problems, the technical scheme solves the problems that the monitoring method, the system and the storage medium based on multi-source data integration and data mining are difficult to collect information from different data sources, comprehensively judge the overall state of a monitored object, easily cause monitoring one-sidedness, miss key indexes, lead to untimely abnormal early warning and low accuracy of the monitored data, and cannot reasonably and properly integrate data for processing the multi-source data of the monitored object, cannot solve data conflict among different attribute data and cannot accurately evaluate risks, which are proposed in the background art.
In order to achieve the above purpose, the invention adopts the following technical scheme:
the monitoring method based on multi-source data integration and data mining comprises the following steps:
acquiring monitoring information, wherein the monitoring information comprises monitoring object information and monitoring environment information;
Acquiring monitoring data source information based on actual monitoring requirements according to the monitoring information;
Acquiring multi-source monitoring data according to the monitoring data source information;
acquiring a monitoring data set based on data integration according to the multi-source monitoring data;
Acquiring a monitoring historical data set, wherein the monitoring historical data set comprises monitoring historical data and monitoring data source information;
acquiring monitoring history warning information and monitoring history warning data according to the monitoring history data set;
Mining according to the monitoring history warning information and the monitoring history warning data based on the data association rule to acquire data association information;
acquiring monitoring characteristic data according to the data association information and the monitoring data set;
Acquiring a data risk assessment index based on a monitoring data risk assessment model according to the monitoring characteristic data;
acquiring a data risk assessment index threshold based on the data monitoring requirement;
and carrying out early warning on the data risk according to the data risk assessment index and the data risk assessment index threshold value.
Preferably, the acquiring the monitoring dataset based on data integration according to the multi-source monitoring data specifically includes:
acquiring multi-source monitoring data format information according to the multi-source monitoring data;
Acquiring standard format information of monitoring data based on data analysis requirements according to the multi-source monitoring data format information;
according to the standard format information of the monitoring data and the format information of the multi-source monitoring data, carrying out data format conversion on the multi-source monitoring data to obtain the standard format data of the multi-source monitoring;
acquiring attribute information of the multi-source monitoring standard format data according to the multi-source monitoring standard format data;
Acquiring data attribute rule information and data public attribute information based on data integration requirements according to the multi-source monitoring standard format data attribute information;
acquiring a data rule related index according to the data attribute rule information and the multi-source monitoring standard format data attribute information;
acquiring a data rule related index threshold based on data analysis requirements;
Screening the multi-source monitoring standard format data according to the data rule correlation index and the data rule correlation index threshold;
According to the multi-source monitoring standard format data attribute information and the data public attribute information, performing data mapping on the screened multi-source monitoring standard format data to obtain multi-source monitoring attribute mapping data;
acquiring a monitoring data set based on Z-score standardization according to the multi-source monitoring attribute mapping data;
The calculation formula of the data rule correlation index is as follows:
where Q is a data rule correlation index, For the influence coefficient of the ith data attribute on the data,/>For the correlation coefficient of the ith data attribute and the data attribute rule, n is the total number of data attributes.
Preferably, the step of mining based on the data association rule according to the monitoring history warning information and the monitoring history warning data to obtain the data association information specifically includes:
acquiring a monitoring history warning transaction set according to the monitoring history warning information and the monitoring history warning data;
Acquiring frequent item set information based on a frequent item set generation algorithm according to the monitoring history warning transaction set;
Acquiring association rule information according to frequent item set information, wherein the association rule consists of a front frequent item set and a back frequent item set;
Acquiring association rule support degree information and association rule confidence degree information according to the association rule information, wherein the association rule support degree is the frequency of occurrence of the association rule in a monitoring history warning transaction set, namely the probability of simultaneous occurrence of a front part frequent item set and a back part frequent item set of the association rule, and the association rule confidence degree is the probability of simultaneous satisfaction of the back part frequent item set by the association rule under the condition that the front part frequent item set is satisfied;
acquiring an association rule support threshold and an association rule confidence threshold based on the data association rule mining requirement;
and screening the association rule according to the association rule support degree information, the association rule confidence degree information, the association rule support degree threshold and the association rule confidence degree threshold to acquire data association information.
Preferably, the acquiring monitoring feature data according to the data association information and the monitoring data set specifically includes:
acquiring monitoring deduplication data based on a hash deduplication method according to the monitoring dataset;
Acquiring a monitoring deduplication data box diagram according to the monitoring deduplication data;
obtaining monitoring deduplication data box diagram data according to a monitoring deduplication data box diagram, wherein the monitoring deduplication data box diagram data comprises a minimum value, a lower quartile, a median, an upper quartile and a maximum value of the monitoring deduplication data box diagram;
acquiring a box diagram inner limit coefficient based on the data abnormality detection requirement;
obtaining box diagram threshold information according to the box diagram inner limit coefficient and the monitoring duplication removal data box diagram data;
acquiring monitoring outlier data according to the monitoring duplication elimination data box diagram and box diagram threshold information;
Removing outlier data of the monitoring de-duplication data according to the monitoring outlier data to obtain monitoring correction data;
Standard monitoring data is obtained, wherein the standard monitoring data is monitoring data in a normal standard state;
acquiring monitoring abnormal data according to the monitoring correction data and the standard monitoring data;
Acquiring monitoring characteristic data according to the data association information and the monitoring abnormal data, wherein the monitoring characteristic data comprises event monitoring characteristic data and data related characteristic information;
Wherein the event monitoring feature data represents monitoring feature data related to a risk event, and the data related feature information is monitoring feature data related to each other.
Preferably, the early warning of the data risk is performed according to the data risk assessment index and the data risk assessment index threshold, which specifically includes:
Acquiring a data risk assessment index according to the monitoring characteristic data;
acquiring a data risk assessment index threshold based on the data monitoring requirement;
Judging whether the data risk assessment index exceeds the data risk assessment index threshold according to the data risk assessment index and the data risk assessment index threshold, if not, recording the data risk assessment index and the monitoring characteristic data, and if so, outputting and displaying data risk early warning information;
the calculation formula of the data risk assessment index is as follows:
wherein R is a data risk assessment index, As the influence coefficient of the s-th characteristic data on the j-th risk event,Is the weight of the s-th characteristic data,/>And h is the total number of the feature data, and m is the total number of the data risk event types, wherein h is the impact index of all feature data associated with the s-th feature data on the j-th risk event.
Furthermore, a monitoring system based on multi-source data integration and data mining is provided, which is used for implementing the above-mentioned monitoring method, and includes:
the main control module is used for carrying out data format conversion on the multi-source monitoring data according to the monitoring data standard format information and the multi-source monitoring data format information, obtaining multi-source monitoring standard format data, obtaining data rule related indexes according to the data attribute rule information and the multi-source monitoring standard format data attribute information, screening the multi-source monitoring standard format data according to the data rule related indexes and the data rule related index threshold, mapping data according to the multi-source monitoring attribute, obtaining a monitoring data set based on Z-score standardization, obtaining monitoring feature data according to the data related information and the monitoring data set, obtaining a data risk assessment index according to the monitoring feature data, and carrying out early warning on data risk according to the data risk assessment index and the data risk assessment index threshold;
The information acquisition module is used for acquiring monitoring information, monitoring object information and monitoring environment information, acquiring multi-source monitoring data format information according to multi-source monitoring data, acquiring a monitoring historical data set, monitoring historical data and monitoring data source information, and acquiring monitoring historical warning information and monitoring historical warning data according to the monitoring historical data set;
The data mining module is used for acquiring a monitoring history warning transaction set according to the monitoring history warning information and the monitoring history warning data, acquiring frequent item set information based on a frequent item set generation algorithm according to the monitoring history warning transaction set, acquiring association rule information according to the frequent item set information, acquiring association rule support degree information and association rule confidence degree information according to the association rule information, and screening association rules according to the association rule support degree information, the association rule confidence degree information, an association rule support degree threshold value and an association rule confidence degree threshold value to acquire data association information;
and the display module is interacted with the main control module and is used for displaying the data risk assessment index and outputting data risk early warning information.
Optionally, the main control module specifically includes:
the control unit is used for acquiring monitoring characteristic data according to the data association information and the monitoring data set, acquiring a data risk assessment index according to the monitoring characteristic data, and carrying out early warning on the data risk according to the data risk assessment index and a data risk assessment index threshold;
The information receiving unit is interacted with the information acquisition module and the data mining module, and is used for receiving data and transmitting the data to the data processing unit;
The data processing unit is used for carrying out data format conversion on the multi-source monitoring data according to the monitoring data standard format information and the multi-source monitoring data format information, obtaining multi-source monitoring standard format data, obtaining data rule correlation indexes according to the data attribute rule information and the multi-source monitoring standard format data attribute information, screening the multi-source monitoring standard format data according to the data rule correlation indexes and the data rule correlation index threshold, mapping data according to the multi-source monitoring attribute, and obtaining a monitoring data set based on Z-score standardization.
Optionally, the information acquisition module specifically includes:
The first acquisition unit is used for acquiring monitoring information, monitoring object information and monitoring environment information and acquiring multi-source monitoring data format information according to multi-source monitoring data;
the second acquisition unit is used for acquiring a monitoring historical data set, monitoring historical data and monitoring data source information, and acquiring monitoring historical warning information and monitoring historical warning data according to the monitoring historical data set.
Optionally, the data mining module specifically includes:
The transaction set unit is used for acquiring a monitoring history warning transaction set according to the monitoring history warning information and the monitoring history warning data, and acquiring frequent item set information based on a frequent item set generation algorithm according to the monitoring history warning transaction set;
The association rule unit is used for acquiring association rule information according to the frequent item set information, acquiring association rule support degree information and association rule confidence degree information according to the association rule information, screening the association rule according to the association rule support degree information, the association rule confidence degree information, the association rule support degree threshold value and the association rule confidence degree threshold value, and acquiring data association information.
Still further, a computer-readable storage medium is proposed, on which a computer-readable program is stored, which when called performs the monitoring method as described above.
Compared with the prior art, the invention has the beneficial effects that:
The invention provides a monitoring method, a system and a storage medium based on multi-source data integration and data mining, wherein the multi-source monitoring data are subjected to data integration, data conflicts caused by differences or contradictions among data of different data sources on certain attributes are avoided through data rule correlation indexes, so that the monitoring result is more comprehensive, the accuracy of the data is improved, the association rules are screened through association rule support degree information and association rule confidence degree information, the association information among the monitoring data is analyzed, the risk degree of the monitoring characteristic data is accurately estimated through data risk assessment indexes, and early warning is timely carried out on analysis.
Drawings
FIG. 1 is a flow chart of a monitoring method based on multi-source data integration and data mining according to the present invention;
FIG. 2 is a flow chart of the monitoring dataset acquisition of the present invention;
FIG. 3 is a flow chart of data-related information acquisition in the present invention;
FIG. 4 is a flow chart of the acquisition of monitoring feature data in the present invention;
fig. 5 is a block diagram of a monitoring system based on multi-source data integration and data mining according to the present invention.
Detailed Description
The following description is presented to enable one of ordinary skill in the art to make and use the invention. The preferred embodiments in the following description are by way of example only and other obvious variations will occur to those skilled in the art.
Referring to fig. 1 to fig. 4, a monitoring method based on multi-source data integration and data mining according to an embodiment of the present invention includes:
acquiring monitoring information, wherein the monitoring information comprises monitoring object information and monitoring environment information;
Acquiring monitoring data source information based on actual monitoring requirements according to the monitoring information;
Acquiring multi-source monitoring data according to the monitoring data source information;
acquiring a monitoring data set based on data integration according to the multi-source monitoring data;
specifically, according to the multi-source monitoring data, based on data integration, a monitoring data set is obtained, which specifically includes:
acquiring multi-source monitoring data format information according to the multi-source monitoring data;
Acquiring standard format information of monitoring data based on data analysis requirements according to the multi-source monitoring data format information;
according to the standard format information of the monitoring data and the format information of the multi-source monitoring data, carrying out data format conversion on the multi-source monitoring data to obtain the standard format data of the multi-source monitoring;
acquiring attribute information of the multi-source monitoring standard format data according to the multi-source monitoring standard format data;
Acquiring data attribute rule information and data public attribute information based on data integration requirements according to the multi-source monitoring standard format data attribute information;
acquiring a data rule related index according to the data attribute rule information and the multi-source monitoring standard format data attribute information;
acquiring a data rule related index threshold based on data analysis requirements;
Screening the multi-source monitoring standard format data according to the data rule correlation index and the data rule correlation index threshold;
According to the multi-source monitoring standard format data attribute information and the data public attribute information, performing data mapping on the screened multi-source monitoring standard format data to obtain multi-source monitoring attribute mapping data;
acquiring a monitoring data set based on Z-score standardization according to the multi-source monitoring attribute mapping data;
The calculation formula of the data rule correlation index is as follows:
where Q is a data rule correlation index, For the influence coefficient of the ith data attribute on the data,/>For the correlation coefficient of the ith data attribute and the data attribute rule, n is the total number of data attributes.
According to the scheme, data attribute rule information and data public attribute information are acquired based on data integration requirements, unified data attribute standards and rules are established, data rule correlation indexes are acquired through the data attribute rule information and the data attribute information of the multi-source monitoring standard format data, the multi-source monitoring standard format data are screened according to the data rule correlation indexes and data rule correlation index thresholds, data conflict among different attribute data is avoided, data mapping is conducted on the screened multi-source monitoring standard format data through the multi-source monitoring standard format data attribute information and the data public attribute information, multi-source monitoring attribute mapping data are acquired, a monitoring data set is acquired according to the multi-source monitoring attribute mapping data based on Z-score standardization, integration of the multi-source data is achieved, monitoring results are more comprehensive, and data accuracy is improved.
Acquiring a monitoring historical data set, wherein the monitoring historical data set comprises monitoring historical data and monitoring data source information;
acquiring monitoring history warning information and monitoring history warning data according to the monitoring history data set;
Mining according to the monitoring history warning information and the monitoring history warning data based on the data association rule to acquire data association information;
specifically, according to the monitoring history warning information and the monitoring history warning data, mining is performed based on a data association rule, and the data association information is obtained, which specifically comprises the following steps:
acquiring a monitoring history warning transaction set according to the monitoring history warning information and the monitoring history warning data;
Acquiring frequent item set information based on a frequent item set generation algorithm according to the monitoring history warning transaction set;
Acquiring association rule information according to frequent item set information, wherein the association rule consists of a front frequent item set and a back frequent item set;
Acquiring association rule support degree information and association rule confidence degree information according to the association rule information, wherein the association rule support degree is the frequency of occurrence of the association rule in a monitoring history warning transaction set, namely the probability of simultaneous occurrence of a front part frequent item set and a back part frequent item set of the association rule, and the association rule confidence degree is the probability of simultaneous satisfaction of the back part frequent item set by the association rule under the condition that the front part frequent item set is satisfied;
acquiring an association rule support threshold and an association rule confidence threshold based on the data association rule mining requirement;
and screening the association rule according to the association rule support degree information, the association rule confidence degree information, the association rule support degree threshold and the association rule confidence degree threshold to acquire data association information.
According to the scheme, the association rule information is acquired through frequent item set information, the association rule support degree information and the association rule confidence degree information are acquired according to the association rule information, and the association rule is screened according to the association rule support degree information, the association rule confidence degree information, the association rule support degree threshold value and the association rule confidence degree threshold value to acquire data association information, so that analysis and mining of the data association information in the data are realized, and later feature data extraction and risk assessment are facilitated.
Acquiring monitoring characteristic data according to the data association information and the monitoring data set;
Specifically, according to a monitoring data set, based on a hash deduplication method, monitoring deduplication data are obtained;
Acquiring a monitoring deduplication data box diagram according to the monitoring deduplication data;
obtaining monitoring deduplication data box diagram data according to a monitoring deduplication data box diagram, wherein the monitoring deduplication data box diagram data comprises a minimum value, a lower quartile, a median, an upper quartile and a maximum value of the monitoring deduplication data box diagram;
acquiring a box diagram inner limit coefficient based on the data abnormality detection requirement;
obtaining box diagram threshold information according to the box diagram inner limit coefficient and the monitoring duplication removal data box diagram data;
Wherein the box plot threshold includes upper and lower bounds for outliers:
In the method, in the process of the invention, Is the upper limit of outliers,/>Is the lower limit of outliers,/>For the lower quartile, B is the upper quartile, and k is the box plot inner limit coefficient;
acquiring monitoring outlier data according to the monitoring duplication elimination data box diagram and box diagram threshold information;
Removing outlier data of the monitoring de-duplication data according to the monitoring outlier data to obtain monitoring correction data;
Standard monitoring data is obtained, wherein the standard monitoring data is monitoring data in a normal standard state;
acquiring monitoring abnormal data according to the monitoring correction data and the standard monitoring data;
Acquiring monitoring characteristic data according to the data association information and the monitoring abnormal data, wherein the monitoring characteristic data comprises event monitoring characteristic data and data related characteristic information;
Wherein the event monitoring feature data represents monitoring feature data related to a risk event, and the data related feature information is monitoring feature data related to each other.
In the scheme, the monitoring outlier data is obtained through monitoring the deduplication data box diagram and the box diagram threshold information, outlier data removal is carried out on the monitoring deduplication data according to the monitoring outlier data, the monitoring correction data is obtained, the reliability and the accuracy of the data are ensured, the monitoring abnormal data are obtained according to the monitoring correction data and the standard monitoring data, and the monitoring characteristic data are obtained according to the data association information and the monitoring abnormal data, so that risk assessment is facilitated later.
Acquiring a data risk assessment index based on a monitoring data risk assessment model according to the monitoring characteristic data;
acquiring a data risk assessment index threshold based on the data monitoring requirement;
and carrying out early warning on the data risk according to the data risk assessment index and the data risk assessment index threshold value.
Specifically, according to the data risk assessment index and the data risk assessment index threshold, the data risk is pre-warned, and the method specifically comprises the following steps:
Acquiring a data risk assessment index according to the monitoring characteristic data;
acquiring a data risk assessment index threshold based on the data monitoring requirement;
Judging whether the data risk assessment index exceeds the data risk assessment index threshold according to the data risk assessment index and the data risk assessment index threshold, if not, recording the data risk assessment index and the monitoring characteristic data, and if so, outputting and displaying data risk early warning information;
the calculation formula of the data risk assessment index is as follows:
wherein R is a data risk assessment index, As the influence coefficient of the s-th characteristic data on the j-th risk event,Is the weight of the s-th characteristic data,/>And h is the total number of the feature data, and m is the total number of the data risk event types, wherein h is the impact index of all feature data associated with the s-th feature data on the j-th risk event.
In the scheme, the data risk assessment index is obtained by monitoring the characteristic data, the data risk assessment index threshold is obtained based on the data monitoring requirement, and whether the data risk assessment index exceeds the data risk assessment index threshold is judged according to the data risk assessment index and the data risk assessment index threshold, so that early warning is timely carried out on analysis.
It will be appreciated that, in the feature data related to risk, there is also associated information between the feature data, and different associated information has different influence on risk, and when some feature data are simultaneously present, the risk increases exponentially.
Referring to fig. 5, further, in combination with the above-mentioned monitoring method based on multi-source data integration and data mining, a monitoring system based on multi-source data integration and data mining is provided, which includes:
The main control module is used for carrying out data format conversion on the multi-source monitoring data according to the monitoring data standard format information and the multi-source monitoring data format information, obtaining multi-source monitoring standard format data according to the data attribute rule information and the multi-source monitoring standard format data attribute information, obtaining a data rule correlation index, screening the multi-source monitoring standard format data according to the data rule correlation index and the data rule correlation index threshold, mapping the data according to the multi-source monitoring attribute, obtaining a monitoring data set based on Z-score standardization, obtaining monitoring characteristic data according to the data correlation information and the monitoring data set, obtaining a data risk assessment index according to the monitoring characteristic data, and carrying out early warning on data risk according to the data risk assessment index and the data risk assessment index threshold;
The information acquisition module is used for acquiring monitoring information, monitoring object information and monitoring environment information, acquiring multi-source monitoring data format information according to multi-source monitoring data, acquiring a monitoring historical data set, monitoring historical data and monitoring data source information, and acquiring monitoring historical warning information and monitoring historical warning data according to the monitoring historical data set;
The data mining module is used for acquiring a monitoring history warning transaction set according to the monitoring history warning information and the monitoring history warning data, acquiring frequent item set information based on a frequent item set generation algorithm according to the monitoring history warning transaction set, acquiring association rule information according to the frequent item set information, acquiring association rule support degree information and association rule confidence degree information according to the association rule information, and screening association rules according to the association rule support degree information, the association rule confidence degree information, an association rule support degree threshold value and an association rule confidence degree threshold value to acquire data association information;
And the display module is interacted with the main control module and is used for displaying the data risk assessment index and outputting data risk early warning information.
The main control module specifically comprises:
The control unit is used for acquiring monitoring characteristic data according to the data association information and the monitoring data set, acquiring a data risk assessment index according to the monitoring characteristic data, and carrying out early warning on the data risk according to the data risk assessment index and a data risk assessment index threshold;
The information receiving unit is interacted with the information acquisition module and the data mining module, and is used for receiving data and transmitting the data to the data processing unit;
the data processing unit is used for carrying out data format conversion on the multi-source monitoring data according to the monitoring data standard format information and the multi-source monitoring data format information, obtaining multi-source monitoring standard format data, obtaining data rule related indexes according to the data attribute rule information and the multi-source monitoring standard format data attribute information, screening the multi-source monitoring standard format data according to the data rule related indexes and the data rule related index threshold, mapping data according to the multi-source monitoring attribute, and obtaining a monitoring data set based on Z-score standardization.
The information acquisition module specifically comprises:
The first acquisition unit is used for acquiring monitoring information, monitoring object information and monitoring environment information and acquiring multi-source monitoring data format information according to multi-source monitoring data;
The second acquisition unit is used for acquiring a monitoring historical data set, monitoring historical data and monitoring data source information, and acquiring monitoring historical warning information and monitoring historical warning data according to the monitoring historical data set.
The data mining module specifically comprises:
The transaction set unit is used for acquiring a monitoring history warning transaction set according to the monitoring history warning information and the monitoring history warning data, and acquiring frequent item set information based on a frequent item set generation algorithm according to the monitoring history warning transaction set;
The association rule unit is used for acquiring association rule information according to the frequent item set information, acquiring association rule support degree information and association rule confidence degree information according to the association rule information, screening the association rule according to the association rule support degree information, the association rule confidence degree information, the association rule support degree threshold value and the association rule confidence degree threshold value, and acquiring data association information.
Still further, the present solution also proposes a computer-readable storage medium having stored thereon a computer-readable program that when invoked performs the monitoring method as described above;
it is understood that the computer readable storage medium may be a magnetic medium, e.g., floppy disk, hard disk, tape; optical media such as DVD; or a semiconductor medium such as a solid state disk SolidStateDisk, SSD, etc.
In summary, the invention has the advantages that:
According to the method, data integration is carried out on multi-source monitoring data through data format conversion, data mapping and data standardization data processing methods, the degree of difference of the data of different data sources is evaluated through data rule correlation indexes, data conflict caused by difference or contradiction of the data of different data sources on certain attributes is avoided, a monitoring result is more comprehensive, accuracy of the data is improved, association rule information is obtained through frequent item set information, association rule support degree information, association rule confidence degree information, association rule support degree threshold and association rule confidence degree threshold are obtained, association rules are screened, association information among monitoring data is fully analyzed, data risk assessment indexes are obtained through monitoring feature data, risk degree of the monitoring feature data is accurately assessed, and early warning is carried out on analysis in time.
The foregoing has shown and described the basic principles, principal features and advantages of the invention. It will be understood by those skilled in the art that the present invention is not limited to the embodiments described above, and that the above embodiments and descriptions are merely illustrative of the principles of the present invention, and various changes and modifications may be made therein without departing from the spirit and scope of the invention, which is defined by the appended claims. The scope of the invention is defined by the appended claims and equivalents thereof.

Claims (9)

1. The monitoring method based on multi-source data integration and data mining is characterized by comprising the following steps:
acquiring monitoring information, wherein the monitoring information comprises monitoring object information and monitoring environment information;
Acquiring monitoring data source information based on actual monitoring requirements according to the monitoring information;
Acquiring multi-source monitoring data according to the monitoring data source information;
acquiring multi-source monitoring data format information according to the multi-source monitoring data;
Acquiring standard format information of monitoring data based on data analysis requirements according to the multi-source monitoring data format information;
according to the standard format information of the monitoring data and the format information of the multi-source monitoring data, carrying out data format conversion on the multi-source monitoring data to obtain the standard format data of the multi-source monitoring;
acquiring attribute information of the multi-source monitoring standard format data according to the multi-source monitoring standard format data;
Acquiring data attribute rule information and data public attribute information based on data integration requirements according to the multi-source monitoring standard format data attribute information;
acquiring a data rule related index according to the data attribute rule information and the multi-source monitoring standard format data attribute information;
acquiring a data rule related index threshold based on data analysis requirements;
Screening the multi-source monitoring standard format data according to the data rule correlation index and the data rule correlation index threshold;
According to the multi-source monitoring standard format data attribute information and the data public attribute information, performing data mapping on the screened multi-source monitoring standard format data to obtain multi-source monitoring attribute mapping data;
acquiring a monitoring data set based on Z-score standardization according to the multi-source monitoring attribute mapping data;
The calculation formula of the data rule correlation index is as follows:
where Q is a data rule correlation index, For the influence coefficient of the ith data attribute on the data,/>The correlation coefficient of the ith data attribute and the data attribute rule is given, and n is the total number of the data attributes;
Acquiring a monitoring historical data set, wherein the monitoring historical data set comprises monitoring historical data and monitoring data source information;
acquiring monitoring history warning information and monitoring history warning data according to the monitoring history data set;
Mining according to the monitoring history warning information and the monitoring history warning data based on the data association rule to acquire data association information;
acquiring monitoring characteristic data according to the data association information and the monitoring data set;
Acquiring a data risk assessment index based on a monitoring data risk assessment model according to the monitoring characteristic data;
acquiring a data risk assessment index threshold based on the data monitoring requirement;
and carrying out early warning on the data risk according to the data risk assessment index and the data risk assessment index threshold value.
2. The method for monitoring based on multi-source data integration and data mining according to claim 1, wherein the acquiring data association information based on data association rule mining according to the monitoring history warning information and the monitoring history warning data specifically comprises:
acquiring a monitoring history warning transaction set according to the monitoring history warning information and the monitoring history warning data;
Acquiring frequent item set information based on a frequent item set generation algorithm according to the monitoring history warning transaction set;
Acquiring association rule information according to frequent item set information, wherein the association rule consists of a front frequent item set and a back frequent item set;
Acquiring association rule support degree information and association rule confidence degree information according to the association rule information, wherein the association rule support degree is the frequency of occurrence of the association rule in a monitoring history warning transaction set, namely the probability of simultaneous occurrence of a front part frequent item set and a back part frequent item set of the association rule, and the association rule confidence degree is the probability of simultaneous satisfaction of the back part frequent item set by the association rule under the condition that the front part frequent item set is satisfied;
acquiring an association rule support threshold and an association rule confidence threshold based on the data association rule mining requirement;
and screening the association rule according to the association rule support degree information, the association rule confidence degree information, the association rule support degree threshold and the association rule confidence degree threshold to acquire data association information.
3. The method for monitoring and controlling data mining based on multi-source data integration according to claim 1, wherein the step of obtaining monitoring feature data according to the data association information and the monitoring data set specifically comprises the steps of:
acquiring monitoring deduplication data based on a hash deduplication method according to the monitoring dataset;
Acquiring a monitoring deduplication data box diagram according to the monitoring deduplication data;
obtaining monitoring deduplication data box diagram data according to a monitoring deduplication data box diagram, wherein the monitoring deduplication data box diagram data comprises a minimum value, a lower quartile, a median, an upper quartile and a maximum value of the monitoring deduplication data box diagram;
acquiring a box diagram inner limit coefficient based on the data abnormality detection requirement;
obtaining box diagram threshold information according to the box diagram inner limit coefficient and the monitoring duplication removal data box diagram data;
acquiring monitoring outlier data according to the monitoring duplication elimination data box diagram and box diagram threshold information;
Removing outlier data of the monitoring de-duplication data according to the monitoring outlier data to obtain monitoring correction data;
Standard monitoring data is obtained, wherein the standard monitoring data is monitoring data in a normal standard state;
acquiring monitoring abnormal data according to the monitoring correction data and the standard monitoring data;
Acquiring monitoring characteristic data according to the data association information and the monitoring abnormal data, wherein the monitoring characteristic data comprises event monitoring characteristic data and data related characteristic information;
Wherein the event monitoring feature data represents monitoring feature data related to a risk event, and the data related feature information is monitoring feature data related to each other.
4. The method for monitoring based on multi-source data integration and data mining according to claim 1, wherein the early warning of data risk is performed according to the data risk assessment index and the data risk assessment index threshold, specifically comprising:
Acquiring a data risk assessment index according to the monitoring characteristic data;
acquiring a data risk assessment index threshold based on the data monitoring requirement;
Judging whether the data risk assessment index exceeds the data risk assessment index threshold according to the data risk assessment index and the data risk assessment index threshold, if not, recording the data risk assessment index and the monitoring characteristic data, and if so, outputting and displaying data risk early warning information;
the calculation formula of the data risk assessment index is as follows:
wherein R is a data risk assessment index, For the influence coefficient of the s-th characteristic data on the j-th risk event,/>Is the weight of the s-th characteristic data,/>And h is the total number of the feature data, and m is the total number of the data risk event types, wherein h is the impact index of all feature data associated with the s-th feature data on the j-th risk event.
5. A monitoring system based on multi-source data integration and data mining for implementing the monitoring method according to any one of claims 1 to 4, comprising:
the main control module is used for carrying out data format conversion on the multi-source monitoring data according to the monitoring data standard format information and the multi-source monitoring data format information, obtaining multi-source monitoring standard format data, obtaining data rule related indexes according to the data attribute rule information and the multi-source monitoring standard format data attribute information, screening the multi-source monitoring standard format data according to the data rule related indexes and the data rule related index threshold, mapping data according to the multi-source monitoring attribute, obtaining a monitoring data set based on Z-score standardization, obtaining monitoring feature data according to the data related information and the monitoring data set, obtaining a data risk assessment index according to the monitoring feature data, and carrying out early warning on data risk according to the data risk assessment index and the data risk assessment index threshold;
The information acquisition module is used for acquiring monitoring information, monitoring object information and monitoring environment information, acquiring multi-source monitoring data format information according to multi-source monitoring data, acquiring a monitoring historical data set, monitoring historical data and monitoring data source information, and acquiring monitoring historical warning information and monitoring historical warning data according to the monitoring historical data set;
The data mining module is used for acquiring a monitoring history warning transaction set according to the monitoring history warning information and the monitoring history warning data, acquiring frequent item set information based on a frequent item set generation algorithm according to the monitoring history warning transaction set, acquiring association rule information according to the frequent item set information, acquiring association rule support degree information and association rule confidence degree information according to the association rule information, and screening association rules according to the association rule support degree information, the association rule confidence degree information, an association rule support degree threshold value and an association rule confidence degree threshold value to acquire data association information;
and the display module is interacted with the main control module and is used for displaying the data risk assessment index and outputting data risk early warning information.
6. The multi-source data integration and data mining based monitoring system of claim 5, wherein the main control module specifically comprises:
the control unit is used for acquiring monitoring characteristic data according to the data association information and the monitoring data set, acquiring a data risk assessment index according to the monitoring characteristic data, and carrying out early warning on the data risk according to the data risk assessment index and a data risk assessment index threshold;
The information receiving unit is interacted with the information acquisition module and the data mining module, and is used for receiving data and transmitting the data to the data processing unit;
The data processing unit is used for carrying out data format conversion on the multi-source monitoring data according to the monitoring data standard format information and the multi-source monitoring data format information, obtaining multi-source monitoring standard format data, obtaining data rule correlation indexes according to the data attribute rule information and the multi-source monitoring standard format data attribute information, screening the multi-source monitoring standard format data according to the data rule correlation indexes and the data rule correlation index threshold, mapping data according to the multi-source monitoring attribute, and obtaining a monitoring data set based on Z-score standardization.
7. The multi-source data integration and data mining based monitoring system according to claim 5, wherein the information acquisition module specifically comprises:
The first acquisition unit is used for acquiring monitoring information, monitoring object information and monitoring environment information and acquiring multi-source monitoring data format information according to multi-source monitoring data;
the second acquisition unit is used for acquiring a monitoring historical data set, monitoring historical data and monitoring data source information, and acquiring monitoring historical warning information and monitoring historical warning data according to the monitoring historical data set.
8. The multi-source data integration and data mining based monitoring system of claim 5, wherein the data mining module specifically comprises:
The transaction set unit is used for acquiring a monitoring history warning transaction set according to the monitoring history warning information and the monitoring history warning data, and acquiring frequent item set information based on a frequent item set generation algorithm according to the monitoring history warning transaction set;
The association rule unit is used for acquiring association rule information according to the frequent item set information, acquiring association rule support degree information and association rule confidence degree information according to the association rule information, screening the association rule according to the association rule support degree information, the association rule confidence degree information, the association rule support degree threshold value and the association rule confidence degree threshold value, and acquiring data association information.
9. A computer-readable storage medium, on which a computer-readable program is stored, characterized in that the computer-readable program, when called, performs the monitoring method according to any one of claims 1-4.
CN202410245153.6A 2024-03-05 2024-03-05 Monitoring method, system and storage medium based on multi-source data integration and data mining Active CN117827937B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410245153.6A CN117827937B (en) 2024-03-05 2024-03-05 Monitoring method, system and storage medium based on multi-source data integration and data mining

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410245153.6A CN117827937B (en) 2024-03-05 2024-03-05 Monitoring method, system and storage medium based on multi-source data integration and data mining

Publications (2)

Publication Number Publication Date
CN117827937A CN117827937A (en) 2024-04-05
CN117827937B true CN117827937B (en) 2024-05-24

Family

ID=90523230

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410245153.6A Active CN117827937B (en) 2024-03-05 2024-03-05 Monitoring method, system and storage medium based on multi-source data integration and data mining

Country Status (1)

Country Link
CN (1) CN117827937B (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111639237A (en) * 2020-04-07 2020-09-08 安徽理工大学 Electric power communication network risk assessment system based on clustering and association rule mining
CN114330598A (en) * 2022-01-21 2022-04-12 武汉东湖大数据交易中心股份有限公司 Multi-source heterogeneous data fusion method and system based on fuzzy C-means clustering algorithm
CN115935292A (en) * 2022-12-23 2023-04-07 西南交通大学 Method for constructing full-life-cycle multi-source heterogeneous data fusion mode of complex equipment
CN116090819A (en) * 2022-12-27 2023-05-09 贵州电网有限责任公司 Power distribution network risk situation prediction method based on association rule
CN116185817A (en) * 2022-11-30 2023-05-30 北京航空航天大学 Screening method and system for software defect prediction rules
CN116910824A (en) * 2023-08-28 2023-10-20 广东中山网传媒信息科技有限公司 Safety big data analysis method and system based on distributed multi-source measure
CN117253614A (en) * 2023-11-14 2023-12-19 天津医科大学朱宪彝纪念医院(天津医科大学代谢病医院、天津代谢病防治中心) Diabetes risk early warning method based on big data analysis

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11755937B2 (en) * 2018-08-24 2023-09-12 General Electric Company Multi-source modeling with legacy data

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111639237A (en) * 2020-04-07 2020-09-08 安徽理工大学 Electric power communication network risk assessment system based on clustering and association rule mining
CN114330598A (en) * 2022-01-21 2022-04-12 武汉东湖大数据交易中心股份有限公司 Multi-source heterogeneous data fusion method and system based on fuzzy C-means clustering algorithm
CN116185817A (en) * 2022-11-30 2023-05-30 北京航空航天大学 Screening method and system for software defect prediction rules
CN115935292A (en) * 2022-12-23 2023-04-07 西南交通大学 Method for constructing full-life-cycle multi-source heterogeneous data fusion mode of complex equipment
CN116090819A (en) * 2022-12-27 2023-05-09 贵州电网有限责任公司 Power distribution network risk situation prediction method based on association rule
CN116910824A (en) * 2023-08-28 2023-10-20 广东中山网传媒信息科技有限公司 Safety big data analysis method and system based on distributed multi-source measure
CN117253614A (en) * 2023-11-14 2023-12-19 天津医科大学朱宪彝纪念医院(天津医科大学代谢病医院、天津代谢病防治中心) Diabetes risk early warning method based on big data analysis

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于Apriori-BPNN分析的继电保护通信电路风险评估研究及应用;吕顺利;施健;缪巍巍;吴海洋;陆涛;;计算机与数字工程;20170420(04);全文 *

Also Published As

Publication number Publication date
CN117827937A (en) 2024-04-05

Similar Documents

Publication Publication Date Title
CN111475804B (en) Alarm prediction method and system
JP6725700B2 (en) Method, apparatus, and computer readable medium for detecting abnormal user behavior related application data
WO2021052031A1 (en) Statistical interquartile range-based commodity inventory risk early warning method and system, and computer readable storage medium
US10572512B2 (en) Detection method and information processing device
CN112579728B (en) Behavior abnormity identification method and device based on mass data full-text retrieval
US10810225B2 (en) System and method for large scale data processing of source data
CN112328425A (en) Anomaly detection method and system based on machine learning
CN112800061B (en) Data storage method, device, server and storage medium
WO2022103738A1 (en) Systems and methods for enhanced machine learning using hierarchical prediction and compound thresholds
CN115514619B (en) Alarm convergence method and system
CN116579697A (en) Cold chain full link data information management method, device, equipment and storage medium
CN116257663A (en) Abnormality detection and association analysis method and related equipment for unmanned ground vehicle
CN114881167A (en) Abnormality detection method, abnormality detection device, electronic apparatus, and medium
CN117827937B (en) Monitoring method, system and storage medium based on multi-source data integration and data mining
CN105677723B (en) A kind of data label foundation and search method for industrial signal source
GB2465860A (en) A directed graph behaviour model for monitoring a computer system in which each node of the graph represents an event generated by an application
CN115599077B (en) Vehicle fault delimiting method and device, electronic equipment and storage medium
CN116862109A (en) Regional carbon emission situation awareness early warning method
CN114312930A (en) Train operation abnormity diagnosis method and device based on log data
CN118094169B (en) Component relevance analysis method for intelligent operation and maintenance alarm system of complex equipment
CN111798237A (en) Abnormal transaction diagnosis method and system based on application log
CN118094169A (en) Component relevance analysis method for intelligent operation and maintenance alarm system of complex equipment
US20230409421A1 (en) Anomaly detection in computer systems
CN115481176A (en) Data multidimensional display method based on real-time warehouse
CN117708720A (en) Equipment fault diagnosis system based on knowledge graph

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant