CN117312255B

CN117312255B - Electronic document splitting optimization management method and system

Info

Publication number: CN117312255B
Application number: CN202311605670.1A
Authority: CN
Inventors: 李洪波; 石文博; 米杰; 毛伟
Original assignee: Hunan Zhongsi Information Technology Co ltd
Current assignee: Hunan Zhongsi Information Technology Co ltd
Priority date: 2023-11-29
Filing date: 2023-11-29
Publication date: 2024-02-20
Anticipated expiration: 2043-11-29
Also published as: CN117312255A

Abstract

The invention relates to the technical field of data processing, in particular to a method and a system for splitting, optimizing and managing electronic documents, which comprise the following steps: real water consumption data of the water consumption in the factory at different moments in a preset time period are collected in real time, and historical water consumption data of the water consumption in the factory at different moments in the preset time period are collected in the historical water consumption; according to the distribution condition of the water data to be analyzed in the water data neighborhood range to be analyzed, screening the water data to be analyzed to determine initial segmentation points; obtaining an optimal segmentation point according to the segmentation result of the water data to be analyzed by the initial segmentation point in the historical water data and the segmentation result of the water data to be analyzed by the initial segmentation point in the actual water data; and carrying out sectional compression processing on the water data to be analyzed by utilizing the optimal sectional point, and storing the data subjected to sectional compression to obtain split water consumption electronic document data. The invention has better effect of carrying out sectional compression treatment on the water data to be analyzed.

Description

Electronic document splitting optimization management method and system

Technical Field

The invention relates to the technical field of data processing, in particular to an electronic document splitting optimization management method and system.

Background

The water consumption data in the factory refers to real-time data of water consumption of the factory at each moment in the day, and in order to monitor the water consumption condition in the factory in real time, the water consumption in the factory acquired in real time needs to be stored and uploaded to a management system. However, as the fluctuation of the water consumption data in the factory is frequent and the fluctuation range is smaller, repeated data acquired every day is more, so that the data redundancy degree is larger when the water consumption data in the factory is compressed and stored, when the real-time acquired water consumption data is split and compressed, only the continuous repeated condition of the acquired data is considered, the real-time acquired water consumption data is directly split and compressed, and the redundant relation between the real-time acquired water consumption data and the historical water consumption data is not considered, so that the existing method for splitting and compressing the water consumption data has poor processing effect.

Disclosure of Invention

In order to solve the technical problem that the existing method for splitting and compressing water data has poor processing effect, the invention aims to provide an electronic document splitting and optimizing management method, which adopts the following technical scheme:

real water consumption data of the water consumption in the factory at different moments in a preset time period are collected in real time, and historical water consumption data of the water consumption in the factory at different moments in the preset time period are collected in the historical water consumption; the actual water use data and the historical water use data are water use data to be analyzed;

obtaining probability indexes of each piece of water data to be analyzed as data segmentation points according to the distribution condition of the water data to be analyzed in the water data neighborhood range to be analyzed; screening the water data to be analyzed according to the probability index to determine initial segmentation points;

obtaining an optimal segmentation point according to the segmentation result of the water data to be analyzed by the initial segmentation point in the historical water data and the segmentation result of the water data to be analyzed by the initial segmentation point in the actual water data;

and carrying out sectional compression processing on the water data to be analyzed by utilizing the optimal sectional point, and storing the data subjected to sectional compression to obtain split water consumption electronic document data.

Preferably, the obtaining the optimal segmentation point according to the segmentation result of the water data to be analyzed by the initial segmentation point in the historical water data and the segmentation result of the water data to be analyzed by the initial segmentation point in the actual water data specifically includes:

obtaining an effect evaluation index of the initial segmentation point according to the segmentation result of the water data to be analyzed of the initial segmentation point in the historical water data and the segmentation result of the water data to be analyzed of the initial segmentation point in the actual water data;

and screening the initial segmentation points according to the effect evaluation index to determine the optimal segmentation points.

Preferably, the obtaining the effect evaluation index of the initial segmentation point according to the segmentation result of the water data to be analyzed by the initial segmentation point in the historical water data and the segmentation result of the water data to be analyzed by the initial segmentation point in the actual water data specifically includes:

recording any initial segmentation point in the actual water use data as a first target segmentation point, recording the next initial segmentation point adjacent to the first target segmentation point as a second target segmentation point, and recording initial segmentation points with the same position serial numbers as the first and second target segmentation points in the historical water use data in the actual water use data as a first matching segmentation point and a second matching segmentation point respectively;

acquiring actual water consumption data between a first target segmentation point and a second target segmentation point to form a target actual data sequence, and acquiring historical water consumption data between the first target segmentation point and the second target segmentation point to form a target historical data sequence; acquiring actual water consumption data between a first matching segmentation point and a second matching segmentation point to form a matching actual data sequence, and acquiring historical water consumption data between the first matching segmentation point and the second matching segmentation point to form a matching historical data sequence;

and obtaining the effect evaluation indexes of the second target segmentation point and the second matching segmentation point according to the target actual data sequence, the target historical data sequence, the matching actual data sequence and the data distribution conditions in the matching historical data sequence.

Preferably, the calculation formula of the effect evaluation index is specifically:

；

wherein,an effect evaluation index indicating a second target segment point and a second matching segment point, r+1 indicating an (r+1) -th initial segment point,>representing a matching actual data sequence->Representing the number of different data values contained in the matching actual data sequence, ±>Indicating the number of presence of the x-th value in the matching actual data sequence,/for>Representing the frequency of occurrence of the x-th value in the matching actual data sequence; />Representing a matching history data sequence,/->Representing the number of different data values contained in the matching history data sequence, ±>Indicating the number of presence of the x-th value in the matching history data sequence,/for>Representing the frequency of occurrence of the x-th value in the matching history data sequence; />Representing the actual data sequence of the object->Representing different data values contained in the target actual data sequenceQuantity of->Indicating the number of x-th values present in the target actual data sequence,/for the target actual data sequence>Representing the frequency of occurrence of the x-th numerical value contained in the target actual data sequence; />Representing a target history data sequence,/->Representing the number of different data values contained in the target historical data sequence, +.>Indicating the number of x-th value present in the target history data sequence,/for>Representing the frequency of occurrence of the x-th value contained in the target history data sequence.

Preferably, the screening the initial segmentation point according to the effect evaluation index, and determining an optimal segmentation point specifically includes:

when the effect evaluation index of the second target segmentation point and the second matching segmentation point is larger than a preset value, the second matching segmentation point is the optimal segmentation point in the water data to be analyzed; and when the effect evaluation indexes of the second target segmentation point and the second matching segmentation point are smaller than a preset value, the second target segmentation point is the optimal segmentation point in the water data to be analyzed.

Preferably, the obtaining the probability indicator that each water data to be analyzed is a data segment point according to the distribution condition of the water data to be analyzed in the water data neighborhood range to be analyzed specifically includes:

recording any one water data to be analyzed as selected water data, and forming a left neighborhood data sequence of the selected water data by a preset number of adjacent water data to be analyzed before the selected water data; forming a right neighborhood data sequence of the selected water data by using adjacent preset number of water data to be analyzed after the water data are selected;

and obtaining probability indexes of the selected water data as data segmentation points according to the matching relation of the left neighborhood data sequence and the right neighborhood data sequence.

Preferably, the obtaining the probability indicator that the selected water data is the data segmentation point according to the matching relationship between the left neighborhood data sequence and the right neighborhood data sequence specifically includes:

the method comprises the steps of obtaining the number of data, which are the same in a right neighborhood data sequence, of all data in a left neighborhood data sequence as a first number, and obtaining the number of data, which are the same in a left neighborhood data sequence, of all data in a right neighborhood data sequence as a second number; and obtaining a probability index of selecting the water data as the data segmentation point according to the maximum value in the first quantity and the second quantity, wherein the maximum value and the probability index are in a negative correlation.

Preferably, the screening the water data to be analyzed according to the probability index to determine an initial segmentation point specifically includes:

and recording the water data to be analyzed corresponding to the probability index being greater than or equal to a preset probability threshold as an initial segmentation point.

Preferably, the step of performing the segment compression processing on the water data to be analyzed by using the optimal segment point specifically includes:

and respectively carrying out segmentation processing on actual water use data and historical water use data in the water use data to be analyzed by utilizing the optimal segmentation points, and carrying out compression processing on the segmented data by utilizing a Huffman coding algorithm.

The invention also provides an electronic document splitting optimization management system, which comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the computer program realizes the steps of an electronic document splitting optimization management method when being executed by the processor.

The embodiment of the invention has at least the following beneficial effects:

according to the invention, the actual water consumption data and the historical water consumption data are collected at first, so that the two water consumption data are subjected to joint analysis later, and the data processing effect is better. Then, analyzing the distribution condition of the water data to be analyzed in the neighborhood range of the water data to be analyzed to obtain probability indexes of the water data to be analyzed as data segmentation points, namely reflecting the probability of the water data to be analyzed as the data segmentation points, so as to screen the water data to be analyzed to determine initial segmentation points, wherein the initial segmentation points only represent the data distribution condition of the corresponding water data in the neighborhood range of the water data to be analyzed. Further, according to the segmentation result of the water data to be analyzed by the initial segmentation point in the historical water data and the segmentation result of the water data to be analyzed by the initial segmentation point in the actual water data, an optimal segmentation point is obtained, namely, the segmentation conditions of the initial segmentation point in the two data are respectively analyzed, so that the data segmentation points with better segmentation conditions in the historical water data and the actual water data are screened out. Finally, the optimal segmentation point is utilized to perform segmented compression treatment on the water consumption data to be analyzed, and the obtained water consumption electronic document data is beneficial to management of the water consumption data of the factory.

Drawings

In order to more clearly illustrate the embodiments of the invention or the technical solutions and advantages of the prior art, the following description will briefly explain the drawings used in the embodiments or the description of the prior art, and it is obvious that the drawings in the following description are only some embodiments of the invention, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.

Fig. 1 is a flowchart of a method for electronic document splitting optimization management according to an embodiment of the present invention.

Detailed Description

In order to further describe the technical means and effects adopted by the invention to achieve the preset aim, the following detailed description refers to specific implementation, structure, characteristics and effects of the method and system for electronic document splitting and optimizing management according to the invention, which are provided by the invention, with reference to the accompanying drawings and preferred embodiments. In the following description, different "one embodiment" or "another embodiment" means that the embodiments are not necessarily the same. Furthermore, the particular features, structures, or characteristics of one or more embodiments may be combined in any suitable manner.

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.

The invention provides a specific scheme of an electronic document splitting optimization management method and a system, which are specifically described below with reference to the accompanying drawings.

An embodiment of an electronic document splitting optimization management method comprises the following steps:

referring to fig. 1, a flowchart of a method for electronic document splitting optimization management according to an embodiment of the present invention is shown, and the method includes the following steps:

step one, acquiring actual water use data of the water consumption in a factory at different moments in a preset time period in real time, and acquiring historical water use data of the water consumption in the historical time period at different moments in the preset time period; the actual water use data and the historical water use data are water use data to be analyzed.

Firstly, actual water use data of the water consumption in the factory at different moments in a preset time period are collected from a management system in the factory, and historical water use data of the water consumption in the factory at different moments in the preset time period are collected. In this embodiment, the time length of the preset time period is set to one day, that is, 24 hours, and the time interval between two adjacent different times is set to 10 minutes, and the practitioner can set according to the specific implementation scenario.

Specifically, when the water consumption condition in the factory on the same day is collected every day, the collected data is required to be split and stored, and the water consumption condition of the day is similar to the water consumption condition collected in real time in consideration of the history, and the fluctuation condition of the water consumption is also similar, so that the water consumption data of the same time length and different moments in the history water consumption are collected at the same time. That is, in the present embodiment, actual water usage data at each time of the day is collected in real time, and at the same time, historical water usage data at each time of the day before the day is collected in the historical water usage. In order to facilitate the subsequent data analysis process, both the actual water usage data and the historical water usage data are used as the water usage data to be analyzed, and it can be understood that the water usage data to be analyzed in the embodiment includes water usage data in two preset time periods.

Step two, obtaining probability indexes of each piece of water data to be analyzed as data segmentation points according to the distribution condition of the water data to be analyzed in the water data neighborhood range to be analyzed; and screening the water data to be analyzed according to the probability index to determine an initial segmentation point.

Because the water data in the factory frequently fluctuates and the fluctuation amplitude is relatively close to each other, the collected water data has the same numerical value as more data in the historical data, and therefore, the historical data and the data collected in real time are split and compressed simultaneously. Although the data quantity of the historical water consumption data and the real water consumption data acquired in real time in the historical data is equal, the data segmentation points between the historical water consumption data and the real water consumption data may be different, so that when the historical water consumption data and the real water consumption data are split respectively by using the same data segmentation point, the overall data compression efficiency is low, the splitting compression processing effect is poor, and further the data segmentation points need to be analyzed to determine the data segmentation point when the splitting compression effect is optimal.

Firstly, analyzing water data to be analyzed, and determining the position distribution of data segmentation points in the water data, namely obtaining probability indexes of each water data to be analyzed as the data segmentation points according to the distribution condition of the water data to be analyzed in the neighborhood range of the water data to be analyzed. In the process of analyzing, screening and determining the data segmentation points of the water consumption data to be analyzed, the actual water consumption data and the historical water consumption data respectively exist the data segmentation points in each preset time period, and in this embodiment, any one water consumption data is taken as an example for description.

Specifically, any water data to be analyzed is recorded as selected water data, and a left neighborhood data sequence of the selected water data is formed by a preset number of adjacent water data to be analyzed before the selected water data; and constructing a right neighborhood data sequence of the selected water data by using the adjacent preset number of water data to be analyzed after the selected water data.

In this embodiment, the preset number is 20, and the ith actual water data of all the actual water data is taken as the selected water data for explanation, the ith actual water data may be expressed asThe left neighborhood data sequence of the selected water data, i.e. the ith actual water data, can be expressed as +.>Wherein->Represents the i-20 th actual water use data in all actual water use data, < >>The i-1 th actual water use data among all the actual water use data is represented. The right neighborhood data sequence of the selected water data, i.e. the ith actual water data, can be expressed asWherein->Represents the (i+1) th actual water data among all actual water data,/for>The (i+20) th actual water data among all the actual water data are shown.

When the number of data on the left or right side of the selected water data is smaller than the preset number, the analysis and judgment operation of whether the data can be used as the segmentation point is not performed, namely, the data analysis is performed from the 21 st data in the actual water data until the n-20 th data in the actual water data is finished, wherein n is the total number of the actual water data.

And obtaining probability indexes of the selected water data as data segmentation points according to the matching relation of the left neighborhood data sequence and the right neighborhood data sequence. The data quantity of all the data in the left neighborhood data sequence, which is the same in the right neighborhood data sequence, is obtained and is marked as the first quantity, and the data quantity of all the data in the right neighborhood data sequence, which is the same in the left neighborhood data sequence, is obtained and is marked as the second quantity.

For example, assuming that the selected water consumption data is 1, the preset number value is 4, the corresponding left neighborhood data sequence is {1,4,5,3,1}, the right neighborhood data sequence is {1,1,1,1,4}, and the values of all the data in the left neighborhood data sequence are 1,4,1 in the right neighborhood data sequence, respectively, where the first number value is 3. The values of all data in the right neighborhood data sequence are 1,1,1,1,4 in the left neighborhood data sequence, and the value of the second number is 5.

And obtaining a probability index of selecting the water data as the data segmentation point according to the maximum value in the first quantity and the second quantity, wherein the maximum value and the probability index are in a negative correlation. Taking the ith actual water data in all the actual water data as the selected water data for illustration, the calculation formula of the probability index of the ith actual water data as the data segmentation point can be expressed as follows:

；

wherein,probability index indicating the ith actual water data as data segment point, +.>Representing a first quantity, +.>Representing a second quantity, +.>Representing the amount of data contained in either the left or right neighborhood data sequence.

The first quantity characterizes the number of repetitions of data in a neighborhood before and in a neighborhood after the selected water data, the second quantity characterizes the number of repetitions of data in a neighborhood after and in a neighborhood before the selected water data,the larger the value of the ratio of the part with larger repeated data in the two parts is, the higher the data repetition degree in the neighborhood range on the left side and the right side of the selected water data is, and the worse the effect of taking the selected water data as the data segmentation point is, namely the smaller the value of the corresponding probability index is, the smaller the probability of taking the selected water data as the data segmentation point is.

It should be noted that, when the huffman coding algorithm is used to compress the water data to be analyzed, the compression effect of the data is related to the numerical distribution of the data, when the frequency of the numerical value of the data in all the water data to be analyzed is high, the corresponding data compression efficiency is high, when the repetition degree of the data in the neighborhood regions on the left and right sides of the water data to be analyzed is high, the necessity of the segmentation operation is low, and the probability of the corresponding water data to be analyzed as the data segmentation points is low.

Based on the above, the to-be-analyzed water data is screened according to the probability index to determine an initial segmentation point, namely the probability index characterizes the probability that the to-be-analyzed water data is taken as a data segmentation point, when the probability index is larger than or equal to a preset probability threshold value, the probability that the to-be-analyzed water data is taken as the data segmentation point is higher, so that the to-be-analyzed water data is marked as the initial segmentation point, namely the corresponding to-be-analyzed water data is marked as the initial segmentation point when the probability index is larger than or equal to the preset probability threshold value. When the probability index is smaller than a preset probability threshold, the probability that the water data to be analyzed is used as the data segmentation point is lower, so that the water data to be analyzed is not used as the segmentation point for data analysis.

In this embodiment, the probability threshold has a value of 0.4, and the implementer may set according to a specific implementation scenario. According to the method, all initial segmentation points in all actual water use data can be obtained respectively, and all initial segmentation points in all historical water use data can be obtained simultaneously.

And thirdly, obtaining an optimal segmentation point according to the segmentation result of the water data to be analyzed by the initial segmentation point in the historical water data and the segmentation result of the water data to be analyzed by the initial segmentation point in the actual water data.

The corresponding data segmentation points exist in the actual water data at all times in a day, the corresponding data segmentation points exist in the historical water data at all times in a day, namely, the initial segmentation points in the actual water data are obtained based on the numerical distribution of the actual water data, the initial segmentation points in the historical water data are obtained based on the numerical distribution of the historical water data, and therefore, when the initial segmentation points in the actual water data are utilized to divide the historical water data at the same time, the situation that the splitting effect is poor can occur, and similarly, when the initial segmentation points in the historical water data are utilized to divide the actual water data at the same time, the situation that the splitting effect is poor can also occur, so that the analysis needs to be carried out by combining a plurality of different segmentation results to determine the data segmentation points with the optimal splitting effect.

Based on the analysis result, the effect evaluation index of the initial segmentation point is obtained according to the segmentation result of the water data to be analyzed of the initial segmentation point in the historical water data and the segmentation result of the water data to be analyzed of the initial segmentation point in the actual water data.

Specifically, any initial segmentation point in the actual water use data is marked as a first target segmentation point, the next initial segmentation point adjacent to the first target segmentation point is marked as a second target segmentation point, and initial segmentation points in the historical water use data, which have the same position serial numbers as the first and second target segmentation points in the actual water use data, are respectively marked as a first matching segmentation point and a second matching segmentation point.

In this embodiment, the (r+1) th initial segmentation point in the actual water data is taken as the first target segmentation point, the (r+1) th initial segmentation point is taken as the second target segmentation point, and similarly, the (r) th initial segmentation point in the historical water data is taken as the first matching segmentation point, and the (r+1) th initial segmentation point is taken as the second matching segmentation point.

Acquiring actual water data between a first target segment point and a second target segment point to form a target actual data sequence, which is expressed asAcquiring historical water data between a first target segmentation point and a second target segmentation point to form a target historical data sequence, wherein the target historical data sequence is expressed as +.>The method comprises the steps of carrying out a first treatment on the surface of the Acquiring actual water data between the first matching segmentation point and the second matching segmentation point to form a matching actual data sequence, wherein the matching actual data sequence is expressed as +.>Acquiring historical water data between the first matching segmentation point and the second matching segmentation point to form a matching historical data sequence, wherein the matching historical data sequence is expressed as +.>。

And obtaining the effect evaluation indexes of the second target segmentation point and the second matching segmentation point according to the target actual data sequence, the target historical data sequence, the matching actual data sequence and the data distribution conditions in the matching historical data sequence. That is, in the present embodiment, the calculation formula of the effect evaluation index of the (r+1) -th initial segment point in the actual water use data and the historical water use data can be expressed as:

；

wherein,an effect evaluation index indicating a second target segment point and a second matching segment point, r+1 indicating an (r+1) -th initial segment point,>representing a matching actual data sequence->Representing the number of different data values contained in the matching actual data sequence, ±>Indicating the number of presence of the x-th value in the matching actual data sequence,/for>Representing the frequency of occurrence of the x-th value in the matching actual data sequence; />Representing a matching history data sequence,/->Representing the number of different data values contained in the matching history data sequence, ±>Indicating the number of presence of the x-th value in the matching history data sequence,/for>Representing the frequency of occurrence of the x-th value in the matching history data sequence; />Representing the actual data sequence of the object->Representing the number of different data values contained in the target actual data sequence, ±>Indicating the number of x-th values present in the target actual data sequence,/for the target actual data sequence>Representing the frequency of occurrence of the x-th numerical value contained in the target actual data sequence; />Representing a target history data sequence,/->Representing the number of different data values contained in the target historical data sequence, +.>Indicating the number of x-th value present in the target history data sequence,/for>Representing the frequency of occurrence of the x-th value contained in the target history data sequence.

In each data sequence, the frequency of occurrence of each of the different values reflects the degree of repetition of the water use data in the corresponding data sequence, as exemplified by matching the actual data sequence,reflects the repeated probability of the x-th data value in the matched actual data sequence, carries out product calculation by using the corresponding existing quantity of the data value as a coefficient,the method can reflect the repeatability of the value of the x-th data in the matched actual data sequence, and the higher the value is, the better the effect of the dividing mode of the current matched actual data sequence is, the higher the repeatability of the data is, and the better the effect of splitting and compressing by utilizing the dividing result is.

Similarly, according to the same analysis mode, the molecules in the formula represent the (r) initial segmentation point and the (r+1) initial segmentation point in the historical water data, after the historical water data and the actual water data are split respectively, the higher the repetition degree in the data sequence is, the better the splitting effect of the (r) initial segmentation point and the (r+1) initial segmentation point in the historical water data on the water data to be analyzed is.

The denominator of the formula characterizes the repeated degree condition in the data sequence after the historical water use data and the actual water use data are split by utilizing the (r) initial segmentation point and the (r+1) initial segmentation point in the actual water use data, and the greater the repeated degree condition is, the better the splitting effect of the water use data to be analyzed by utilizing the (r) initial segmentation point and the (r+1) initial segmentation point in the actual water use data is.

Based on this, in the present embodiment, the value of the preset numerical value is set to 1. When the effect evaluation index of the second target segmentation point and the second matching segmentation point is larger than a preset value, the fact that the numerator is larger than the denominator in the formula is explained, and further, the fact that the data repeatability of splitting the water data to be analyzed by utilizing the r initial segmentation point and the r+1th initial segmentation point in the historical water data is larger than the data repeatability of splitting the water data to be analyzed is explained, and further, the fact that the dividing effect of the initial segmentation point in the historical water data is better is explained, and further, the r+1th initial segmentation point in the historical water data is used as an optimal segmentation point, namely, the second matching segmentation point is used as the optimal segmentation point in the water data to be analyzed.

When the effect evaluation indexes of the second target segmentation point and the second matching segmentation point are smaller than the preset value, the fact that the numerator in the formula is smaller than the denominator and further the fact that the data repetition degree of splitting the water data to be analyzed by utilizing the (r) th initial segmentation point and the (r+1) th initial segmentation point in the historical water data is smaller than the data repetition degree of splitting the water data to be analyzed by utilizing the (r) th initial segmentation point and the (r+1) th initial segmentation point in the actual water data is further the fact that the dividing effect of the initial segmentation point in the actual water data is better is further the fact that the (r+1) th initial segmentation point in the actual water data is used as the optimal segmentation point is further achieved, namely the second target segmentation point is the optimal segmentation point in the water data to be analyzed.

When the effect evaluation index of the second target segment point and the second matching segment point is equal to the preset value, it is noted that the data repetition of the two dividing modes is equal, and either dividing mode may be used. At the same time, let r=0 in the initial calculation, namely, the first actual water consumption data in one day is used as the initial data of the data sequence, and the first initial segmentation point is used as the cut-off data of the data sequence for analysis in sequence. If the first initial segmentation point is judged to be the first optimal segmentation point, the first optimal segmentation point is taken as a first target segmentation point, the next initial segmentation point adjacent to the first optimal segmentation point is taken as a second target segmentation point, the corresponding first matching segmentation point and the second matching segmentation point are obtained, the obtaining and analyzing operation of the optimal segmentation point is carried out, and the like, and the process is stopped when all actual or historical water use data are traversed. And a plurality of optimal segmentation points can be obtained, and the effect of dividing actual water consumption data and historical water consumption data by using the optimal segmentation points is good.

And step four, carrying out sectional compression processing on the water data to be analyzed by utilizing the optimal sectional point, and storing the data subjected to sectional compression to obtain split water consumption electronic document data.

And respectively carrying out segmentation processing on actual water consumption data and historical water consumption data in the water consumption data to be analyzed by utilizing the optimal segmentation points, carrying out compression processing on the segmented data by utilizing a Huffman coding algorithm to obtain segmented compressed data, and further storing the segmented compressed data to obtain split water consumption electronic document data.

It should be noted that, the obtained optimal segmentation point may be some data in the historical water consumption data or some data in the actual water consumption data, and because the historical water consumption data and the actual water consumption data have a certain corresponding relationship at each time in a day, the optimal segmentation point can find the water consumption data at the corresponding time in the historical or actual water consumption data, and then split the historical or actual water consumption data by using the corresponding water consumption data, so that the effect of compressing the split data is better.

An electronic document splitting optimization management system embodiment:

the embodiment provides an electronic document splitting optimization management system, which comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the computer program realizes the steps of an electronic document splitting optimization management method when being executed by the processor. Since an embodiment of an electronic document splitting optimization management method has been described in detail, it will not be described in detail.

The above embodiments are only for illustrating the technical solution of the present application, and are not limiting; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the scope of the embodiments of the present application, and are intended to be included within the scope of the present application.

Claims

1. An electronic document splitting optimization management method is characterized by comprising the following steps:

carrying out sectional compression processing on water data to be analyzed by utilizing an optimal sectional point, and storing the data subjected to sectional compression to obtain split water consumption electronic document data;

the method comprises the steps of obtaining an optimal segmentation point according to the segmentation result of the water data to be analyzed by the initial segmentation point in the historical water data and the segmentation result of the water data to be analyzed by the initial segmentation point in the actual water data, and specifically comprises the following steps:

screening the initial segmentation points according to the effect evaluation index to determine optimal segmentation points;

the method comprises the steps of obtaining an effect evaluation index of an initial segmentation point according to the segmentation result of the water data to be analyzed of the initial segmentation point in the historical water data and the segmentation result of the water data to be analyzed of the initial segmentation point in the actual water data, and specifically comprises the following steps:

obtaining an effect evaluation index of a second target segmentation point and a second matching segmentation point according to the target actual data sequence, the target historical data sequence, the matching actual data sequence and the data distribution condition in the matching historical data sequence;

the calculation formula of the effect evaluation index specifically comprises:

；

wherein,an effect evaluation index indicating a second target segment point and a second matching segment point, r+1 indicating an (r+1) -th initial segment point,>representing a matching actual data sequence->Representing the number of different data values contained in the matching actual data sequence, ±>Indicating the number of presence of the x-th value in the matching actual data sequence,/for>Representing the frequency of occurrence of the x-th value in the matching actual data sequence; />Representing a matching history data sequence,/->Representing the number of different data values contained in the matching history data sequence, ±>Representing matching historyThe number of x-th value present in the data sequence, is->Representing the frequency of occurrence of the x-th value in the matching history data sequence; />Representing the actual data sequence of the object,representing the number of different data values contained in the target actual data sequence, ±>Indicating the number of x-th values present in the target actual data sequence,/for the target actual data sequence>Representing the frequency of occurrence of the x-th numerical value contained in the target actual data sequence; />Representing a target history data sequence,/->Representing the number of different data values contained in the target historical data sequence, +.>Indicating the number of x-th value present in the target history data sequence,/for>Representing the frequency of occurrence of the x-th numerical value contained in the target history data sequence;

according to the distribution condition of the water data to be analyzed in the water data neighborhood range to be analyzed, the probability index that each water data to be analyzed is a data segmentation point is obtained, and the method specifically comprises the following steps:

2. The method for optimizing and managing splitting electronic documents according to claim 1, wherein the step of screening the initial segmentation points according to the effect evaluation index to determine the optimal segmentation points comprises the following steps:

3. The method for optimizing and managing the splitting of the electronic document according to claim 1, wherein the obtaining the probability index of the selected water consumption data as the data segmentation point according to the matching relation between the left neighborhood data sequence and the right neighborhood data sequence specifically comprises:

4. The method for optimizing and managing the splitting of the electronic document according to claim 1, wherein the screening the water data to be analyzed according to the probability index to determine the initial segmentation point specifically comprises:

5. The method for optimizing and managing splitting electronic documents according to claim 1, wherein the step of performing the segment compression processing on the water data to be analyzed by using the optimal segment point comprises the following steps:

6. An electronic document splitting optimisation management system comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the computer program when executed by the processor implements the steps of an electronic document splitting optimisation management method as claimed in any one of claims 1 to 5.