CN108446734A - Disk failure automatic prediction method based on artificial intelligence - Google Patents

Disk failure automatic prediction method based on artificial intelligence Download PDF

Info

Publication number
CN108446734A
CN108446734A CN201810228937.2A CN201810228937A CN108446734A CN 108446734 A CN108446734 A CN 108446734A CN 201810228937 A CN201810228937 A CN 201810228937A CN 108446734 A CN108446734 A CN 108446734A
Authority
CN
China
Prior art keywords
disk
time
early warning
status data
artificial intelligence
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810228937.2A
Other languages
Chinese (zh)
Inventor
李新明
刘斌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhongke Edge Intelligence Information Technology (suzhou) Co Ltd
Original Assignee
Zhongke Edge Intelligence Information Technology (suzhou) Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhongke Edge Intelligence Information Technology (suzhou) Co Ltd filed Critical Zhongke Edge Intelligence Information Technology (suzhou) Co Ltd
Priority to CN201810228937.2A priority Critical patent/CN108446734A/en
Publication of CN108446734A publication Critical patent/CN108446734A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B19/00Driving, starting, stopping record carriers not specifically of filamentary or web form, or of supports therefor; Control thereof; Control of operating function ; Driving both disc and head
    • G11B19/02Control of operating function, e.g. switching from recording to reproducing
    • G11B19/04Arrangements for preventing, inhibiting, or warning against double recording on the same blank or against other recording or reproducing malfunctions

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The present invention refers to a kind of disk failure automatic prediction method based on artificial intelligence, including:Acquire the status data of several groups disk, as training data, machine learning algorithm is used to be trained it to generate disk failure identification and categorizing system, disk failure identification and categorizing system are the early warning fault time of disk to be calculated according to the status data of disk;A disk is acquired in some or all of current time status data with the first setting time period, import aforementioned disk failure identification and categorizing system, the early warning fault time for obtaining the current time disk, the disk is handled using preset alarm rule according to the early warning fault time of the disk.The present invention uses artificial intelligence technology, the Disk State data based on S.M.A.R.T technical limit spacings to predict disk failure, handled in time failed disk to reach, and enhance the purpose of storage system reliability.

Description

Disk failure automatic prediction method based on artificial intelligence
Technical field
The present invention relates to artificial intelligence fields, belong to a kind of disk failure automatic prediction method based on artificial intelligence.
Background technology
Storage system is responsible for that data are persistently stored, and is one of chief component of information system, and reliability is information The key of system normal operation.Although in recent years, the technologies such as solid-state storage, biometric storage develop rapidly, up to the present, magnetic Disk is still the core component of storage system.The reliability of disk directly affects the reliability of storage system.Disk by The mechanical mixture body of the compositions such as magnetic sheet, magnetic head, motor, design structure itself determine that its reliability is not high.In large-scale number According in center, disk unit quantity generally reaches 100,000, Bai Wanji.In large-scale data center, even if since technique is promoted, disk Single product can keep lower failure rate, but since radix is too big, and disk failure will largely occur.Disk failure compares other portions Part fault data wants more more.In consideration of it, as can the generation to disk failure is predicted in advance, great convenience will be brought to O&M, And greatly reduce due to disk failure and caused by lose.
The High Availabitity of storage system, it will usually use redundant array of inexpensive disk RAID (Redundant Arrays of Inexpensive Disks) technology or distributed storage technology such as HDFS (Hadoop DistributeFileSystem), MFS (Moosefs) etc., improves tolerance of the system to disk failure in a manner of data redundancy, and that improves storage system can By property.But these Passive fault-tolerant control technologies can't reduce the failure rate of physical disk itself, on the contrary due to data redundancy, it is also necessary to More disks are consumed, operation cost is increased.No matter which kind of method is used in fact, the service life of disk is limited, and failure is must Right.From operating cost angle, first will solve the integrity problem of storage system, this problem can by premise To Passive fault-tolerant control mode to give limited solution (improve fault-tolerant, but be not avoided that hardware while the loss of data that damage is brought Risk), and reliability is higher, and the hardware resource of consumption will be more;Second Problem is exactly to reduce hardware maintenance cost.It is right Disk failure progress accurately estimate can make rational planning for disk stock amount and daily maintenance plan, for data center reduction at This, it is extremely important to be turned up service stability.
In order to support the prediction of disk failure, need to be acquired the various state parameters of disk and comprehensive assessment its fortune Row state.Most of disk all uses S.M.A.R.T (Self-Monitoring Analysis and at present ReportingTechnology) technology, the technology monitor the multiple parameters in disk operational process, including the tracking of disk it is wrong, The information such as parity errors, SMART can do the alarm of single index, this method by the method to target setting threshold value It is simple and practicable, but early warning accuracy rate bottom, actually use single or simple S.M.A.R.T attribute values can't be accurate Predict disk failure in ground.
Invention content
The purpose of the present invention is to provide a kind of disk failure automatic prediction method based on artificial intelligence, using artificial intelligence Energy technology, the Disk State data based on S.M.A.R.T technical limit spacings predict disk failure, to reach timely pair of event Barrier disk is handled, and enhances the purpose of storage system reliability.
In order to achieve the above objectives, the present invention provides the following technical solutions:
A kind of disk failure automatic prediction method based on artificial intelligence, including:
The status data of acquisition several groups disk instructs it using machine learning algorithm as training data Practice to generate disk failure identification and categorizing system, disk failure identification and categorizing system are to according to the status number of disk According to the early warning fault time that disk is calculated;
A disk is acquired in some or all of current time status data with the first setting time period, imports aforementioned magnetic Disk fault identification and categorizing system obtain the early warning fault time of the current time disk, when according to the early warning failure of the disk Between the disk is handled using preset alarm rule.
In further embodiment, the method further includes:
S.M.A.R.T technologies are used to acquire the status data of disk.
In further embodiment, the status data includes bottom data read error rate, motor arrival rated speed Time, remap sector count, seek error rate, power on run time, unrepairable error count, magnetic head write-in height, Measure the temperature of hard disk, at least one of the sector count that hardware ECC restores, waiting is reset.
In further embodiment, the method for the identification of one disk failure of the generation and categorizing system includes:
The status data of several groups disk is obtained, each status data is both provided with corresponding state threshold, to status number Quantified according to state threshold;
Status data, the state threshold of aforementioned quantization are trained using SVM algorithm, obtain one for Disk State Optimal Separating Hyperplane.
It is described to refer to according to the early warning fault time preset alarm rule of the disk in further embodiment,
It is less than or equal to the first setting time threshold value in response to the early warning fault time of any one disk, when being set with second Between the period state data acquisition is carried out to the disk, and then obtain its early warning fault time, the second setting time period was less than the One setting time period;
It is less than or equal to the second setting time threshold value in response to the early warning fault time of arbitrary disk, sends out fault warning, the Two setting time threshold values are less than the first setting time threshold value.
In further embodiment, the early warning fault time of the disk failure identification and categorizing system one disk of acquisition Method includes:
Set a fault time precision;
The status data of the disk is obtained, if the natural time from current time is divided by fault time precision Dry period, then judge whether the disk can break down within a wherein period successively according to time sequencing, it will Judge that the time range for that period that can be broken down is exported as early warning fault time.
In further embodiment, the training data includes the status data of current time disk, from current time Disk State variable quantity and variance in the time range of previous fault time precision, from current time when previous failure Between precision time range in disk I/O average loads.
In further embodiment, the fault time precision is 5 days.
In further embodiment, the method further includes:
Logistic regression algorithms are used to be trained training data to generate disk failure identification and categorizing system.
In further embodiment, the early warning fault time of the disk failure identification and categorizing system one disk of acquisition Method includes:
It is more than a setting probability of malfunction threshold value in response to the probability that disk breaks down in any one period, when by this Between section as the disk early warning fault time export.
The beneficial effects of the present invention are:
1) according to many condition comprehensive analysis, disk failure recall rate is improved.
2) can not only according to current state judge disk whether failure, can also according to Disk State and load judgement disk Fault trend.
3) means unbalance discs recall rate (FDR) and rate of false alarm (FAR) are provided.
Above description is only the general introduction of technical solution of the present invention, in order to better understand the technical means of the present invention, And can be implemented in accordance with the contents of the specification, below with presently preferred embodiments of the present invention and after coordinating attached drawing to be described in detail such as.
Description of the drawings
Fig. 1 is the flow chart of the disk failure automatic prediction method based on artificial intelligence of the present invention.
Fig. 2 is the schematic diagram of the Lead Time of the present invention.
Fig. 3 is the schematic diagram of the disk failure identification of the present invention and the operation principle of categorizing system.
Fig. 4 is that the early warning fault time of the disk failure identification and categorizing system based on classification operation principle of the present invention is pre- Survey process chart.
Fig. 5 is the present invention by the way that probability of malfunction threshold value is arranged to obtain the method flow diagram of early warning failure.
Specific implementation mode
With reference to the accompanying drawings and examples, the specific implementation mode of the present invention is described in further detail.Implement below Example is not limited to the scope of the present invention for illustrating the present invention.
In conjunction with Fig. 1, the present invention refers to a kind of disk failure automatic prediction method based on artificial intelligence, the method packet It includes:
Step 1, the status data for acquiring several groups disk, as training data, using machine learning algorithm to it It is trained to generate disk failure identification and categorizing system, disk failure identification and categorizing system are to according to disk The early warning fault time of disk is calculated in status data.
Step 2 acquires a disk with the first setting time period in some or all of current time status data, imports Aforementioned disk failure identification and categorizing system, obtain the early warning fault time of the current time disk, according to the early warning of the disk Fault time is handled the disk using preset alarm rule.
The disk failure automatic Prediction side based on artificial intelligence proposed by the present invention is elaborated in terms of five below The particular content of method and the extension of related art scheme.
One, disk is handled about according to the early warning fault time of disk
It is described to refer to according to the early warning fault time preset alarm rule of the disk,
It is less than or equal to the first setting time threshold value in response to the early warning fault time of any one disk, when being set with second Between the period state data acquisition is carried out to the disk, and then obtain its early warning fault time, the second setting time period was less than the One setting time period.
It is less than or equal to the second setting time threshold value in response to the early warning fault time of arbitrary disk, sends out fault warning, the Two setting time threshold values are less than the first setting time threshold value.
In conjunction with Fig. 2, fault warning is provided when system prediction is to disk failure, and disk is not thoroughly unavailable at this time, It is separated by a period between the time point that current point in time and disk really break down.
Assuming that the time difference between early warning and physical fault is defined as Lead Time by us, if in fault pre-alarming Time point just goes to replace disk, then still have certain time apart from the real failure of disk, and disk is available within the time period , can slattern so a part of disk can normal use life cycle, wastage be equal to Lead Time.
Prediction can remind how long disk also damages, and lead is bigger, and predictablity rate is higher, but simultaneously, Lead Time Also bigger.
Cost is reduced in order to improve disk utilization, while increasing the accuracy of prediction algorithm, the present invention takes following skill Art means:
Two level forecasting mechanism is introduced in forecasting system, whether first order prediction disk will break down, and the second level is pre- Survey the specific time that disk distance breaks down.Specifically, prison will be shortened after failure (level-one prediction) by being predicted to be in disk It surveys time interval and fault time prediction is carried out to it, predict that in next X days (two level prediction) can occur for specific failure.Example That monitoring in 5 days is primary if conventional, when find certain disk be predicted to be will failure after, be changed to daily that all monitoring is primary, only therefore When barrier will be happened at that (alarm threshold) was interior in Y days, it can just make fault warning and give removable disk.
Two, about disk failure identification and the generating mode of categorizing system
By abovementioned steps it is found that the premise of the disk failure automatic prediction method based on artificial intelligence mentioned by the present invention It is the disk failure identification and classification for generating the early warning fault time that one can be calculated disk according to the status data of disk System.
The service life of disk is affected by many factors, for example, during the original state of disk, later stage use disk by Damage arrived etc., these factors are fed back all vividly in the status data of disk.
Therefore, the present invention proposes, by acquiring the status data of several groups disk, as training data (sample number According to), then use machine learning algorithm to be trained training data (sample data) and be to generate disk failure identification and classification System.
It should be appreciated that the quantity of training data (sample data) is more, type is more, the disk failure of generation identifies and divides The precision of class system and accuracy are also higher.
In some instances, disk failure identification and categorizing system realize that data are handed over by network and a Cloud Server Mutually, training data (sample data) is periodically downloaded from the Cloud Server constantly to carry out self-teaching and update.
Acquisition mode about training data (sample data), it is preferred that in step 1, the present invention uses S.M.A.R.T skills Art is to acquire the status data of disk.
Monitoring and self-detection mechanism of the SMART as disk internal state, can detect and describe each of disk well State feature, and current Disk State is converted into one group of specific numerical value, show the state when front disk in vector form Feature, convenient for learning its numerical characteristics using machine learning algorithm.
In SAMRT data, there are 23 important data item, this method to have chosen 10 main data item as disk The source of training data in failure predication.
This 10 main status data items include bottom data read error rate, motor reach rated speed time, Sector count is remapped, error rate is sought, powers on run time, unrepairable error count, magnetic head write-in height, metering hard disk Temperature, the sector count that hardware ECC restores, waiting is reset.
In fact, from the foregoing it will be appreciated that the Disk State data used type is more, quantity is more, the disk of generation therefore The precision and accuracy of barrier identification and categorizing system are also higher, but simultaneously, the types of the Disk State data of use is more, quantity More, operand when training is also bigger, and operation time is also longer.In order to both balance, we select aforementioned 10 to magnetic The status data item that disk failure is affected is as training data.
After collecting enough training datas, disk failure identification and categorizing system are generated in next step.
One disk failure of the generation identifies and the method for categorizing system includes:
The status data of several groups disk is obtained, each status data is both provided with corresponding state threshold, to status number Quantified according to state threshold.
Status data, the state threshold of aforementioned quantization are trained using SVM algorithm, obtain one for Disk State Optimal Separating Hyperplane.
Disk failure forecasting mechanism embodies, and mainly finds state threshold, and status data is more than the threshold value corresponding to it Carry out fault warning.From the foregoing it will be appreciated that state not instead of SMART here provides specific a certain item index, one group of index Quantified, Disk State and state threshold are embodied.Then these symbolic animal of the birth year are calculated as disaggregated model training data with SVM Method trains disaggregated model, and the Optimal Separating Hyperplane for finding out Disk State (is regarded as a kind of specific manifestation shape of state threshold Formula).
SVM is a kind of supervision machine learning algorithm of classics, has good performance when high number of latitude is according to classification.In algorithm Based on the LIBSVM that increases income in realization.The nicety of grading of svm classifier model mainly by training data and kernel function selection and The adjusting of relevant parameter influences.
After generating disk failure identification and categorizing system, we can start to carry out fault pre-alarming, tool to disk Body is as follows:
A disk is acquired in some or all of current time status data with the first setting time period, imports aforementioned magnetic Disk fault identification and categorizing system obtain the early warning fault time of the current time disk, when according to the early warning failure of the disk Between the disk is handled using preset alarm rule.
For example, when the early warning fault time of a certain disk being less than given threshold, alarm, prompting changing disk are carried out;Or Person monitors the early warning fault time of multiple disks simultaneously, is ranked up to it according to early warning fault time;Or use aforementioned two Grade forecasting mechanism etc..
Three, about disk failure identification and the operation principle of categorizing system
The disk failure identifies and the method for the early warning fault time of categorizing system one disk of acquisition includes:
Set a fault time precision.
The status data of the disk is obtained, if the natural time from current time is divided by fault time precision Dry period, then judge whether the disk can break down within a wherein period successively according to time sequencing, it will Judge that the time range for that period that can be broken down is exported as early warning fault time.
For machine learning algorithm, relative to specific early warning fault time numerical value is calculated, yes or no's sentences The journey that stops is relatively more simple, operand is small, arithmetic speed also faster, the disk failure identification of generation and categorizing system Greater number of disk can be monitored under simple hardware supported.
Using excellent in performance of the machine learning algorithm in treatment classification problem, present invention proposition will be predicted to turn fault time Classification problem is turned to be solved.Here whether classification is not categorized into disk instead of i.e. by failure, is categorized into disk failure Whether occur within next given a period of time.
For example, the precision for setting fault time prediction first as X days, occurs after prediction when specific failure can be predicted 0~X days or X~2X days or 2X~3X days etc..Only need predict disk failure whether can at next X days, Occur within the scope of the given times such as 2X days, by fault time predictive conversion for the soluble problem of sorting algorithm.
In conjunction with Fig. 3, it is assumed that the fault time precision set as 5 days, first determine whether it is current i.e. will the disk of failure can not It can break down in next 5 days, if it is determined that the disk can break down in 5 days, then the disk will be provided In the early warning of 5 days internal faults, then within the scope of time just after prediction 5 days that this failure occurs;If it is determined that not 5 It breaks down in it, to judge that the disk can or can not break down within 10 days futures into one;If it is then just predicting early warning Fault time can in 5~10 days later, and so on, by each judgement flow with obtain the disk early warning therefore Downtime, and early warning fault time is a time range.
Four, based on the disk failure identification of classification operation principle and the generating mode of categorizing system
In conjunction with Fig. 4, by two points that the fault time predictive conversion of disk is typical usable machine learning algorithm solution Class problem, needs extra care:Training data will consider disk current state, state variation rate, magnetic disc i/o load shape simultaneously More factors such as condition.It handles outside training data, when simply doing failure predication, two classifications to be sorted are the following meeting respectively Or will not failure, and require here by disk sort at meeting in X days or will not failure, more fault times limit.Specifically The training data for choosing grader acquires disk related data as unit of day when data are acquired.
Specifically, the training data includes the status data of current time disk, the previous failure from current time Disk State variable quantity and variance in the time range of time precision, from current time previous fault time precision when Between disk in range I/O average loads.
Assuming that fault time precision is still 5 days, the collection point of each training data needs to acquire disk in SMART data item Current status data and magnetic disc i/o load, and calculate state change value of this data point forward within the scope of 5 days, 5 days it is each Data item variance, average I/O loads in 5 days, are recorded aforementioned each item data as an input data.That is, each sample Notebook data needs the data item that records to include:Variable quantity and variance, magnetic in current Disk State data, nearest 5 days of Disk State Disk I/O average loads in nearest 5 days.
Preferably, Logistic regression algorithms is used to be trained aforementioned training data with generate disk failure identification and Categorizing system.Logistic recurrence is a kind of supervised learning algorithm, can be used for classifying.It is specific as follows:The letter of given unknown parameter Number, is trained by training data, uses optimal method to determine that one group of parameter, this group of parameter are exactly that Logistic is returned Return disaggregated model.
When unknown input data return disaggregated model to aforementioned Logistic again for we, Logistic returns disaggregated model Unknown data is classified, and exports and belongs to the specific probability of a certain classification is how many.
In conjunction with Fig. 5, in the prediction of disk failure time, disaggregated model is returned by training to Logistic, determines magnetic The probability that disk breaks down in following a period of time.The probability that only disk breaks down in following a period of time is more than setting When probability threshold value, disk failure alarm just can be really provided.That is, the disk failure identification and categorizing system obtain a disk The method of early warning fault time includes:
It is more than a setting probability of malfunction threshold value in response to the probability that disk breaks down in any one period, when by this Between section as the disk early warning fault time export.
The key technical feature for setting while being also rate of false alarm and recall rate balance of probability of malfunction threshold value.
Five, the balance about FAR rate of false alarms and FDR recall rates
FAR rate of false alarms refer to that the intertwining misprediction of normal magnetic flux is i.e. by the probability of failed disk.
FDR recall rates refer to the ratio that the number of faults predicted accounts for total breakdown frequency.
From the foregoing it will be appreciated that high FDR may bring high FAR.
In actual operations, it replaces because wrong report carries out disk and can cause waste to disk, but disk failure prediction Meaning is that FDR as high as possible.In the basis for forecasting (SMART data) and machine learning method (SVM) of this method design Under the conditions of, it there is no method to accomplish recall rate very and zero rate of false alarm, it is therefore desirable to be made according to actual needs to FDR and FAR It accepts or rejects.
Forecasting system that this method is related in practical application, according to own characteristic, by adjust probability of malfunction threshold value with The FAR and FDR of prediction result are controlled, and then can be adjusted to prediction result according to the requirement to FDR and FAR.
By introducing threshold mechanism, prediction result is adjusted, the characteristics of according to current predictive model, pass through be arranged therefore Hinder probability threshold value, early warning is only just carried out when probability of malfunction is more than probability of malfunction threshold value, to adjust FDR and FAR.
Each technical characteristic of embodiment described above can be combined arbitrarily, to keep description succinct, not to above-mentioned reality It applies all possible combination of each technical characteristic in example to be all described, as long as however, the combination of these technical characteristics is not deposited In contradiction, it is all considered to be the range of this specification record.
Several embodiments of the invention above described embodiment only expresses, the description thereof is more specific and detailed, but simultaneously It cannot therefore be construed as limiting the scope of the patent.It should be pointed out that coming for those of ordinary skill in the art It says, without departing from the inventive concept of the premise, various modifications and improvements can be made, these belong to the protection of the present invention Range.Therefore, the protection domain of patent of the present invention should be determined by the appended claims.

Claims (10)

1. a kind of disk failure automatic prediction method based on artificial intelligence, which is characterized in that including:
Acquire several groups disk status data, as training data, use machine learning algorithm to be trained it with Disk failure identification and categorizing system are generated, disk failure identification and categorizing system are to according to the status data meter of disk Calculate the early warning fault time for obtaining disk;
A disk is acquired in some or all of current time status data with the first setting time period, imports aforementioned disk event Barrier identification and categorizing system, obtain the early warning fault time of the current time disk, are adopted according to the early warning fault time of the disk The disk is handled with preset alarm rule.
2. the disk failure automatic prediction method according to claim 1 based on artificial intelligence, which is characterized in that the side Method further includes:
S.M.A.R.T technologies are used to acquire the status data of disk.
3. the disk failure automatic prediction method according to claim 1 or 2 based on artificial intelligence, which is characterized in that institute It includes that bottom data read error rate, motor reach that the time of rated speed, to remap sector count, tracking wrong to state status data Accidentally rate, power on run time, unrepairable error count, magnetic head write-in height, the temperature of metering hard disk, hardware ECC restore, etc. Wait at least one of the sector count reset.
4. the disk failure automatic prediction method according to claim 1 or 2 based on artificial intelligence, which is characterized in that institute Stating the method for generating disk failure identification and categorizing system includes:
Obtain the status data of several groups disk, each status data is both provided with corresponding state threshold, to status data and State threshold is quantified;
Status data, the state threshold of aforementioned quantization are trained using SVM algorithm, obtain a classification for being directed to Disk State Hyperplane.
5. the disk failure automatic prediction method according to claim 1 based on artificial intelligence, which is characterized in that described Refer to according to the early warning fault time preset alarm rule of the disk,
It is less than or equal to the first setting time threshold value in response to the early warning fault time of any one disk, with the second setting time week Phase carries out state data acquisition to the disk, and then obtains its early warning fault time, and the second setting time period set less than first It fixes time the period;
It is less than or equal to the second setting time threshold value in response to the early warning fault time of arbitrary disk, sends out fault warning, second sets Threshold value of fixing time is less than the first setting time threshold value.
6. the disk failure automatic prediction method according to claim 1 based on artificial intelligence, which is characterized in that the magnetic The method that disk fault identification and categorizing system obtain the early warning fault time of a disk includes:
Set a fault time precision;
Natural time from current time is divided into several by the status data for obtaining the disk by fault time precision Period, then judge whether the disk can break down within a wherein period successively according to time sequencing, will judge The time range for that period that can be broken down is exported as early warning fault time.
7. the disk failure automatic prediction method according to claim 6 based on artificial intelligence, which is characterized in that the instruction It includes the status data of current time disk, from current time in the time range of previous fault time precision to practice data Disk State variable quantity and variance, the I/O of disk from current time in the time range of previous fault time precision are flat Load.
8. the disk failure automatic prediction method according to claim 6 based on artificial intelligence, which is characterized in that the event Downtime precision is 5 days.
9. the disk failure automatic prediction method based on artificial intelligence according to claim 1 or 6, which is characterized in that The method further includes:
Logistic regression algorithms are used to be trained training data to generate disk failure identification and categorizing system.
10. the disk failure automatic prediction method according to claim 9 based on artificial intelligence, which is characterized in that described Disk failure identifies and the method for the early warning fault time of categorizing system one disk of acquisition includes:
It is more than a setting probability of malfunction threshold value in response to the probability that disk breaks down in any one period, by the period Early warning fault time as the disk exports.
CN201810228937.2A 2018-03-20 2018-03-20 Disk failure automatic prediction method based on artificial intelligence Pending CN108446734A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810228937.2A CN108446734A (en) 2018-03-20 2018-03-20 Disk failure automatic prediction method based on artificial intelligence

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810228937.2A CN108446734A (en) 2018-03-20 2018-03-20 Disk failure automatic prediction method based on artificial intelligence

Publications (1)

Publication Number Publication Date
CN108446734A true CN108446734A (en) 2018-08-24

Family

ID=63195864

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810228937.2A Pending CN108446734A (en) 2018-03-20 2018-03-20 Disk failure automatic prediction method based on artificial intelligence

Country Status (1)

Country Link
CN (1) CN108446734A (en)

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109240867A (en) * 2018-09-18 2019-01-18 鸿秦(北京)科技有限公司 Hard disk failure prediction technique
CN109491850A (en) * 2018-11-21 2019-03-19 北京北信源软件股份有限公司 A kind of disk failure prediction technique and device
CN109828869A (en) * 2018-12-05 2019-05-31 中兴通讯股份有限公司 Predict the method, apparatus and storage medium of hard disk failure time of origin
CN111258788A (en) * 2020-01-17 2020-06-09 上海商汤智能科技有限公司 Disk failure prediction method, device and computer readable storage medium
CN111459692A (en) * 2019-01-18 2020-07-28 伊姆西Ip控股有限责任公司 Method, apparatus and computer program product for predicting drive failure
CN111489539A (en) * 2019-01-29 2020-08-04 珠海格力电器股份有限公司 Household appliance system fault early warning method, system and device
CN112596964A (en) * 2020-12-15 2021-04-02 中国建设银行股份有限公司 Disk failure prediction method and device
CN113076217A (en) * 2021-04-21 2021-07-06 扬州万方电子技术有限责任公司 Disk fault prediction method based on domestic platform
CN113434088A (en) * 2021-06-28 2021-09-24 中国建设银行股份有限公司 Disk identification method and device
CN113778791A (en) * 2021-08-19 2021-12-10 苏州浪潮智能科技有限公司 Fault early warning method and system for distributed storage disk
CN114063881A (en) * 2020-07-31 2022-02-18 阿里巴巴集团控股有限公司 Disk management method and device of distributed system
CN116680114A (en) * 2023-08-04 2023-09-01 浙江鹏信信息科技股份有限公司 LVM fault data quick recovery method, system and computer readable storage medium
CN114063881B (en) * 2020-07-31 2024-07-26 阿里巴巴集团控股有限公司 Disk management method and device for distributed system

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102129397A (en) * 2010-12-29 2011-07-20 深圳市永达电子股份有限公司 Method and system for predicating self-adaptive disk array failure
CN104102773A (en) * 2014-07-05 2014-10-15 山东鲁能软件技术有限公司 Equipment fault warning and state monitoring method
CN105260279A (en) * 2015-11-04 2016-01-20 四川效率源信息安全技术股份有限公司 Method and device of dynamically diagnosing hard disk failure based on S.M.A.R.T (Self-Monitoring Analysis and Reporting Technology) data
CN107392320A (en) * 2017-07-28 2017-11-24 郑州云海信息技术有限公司 A kind of method that hard disk failure is predicted using machine learning

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102129397A (en) * 2010-12-29 2011-07-20 深圳市永达电子股份有限公司 Method and system for predicating self-adaptive disk array failure
CN104102773A (en) * 2014-07-05 2014-10-15 山东鲁能软件技术有限公司 Equipment fault warning and state monitoring method
CN105260279A (en) * 2015-11-04 2016-01-20 四川效率源信息安全技术股份有限公司 Method and device of dynamically diagnosing hard disk failure based on S.M.A.R.T (Self-Monitoring Analysis and Reporting Technology) data
CN107392320A (en) * 2017-07-28 2017-11-24 郑州云海信息技术有限公司 A kind of method that hard disk failure is predicted using machine learning

Cited By (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109240867A (en) * 2018-09-18 2019-01-18 鸿秦(北京)科技有限公司 Hard disk failure prediction technique
CN109491850A (en) * 2018-11-21 2019-03-19 北京北信源软件股份有限公司 A kind of disk failure prediction technique and device
JP2022508320A (en) * 2018-12-05 2022-01-19 中興通訊股▲ふん▼有限公司 Hard disk failure prediction method, device and storage medium
CN109828869A (en) * 2018-12-05 2019-05-31 中兴通讯股份有限公司 Predict the method, apparatus and storage medium of hard disk failure time of origin
WO2020114313A1 (en) * 2018-12-05 2020-06-11 中兴通讯股份有限公司 Method and apparatus for predicting hard disk fault occurrence time, and storage medium
US11656943B2 (en) 2018-12-05 2023-05-23 Zte Corporation Method and apparatus for predicting hard disk fault occurrence time, and storage medium
JP7158586B2 (en) 2018-12-05 2022-10-21 中興通訊股▲ふん▼有限公司 Hard disk failure prediction method, apparatus and storage medium
EP3879405A4 (en) * 2018-12-05 2022-01-19 ZTE Corporation Method and apparatus for predicting hard disk fault occurrence time, and storage medium
CN111459692B (en) * 2019-01-18 2023-08-18 伊姆西Ip控股有限责任公司 Method, apparatus and computer program product for predicting drive failure
CN111459692A (en) * 2019-01-18 2020-07-28 伊姆西Ip控股有限责任公司 Method, apparatus and computer program product for predicting drive failure
CN111489539A (en) * 2019-01-29 2020-08-04 珠海格力电器股份有限公司 Household appliance system fault early warning method, system and device
CN111258788A (en) * 2020-01-17 2020-06-09 上海商汤智能科技有限公司 Disk failure prediction method, device and computer readable storage medium
CN111258788B (en) * 2020-01-17 2024-04-12 上海商汤智能科技有限公司 Disk failure prediction method, device and computer readable storage medium
CN114063881B (en) * 2020-07-31 2024-07-26 阿里巴巴集团控股有限公司 Disk management method and device for distributed system
CN114063881A (en) * 2020-07-31 2022-02-18 阿里巴巴集团控股有限公司 Disk management method and device of distributed system
CN112596964B (en) * 2020-12-15 2024-05-17 中国建设银行股份有限公司 Disk fault prediction method and device
CN112596964A (en) * 2020-12-15 2021-04-02 中国建设银行股份有限公司 Disk failure prediction method and device
CN113076217A (en) * 2021-04-21 2021-07-06 扬州万方电子技术有限责任公司 Disk fault prediction method based on domestic platform
CN113076217B (en) * 2021-04-21 2024-04-12 扬州万方科技股份有限公司 Disk fault prediction method based on domestic platform
CN113434088A (en) * 2021-06-28 2021-09-24 中国建设银行股份有限公司 Disk identification method and device
CN113778791B (en) * 2021-08-19 2023-07-18 苏州浪潮智能科技有限公司 Fault early warning method and system for distributed storage disk
CN113778791A (en) * 2021-08-19 2021-12-10 苏州浪潮智能科技有限公司 Fault early warning method and system for distributed storage disk
CN116680114A (en) * 2023-08-04 2023-09-01 浙江鹏信信息科技股份有限公司 LVM fault data quick recovery method, system and computer readable storage medium
CN116680114B (en) * 2023-08-04 2023-10-31 浙江鹏信信息科技股份有限公司 LVM fault data quick recovery method, system and computer readable storage medium

Similar Documents

Publication Publication Date Title
CN108446734A (en) Disk failure automatic prediction method based on artificial intelligence
De Santo et al. Deep Learning for HDD health assessment: An application based on LSTM
CN108052528B (en) A kind of storage equipment timing classification method for early warning
Li et al. Hard drive failure prediction using decision trees
Khan et al. Hierarchical integrated machine learning model for predicting flight departure delays and duration in series
Chen Bankruptcy prediction in firms with statistical and intelligent techniques and a comparison of evolutionary computation approaches
CN107301118B (en) A kind of fault indices automatic marking method and system based on log
CN110164501A (en) A kind of hard disk detection method, device, storage medium and equipment
Hammami et al. Neural networks for online learning of non-stationary data streams: a review and application for smart grids flexibility improvement
Luo et al. NTAM: neighborhood-temporal attention model for disk failure prediction in cloud platforms
Jassas et al. A failure prediction model for large scale cloud applications using deep learning
Zhang et al. Tier-scrubbing: An adaptive and tiered disk scrubbing scheme with improved MTTD and reduced cost
Jiang et al. Scrub unleveling: Achieving high data reliability at low scrubbing cost
Halstead et al. Recurring concept memory management in data streams: exploiting data stream concept evolution to improve performance and transparency
Ebert Experiences with criticality predictions in software development
Jin et al. Software fault prediction model based on adaptive dynamical and median particle swarm optimization
Zhu et al. Disk Failure Prediction for Software-Defined Data Centre (SDDC)
Georgoulopoulos et al. A survey on hardware failure prediction of servers using machine learning and deep learning
Ji et al. Risk index early-warning of smart grid based on neural network
CN115705274A (en) Hard disk failure prediction method and device, computer readable medium and electronic equipment
Bahrami et al. Machine Learning Application to Extreme Weather Power Outage Forecasting in Distribution Networks using a Majority Under-Sampling and Minority Over-Sampling Strategy
Xu et al. Convtrans-tps: A convolutional transformer model for disk failure prediction in large-scale network storage systems
Simić et al. An approach of steel plates fault diagnosis in multiple classes decision making
Tahir et al. Improvement of decision making grid model for maintenance management in small and medium industries
Vani et al. A machine learning framework for job failure prediction in cloud using hyper parameter tuned MLP

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20180824