CN117540176B - Data recovery analysis method and system based on solid state disk - Google Patents

Data recovery analysis method and system based on solid state disk Download PDF

Info

Publication number
CN117540176B
CN117540176B CN202410027052.1A CN202410027052A CN117540176B CN 117540176 B CN117540176 B CN 117540176B CN 202410027052 A CN202410027052 A CN 202410027052A CN 117540176 B CN117540176 B CN 117540176B
Authority
CN
China
Prior art keywords
data
data information
deleted
solid state
state disk
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202410027052.1A
Other languages
Chinese (zh)
Other versions
CN117540176A (en
Inventor
张培栋
李根祥
周水平
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Lingdechuang Technology Co ltd
Original Assignee
Shenzhen Lingdechuang Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Lingdechuang Technology Co ltd filed Critical Shenzhen Lingdechuang Technology Co ltd
Priority to CN202410027052.1A priority Critical patent/CN117540176B/en
Publication of CN117540176A publication Critical patent/CN117540176A/en
Application granted granted Critical
Publication of CN117540176B publication Critical patent/CN117540176B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1448Management of the data involved in backup or backup restore
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0646Horizontal data movement in storage systems, i.e. moving data in between storage devices or systems
    • G06F3/0652Erasing, e.g. deleting, data cleaning, moving of data to a wastebasket

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Quality & Reliability (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a data recovery analysis method and a system based on a solid state disk, which belong to the field of electronic digital data processing.

Description

Data recovery analysis method and system based on solid state disk
Technical Field
The invention belongs to the technical field of electronic digital data processing, and particularly relates to a data recovery analysis method and system based on a solid state disk.
Background
Data storage technology plays an important role in the computer field. The storage of data is one of the requirements for a computer to perform tasks, while a hard disk is one of the main storage media of a computer, and is a data storage device using magnetic media, and the data is stored on a plurality of magnetic disks sealed in the inner cavity of a clean hard disk drive. The data storage process in the hard disk comprises file reading, file writing, file deleting and the like, and when the deleted data of the hard disk is managed, all the deleted data in a certain period are mostly cleaned in a timing cleaning mode, so that accurate qualitative and quantitative analysis of the recovery probability of the data to be deleted cannot be performed, a large amount of useful data are simultaneously deleted, and the problems exist in the prior art;
for example, in the Chinese patent with the publication number of CN116661706B, a method and a system for cleaning and analyzing a cache of a solid state disk are disclosed, which relate to the technical field of intelligent identification, and the method comprises the following steps: reading the cache data in the solid state disk according to the first user request information, acquiring an identification return value, judging whether the read data is out of date in the solid state disk when the identification return value is empty, if the read data is out of date in the solid state disk, recovering the read data to the solid state disk after acquiring a recovery instruction, storing the real-time write data and the recovery data of the solid state disk in a data structure, respectively outputting the write-in cache database and the clean cache database for connection, carrying out clean time limit identification on the cache data written in the cache database, storing the cache data reaching the preset clean time limit in the clean cache database, solving the technical problem of high clean error rate of the cache data due to improper operation in the data cleaning process in the prior art, realizing multi-dimensional analysis on data cleaning, and reducing the clean error rate of the cache data.
The problems proposed in the background art exist in the above patents: when the deleted data of the hard disk is managed, all the deleted data in a certain period are mostly cleaned in a timing cleaning mode, so that accurate qualitative and quantitative analysis of recovery probability of the data to be deleted cannot be performed, a large amount of useful data is simultaneously deleted, and in order to solve the problems, the application designs a data recovery analysis method and system based on the solid state disk.
Disclosure of Invention
Aiming at the defects of the prior art, the invention provides a data recovery analysis method and a system based on a solid state disk, the invention reads cache data in the solid state disk, acquires data information deleted from the solid state disk, extracts data characteristics of data information deleted from the solid state disk, data information titles and data information, imports importance degree calculation strategies constructed by data characteristics of the extracted data information deleted from the solid state disk to calculate importance degree of deleted data, imports the data types of the extracted data information deleted from the solid state disk, the data information titles and the data characteristics of the data information into the constructed data safety degree calculation strategies to calculate the data safety degree of deleted data, acquires the importance degree of the deleted data and the data safety degree of the deleted data obtained by calculation, substitutes the calculated importance degree and the data safety degree of the deleted data into a data recovery coefficient calculation formula, compares the recovery coefficient of the deleted data with a set recovery coefficient threshold, and if the obtained recovery coefficient of the deleted data is larger than or equal to the set recovery coefficient threshold, carries out calculation of the importance degree of the deleted data, and further analyzes the deleted data according to the set recovery coefficient size sequence of the deleted data, and further can not be accurately stored in the data recovery coefficient of the solid state disk, and can not be further stored in the data recovery coefficient of the fixed data is more accurate, and the data can be further stored in the data recovery coefficient has been stored in the data recovery coefficient calculation mode.
In order to achieve the above purpose, the present invention provides the following technical solutions:
a data recovery analysis method based on a solid state disk comprises the following specific steps:
s1, reading cache data in a solid state disk, simultaneously acquiring data information deleted from the solid state disk, and extracting the data type, the data information title and the data characteristics of the data information deleted from the solid state disk;
s2, importing the extracted data type, data information title and data characteristics of the data information deleted from the solid state disk into a constructed importance calculation strategy to calculate the importance of the deleted data;
s3, importing the extracted data type, data information title and data characteristics of the data information deleted from the solid state disk into a constructed data security calculation strategy to calculate the data security of the deleted data;
s4, acquiring the importance of the deleted data and the data safety of the deleted data obtained through calculation, substituting the importance and the data safety into a data recovery coefficient calculation formula, and calculating the recovery coefficient of the deleted data;
s5, comparing the recovery coefficient of the deleted data with a set recovery coefficient threshold, if the recovery coefficient of the deleted data is larger than or equal to the set recovery coefficient threshold, collecting the deleted data in reserved storage particles through compression by the data garbage collection module according to the order of the recovery coefficient of the deleted data, and if the recovery coefficient of the deleted data is smaller than the set recovery coefficient threshold, directly deleting the deleted data.
Specifically, the step S1 includes the following specific steps:
s11, reading cache data in the solid state disk, acquiring data information to be deleted from the solid state disk, and identifying the data type, the data information title and the data characteristics of the data information of the deleted data information, wherein the data type of the data information is the type of the deleted data information, and is specifically divided into a picture, a document and a video, the data information title is the title information of the deleted data information, and the data characteristics of the data information are specifically: if the type of the deleted data information is a picture, the data characteristics are pixel value characteristics of each pixel point of the picture with standard size, if the type of the deleted data information is a document, the data characteristics are Chinese characteristics in the document, if the type of the deleted data information is a video, the data characteristics are image characteristics of each video frame in the video, and the data types of the data information of the deleted data, the data information title and the data characteristics of the data information form a first dimension vector for transmission;
s12, reading the data type, the data information title and the data characteristics of the data information of the encrypted data information in the solid state disk, and transmitting the data type, the data information title and the data characteristics of the data information of the encrypted data information in a form of a second dimension vector;
S13, reading the data type, the data information title and the data characteristics and the recovery times of the deleted data information, and transmitting the data type, the data information title, the data characteristics and the recovery times of the deleted data information to form a third dimensional vector.
Specifically, the importance calculation policy in S2 includes the following specific contents:
s21, acquiring data information to be deleted from the solid state disk, identifying the data type, the data information title and the data characteristics of the data information of the deleted data information, and simultaneously acquiring the data type, the data information title and the data characteristics and the recovery times of the data information of the deleted data information, which are read by a user;
s22, substituting the obtained data type, data information title and data characteristics of the identified deleted data information and the obtained data type, data information title and data characteristics of the deleted data information retrieved by a user into a similarity calculation formula to calculate the first-order similarity between the identified deleted data information and one of the deleted data information retrieved by the user, wherein the first-order similarity calculation formula is as follows: Wherein->To identify the similarity of the deleted data information to the i-th user's retrieval of the deleted data information,/for the user>As the first-level judgment factor, if the data type of the deleted data information is identified to be the same as that of the i-th user for retrieving the deleted data information, the first-level judgment factor is taken as 1, and if the data type of the deleted data information is identified to be the same as that of the i-th user for retrieving the deleted data information, the first-level judgment factor is taken as 0>For the title duty factor, < >>Data characteristic occupationRatio of->M () is the number of elements in parentheses, z is the data information header identifying the deletion, ++>Retrieving the data information header of the deleted data information for the ith user,/-, for the user>To identify the same character of the deleted data information title as the data information title of the i-th user retrieving the deleted data information>To identify the character union of the deleted data information title and the data information title of the i-th user retrieving the deleted data information, s is the data feature identifying the deleted data information,/>Retrieving the data characteristics of the deleted data information for the ith user;
s23, obtaining the first-level similarity between the calculated identified deleted data information and the deleted data information retrieved by each user, extracting the times of retrieving the deleted data information by the user, substituting the times into an importance degree calculation formula to calculate the importance degree of the identified deleted data information, wherein the importance degree calculation formula is as follows: Wherein->The number of times of retrieving the deleted data information for the ith user is n, which is the total number of times of retrieving the deleted data information for the user.
It should be noted that the encrypted data information and the recovered deleted data information are only extracted and then used in the system, and the encrypted data information and the recovered deleted data information cannot be obtained on the premise that an external person does not conduct network attack, so that the specific confidentiality problem is not considered;
specifically, the data security calculation strategy in S3 includes the following specific steps:
s31, identifying the data type, the data information title and the data characteristics of the data information of the deleted data information, and simultaneously extracting the data type, the data information title and the data characteristics of the data information of the encrypted data information in the solid state disk;
s32, importing the acquired data type, data information title and data characteristics of the identified deleted data information and the encrypted data information in the solid state disk into a secondary similarity calculation formula to calculate the secondary similarity between the identified deleted data information and the encrypted data information in one of the solid state disks, wherein the secondary similarity calculation formula is as follows: Wherein, the method comprises the steps of, wherein,to identify the secondary similarity of the deleted data information to the encrypted data information in the jth solid state disk,/for the second time>As the second-level judgment factor, if the data type of the encrypted data information in the j-th solid state disk is the same as that of the data information, the first-level judgment factor takes 1, and if the data type of the encrypted data information in the j-th solid state disk is the same as that of the data information, the first-level judgment factor takes 0>Data information title of encrypted data information in jth solid state disk, < >>The data characteristics of the encrypted data information in the j-th solid state disk are provided;
s33, acquiring identification deletionThe second-level similarity between the data information and the encrypted data information in all the solid state disks is substituted into a data security calculation formula to calculate the data security, wherein the data security calculation formula is as follows:wherein N is the number of encrypted data information in the solid state disk.
Specifically, the specific content of S4 includes the following specific steps:
obtaining the importance of the deleted data and the data safety of the deleted data, substituting the importance and the data safety into a data recovery coefficient calculation formula, and calculating the recovery coefficient of the deleted data, wherein the data recovery coefficient calculation formula is as follows: Wherein->Is an importance ratio coefficient->Is a safety duty ratio coefficient->
Here, it is to be noted that, here、/>、/>、/>And the calculation mode of the recovery coefficient threshold value is as follows: extracting 5000 groups of data to be recovered by historical deletion and data to be recovered by deletion, inputting the data to be recovered and the data to be recovered into a data recovery coefficient calculation formula to calculate a data recovery coefficient, distinguishing the data to be recovered from the data to be recovered by deletion, and thenThen the calculated data and the distinguishing result are imported into fitting software to obtain the optimal +.>、/>、/>、/>And recovering the value of the coefficient threshold.
The data recovery analysis system based on the solid state disk is realized based on the data recovery analysis method based on the solid state disk, and comprises a data acquisition module, an importance calculation module, a security calculation module, a data recovery coefficient calculation module, a data comparison module and a control module, wherein the data acquisition module is used for reading cache data in the solid state disk and simultaneously acquiring data information deleted from the solid state disk, extracting data types, data information titles and data characteristics of the data information deleted from the solid state disk, the importance calculation module is used for conducting calculation of the importance of deleted data by importing the extracted data types, data information titles and data characteristics of the data information deleted from the solid state disk into a constructed importance calculation strategy, and the security calculation module is used for conducting calculation of the data security of deleted data by importing the extracted data types, data information titles and data characteristics of the data information from the solid state disk into the constructed data security calculation strategy.
Specifically, the data recovery coefficient calculation module is used for obtaining the importance degree of the deleted data and the data safety degree of the deleted data obtained through calculation, substituting the importance degree and the data safety degree of the deleted data into a data recovery coefficient calculation formula to calculate the recovery coefficient of the deleted data, the data comparison module is used for comparing the recovery coefficient of the deleted data with a set recovery coefficient threshold value, and the control module is used for controlling the operation of the data obtaining module, the importance degree calculation module, the safety degree calculation module, the data recovery coefficient calculation module and the data comparison module.
An electronic device, comprising: a processor and a memory, wherein the memory stores a computer program for the processor to call;
and the processor executes the data recovery analysis method based on the solid state disk by calling the computer program stored in the memory.
A computer readable storage medium storing instructions that, when executed on a computer, cause the computer to perform a solid state disk-based data recovery analysis method as described above.
Compared with the prior art, the invention has the beneficial effects that:
the method comprises the steps of reading cache data in a solid state disk, simultaneously obtaining data information deleted from the solid state disk, extracting data types, data information titles and data characteristics of the data information deleted from the solid state disk, importing the extracted data types, data information titles and data characteristics of the data information from the solid state disk into a constructed importance calculation strategy to calculate importance of the deleted data, importing the extracted data types, data information titles and data characteristics of the data information from the solid state disk into the constructed data safety calculation strategy to calculate data safety of the deleted data, obtaining the calculated importance of the deleted data and the data safety of the deleted data, substituting the calculated importance of the deleted data into a data recovery coefficient calculation formula, calculating a recovery coefficient of the deleted data, comparing the recovery coefficient of the deleted data with a set recovery coefficient threshold, and if the recovery coefficient of the obtained deleted data is larger than or equal to the set recovery coefficient threshold, collecting the deleted data in a reserved storage particle through compression by a garbage collection module according to the size sequence of the recovery coefficient of the deleted data, if the obtained recovery coefficient of the deleted data is smaller than the set recovery coefficient of the deleted data, analyzing the obtained data is required to be further analyzed accurately and quantitatively, and the data can not be further analyzed, and the data can be further prevented from being further removed.
Drawings
FIG. 1 is a schematic flow chart of a data recovery analysis method based on a solid state disk;
fig. 2 is a schematic diagram of an overall framework of a data recovery analysis system based on a solid state disk of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments.
Example 1
Referring to fig. 1, an embodiment of the present invention is provided: a data recovery analysis method based on a solid state disk comprises the following specific steps:
s1, reading cache data in a solid state disk, simultaneously acquiring data information deleted from the solid state disk, and extracting the data type, the data information title and the data characteristics of the data information deleted from the solid state disk;
s2, importing the extracted data type, data information title and data characteristics of the data information deleted from the solid state disk into a constructed importance calculation strategy to calculate the importance of the deleted data;
s3, importing the extracted data type, data information title and data characteristics of the data information deleted from the solid state disk into a constructed data security calculation strategy to calculate the data security of the deleted data;
S4, acquiring the importance of the deleted data and the data safety of the deleted data obtained through calculation, substituting the importance and the data safety into a data recovery coefficient calculation formula, and calculating the recovery coefficient of the deleted data;
s5, comparing the recovery coefficient of the deleted data with a set recovery coefficient threshold, if the recovery coefficient of the deleted data is greater than or equal to the set recovery coefficient threshold, collecting the deleted data in reserved storage particles through compression by a data garbage collection module according to the order of the recovery coefficient of the deleted data, and if the recovery coefficient of the deleted data is smaller than the set recovery coefficient threshold, directly deleting the deleted data;
in this embodiment, it should be noted that S1 includes the following specific steps:
s11, reading cache data in the solid state disk, acquiring data information to be deleted from the solid state disk, and identifying the data type, the data information title and the data characteristics of the data information of the deleted data information, wherein the data type of the data information is the type of the deleted data information, and is specifically divided into a picture, a document and a video, the data information title is the title information of the deleted data information, and the data characteristics of the data information are specifically: if the type of the deleted data information is a picture, the data characteristics are pixel value characteristics of each pixel point of the picture with standard size, if the type of the deleted data information is a document, the data characteristics are Chinese characteristics in the document, if the type of the deleted data information is a video, the data characteristics are image characteristics of each video frame in the video, and the data types of the data information of the deleted data, the data information title and the data characteristics of the data information form a first dimension vector for transmission;
S12, reading the data type, the data information title and the data characteristics of the data information of the encrypted data information in the solid state disk, and transmitting the data type, the data information title and the data characteristics of the data information of the encrypted data information in a form of a second dimension vector;
s13, reading the data type, the data information title and the data characteristics and the recovery times of the deleted data information, and transmitting the data type, the data information title, the data characteristics and the recovery times of the deleted data information to form a third dimensional vector;
the following is a simple example of a C language code for reading cache data in a Solid State Disk (SSD) and acquiring deleted data information, the example code being based on the assumption that the deleted data information is stored in a file named 'deleted_data.txt', the file format being as follows:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
void read_cache_data(const char *filename);
void extract_deleted_data(const char *filename);
int main() {
read_cache_data("cache_data.txt");
extract_deleted_data("deleted_data.txt");
return 0;
}
void read_cache_data(const char *filename) {
FILE *file = fopen(filename, "r");
if (file == NULL) {
printf ("unopened file:% s\n", filename);
return;
}
char line[256];
while (fgets(line, sizeof(line), file)) {
printf("%s", line);
}
fclose(file);
}
void extract_deleted_data(const char *filename) {
FILE *file = fopen(filename, "r");
if (file == NULL) {
printf ("unopened file:% s\n", filename);
return;
}
char line[256];
while (fgets(line, sizeof(line), file)) {
char *token = strtok(line, " \t");
if (token != NULL) {
char data_type[32];
char data_title[64];
float data_feature;
sscanf(token, "%s %s %f", data_type, data_title, &data_feature);
printf ("data type:% s\n", data_type);
printf ("data information title:% s\n", data_title);
printf ("data feature:%. 2f_n", data_feature);
}
}
fclose(file);
}
This example code first defines two functions: the read cache data is used for reading cache data, and the extract data is used for extracting deleted data information from the read data txt file. In the main function, firstly calling a 'read_cache_data' function to read cache data, and then calling an 'extract_deleted_data' function to extract the information of deleted data;
this example code is only used for demonstration purposes, and in practical application, the code may need to be modified according to practical situations, and in addition, the storage modes of the cached data and the deleted data of the SSD may be different according to different hardware manufacturers and models, so please adjust the code according to practical situations;
in this embodiment, it should be noted that the importance calculating strategy in S2 includes the following specific contents:
s21, acquiring data information to be deleted from the solid state disk, identifying the data type, the data information title and the data characteristics of the data information of the deleted data information, and simultaneously acquiring the data type, the data information title and the data characteristics and the recovery times of the data information of the deleted data information, which are read by a user;
s22, substituting the obtained data type, data information title and data characteristics of the identified deleted data information and the obtained data type, data information title and data characteristics of the deleted data information retrieved by a user into a similarity calculation formula to calculate the first-order similarity between the identified deleted data information and one of the deleted data information retrieved by the user, wherein the first-order similarity calculation formula is as follows: Wherein->To identify the similarity of the deleted data information to the i-th user's retrieval of the deleted data information,/for the user>As the first-level judgment factor, if the data type of the deleted data information is identified to be the same as that of the i-th user for retrieving the deleted data information, the first-level judgment factor is taken as 1, and if the data type of the deleted data information is identified to be the same as that of the i-th user for retrieving the deleted data information, the first-level judgment factor is taken as 0>For the title duty factor, < >>Data characteristic duty cycle, +_>M () is the number of elements in parentheses, z is the data information header identifying the deletion, ++>Retrieving the data information header of the deleted data information for the ith user,/-, for the user>To identify the number of deletionsThe same character of the data information title of the deleted data information is retrieved by the ith user according to the information title,/and->To identify the character union of the deleted data information title and the data information title of the i-th user retrieving the deleted data information, s is the data feature identifying the deleted data information,/>Retrieving the data characteristic of the deleted data information for the ith user, wherein the data characteristic of the data information is a character for a document, and +. >Is the area of the same portion of the two pictures,is the union area of two pictures;
s23, obtaining the first-level similarity between the calculated identified deleted data information and the deleted data information retrieved by each user, extracting the times of retrieving the deleted data information by the user, substituting the times into an importance degree calculation formula to calculate the importance degree of the identified deleted data information, wherein the importance degree calculation formula is as follows:wherein->The number of times of retrieving the deleted data information for the ith user is n, which is the total number of times of retrieving the deleted data information for the user;
it should be noted that the encrypted data information and the recovered deleted data information are only extracted and then used in the system, and the encrypted data information and the recovered deleted data information cannot be obtained on the premise that an external person does not conduct network attack, so that the specific confidentiality problem is not considered;
in this embodiment, it should be noted that, the data security calculation policy in S3 includes the following specific steps:
s31, identifying the data type, the data information title and the data characteristics of the data information of the deleted data information, and simultaneously extracting the data type, the data information title and the data characteristics of the data information of the encrypted data information in the solid state disk;
S32, importing the acquired data type, data information title and data characteristics of the identified deleted data information and the encrypted data information in the solid state disk into a secondary similarity calculation formula to calculate the secondary similarity between the identified deleted data information and the encrypted data information in one of the solid state disks, wherein the secondary similarity calculation formula is as follows:wherein->To identify the secondary similarity of the deleted data information to the encrypted data information in the jth solid state disk,/for the second time>As the second-level judgment factor, if the data type of the encrypted data information in the j-th solid state disk is the same as that of the data information, the first-level judgment factor takes 1, and if the data type of the encrypted data information in the j-th solid state disk is the same as that of the data information, the first-level judgment factor takes 0>Data information title of encrypted data information in jth solid state disk, < >>The data characteristics of the encrypted data information in the j-th solid state disk are provided;
s33, acquiring and identifying secondary similarity of deleted data information and encrypted data information in all solid state disks, substituting the secondary similarity into a data security meterCalculating the data security degree according to a calculation formula, wherein the data security degree calculation formula is as follows: Wherein N is the number of encrypted data information in the solid state disk;
in this embodiment, it should be noted that the specific content of S4 includes the following specific steps:
obtaining the importance of the deleted data and the data safety of the deleted data, substituting the importance and the data safety into a data recovery coefficient calculation formula, and calculating the recovery coefficient of the deleted data, wherein the data recovery coefficient calculation formula is as follows:wherein->Is an importance ratio coefficient->Is a safety duty ratio coefficient->
Here, it is to be noted that, here、/>、/>、/>And the calculation mode of the recovery coefficient threshold value is as follows: extracting 5000 groups of data to be recovered by historical deletion and data to be recovered by deletion, inputting the data to be recovered and the data to be recovered into a data recovery coefficient calculation formula to calculate a data recovery coefficient, distinguishing the data to be recovered from the data to be recovered by deletion, and then importing the data obtained by calculation and a distinguishing result into fitting software to obtain a fitting softwareJudging the optimal ∈of the accuracy>、/>、/>、/>And recovering the value of the coefficient threshold;
the method comprises the steps of reading cache data in a solid state disk, simultaneously obtaining data information deleted from the solid state disk, extracting data types, data information titles and data characteristics of the data information deleted from the solid state disk, importing the extracted data types, data information titles and data characteristics of the data information from the solid state disk into a constructed importance calculation strategy to calculate importance of the deleted data, importing the extracted data types, data information titles and data characteristics of the data information from the solid state disk into the constructed data safety calculation strategy to calculate data safety of the deleted data, obtaining the calculated importance of the deleted data and the data safety of the deleted data, substituting the calculated importance of the deleted data into a data recovery coefficient calculation formula, calculating a recovery coefficient of the deleted data, comparing the recovery coefficient of the deleted data with a set recovery coefficient threshold, and if the recovery coefficient of the obtained deleted data is larger than or equal to the set recovery coefficient threshold, collecting the deleted data in a reserved storage particle through compression by a garbage collection module according to the size sequence of the recovery coefficient of the deleted data, if the obtained recovery coefficient of the deleted data is smaller than the set recovery coefficient of the deleted data, analyzing the obtained data is required to be further analyzed accurately and quantitatively, and the data can not be further analyzed, and the data can be further prevented from being further removed.
Example 2
As shown in fig. 2, a data recovery analysis system based on a solid state disk is implemented based on the data recovery analysis method based on the solid state disk, which includes a data acquisition module, an importance calculation module, a security calculation module, a data recovery coefficient calculation module, a data comparison module and a control module, wherein the data acquisition module is used for reading cache data in the solid state disk, simultaneously acquiring data information deleted from the solid state disk, extracting data types, data information titles and data characteristics of the data information deleted from the solid state disk, the importance calculation module is used for introducing the extracted data types, data information titles and data characteristics of the data information deleted from the solid state disk into a constructed importance calculation strategy to calculate the importance of deleted data, and the security calculation module is used for introducing the extracted data types, data information titles and data characteristics of the data information deleted from the solid state disk into the constructed data security calculation strategy to calculate the data security of deleted data.
In this embodiment, the data recovery coefficient calculation module is configured to obtain the calculated importance degree of the deleted data and the calculated data security degree of the deleted data, and substitute the importance degree and the calculated data security degree into the data recovery coefficient calculation formula to perform calculation of the recovery coefficient of the deleted data, and the data comparison module is configured to compare the recovery coefficient of the deleted data with a set recovery coefficient threshold, and the control module is configured to control operations of the data acquisition module, the importance degree calculation module, the security degree calculation module, the data recovery coefficient calculation module, and the data comparison module.
Example 3
The present embodiment provides an electronic device including: a processor and a memory, wherein the memory stores a computer program for the processor to call;
the processor executes the data recovery analysis method based on the solid state disk by calling the computer program stored in the memory.
The electronic device may have a relatively large difference due to different configurations or performances, and may include one or more processors (Central Processing Units, CPU) and one or more memories, where at least one computer program is stored in the memories, and the computer program is loaded and executed by the processors to implement a data recovery analysis method based on a solid state disk provided by the above method embodiment. The electronic device can also include other components for implementing the functions of the device, for example, the electronic device can also have wired or wireless network interfaces, input-output interfaces, and the like, for inputting and outputting data. The present embodiment is not described herein.
Example 4
The present embodiment proposes a computer-readable storage medium having stored thereon an erasable computer program;
When the computer program runs on the computer equipment, the computer equipment is caused to execute the data recovery analysis method based on the solid state disk.
For example, the computer readable storage medium can be Read-Only Memory (ROM), random access Memory (Random Access Memory, RAM), compact disk Read-Only Memory (Compact Disc Read-Only Memory, CD-ROM), magnetic tape, floppy disk, optical data storage device, etc.
It should be understood that, in various embodiments of the present application, the sequence numbers of the foregoing processes do not mean the order of execution, and the order of execution of the processes should be determined by the functions and internal logic thereof, and should not constitute any limitation on the implementation process of the embodiments of the present application.
It should be understood that determining B from a does not mean determining B from a alone, but can also determine B from a and/or other information.
The above embodiments may be implemented in whole or in part by software, hardware, firmware, or any other combination. When implemented in software, the above-described embodiments may be implemented in whole or in part in the form of a computer program product. The computer program product comprises one or more computer instructions or computer programs. When the computer instructions or computer program are loaded or executed on a computer, the processes or functions in accordance with embodiments of the present invention are produced in whole or in part. The computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable apparatus. The computer instructions may be stored in or transmitted from one computer-readable storage medium to another, for example, by way of wired or/and wireless networks from one website site, computer, server, or data center to another. Computer readable storage media can be any available media that can be accessed by a computer or data storage devices, such as servers, data centers, etc. that contain one or more collections of available media. The usable medium may be a magnetic medium (e.g., floppy disk, hard disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium. The semiconductor medium may be a solid state disk.
Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
It will be clear to those skilled in the art that, for convenience and brevity of description, specific working procedures of the above-described systems, apparatuses and units may refer to corresponding procedures in the foregoing method embodiments, and are not repeated herein.
In the several embodiments provided by the present invention, it should be understood that the disclosed systems, devices, and methods may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, e.g., the partitioning of units is merely one way of partitioning, and there may be additional ways of partitioning in actual implementation, e.g., multiple units or components may be combined or integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical or other form.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed over a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in the embodiments of the present invention may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit.
In the description of the present specification, the descriptions of the terms "one embodiment," "example," "specific example," and the like, mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present invention. In this specification, schematic representations of the above terms do not necessarily refer to the same embodiments or examples. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
The preferred embodiments of the invention disclosed above are intended only to assist in the explanation of the invention. The preferred embodiments are not intended to be exhaustive or to limit the invention to the precise form disclosed. Obviously, many modifications and variations are possible in light of the above teaching. The embodiments were chosen and described in order to best explain the principles of the invention and the practical application, to thereby enable others skilled in the art to best understand and utilize the invention. The invention is limited only by the claims and the full scope and equivalents thereof.

Claims (9)

1. The data recovery analysis method based on the solid state disk is characterized by comprising the following specific steps of:
s1, reading cache data in a solid state disk, simultaneously acquiring data information deleted from the solid state disk, and extracting the data type, the data information title and the data characteristics of the data information deleted from the solid state disk;
s2, importing the extracted data type, data information title and data characteristics of the data information deleted from the solid state disk into a constructed importance calculation strategy to calculate the importance of the deleted data;
S3, importing the extracted data type, data information title and data characteristics of the data information deleted from the solid state disk into a constructed data security calculation strategy to calculate the data security of the deleted data;
s4, acquiring the importance of the deleted data and the data safety of the deleted data obtained through calculation, substituting the importance and the data safety into a data recovery coefficient calculation formula, and calculating the recovery coefficient of the deleted data;
s5, comparing the recovery coefficient of the deleted data with a set recovery coefficient threshold, if the recovery coefficient of the deleted data is greater than or equal to the set recovery coefficient threshold, collecting the deleted data in reserved storage particles through compression by a data garbage collection module according to the order of the recovery coefficient of the deleted data, and if the recovery coefficient of the deleted data is smaller than the set recovery coefficient threshold, directly deleting the deleted data;
the data security calculation strategy in the S3 comprises the following specific steps:
s31, identifying the data type, the data information title and the data characteristics of the data information of the deleted data information, and simultaneously extracting the data type, the data information title and the data characteristics of the data information of the encrypted data information in the solid state disk;
S32, data type, data information title, data characteristic and solid state hard of the data information to be acquired and identified and deletedThe data type, the data information title and the data characteristics of the encrypted data information in the disk are imported into a secondary similarity calculation formula to calculate and identify the secondary similarity of the deleted data information and the encrypted data information in one of the solid state disks, wherein the secondary similarity calculation formula is as follows:wherein->To identify the secondary similarity of the deleted data information to the encrypted data information in the jth solid state disk,/for the second time>As the second-level judgment factor, if the data type of the encrypted data information in the j-th solid state disk is the same as that of the data information, the first-level judgment factor takes 1, and if the data type of the encrypted data information in the j-th solid state disk is the same as that of the data information, the first-level judgment factor takes 0>Data information title of encrypted data information in jth solid state disk, < >>The data characteristics of the encrypted data information in the j-th solid state disk are provided;
s33, acquiring and identifying the secondary similarity of the deleted data information and the encrypted data information in all the solid state disks, and substituting the secondary similarity into a data security calculation formula to calculate the data security, wherein the data security calculation formula is as follows: Wherein N is the number of encrypted data information in the solid state disk.
2. The method for analyzing data recovery based on solid state disk as claimed in claim 1, wherein the step S1 comprises the following specific steps:
s11, reading cache data in the solid state disk, acquiring data information to be deleted from the solid state disk, and identifying the data type, the data information title and the data characteristics of the data information of the deleted data information, wherein the data type of the data information is the type of the deleted data information, and the data type, the data information title and the data characteristics of the data information of the deleted data form a first dimension vector for transmission;
s12, reading the data type, the data information title and the data characteristics of the data information of the encrypted data information in the solid state disk, and transmitting the data type, the data information title and the data characteristics of the data information of the encrypted data information in a form of a second dimension vector;
s13, reading the data type, the data information title and the data characteristics and the recovery times of the deleted data information, and transmitting the data type, the data information title, the data characteristics and the recovery times of the deleted data information to form a third dimensional vector.
3. The method for analyzing data recovery based on solid state disk as claimed in claim 2, wherein the importance calculation strategy in S2 comprises the following specific contents:
s21, acquiring data information to be deleted from the solid state disk, identifying the data type, the data information title and the data characteristics of the data information of the deleted data information, and simultaneously acquiring the data type, the data information title and the data characteristics and the recovery times of the data information of the deleted data information, which are read by a user;
s22, substituting the obtained data type, data information title and data characteristics of the identified deleted data information and the obtained data type, data information title and data characteristics of the data information of the user for recovering the deleted data information into a similarity calculation formula to calculate the first-order similarity between the identified deleted data information and one of the user for recovering the deleted data information, wherein the first-order similarity calculation formula is that:Wherein, the method comprises the steps of, wherein,to identify the similarity of the deleted data information to the i-th user's retrieval of the deleted data information,/for the user>As the first-level judgment factor, if the data type of the deleted data information is identified to be the same as that of the i-th user for retrieving the deleted data information, the first-level judgment factor is taken as 1, and if the data type of the deleted data information is identified to be the same as that of the i-th user for retrieving the deleted data information, the first-level judgment factor is taken as 0 >For the title duty factor, < >>Data characteristic duty cycle, +_>M () is the number of elements in parentheses, z is the data information header identifying the deletion, ++>Retrieving the data information header of the deleted data information for the ith user,/-, for the user>To identify the same character of the deleted data information title as the data information title of the i-th user retrieving the deleted data information>To identify the character union of the deleted data information header and the data information header of the i-th user for retrieving the deleted data information, s is the identification deletionExcept for the data characteristics of the data information>The data characteristics of the deleted data information are retrieved for the ith user.
4. The method for data recovery analysis based on solid state disk as claimed in claim 3, wherein the step S2 further comprises the following specific steps:
s23, obtaining the first-level similarity between the calculated identified deleted data information and the deleted data information retrieved by each user, extracting the times of retrieving the deleted data information by the user, substituting the times into an importance degree calculation formula to calculate the importance degree of the identified deleted data information, wherein the importance degree calculation formula is as follows:wherein->The number of times of retrieving the deleted data information for the ith user is n, which is the total number of times of retrieving the deleted data information for the user.
5. The method for analyzing data recovery based on the solid state disk as claimed in claim 4, wherein the specific content of S4 comprises the following specific steps:
obtaining the importance of the deleted data and the data safety of the deleted data, substituting the importance and the data safety into a data recovery coefficient calculation formula, and calculating the recovery coefficient of the deleted data, wherein the data recovery coefficient calculation formula is as follows:wherein->Is an importance ratio coefficient->Is a safety duty ratio coefficient->
6. A solid state disk-based data recovery analysis system, which is realized based on the solid state disk-based data recovery analysis method according to any one of claims 1 to 5, and is characterized by comprising a data acquisition module, an importance calculation module, a security calculation module, a data recovery coefficient calculation module, a data comparison module and a control module, wherein the data acquisition module is used for reading cache data in the solid state disk and simultaneously acquiring data information deleted from the solid state disk, extracting data characteristics of a data type, a data information title and the data information of the data information deleted from the solid state disk, the importance calculation module is used for conducting calculation of the importance of deleted data by importing the data characteristics of the data information, the data information title and the data information deleted from the solid state disk into a constructed importance calculation strategy, and the security calculation module is used for conducting calculation of the data security of deleted data by importing the data type, the data information title and the data characteristics of the data information deleted from the solid state disk into the constructed data security calculation strategy.
7. The system of claim 6, wherein the data recovery coefficient calculation module is configured to obtain the calculated importance of the deleted data and the calculated data security of the deleted data, and substitute the importance of the deleted data and the calculated data security into the data recovery coefficient calculation formula to perform calculation of the recovery coefficient of the deleted data, the data comparison module is configured to compare the recovery coefficient of the deleted data with a set recovery coefficient threshold, and the control module is configured to control operations of the data acquisition module, the importance calculation module, the security calculation module, the data recovery coefficient calculation module, and the data comparison module.
8. An electronic device, comprising: a processor and a memory, wherein the memory stores a computer program for the processor to call;
the method for analyzing the data recovery based on the solid state disk is characterized in that the processor executes the data recovery analysis method based on the solid state disk according to any one of claims 1 to 5 by calling a computer program stored in the memory.
9. A computer readable storage medium storing instructions which, when executed on a computer, cause the computer to perform a solid state disk based data recovery analysis method according to any one of claims 1 to 5.
CN202410027052.1A 2024-01-09 2024-01-09 Data recovery analysis method and system based on solid state disk Active CN117540176B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410027052.1A CN117540176B (en) 2024-01-09 2024-01-09 Data recovery analysis method and system based on solid state disk

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410027052.1A CN117540176B (en) 2024-01-09 2024-01-09 Data recovery analysis method and system based on solid state disk

Publications (2)

Publication Number Publication Date
CN117540176A CN117540176A (en) 2024-02-09
CN117540176B true CN117540176B (en) 2024-04-02

Family

ID=89796196

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410027052.1A Active CN117540176B (en) 2024-01-09 2024-01-09 Data recovery analysis method and system based on solid state disk

Country Status (1)

Country Link
CN (1) CN117540176B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108986869A (en) * 2018-07-26 2018-12-11 南京群顶科技有限公司 A kind of disk failure detection method predicted using multi-model
WO2021206956A1 (en) * 2020-04-06 2021-10-14 Datto, Inc. Methods and systems for detecting ransomware attack in incremental backup
CN114816860A (en) * 2022-05-18 2022-07-29 苏州忆联信息***有限公司 Data recovery processing method and system based on solid state disk and computer equipment
CN116340055A (en) * 2023-03-28 2023-06-27 上海威固信息技术股份有限公司 Data recovery method and system for solid state disk
CN116661706A (en) * 2023-07-26 2023-08-29 江苏华存电子科技有限公司 Cache cleaning analysis method and system for solid state disk

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108986869A (en) * 2018-07-26 2018-12-11 南京群顶科技有限公司 A kind of disk failure detection method predicted using multi-model
WO2021206956A1 (en) * 2020-04-06 2021-10-14 Datto, Inc. Methods and systems for detecting ransomware attack in incremental backup
CN114816860A (en) * 2022-05-18 2022-07-29 苏州忆联信息***有限公司 Data recovery processing method and system based on solid state disk and computer equipment
CN116340055A (en) * 2023-03-28 2023-06-27 上海威固信息技术股份有限公司 Data recovery method and system for solid state disk
CN116661706A (en) * 2023-07-26 2023-08-29 江苏华存电子科技有限公司 Cache cleaning analysis method and system for solid state disk

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
面向虚拟机镜像的改进相似度分组去重优化方法;梁小宇;陈宁江;闫承鑫;刘文斌;;广西大学学报(自然科学版);20171225(06);第178-186页 *

Also Published As

Publication number Publication date
CN117540176A (en) 2024-02-09

Similar Documents

Publication Publication Date Title
US6405193B2 (en) Database processing method and apparatus using handle
US20080126320A1 (en) Method and Apparatus for Approximate Matching Where Programmable Logic Is Used to Process Data Being Written to a Mass Storage Medium and Process Data Being Read from a Mass Storage Medium
WO2021159834A1 (en) Abnormal information processing node analysis method and apparatus, medium and electronic device
CN114003791B (en) Depth map matching-based automatic classification method and system for medical data elements
WO2021174812A1 (en) Data cleaning method and apparatus for profile, and medium and electronic device
CN109213752A (en) A kind of data cleansing conversion method based on CIM
CN112528279B (en) Method and device for establishing intrusion detection model
WO2024001080A1 (en) Method for fault localization of database throughout infrastructure based on artificial intelligence for it operations
CN114297140A (en) Archive management system based on artificial intelligence
CN109308290A (en) A kind of efficient data cleaning conversion method based on CIM
CN116383193A (en) Data management method and device, electronic equipment and storage medium
CN117540176B (en) Data recovery analysis method and system based on solid state disk
CN109710628A (en) Information processing method and device, system, computer and readable storage medium storing program for executing
CN109522273A (en) A kind of method and device for realizing data write-in
CN107590233A (en) A kind of file management method and device
CN114359815B (en) Processing method for rapidly auditing video content
CN115640158A (en) Detection analysis method and device based on database
CN113268506B (en) Query method and device of cache database, electronic equipment and readable storage medium
CN115481086A (en) Mass small file reading and writing method and system, electronic device and storage medium
CN101510211A (en) Multimedia data processing system and method
CN114546957A (en) Intelligent centralized data processing service platform
CN112347102A (en) Multi-table splicing method and multi-table splicing device
KR102600770B1 (en) Open-source intelligence forensic system that generates link information between public source information and snapshot and method of operating the same
CN113590903B (en) Management method and device of information data
CN116166472B (en) Data recovery method and system for stored data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant