CN105790768B

CN105790768B - Date storage method and system

Info

Publication number: CN105790768B
Application number: CN201410809119.3A
Authority: CN
Inventors: 向宏卫; 陈小川
Original assignee: Allwinner Technology Co Ltd
Current assignee: Allwinner Technology Co Ltd
Priority date: 2014-12-19
Filing date: 2014-12-19
Publication date: 2018-12-25
Anticipated expiration: 2034-12-19
Also published as: CN105790768A

Abstract

The invention discloses a kind of date storage method and systems, and wherein method includes the following steps: to parse media file, obtain the media information and the first bit stream data of media file, and save media information to the first default file；First bit stream data is compressed using hash algorithm compress mode and obtains the first compressed bit stream, and saves the first compressed bit stream to the first default file；It reads and exports the first bit stream data to decoder and be decoded, obtain the image information and the second bit stream data of media file, and save image information to the second default file；Second bit stream data is compressed using hash algorithm compress mode and obtains the second compressed bit stream, and saves the second compressed bit stream to the second default file.It is not by compressing media information in media file and image information, and the bit stream data of the overwhelming majority is compressed using hash algorithm, memory space is saved, solves the problems, such as that the existing compressed file of data storage compression method still occupies larger memory space.

Description

Data storage method and system

Technical Field

The present invention relates to the field of storage, and in particular, to a data storage method and system.

Background

In the media automation test, various media information needs to be processed and stored, especially audio and video code streams, and the data volume is very large. These large amounts of data exceed the storage and processing capabilities of the computer and are beyond the current file transfer rates. How to store and transmit these data efficiently and how to match quickly to the required information from these data is a challenge for the user. Therefore, in order to store, process and transmit the data, an efficient data storage method must be adopted to compress and store the data.

The existing compression suitable for text file data is mainly lossless compression. However, the current lossless compression algorithm is based on finding the same and repeated sequence in the file for compression, even if the data with high repeatability such as text files is compressed, the maximum compression rate can only reach about 70% by adopting lossless compression under the general condition, namely the size of the compressed file is about 30% of the original file. The compressed file still occupies a large storage space.

Disclosure of Invention

Therefore, it is necessary to provide a data storage method and system for solving the problem that a file compressed by the existing data storage compression method still occupies a large storage space.

The data storage method provided for realizing the aim of the invention comprises the following steps:

analyzing a media file, acquiring media information and first code stream data of the media file, and storing the media information to a first preset file;

compressing the first code stream data by adopting a Hash algorithm compression mode to obtain a first compressed code stream, and storing the first compressed code stream to the first preset file;

reading and outputting the first code stream data to a decoder for decoding, acquiring image information and second code stream data of the media file, and storing the image information to a second preset file;

and compressing the second code stream data by adopting the Hash algorithm compression mode to obtain a second compressed code stream, and storing the second compressed code stream to the second preset file.

In one embodiment, the method further comprises the following steps:

and respectively comparing the first compressed code stream with a first preset reference file and the second compressed code stream with a second preset reference file according to the hash value obtained by adopting the hash algorithm compression mode, obtaining a test result, and determining whether the first compressed code stream and the second compressed code stream are correct.

In one embodiment, the compressing the first code stream data by using a hash algorithm compression method to obtain a first compressed code stream includes the following steps:

setting an initialization value for hash value calculation;

padding each code stream in the first code stream data, and adding the original length value of each code stream to each code stream after padding the padding;

dividing each code stream to obtain a plurality of first data blocks M with the same size₀-M_n(ii) a Wherein n is a positive integer;

and processing the first data block of each code stream in groups by adopting a compression function until all code streams in the first code stream data are completely processed to obtain the first compressed code stream.

In one embodiment, the processing the first data block of each code stream in a packet by using a compression function includes the following steps:

identifying a first buffer, a second buffer and a third buffer required for calculating the hash value;

wherein the first buffer is a buffer of 5 32-bit words; the second buffer is an 80-bit word buffer; the third buffer area is a buffer area with 1 32-bit words; and,

the first buffer area is marked as H₀、H₁、H₂、H₃、H₄(ii) a The second buffer area is marked as W₀-W₇₉(ii) a The third buffer area identification is TEMP;

each first data block M_iThe division into 16 words: w₀-W₁₅(ii) a Wherein, W₀The leftmost word; i is greater than or equal to 0 and less than or equal to n;

for i is greater than or equal to 16 and less than or equal to 79, let W_i＝S₁(W_i-3xor W_i-8xor W_i-14xor W_i-16)；

Let A be H₀，B＝H₁，C＝H₂，D＝H₃，E＝H₄；

For i greater than or equal to 0 and less than or equal to 79, a loop TEMP ═ S is executed₅(A)+f_i(B，C，D)+E+W_i+K_i；E＝D；D＝C；C＝S₃₀(B)；B＝A；A＝TEMP；

Let H₀＝H₀+A，H₁＝H₁+B，H₂＝H₂+C，H₃＝H₃+D，H₄＝H₄+E；

Wherein f is_i(B, C, D) is the compression function; k_iIs a constant word; a is a first intermediate variable, B is a second intermediate variable, C is a third intermediate variable, D is a fourth intermediate variable, and E is a fifth intermediate variable.

In one embodiment, the compression function f_i(B, C, D) are:

the constant word K_iComprises the following steps:

in one embodiment, the compressing the second code stream data by using the hash algorithm to obtain a second compressed code stream includes the following steps:

setting an initialization value for hash value calculation;

padding each code stream in the second code stream data, and adding the original length value of each code stream to each code stream after padding;

dividing each code stream to obtain a plurality of second data blocks M with the same size₀’-M_n'; wherein n is an integer;

and processing the second data block in each code stream in groups by adopting a compression function until all the code streams in the second code stream data are completely processed to obtain a second compressed code stream.

Correspondingly, based on the same invention concept, the invention also provides a data storage system, which comprises an analysis storage module, a first compression module, a reading and storing module, a decoder and a second compression module; wherein,

the analysis storage module is used for analyzing the media file, acquiring the media information and the first code stream data of the media file, and storing the media information to a first preset file;

the first compression module is used for compressing the first code stream data by adopting a Hash algorithm compression mode to obtain a first compressed code stream, and storing the first compressed code stream to the first preset file;

the reading and storing module is used for reading and outputting the first code stream data to the decoder;

the decoder is used for decoding the first code stream data to obtain the image information and the second code stream data of the media file;

the reading and storing module is also used for storing the image information to a second preset file;

the second compression module is used for compressing the second code stream data by adopting the Hash algorithm compression mode to obtain a second compressed code stream, and storing the second compressed code stream to the second preset file.

In one embodiment, the device further comprises a comparison module;

the comparison module is used for respectively comparing the first compressed code stream with a corresponding first preset reference file and the second compressed code stream with a corresponding second preset reference file according to the hash value obtained by the hash algorithm compression mode, obtaining a test result and determining whether the first compressed code stream and the second compressed code stream are correct.

In one embodiment, the first compression module includes a first initialization setting unit, a first padding unit, a first additional length value unit, a first splitting unit, and a first processing unit; wherein,

the first initialization setting unit is used for setting an initialization value for the hash value calculation;

the first bit padding filling unit is configured to perform bit padding filling on each code stream in the first code stream data;

the first additional length value unit is used for adding the original length value of each code stream to each code stream after the padding is filled;

the first division unit is used for dividing each code stream to obtain a plurality of first data blocks M with the same size₀-M_n(ii) a Wherein n is a positive integer;

the first processing unit is configured to process the first data block in each code stream in a grouping manner by using a compression function until all code streams in the first code stream data are completely processed, so as to obtain the first compressed code stream.

In one embodiment, the first processing unit comprises an identification subunit, a sub-word segmentation subunit, a first calculation subunit, an assignment subunit, a circulation subunit and a second calculation subunit; wherein,

the identification subunit is configured to identify a first buffer, a second buffer, and a third buffer required for calculating the hash value;

the word dividing subunit is used for dividing each first data block M into multiple data blocks_iThe division into 16 words: w₀-W₁₅(ii) a Wherein, W₀The leftmost word; i is not less than 0 and not more thann；

The first calculating subunit is used for enabling W to be larger than or equal to 16 and smaller than or equal to 79 for i_i＝S₁(W_i-3xorW_i-8xor W_i-14xor W_i-16)；

The assignment subunit is used for enabling A to be H₀，B＝H₁，C＝H₂，D＝H₃，E＝H₄；

A loop subunit configured to execute a loop TEMP ═ S for i greater than or equal to 0 and less than or equal to 79₅(A)+f_i(B，C，D)+E+W_i+K_i；E＝D；D＝C；C＝S₃₀(B)；B＝A；A＝TEMP；

The second calculating subunit is used for ordering H₀＝H₀+A，H₁＝H₁+B，H₂＝H₂+C，H₃＝H₃+D，H₄＝H₄+E；

The data storage method has the beneficial effects that:

in the media automation test, after the media information of the media file and the first code stream data of the media file are acquired by analyzing the media file, the media information is directly stored without being compressed. Meanwhile, the first compressed code stream is obtained after the first code stream data is compressed by adopting a Hash algorithm and is stored in a first preset file, so that the first compression of the media file is realized. And then, inputting the first code stream data into a decoder for decoding to obtain image information of the media file and second code stream data, wherein the image information is directly stored without being compressed, and the second code stream data is compressed by adopting a Hash algorithm to obtain a second compressed code stream and is stored to a second preset file. The media file is compressed twice, and the two times of compression are realized by adopting a Hash algorithm, so that the media file only has a small amount of media information and image information which are not compressed, and most of other code stream data are compressed, thereby effectively saving the storage space. Finally, the problem that files compressed by the existing data storage compression method still occupy a large storage space is effectively solved.

Drawings

FIG. 1 is a flow chart of a data storage method according to an embodiment of the present invention;

FIG. 2 is a flowchart illustrating compressing each code stream in the first code stream data by using a hash algorithm according to an embodiment of the data storage method of the present invention;

FIG. 3 is a schematic diagram of a data storage system according to an embodiment of the present invention;

FIG. 4 is a block diagram of a first compression module according to an embodiment of the present invention.

Detailed Description

In order to make the technical scheme of the invention clearer, the invention is further described in detail with reference to the accompanying drawings and specific embodiments.

Referring to fig. 1, as an embodiment of the data storage method provided by the present invention, the method for processing a media file in a media automation test includes the following steps:

step S100, analyzing the media file, after acquiring the media information and the first code stream data of the media file, executing step S200, and storing the media information to a first preset file.

Meanwhile, step S300 is executed, the first code stream data is compressed by using a hash algorithm compression method to obtain a first compressed code stream, and the first compressed code stream is stored in the first preset file.

In addition, step S400 is also executed at the same time, the first code stream data is read and output to a decoder for decoding, and after the image information and the second code stream data of the media file are acquired, step S500 is executed, and the image information is saved to a second preset file.

Meanwhile, step S600 is executed, the second code stream data is compressed by using a hash algorithm compression method to obtain a second compressed code stream, and the second compressed code stream is stored in a second preset file.

In the media automatic test process, the media file is analyzed to obtain media information (including information such as sizes and time stamps of audio, video and subtitles in the media file) and first code stream data (namely data before the media file is compressed). The media information is directly stored without any compression, and the first code stream data is compressed by adopting a Hash algorithm to obtain a first compressed code stream and then stored. Meanwhile, the first code stream data is output to a decoder for decoding, and image information (including information such as the size, pixel ratio, width, height, output pixel format and the like of the image) and second code stream data (namely code stream data of each frame of image of the media file) of the media file are obtained. The image information is directly stored without being compressed, and the second code stream data is compressed by adopting a Hash algorithm to obtain a second compressed code stream and then stored. That is, some basic information of the media file, such as media information and image information, is not compressed, and most data streams of the media file are compressed by a hash algorithm and then stored.

When the hash algorithm is adopted for data compression, the code stream data to be compressed is always treated as a bit character string. Assume that the code stream "8 ab" is converted to a binary bit string of 001110000110000101100010, and the bit string denoted as hexadecimal is 0x 386162. Therefore, the first code stream data and the second code stream data of the media file are respectively compressed by adopting a Hash algorithm, so that the first compressed code stream and the second compressed code stream obtained after compression are both compressed code streams with less bit number. That is to say, when the media file is compressed by adopting the hash algorithm, the compressed code stream of which the compressed media file is of a small bit number is finally obtained, so that the purpose of storing a large amount of code stream data by using the small bit number is realized, and the storage space is saved. The problem that a large space is still occupied even if data are compressed and then stored in the existing data storage is effectively solved.

Compared with the existing data compression storage, when the data storage method provided by the invention adopts the Hash algorithm for compression, the code stream data with any length of each frame can be stored as a bit string with a fixed length, and the characteristics of the Hash algorithm, such as anti-collision, uniform hash distribution and the like, can be fully utilized. The probability that any two different code stream data have the same hash value approaches 1/2^N. Namely, the code stream data of any two frames can hardly have the same hash value, so that the reliability of data compression and storage is improved, and the correctness of the media automation test result is ensured.

Meanwhile, when the hash algorithm is adopted for data compression, although the generated hash values are the same and have probability, the necessary result of mapping a large space into a limited small space is also achieved. However, 2^NThe number of the code streams is still very large, and the probability that the hash values of two different code stream data collide (i.e. are the same) is still very small. Even if collision occurs, the original code stream data can be compared bit by bit to further confirm. And different hash values represent different code stream data, so that in the media automation test, the media file is compressed and stored by adopting a hash algorithm, wherein the hash value generated by the hash algorithm is used as a preliminary judgment basis in the test, and the compressed media file is not required to be decompressed, so that the comparison rate in the test process is accelerated, and the method has the characteristics of high efficiency and high reliability.

It should be noted that, in the media automation test process, the compressed media file needs to be compared with a preset reference file to detect whether the compressed media file is accurate. Therefore, as a specific embodiment of the data storage method of the present invention, it further includes a step of comparing and detecting. That is, in step S700, according to the hash value obtained by using the hash algorithm compression method, comparing the first compressed code stream stored in the first preset file with the corresponding first preset reference file, and comparing the second compressed code stream stored in the second preset file with the corresponding second preset reference file, thereby obtaining a test result to determine whether the first compressed code stream and the second compressed code stream are correct.

Specifically, the method comprises the following steps:

when a 45MB AVI video file is input, the media information (including file size, time stamp and the like) is acquired through analyzing the media format, the acquired media information is used for initializing a decoder, and the acquired media information is directly written into a first preset file without being compressed by a Hash algorithm and is stored for comparing test results. And then, reading each analyzed code stream data and outputting the code stream data to a decoder for decoding, compressing each analyzed code stream data through a Hash algorithm to obtain a first compressed code stream, and storing the first compressed code stream into a first preset file for comparison of test results.

And then, directly writing the image information (including image width, height, output pixel format and the like) of each frame obtained by decoding into a second preset file without hash algorithm compression for storage and comparison of test results. And then, compressing the decoded code stream data (namely second code stream data) by adopting a Hash algorithm to obtain a second compressed code stream, and then storing the second compressed code stream into a second preset file for comparison of the test result.

And finally, comparing the stored result file with the corresponding reference file, namely comparing the first compressed code stream with the corresponding first preset reference file, and comparing the second compressed code stream with the corresponding second preset reference file, so as to obtain a test result.

Furthermore, the obtained test result can be saved into an XML file so as to facilitate subsequent viewing and review.

In the above embodiment, only the first code stream data and the second code stream data need to be compressed, and the original code stream can be obtained without decompression, and in the media automation test, the media information of the media file to be stored and 12647 code stream data (i.e. all code streams in the first code stream data) occupy 2MB of storage space after being compressed; meanwhile, the image information of the media file to be stored and the corresponding 12467 frames of code stream data (i.e. all code streams in the second code stream data) occupy 1.3MB of storage space after being compressed. The required storage space is only 1.78% before compression compared to 93MB and 92MB before compression. Therefore, the data storage method provided by the invention greatly saves the storage space. Meanwhile, the time for comparing the files is related to the size of the files, and the compressed code stream capacity in the data storage method is very small, so that the file comparison time is relatively reduced, and the test process is accelerated.

It should be noted that, when the hash algorithm is used to compress the first code stream data, step S310 is first executed to perform a series of preprocessing on the first code stream data, where the preprocessing includes operations such as segmentation, padding, adding a length value, and initialization. It should be noted that, when the first code stream data is preprocessed, the preprocessing can be implemented by respectively performing corresponding preprocessing on each code stream in the first code stream data. Moreover, when each code stream in the first code stream data is preprocessed, the execution sequence of operations such as division, padding, adding length, initialization and the like can be replaced mutually. And are not intended to represent the only embodiments in the order in which the present invention may be practiced.

Furthermore, the second code stream data obtained after decoding the first code stream data by the decoder is also compressed by using the hash algorithm, and the execution steps when compressing by using the hash algorithm are the same as the steps when compressing the first code stream data by using the hash algorithm, so that the following only takes the case of compressing the first code stream data by using the hash algorithm as an example to describe in detail the data compression process by using the hash algorithm.

Specifically, referring to fig. 2, step S311 is first executed to set an initialization value for hash value calculation. Wherein the initialization values are set to: h₀＝0x67452301，H₁＝0xEFCDAB89，H₂＝0x98BADCFE，H₃＝0x10325476，H₄0xC3D2E1F 0. That is, the compressed code stream after each code stream compression in the first code stream data is initialized to 5 hexadecimal character strings of 32-bit words. That is, an initial hash value is first set for each code stream to be compressed, and the set initial hash value is stored in a first buffer required for calculating the hash value. Wherein the first buffer region is 5 buffer regions H of 32-bit words₀，H₁，H₂，H₃，H₄。

Then, padding and adding a length value to each code stream in the first code stream data.

Here, before padding each code stream and adding a length value, step S312 may be preferentially performed to segment each code stream according to a preset unit to obtain a plurality of first data blocks M with the same size₀-M_nAnd n is a positive integer. Wherein, each first data block M obtained by division_iThe size of (i is greater than or equal to 0 and less than or equal to n) is 512 bits, and finally each code stream can be divided into n groups of 512 bits. Therefore, when each code stream is processed, each first data block in each code stream can be processed respectively, and compressed code stream data can be obtained.

Then, step S313 can be executed to perform padding for each code stream. I.e. padding each code stream with bits. When the hash algorithm is adopted for data compression, bit complementing processing is required to be carried out on the code stream, so that the remainder of the length of the code stream after bit complementing after modulus 512 is 448. The complementary bits need to be done even if the length has been such that the remainder is 448 after modulo 512. The specific bit-filling operation is as follows: first, 1 is complemented, then 0 is complemented, until the code stream length of each code stream satisfies the remainder of 448 after modulus 512. That is, when performing the complementary bit operation for each code stream, the number of complementary bits is at least one bit and at most 512 bits.

Then, step S314 is executed to add the original length value of each code stream to each code stream after padding. I.e. each code stream after bit padding is added with a length value. The added length value is the original length value of each code stream before the padding filling. That is, the code stream length of each code stream which is not subjected to padding is correspondingly padded to the back of each code stream which is subjected to padding. If the length of the original (i.e. not bit-filled) codestream exceeds 512, it is filled to a multiple of 512.

Then, step S320 is executed, each code stream is processed in groups by using a compression function, and a first compressed code stream is obtained until all code streams in the first code stream data are completely processed.

In addition, when each code stream is processed in groups by adopting a compression function, each code stream is divided into a plurality of first data blocks M_i. Thus, by processing each first data block M separately_iAnd finally obtaining the character string after each code stream is compressed. That is, when each code stream is processed in groups by using the compression function, each first data block M in each code stream can be processed by processing each first data block M in each code stream respectively_iTo be implemented.

By dividing each code stream into a plurality of first data blocks M_iWherein each first data block M_iHas a data length of 512 bits. Each code stream is processed in 512 bits, and a series of compression functions f are needed in the processing process_i(B, C, D). Wherein i is greater than or equal to 0 and less than or equal to 79; n has a value of 79. Each compression function f_i(B, C, D) all operate on 32-bit words (B, C, D) and produce the 32-bit words as output. Meanwhile, when the hash value of each code stream is calculated, three buffers are needed, namely a first buffer and a second bufferTwo buffers and a third buffer. Wherein the first buffer is a buffer of 5 32-bit words as mentioned before. The 5-bit word buffer is identified as H₀，H₁，H₂，H₃，H₄. The second buffer is an 80-bit word buffer, the 80-bit word buffer being identified as W₀、W₁、W₂、……W₇₉. The third buffer is a 1 buffer of 32-bit words, the 1 buffer of 32-bit words being identified as TEMP. In addition, a series of constant words is included, and the series of constant words is identified as K_i。

Further, a compression function f_i(B, C, D) may be defined as follows:

constant number word K_iThen, in hexadecimal notation, it is:

in order to obtain the compressed 160-bit code stream of each code stream, it is necessary to perform a first data block M of 512 bits each₁,M₂,...,M_nSequentially processing each first data block M_iComprising 80 steps. In processing each first data block M_iPreviously, the first buffer { Hi } was initialized to H₀，H₁，H₂，H₃，H₄。

Specifically, process M₁，M₂，……，M_nIn time, the following steps are required:

step S321, dividing each first data block M_iThe division into 16 words: w₀、W₁、W₂、W₃、……W₁₅，W₀The leftmost word.

In step S322, let W be 16 to 79 for i_i＝S₁(W_i-3xor W_i-8xor W_i-14xor W_i-16)。

Step S323, let a be H₀，B＝H₁，C＝H₂，D＝H₃，E＝H₄。

In step S324, for i 0 to 79, the following loop TEMP S is executed₅(A)+f_i(B，C，D)+E+W_i+K_i；E＝D；D＝C；C＝S₃₀(B)；B＝A；A＝TEMP。

Step S325, let H₀＝H₀+A，H₁＝H₁+B，H₂＝H₂+C，H₃＝H₃+D，H₄＝H₄+ E. After all the first data blocks are processed, each code stream is compressed into a 160-bit character string.

Referring to fig. 2, when each code stream is processed by using the compression function, whether each code stream is completely processed can be determined by two judgments. First, step S330 is executed to determine a first data block M in the currently processed code stream in real time_iWhether the processing is complete. If not, returning to step S320, and continuing to process the current first data block M in the code stream_i. If yes, executing step S340, and further determining whether n groups of first data blocks in the currently processed code stream are processed completely; if yes, the processing operation is ended. If not, returning to step S310, and continuing to segment and process the currently processed code stream in the first code stream data.

Further, when step S600 is executed and the second code stream data is compressed by using the hash algorithm to obtain the second compressed code stream, the specific processing procedure is the same as that of step S300. Namely, the method specifically comprises the following steps:

step S611, setting an initialization value for hash value calculation; wherein, the initialization value is: h₀＝0x67452301，H₁＝0xEFCDAB89，H₂＝0x98BADCFE，H₃＝0x10325476，H₄＝0xC3D2E1F0。

Step S612, padding each code stream in the second code stream data, and adding the original length value of each code stream to each code stream after padding.

Step S613, dividing each code stream according to a predetermined unit to obtain a plurality of second data blocks M with the same size₀’–M_n'; wherein n is a positive integer.

Step S620, each code stream is processed in groups by adopting a compression function, and a second compressed code stream is obtained until all code streams in the second code stream data are completely processed.

It should be noted that, the specific steps of compressing the second code stream data by using the hash algorithm to obtain the second compressed code stream are the same as the process of compressing the first code stream data by using the hash algorithm. Therefore, the description thereof is omitted.

Correspondingly, based on the same inventive concept, the invention also provides a data storage system. Since the working principle of the data storage system provided by the invention is the same as or similar to that of the data storage method provided by the invention, repeated descriptions are omitted.

Referring to fig. 3, the data storage system for processing a media file in a media automation test according to the present invention includes a parsing storage module 100, a first compression module 200, a reading storage module 300, a decoder 400, and a second compression module 500. Wherein,

the parsing storage module 100 is configured to parse the media file, obtain media information and first code stream data of the media file, and store the media information to a first preset file.

The first compression module 200 is configured to compress the first code stream data by using a hash algorithm to obtain a first compressed code stream, and store the first compressed code stream to a first preset file.

The read saving module 300 is configured to read and output the first code stream data to the decoder 400.

The decoder 400 is configured to decode the first code stream data to obtain image information and second code stream data of the media file;

the reading and saving module 300 is further configured to save the image information to a second preset file.

The second compression module 500 is configured to compress the second code stream data by using a hash algorithm to obtain a second compressed code stream, and store the second compressed code stream to a second preset file.

Therein, a comparison module 600 is also included. The comparison module 600 is configured to compare the first compressed code stream with a corresponding first preset reference file, and compare the second compressed code stream with a corresponding second preset reference file, respectively, to obtain a test result.

Referring to fig. 4, the first compression module 200 includes a first initialization setting unit 210, a first padding unit 220, a first additional length value unit 230, a first splitting unit 240, and a first processing unit 250, as one possible implementation manner.

The first initialization setting unit 210 is configured to set an initialization value for hash value calculation; wherein, the initialization value is: h₀＝0x67452301，H₁＝0xEFCDAB89，H₂＝0x98BADCFE，

H₃＝0x10325476，H₄＝0xC3D2E1F0。

The first padding filling unit 220 is configured to perform padding filling on each code stream in the first code stream data.

The first length adding unit 230 is configured to add an original length value of each code stream to each code stream after padding.

A first dividing unit 240 for dividing each code stream to obtainA plurality of first data blocks M of the same size₀-M_n(ii) a Wherein n is a positive integer.

The first processing unit 250 is configured to process each code stream in groups by using a compression function, and obtain a first compressed code stream until all code streams in the first code stream data are completely processed.

It should be noted that the first processing unit 250 includes an identifier subunit 251, a sub-word-dividing subunit 252, a first calculation subunit 253, an assignment subunit 254, a loop subunit 255, and a second calculation subunit 256.

The identifier subunit 251 is configured to identify a first buffer, a second buffer, and a third buffer required for calculating the hash value.

Wherein, the first buffer area is a buffer area with 5 32-bit words; the second buffer area is a buffer area with 80 32-bit words; the third buffer is a buffer of 1 32-bit words. And, the first buffer area is marked as H₀、H₁、H₂、H₃、H₄、H₅(ii) a The second buffer area is marked as W₀-W₇₉(ii) a The third buffer is identified as TEMP.

A word dividing subunit 252 for dividing each first data block M_iThe division into 16 words: w₀-W₁₅(ii) a Wherein, W₀I is equal to or greater than 0 and equal to or less than n for the leftmost word.

A first calculating subunit 253, configured to, for i being greater than or equal to 16 and less than or equal to 79, enable W_i＝S₁(W_i-3xor W_i- ₈xor W_i-14xor W_i-16)。

An assignment subunit 254 for setting a to H₀，B＝H₁，C＝H₂，D＝H₃，E＝H₄。

A loop subunit 255, configured to execute a loop TEMP ═ S for i greater than or equal to 0 and less than or equal to 79₅(A)+f_i(B，C，D)+E+W_i+K_i；E＝D；D＝C；C＝S₃₀(B)；B＝A；A＝TEMP。

A second calculating subunit 256 for letting H₀＝H₀+A，H₁＝H₁+B，H₂＝H₂+C，H₃＝H₃+D，H₄＝H₄+E。

Wherein f is_i(B, C, D) is a compression function; k_iIs a constant word; a is a first intermediate variable, B is a second intermediate variable, C is a third intermediate variable, D is a fourth intermediate variable, and E is a fifth intermediate variable.

In addition, it should be noted that the working principle of the second compression code stream obtained by compressing the second code stream data by the second compression module 500 using the hash algorithm is the same as or similar to the working principle of the first compression module 200. Accordingly, the second compression module 500 includes a second initialization setting unit, a second padding unit, a second additional length value unit, a second splitting unit, and a second processing unit (none of which are shown in the figure). Since the compression process of the second compression module 500 is the same as that of the first compression module 200, it is not described herein again.

The above-mentioned embodiments only express several embodiments of the present invention, and the description thereof is more specific and detailed, but not construed as limiting the scope of the present invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the inventive concept, which falls within the scope of the present invention. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims

1. A method of storing data, comprising the steps of:

analyzing a media file, acquiring media information and first code stream data of the media file, and storing the media information to a first preset file; the media information comprises audio size, video size, subtitle size and timestamp in the media file;

reading and outputting the first code stream data to a decoder for decoding, acquiring image information and second code stream data of the media file, and storing the image information to a second preset file; the image information comprises the size, pixel ratio, width, height and output pixel format of the image;

2. The data storage method of claim 1, further comprising the steps of:

3. The data storage method according to claim 1, wherein the compressing the first code stream data by using a hash algorithm compression method to obtain a first compressed code stream comprises the following steps:

setting an initialization value for hash value calculation;

4. The data storage method according to claim 1, wherein the compressing the second bitstream data by using the hash algorithm to obtain a second compressed bitstream comprises:

setting an initialization value for hash value calculation;

5. A data storage system is characterized by comprising an analysis storage module, a first compression module, a reading and storing module, a decoder and a second compression module; wherein,

the analysis storage module is used for analyzing the media file, acquiring the media information and the first code stream data of the media file, and storing the media information to a first preset file; the media information comprises audio size, video size, subtitle size and timestamp in the media file;

the decoder is used for decoding the first code stream data to obtain the image information and the second code stream data of the media file; the image information comprises the size, pixel ratio, width, height and output pixel format of the image;

6. The data storage system of claim 5, further comprising a comparison module;

7. The data storage system of claim 5, wherein the first compression module comprises a first initialization setting unit, a first padding unit, a first additional length value unit, a first splitting unit, and a first processing unit; wherein,