CN112486915A

CN112486915A - Data storage method and device

Info

Publication number: CN112486915A
Application number: CN202011511668.4A
Authority: CN
Inventors: 郑志升; 李天烨
Original assignee: Shanghai Bilibili Technology Co Ltd
Current assignee: Shanghai Bilibili Technology Co Ltd
Priority date: 2020-12-18
Filing date: 2020-12-18
Publication date: 2021-03-12
Anticipated expiration: 2040-12-18
Also published as: CN112486915B

Abstract

The embodiment of the application provides a data storage method, which comprises the following steps: acquiring data to be stored; judging the type of the data to be stored, and determining the index mode of the data to be stored according to the judged type; creating a storage index file of the data to be stored in a data bucket according to the determined index mode; and storing the data to be stored into a corresponding storage file according to the storage index file. The method and the device can reduce the time spent on updating the data.

Description

Data storage method and device

Technical Field

The embodiment of the application relates to the technical field of data processing, in particular to a data storage method and device.

Background

With the rapid development of network technology, the amount of various types of data is suddenly increased, and in the face of the suddenly increased data, different ground distributions are required for different types of data, for example, the data needs to be scattered and then stored in different data storage containers. In the prior art, when storing data into the HUDI database, an index for storing data is generally established according to data time, and then the data is stored according to the established index. However, the inventor finds that, in the prior art, when creating an index, the message digest calculation is performed on data, then the hash is performed according to the obtained message digest, and then the index of the data is stored into a different data bucket (bucket) according to the hash value. For example, when storing comment data in a comment scene, if 1000 pieces of comment data need to be stored in the current latest 1 minute, during storage, indexes of the 1000 pieces of comment data are created first, and then the indexes are evenly distributed and stored in 1000 buckets. Thus, when data is updated, all the buckets need to be read, and HFile (file) in each bucket needs to be read, an index of the data is found, and then the data is found according to the index and updated. With the index created in the above manner, when updating multiple pieces of data, multiple packets need to be read to read HFile to find each piece of data, which results in a long time for updating the data.

Disclosure of Invention

An embodiment of the present application provides a data storage method, an apparatus, a computer device, and a computer-readable storage medium, which can solve the problem that it takes a long time to update data in the prior art.

One aspect of an embodiment of the present application provides a data storage method, where the method includes:

acquiring data to be stored;

judging the type of the data to be stored, and determining the index mode of the data to be stored according to the judged type;

creating a storage index file of the data to be stored in a data bucket according to the determined index mode;

and storing the data to be stored into a corresponding storage file according to the storage index file.

Optionally, the determining the type of the data to be stored and determining the index mode of the data to be stored according to the determined type includes:

judging the type of the data to be stored to determine a preset scene corresponding to the data to be stored;

and determining the index mode of the data to be stored according to the preset scene information.

Optionally, the determining, according to the preset scene information, an index manner of the data to be stored includes:

when the type of the data to be stored is judged to be data of a first preset scene, determining the indexing mode of the data to be stored as indexing based on the timestamp of the data to be stored;

the creating of the storage index file of the data to be stored in the data bucket according to the determined index mode comprises:

determining a data bucket for storing the storage index file according to the creation time stamp of the data to be stored;

and creating the storage index file in the determined data bucket, wherein the storage index file comprises a storage path of the data to be stored and file information of the data to be stored.

Optionally, the determining, according to the creation timestamp of the data to be stored, a data bucket storing the storage index file includes:

calculating a message digest of the creation timestamp;

and carrying out hash operation on the calculated message digest to obtain a data bucket for storing the index file.

when the storage type of the data to be stored is judged to be data of a second preset scene, determining the indexing mode of the data to be stored as indexing based on user identification and a time interval;

determining a data bucket for storing the storage index file according to the user identification corresponding to the data to be stored;

and creating the storage index file in the determined data bucket according to the time interval corresponding to the data to be stored, wherein the storage index file comprises a storage path of the data to be stored and file information stored by the data to be stored.

Optionally, the creating the storage index file in the determined data bucket according to the time interval corresponding to the data to be stored includes:

and creating different storage index files in the determined data bucket according to different time intervals corresponding to the data to be stored.

Optionally, the determining, according to the user identifier corresponding to the data to be stored, a data bucket storing the index file includes:

calculating the message abstract of the user identification;

Optionally, the data storage method further includes:

when a data updating instruction is received, determining a storage index file of data to be updated according to the data updating instruction;

determining a storage path and storage file information of the data to be updated according to the determined storage index file;

and reading the data to be updated according to the storage path and the storage file information of the data to be updated, and updating the data to be updated.

Yet another aspect of an embodiment of the present application provides a data storage apparatus including:

the acquisition module is used for acquiring data to be stored;

the judging module is used for judging the type of the data to be stored and determining the index mode of the data to be stored according to the judged type;

the creating module is used for creating a storage index file of the data to be stored in a data bucket according to the determined index mode;

and the storage module is used for storing the data to be stored into the corresponding storage file according to the storage index file.

Yet another aspect of embodiments of the present application provides a computer device, comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the steps of the data storage method as described in any one of the above when executing the computer program.

Yet another aspect of embodiments of the present application provides a computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, is adapted to carry out the steps of the data storage method according to any one of the above.

According to the data storage method, the data storage system, the computer equipment and the computer readable storage medium, data to be stored are obtained; judging the type of the data to be stored, and determining the index mode of the data to be stored according to the judged type; creating a storage index file of the data to be stored in a data bucket according to the determined index mode; and storing the data to be stored into a corresponding storage file according to the storage index file. According to the method and the device, the index can be created in different modes according to different types of data when the index is created, so that the created index is matched with the data type, the storage position of the data can be conveniently found according to the created index, the data can be conveniently updated subsequently, and the time spent on updating the data is reduced.

Drawings

Fig. 1 schematically shows a schematic diagram of a data transmission system implementing a data storage method of an embodiment of the present application;

FIG. 2 schematically illustrates a flow diagram of a data storage method according to an embodiment of the present application;

FIG. 3 is a flow chart that schematically illustrates an embodiment of a detailed process of creating a storage index file of the data to be stored in a data bucket according to a determined indexing manner;

FIG. 4 is a flowchart illustrating a detailed process of determining a data bucket for storing the storage index file according to the creation timestamp of the data to be stored;

FIG. 5 is a flowchart schematically illustrating a detailed step of creating a storage index file of the data to be stored in a data bucket according to a determined indexing manner according to another embodiment;

FIG. 6 is a flowchart illustrating a detailed process of determining a data bucket for storing the index file according to the user identifier corresponding to the data to be stored;

FIG. 7 schematically illustrates a flow chart of a data storage method of another embodiment;

FIG. 8 schematically illustrates a block diagram of a data storage device according to an embodiment of the present application; and

fig. 9 schematically shows a hardware architecture diagram of a computer device suitable for implementing the data storage method according to an embodiment of the present application.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present application more clearly understood, the present application is further described in detail below with reference to the accompanying drawings and the embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

It should be noted that the description relating to "first", "second", etc. in the present invention is for descriptive purposes only and is not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one such feature. In addition, technical solutions between various embodiments may be combined with each other, but must be realized by a person skilled in the art, and when the technical solutions are contradictory or cannot be realized, such a combination should not be considered to exist, and is not within the protection scope of the present invention.

Fig. 1 schematically shows a schematic diagram of a data transmission system implementing an embodiment of the present application, which may be composed of the following parts: data source layer, network routing layer 2, data buffer layer 3, data distribution layer 4, data storage layer 3, etc.

The data source layer can comprise an internal data source and can also be a data interface connected with an external data source. The data source layer may have data in multiple formats, for example, the reported data of APP and Web are data in HTTP (HyperText Transfer Protocol), and the internal communication data of the server is data in RPC (Remote Procedure Call) format. As shown in fig. 1, the data of the data source layer may be Log data reported by the mobile terminal and received by one or more edge nodes, or may be data provided by various systems or devices, such as a database (e.g., Mysql), a Log Agent (Log Agent), and the like.

Via the gateway and messaging system, the data source layer may transmit data to Collector 2. Wherein:

and the gateway is used for forwarding the data provided by the data source layer to the message system. The gateway may be adapted to a variety of different service scenarios and data protocols, such as APP and Web data configured for compatible parsing of the HTTP (HyperText Transfer Protocol) Protocol, and intercom data of the GRPC Protocol.

And the message system can be composed of one or more Kafka clusters and is used for publishing the data in the data source layer to the corresponding subject. Data with different importance, priority and data throughput can be distributed to different kafka clusters, so that the value of different types of data is guaranteed, and the influence of system faults on the whole data is avoided.

And the Collector 2 is a streaming distribution node based on Flink. The Collector 2 may consume data through a corresponding theme of the message system and convert and distribute the data for storage, that is, guarantee that the data is obtained from the message system and written into a corresponding storage terminal in the data storage layer 3, for example, HDFS, Kafka, Hbase, ES (elastic search), and the like.

The data storage layer 3, which is used to store data, may be composed of different forms of databases. The data storage layer 3 may be a HUDI (apache HUDI) database, wherein the HUDI database may be used to manage large analytics data sets stored by DFS (HDFS or cloud storage), which support update operations in the current data table.

Example one

Fig. 2 schematically shows a flowchart of a data storage method according to a first embodiment of the present application. The data storage method may be applied to a HUDI database, and it should be understood that the flow chart in the embodiment of the method is not used to limit the order of executing the steps. The following description is made by taking a data storage device as an execution subject. As shown in fig. 2, the data storage method may include steps S20 to S23, in which:

step S20, data to be stored is acquired.

Specifically, the data to be stored may be a piece of data or a batch of data, and in this embodiment, the data to be stored is preferably a batch of data, for example, comment data issued by each user within 1 minute. In this embodiment, the data to be stored may be various types of data, for example, comment data in a comment scene, and for example, data related to the UP owner (for example, fan data of the UP owner) in a relationship chain scene.

And step S21, judging the type of the data to be stored, and determining the index mode of the data to be stored according to the judged type.

Specifically, when storing the data to be stored, in order to facilitate subsequent updating or searching of the data to be stored, in this embodiment, before storing the data, an Index (Index) of the data to be stored needs to be created first, so that when storing the data, the data may be stored according to the created Index, and when subsequently updating or querying the stored data, the data may be quickly searched according to the Index.

In this embodiment, in order to improve the performance of subsequent updating or querying of data, when an index is established, indexes of data may be established in different indexing manners according to different data types. Specifically, the type of the data to be stored may be determined by using a preset determination rule, and then, after the type of the data to be stored is determined, the index manner corresponding to the current type of data to be stored is determined according to a pre-created mapping table of data types and index manners.

In an exemplary embodiment, the step S21 may include the following steps: judging the type of the data to be stored to determine a preset scene corresponding to the data to be stored; and determining the index mode of the data to be stored according to the preset scene information.

Specifically, the index modes corresponding to the data to be stored corresponding to different scenes are different, and in order to determine the index mode of the data to be stored, in this embodiment, a preset determination rule may be first adopted to determine the type of the data to be stored, so as to determine which preset scene the data to be stored belongs to, and after the preset scene is determined, the index mode of the current data to be stored may be determined according to a mapping table between the preset scene and the index mode.

In an exemplary embodiment, the determining, according to the preset scene information, an index manner of the data to be stored may include the following steps: and when the type of the data to be stored is data of a first preset scene, determining the indexing mode of the data to be stored as indexing based on the timestamp of the data to be stored.

Specifically, the first preset scene is a scene in which the time of data generation has a relatively obvious rule, for example, a comment data scene. Generally, for each manuscript or video, more comment data can be generated within one day or one week of manuscript or video publishing, and the more late the manuscript or video publishing, the less comment data is generated, that is, the comment data scene is a scene with obvious rule in the data generation time.

As an example, when it is determined that the type of the data to be stored is comment data, the data to be stored may be indexed based on a time stamp of the comment data. For example, comment data posted in the last 1 minute is stored as one dimension in one data bucket, and comment data posted in the last 2 minutes to the last 1 minute is stored as another dimension in another data bucket.

In an exemplary embodiment, the determining, according to the preset scene information, an index manner of the data to be stored may further include: and when the storage type of the data to be stored is judged to be the data of a second preset scene, determining the indexing mode of the data to be stored as indexing based on the user identification and the time interval.

Specifically, the second preset scenario is a relationship chain between data generation and a user, and a scenario in which a time interval has a strong association, such as an UP main relationship chain scenario. Generally, when a user becomes a UP master, other users pay attention to the fact that the user becomes his fan, so that the fan is associated with the UP master, and when the other users become the fan of the user, generally, the user suddenly has a large number of users who become his fans within a certain period of time, and after a certain period of time, the increase of the number of fans becomes slower or even not. That is, the UP master relationship chain scene is a scene in which a relationship chain between data generation and a user and a time interval have a strong association.

As an example, when it is determined that the type of the data to be stored is fan data of UP main, the data to be stored may be indexed based on the user identifier (i.e., UP main identifier) corresponding to the fan data. For example, the fan data of the UP master a is stored into one data bucket as one dimension, such as data bucket 1, and the fan data of the UP master B is stored into another data bucket as one dimension, such as data bucket 2. Meanwhile, the association between the fan data and the time interval is relatively large, so that the time interval can be further used for indexing the data to be stored. For example, the fan data generated by the UP master a at number 1 is stored as another dimension into the storage index file1 of the data bucket 1, such as HFile1, and the fan data generated by the UP master a at number 2 is stored as another dimension into the storage index file2 of the data bucket 1, such as HFile 2; the fan data generated by the UP master B at number 1 is stored as another dimension into the storage index file1 of the data bucket 2, such as HFile1, and the fan data generated by the UP master B at number 2 is stored as another dimension into the storage index file2 of the data bucket 2, such as HFile 2.

And step S22, creating a storage index file of the data to be stored in the data bucket according to the determined index mode.

Specifically, the data bucket (bucket) is used for storing an index file, one data bucket may include multiple hfiles, and each Hfile is a storage index file and is used for storing index entry information of each data. Wherein each index entry information is recorded in the form of a Recordkey, each Recordkey including storage path information of the currently indexed data and stored file information.

As an example, the format of each index entry information is as follows:

recordkey X- > path, file, where path represents the storage path of the data, which is the pointer offset address (offset) in the data storage container, and file represents the stored file information.

In an exemplary embodiment, as shown in fig. 3, when the type of the data to be stored is data of a first preset scene, the step S22 may further include steps S30-S31, wherein: step S30, determining a data bucket for storing the storage index file according to the creation time stamp of the data to be stored; step S31, creating the storage index file in the determined data bucket, where the storage index file includes a storage path of the data to be stored and file information of the data to be stored.

Specifically, when creating the storage index file, a creation timestamp of data to be stored may be extracted, then a specific data bucket in which the storage index file is created may be determined according to the extracted timestamp, and after determining a data bucket, the storage index file may be created directly in the determined data bucket. The creation time stamp may also be referred to as a generation time stamp of the data or a release time stamp of the data, and is used to indicate when the data is generated.

In an exemplary embodiment, as shown in fig. 4, step S30 may further include steps S40-S41, wherein: step S40, calculating the message digest of the creation timestamp; and step S41, performing hash operation on the calculated message digest to obtain a data bucket for storing the index file.

Specifically, the message digest, also called digital digest, is a short message of a fixed length obtained by converting a message of an arbitrary length, and is similar to a Hash function, which is a function of an argument that is a message. The digital digests are a series of ciphertexts with fixed length (128 bits) formed by 'digests' of plaintext to be encrypted by adopting a one-way Hash function, which is also called digital fingerprints, and the ciphertexts have fixed length, and different digests of the plaintext are the ciphertexts, the result is always different, and the digests of the same plaintext must be consistent.

In this embodiment, the message digest calculation may be performed on the creation timestamp through a message digest algorithm to obtain a message digest corresponding to the creation timestamp. The message digest algorithm may be MD2, MD4, MD5, SHA-1, SHA-256, ripemm 128, ripemm 160, and the like, and in this embodiment, the message digest calculation is preferably performed on the creation timestamp by using MD5 algorithm.

As an example, a message digest obtained by performing message digest calculation on the creation timestamp through the MD5 algorithm is a 128-bit ciphertext, and hash calculation needs to be performed on the 128-bit ciphertext in order to determine that the data bucket storing the index storage file needs to perform hash calculation, in this embodiment, when performing hash calculation, the hash algorithm used may be: the data bucket is determined by dividing the 128-bit ciphertext by the number of buckets to perform a remainder operation, for example, if the number corresponding to the 128-bit ciphertext is 505, and the total number of the preset data buckets is 100, the data bucket storing the index file can be obtained as data bucket 5 after the hash operation.

In an embodiment, the data bucket may also be determined directly and simply according to the creation timestamp, for example, the storage index file of the data to be stored with the creation timestamp of the latest one minute is stored in the data bucket 1, the storage index file of the data to be stored with the creation timestamp of the latest two minutes to one minute is stored in the data bucket 2, and the storage index file of the data to be stored with the creation timestamp of the latest three minutes to two minutes is stored in the data bucket 3.

In an exemplary embodiment, as shown in fig. 5, when the type of the data to be stored is data of the second preset scene, the step S22 may further include steps S50-S51, wherein: step S50, determining a data bucket for storing the index file according to the user identification corresponding to the data to be stored; step S51, creating the storage index file in the determined data bucket according to the time interval corresponding to the data to be stored, where the storage index file includes a storage path of the data to be stored and file information of the data to be stored.

Specifically, when creating the storage index file, a user identifier of data to be stored may be extracted first, then a specific data bucket in which the storage index file is created is determined according to the user identifier, and after determining a data bucket, the storage index file may be created directly in the determined data bucket according to a time interval corresponding to the data to be stored. The user identifier is used to distinguish which user the data to be stored is specifically associated with, for example, if the data to be stored is fan data of a UP master, the user identifier is an account of the UP master, or a nickname of the UP master and the like can uniquely distinguish information of different UP masters.

For example, if the time for generating the data to be stored is the latest 1 day, the time interval corresponding to the data to be stored may be determined to be time interval 1, and if the time for generating the data to be stored is generated from the latest one day to the latest 2 days, the time interval corresponding to the data to be stored may be determined to be time interval 2. It can be understood that, in order to determine the time interval corresponding to the data to be stored, a determination rule of the time interval needs to be set in advance, so that after the data to be stored is subsequently acquired, the time interval corresponding to the data to be stored can be determined according to the determination rule.

In an exemplary embodiment, when the storage index file is created in the determined data bucket according to the time interval corresponding to the data to be stored, different storage index files may be created in the determined data bucket according to different time intervals corresponding to the data to be stored.

As an example, when the determined data bucket is data bucket 1, and the time interval corresponding to the data to be stored is time interval 1, then storage index file1 may be created in the data bucket; when the determined data bucket is the data bucket 1 and the time interval corresponding to the data to be stored is the time interval 2, the storage index file2 may be created in the data bucket, that is, the storage index files created in the data bucket at different time intervals are different.

In an exemplary embodiment, as shown in fig. 6, step S50 may further include steps S60-S61, wherein: step S60, calculating the message abstract of the user identification; and step S61, performing hash operation on the calculated message digest to obtain a data bucket for storing the index file.

In this embodiment, the message digest calculation may be performed on the user identifier through a message digest algorithm to obtain a message digest corresponding to the user identifier. The message digest algorithm may be MD2, MD4, MD5, SHA-1, SHA-256, ripemm 128, ripemm 160, and the like, and in this embodiment, the message digest calculation is preferably performed on the creation timestamp by using MD5 algorithm.

As an example, a message digest obtained by performing message digest calculation on a user identifier through an MD5 algorithm is a 128-bit ciphertext, and hash calculation needs to be performed on the 128-bit ciphertext to determine that a data bucket storing an index storage file needs to perform hash calculation, in this embodiment, when performing hash calculation, the hash algorithm used may be: the data bucket is determined by dividing the 128-bit ciphertext by the number of buckets to perform a remainder operation, for example, if the number corresponding to the 128-bit ciphertext is 615, and the total number of the preset data buckets is 100, the data bucket storing the index file can be obtained as the data bucket 15 after the hash operation.

And step S23, storing the data to be stored into a corresponding storage file according to the storage index file.

Specifically, after the index storage file is created, the index information in the index storage file may be used to store the index information into the corresponding storage file. The storage file is a file which is actually located in the data storage container and used for storing data to be stored.

As an example, it is assumed that the index storage file includes index information "Recordkey 1- > path X, file a", which indicates that data to be stored needs to be stored in file a (the storage file) under path X.

In an exemplary embodiment, as shown in fig. 7, the data storage method may further include: steps S70-S72, wherein: step S70, when a data updating instruction is received, determining a storage index file of data to be updated according to the data updating instruction; step S71, determining the storage path and the storage file information of the data to be updated according to the determined storage index file; step S72, reading the data to be updated according to the storage path and the storage file information of the data to be updated, and updating the data to be updated.

As an example, assuming that data a needs to be updated, after an update instruction of data a is received, a storage index file corresponding to the data a is queried in a data bucket according to the data a, and assuming that the storage index file queried for the data a is a storage index file1 in a data bucket 2, a storage path and storage file information of the data a are then obtained from the storage index file1, and finally, the data a can be found according to the information, read into a memory, updated, and rewritten into a file after the update is completed.

Fig. 8 is a block diagram of a data storage device 800 that may be partitioned into one or more program modules, stored in a storage medium, and executed by one or more processors to implement an embodiment of the present application, according to an embodiment of the present application. The program modules referred to in the embodiments of the present application refer to a series of computer program instruction segments that can perform specific functions, and the following description will specifically describe the functions of the program modules in the embodiments. As shown in fig. 8, the data storage device 800 may include: an acquisition module 801, a judgment module 802, a creation module 803, and a storage module 804.

An obtaining module 801, configured to obtain data to be stored.

The determining module 802 is configured to determine the type of the data to be stored, and determine an index manner of the data to be stored according to the determined type.

A creating module 803, configured to create a storage index file of the to-be-stored data in a data bucket according to the determined index manner

The storage module 804 is configured to store the data to be stored into a corresponding storage file according to the storage index file.

In an exemplary embodiment, the determining module 802 is further configured to determine a type of the data to be stored, so as to determine a preset scene corresponding to the data to be stored; and determining the index mode of the data to be stored according to the preset scene information.

In an exemplary embodiment, the determining module 802 is further configured to determine, when it is determined that the type of the data to be stored is data of a first preset scene, that an indexing manner of the data to be stored is to perform indexing based on a timestamp of the data to be stored.

The creating module 803 is further configured to determine, according to the creating timestamp of the data to be stored, a data bucket for storing the storage index file; and creating the storage index file in the determined data bucket, wherein the storage index file comprises a storage path of the data to be stored and file information of the data to be stored.

In an exemplary embodiment, the creating module 803 is further configured to calculate a message digest of the creation timestamp; and carrying out hash operation on the calculated message digest to obtain a data bucket for storing the index file.

In an exemplary embodiment, the determining module 802 is further configured to determine, when it is determined that the storage type of the data to be stored is data of a second preset scene, that an indexing manner of the data to be stored is based on a user identifier and a time interval for indexing.

The creating module 803 is further configured to determine, according to the user identifier corresponding to the data to be stored, a data bucket for storing the storage index file; and creating the storage index file in the determined data bucket according to the time interval corresponding to the data to be stored, wherein the storage index file comprises a storage path of the data to be stored and file information stored by the data to be stored.

In an exemplary embodiment, the creating module 803 is further configured to create different storage index files in the determined data buckets according to different time intervals corresponding to the data to be stored.

In an exemplary embodiment, the creating module 803 is further configured to calculate a message digest of the user identifier; and carrying out hash operation on the calculated message digest to obtain a data bucket for storing the index file.

In an exemplary embodiment, the data storage device 800 may include: the device comprises a receiving module, a determining module and an updating module.

And the receiving module is used for determining a storage index file of the data to be updated according to the data updating instruction when the data updating instruction is received.

The determining module is used for determining the storage path and the storage file information of the data to be updated according to the determined storage index file.

And the updating module is used for reading the data to be updated according to the storage path and the storage file information of the data to be updated and updating the data to be updated.

According to the embodiment of the application, the data to be stored are obtained; judging the type of the data to be stored, and determining the index mode of the data to be stored according to the judged type; creating a storage index file of the data to be stored in a data bucket according to the determined index mode; and storing the data to be stored into a corresponding storage file according to the storage index file. According to the method and the device, the index can be created in different modes according to different types of data when the index is created, so that the created index is matched with the data type, the storage position of the data can be conveniently found according to the created index, the data can be conveniently updated subsequently, and the time spent on updating the data is reduced.

Fig. 9 schematically shows a hardware architecture diagram of a computer device suitable for implementing the data storage method according to an embodiment of the present application. In the present embodiment, the computer device 20 is a device capable of automatically performing numerical calculation and/or information processing in accordance with a command set or stored in advance. For example, it may be a data forwarding device such as a gateway. As shown in fig. 9, the computer device 20 includes at least, but is not limited to: the memory 21, processor 22, and network interface 23 may be communicatively coupled to each other by a system bus. Wherein:

the memory 21 includes at least one type of computer-readable storage medium including a flash memory, a hard disk, a multimedia card, a card type memory (e.g., SD or DX memory, etc.), a Random Access Memory (RAM), a Static Random Access Memory (SRAM), a Read Only Memory (ROM), an Electrically Erasable Programmable Read Only Memory (EEPROM), a Programmable Read Only Memory (PROM), a magnetic memory, a magnetic disk, an optical disk, etc. In some embodiments, the storage 21 may be an internal storage module of the computer device 20, such as a hard disk or a memory of the computer device 20. In other embodiments, the memory 21 may also be an external storage device of the computer device 20, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like, provided on the computer device 20. Of course, the memory 21 may also include both internal and external memory modules of the computer device 20. In this embodiment, the memory 21 is generally used for storing an operating system installed in the computer device 20 and various types of application software, such as program codes of a data storage method. Further, the memory 21 may also be used to temporarily store various types of data that have been output or are to be output.

Processor 22 may be a Central Processing Unit (CPU), controller, microcontroller, microprocessor, or other data Processing chip in some embodiments. The processor 22 is generally configured to control the overall operation of the computer device 20, such as performing control and processing related to data interaction or communication with the computer device 20. In this embodiment, the processor 22 is configured to execute the program code stored in the memory 21 or process data.

The network interface 23 may comprise a wireless network interface or a wired network interface, and the network interface 23 is typically used to establish a communication connection between the computer device 20 and other computer devices. For example, the network interface 23 is used to connect the computer device 20 with an external terminal through a network, establish a data storage channel and a communication connection between the computer device 20 and the external terminal, and the like. The network may be a wireless or wired network such as an Intranet (Intranet), the Internet (Internet), a Global System of Mobile communication (GSM), Wideband Code Division Multiple Access (WCDMA), a 4G network, a 5G network, Bluetooth (Bluetooth), or Wi-Fi.

It is noted that fig. 9 only shows a computer device with components 21-23, but it is to be understood that not all of the shown components are required to be implemented, and that more or less components may be implemented instead.

In this embodiment, the data storage method stored in the memory 21 can be further divided into one or more program modules and executed by one or more processors (in this embodiment, the processor 22) to complete the present invention.

The present embodiment also provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the data storage method in the embodiments.

In this embodiment, the computer-readable storage medium includes a flash memory, a hard disk, a multimedia card, a card type memory (e.g., SD or DX memory, etc.), a Random Access Memory (RAM), a Static Random Access Memory (SRAM), a Read Only Memory (ROM), an Electrically Erasable Programmable Read Only Memory (EEPROM), a Programmable Read Only Memory (PROM), a magnetic memory, a magnetic disk, an optical disk, and the like. In some embodiments, the computer readable storage medium may be an internal storage unit of the computer device, such as a hard disk or a memory of the computer device. In other embodiments, the computer readable storage medium may be an external storage device of the computer device, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like provided on the computer device. Of course, the computer-readable storage medium may also include both internal and external storage devices of the computer device. In the present embodiment, the computer-readable storage medium is generally used for storing an operating system and various types of application software installed in the computer device, for example, the program codes of the data storage method in the embodiment, and the like. Further, the computer-readable storage medium may also be used to temporarily store various types of data that have been output or are to be output.

It will be apparent to those skilled in the art that the modules or steps of the embodiments of the invention described above may be implemented by a general purpose computing device, they may be centralized on a single computing device or distributed across a network of multiple computing devices, and alternatively, they may be implemented by program code executable by a computing device, such that they may be stored in a storage device and executed by a computing device, and in some cases, the steps shown or described may be performed in an order different than that described herein, or they may be separately fabricated into individual integrated circuit modules, or multiple ones of them may be fabricated into a single integrated circuit module. Thus, embodiments of the invention are not limited to any specific combination of hardware and software.

The above description is only a preferred embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes, which are made by using the contents of the present specification and the accompanying drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.

Claims

1. A method of data storage, the method comprising:

acquiring data to be stored;

2. The data storage method according to claim 1, wherein judging the type of the data to be stored, and determining the index mode of the data to be stored according to the judged type comprises:

3. The data storage method according to claim 2, wherein the determining, according to the preset scene information, the index manner of the data to be stored comprises:

when the type of the data to be stored is data of a first preset scene, determining that the indexing mode of the data to be stored is indexing based on the timestamp of the data to be stored;

4. The data storage method according to claim 3, wherein the determining a data bucket for storing the storage index file according to the creation timestamp of the data to be stored comprises:

calculating a message digest of the creation timestamp;

5. The data storage method according to claim 2, wherein the determining, according to the preset scene information, the index manner of the data to be stored comprises:

when the storage type of the data to be stored is data of a second preset scene, determining that the indexing mode of the data to be stored is based on user identification and time interval to perform indexing;

6. The data storage method according to claim 5, wherein the creating the storage index file in the determined data bucket according to the time interval corresponding to the data to be stored comprises:

7. The data storage method of claim 5, wherein the determining, according to the user identifier corresponding to the data to be stored, the data bucket storing the storage index file comprises:

calculating the message abstract of the user identification;

8. The data storage method of any of claims 1 to 7, further comprising:

9. A data storage device, comprising:

the acquisition module is used for acquiring data to be stored;

10. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor, when executing the computer program, is adapted to carry out the steps of the data storage method according to any of claims 1 to 8.

11. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, is adapted to carry out the steps of the data storage method according to any one of claims 1 to 8.