CN112988726B - Method, device, computer equipment and storage medium for data management - Google Patents

Method, device, computer equipment and storage medium for data management Download PDF

Info

Publication number
CN112988726B
CN112988726B CN202110315932.5A CN202110315932A CN112988726B CN 112988726 B CN112988726 B CN 112988726B CN 202110315932 A CN202110315932 A CN 202110315932A CN 112988726 B CN112988726 B CN 112988726B
Authority
CN
China
Prior art keywords
data
processed
storage unit
format
information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110315932.5A
Other languages
Chinese (zh)
Other versions
CN112988726A (en
Inventor
刘泽宙
林晓斌
金晓波
王铮
黄庆伟
邓俊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN202110315932.5A priority Critical patent/CN112988726B/en
Publication of CN112988726A publication Critical patent/CN112988726A/en
Application granted granted Critical
Publication of CN112988726B publication Critical patent/CN112988726B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Quality & Reliability (AREA)
  • Computing Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The disclosure relates to a data management method, a data management device, computer equipment and a storage medium, and relates to the technical field of data processing. The data management method comprises the following steps: receiving a call request containing data information to be processed and corresponding configuration information; acquiring data to be processed based on the data information to be processed; and selectively processing the data to be processed based on the configuration information.

Description

Method, device, computer equipment and storage medium for data management
Technical Field
The present disclosure relates to the field of data processing technology, and in particular, to a method, an apparatus, a computer device, a non-transitory computer readable storage medium, and a computer program product for data management.
Background
With the rapid development of internet technology, various businesses in the real world generate a large amount of data, which is very important for the development of various businesses and the analysis of business conditions. Thus, the user's demands for the storage of data are also increasing. However, the data existing on the existing network is not only extremely large, but also different in format, size and coding, and the requirements for the storage system are different. Therefore, for the storage of these data, the problems of repeated work, low efficiency, waste of manpower and the like are very easy to be caused, and the safety and reliability of the data storage cannot be ensured.
The approaches described in this section are not necessarily approaches that have been previously conceived or pursued. Unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section. Similarly, the problems mentioned in this section should not be considered as having been recognized in any prior art unless otherwise indicated.
Disclosure of Invention
According to a first aspect of the present disclosure, there is provided a method of data management, comprising: receiving a call request containing data information to be processed and corresponding configuration information; acquiring data to be processed based on the data information to be processed; and selectively processing the data to be processed based on the configuration information.
According to a second aspect of the present disclosure, there is provided an apparatus for data management, comprising: the receiving module is configured to receive a call request containing data to be processed and corresponding configuration information; the acquisition module is configured to acquire data to be processed based on the data information to be processed; and a processing module configured to selectively process the data to be processed based on the configuration information.
According to a third aspect of the present disclosure, there is provided a computer device comprising: memory, a processor and a computer program stored on the memory. The processor is configured to execute the computer program to implement the steps of the method of data management according to the present disclosure.
According to a fourth aspect of the present disclosure, there is provided a non-transitory computer readable storage medium having a computer program stored thereon. The computer program, when executed by a processor, implements the steps of the method of data management according to the present disclosure.
According to a fifth aspect of the present disclosure, there is provided a computer program product comprising a computer program, wherein the computer program, when executed by a processor, implements the steps of the method of data management according to the present disclosure.
According to one or more embodiments of the present disclosure, by performing customized processing on data to be processed according to configuration information provided by a calling terminal, not only can the diversified requirements of various different service scenarios be satisfied, but also the convenience of the calling terminal can be increased, and the usability and security of the storage system can be improved.
Drawings
The accompanying drawings illustrate exemplary embodiments and, together with the description, serve to explain exemplary implementations of the embodiments. The illustrated embodiments are for exemplary purposes only and do not limit the scope of the claims. Throughout the drawings, identical reference numerals designate similar, but not necessarily identical, elements.
FIG. 1 illustrates a schematic diagram of an exemplary system in which various methods described herein may be implemented, in accordance with an embodiment of the present disclosure;
FIG. 2 illustrates a flow chart of a method of data management according to some embodiments of the present disclosure;
FIG. 3 shows a flow chart of the selective processing steps of FIG. 2;
FIG. 4 illustrates a block diagram of a data management device according to some embodiments of the present disclosure;
FIG. 5 shows a block diagram of a data management device according to further embodiments of the present disclosure; and
fig. 6 illustrates a block diagram of an exemplary server and client that can be used to implement embodiments of the present disclosure.
Detailed Description
In the present disclosure, the use of the terms "first," "second," and the like to describe various elements is not intended to limit the positional relationship, timing relationship, or importance relationship of the elements, unless otherwise indicated, and such terms are merely used to distinguish one element from another. In some examples, a first element and a second element may refer to the same instance of the element, and in some cases, they may also refer to different instances based on the description of the context.
The terminology used in the description of the various illustrated examples in this disclosure is for the purpose of describing particular examples only and is not intended to be limiting. Unless the context clearly indicates otherwise, the elements may be one or more if the number of the elements is not specifically limited. Furthermore, the term "and/or" as used in this disclosure encompasses any and all possible combinations of the listed items.
The existing data on the existing network is extremely large in data, different in format, size and coding, and different in requirements on a storage system. For the storage of the data, the problems of repeated work, low efficiency, waste of manpower and the like are very easy to cause, and the safety and the reliability of the data storage cannot be ensured.
In the method, the data to be processed is subjected to customized processing according to the configuration information provided by the calling end, so that not only can the diversified requirements of various different service scenes be met, but also the convenience of the calling end can be increased, and the usability and the safety of the storage system are improved.
Embodiments of the present disclosure will be described in detail below with reference to the accompanying drawings.
Fig. 1 illustrates a schematic diagram of an exemplary system 100 in which various methods and apparatus described herein may be implemented, in accordance with an embodiment of the present disclosure. Referring to fig. 1, the system 100 includes one or more client devices 101, 102, 103, 104, 105, and 106, a server 120, and one or more communication networks 110 coupling the one or more client devices to the server 120. Client devices 101, 102, 103, 104, 105, and 106 may be configured to execute one or more applications.
In an embodiment of the present disclosure, the server 120 may run one or more services or software applications that enable the method of data management according to the present disclosure.
In some embodiments, server 120 may also provide other services or software applications that may include non-virtual environments and virtual environments. In some embodiments, these services may be provided as web-based services or cloud services, for example, provided to users of client devices 101, 102, 103, 104, 105, and/or 106 under a software as a service (SaaS) model.
In the configuration shown in fig. 1, server 120 may include one or more components that implement the functions performed by server 120. These components may include software components, hardware components, or a combination thereof that are executable by one or more processors. A user operating client devices 101, 102, 103, 104, 105, and/or 106 may in turn utilize one or more client applications to interact with server 120 to utilize the services provided by these components. It should be appreciated that a variety of different system configurations are possible, which may differ from system 100. Accordingly, FIG. 1 is one example of a system for implementing the various methods described herein and is not intended to be limiting.
The user may upload or download data using client devices 101, 102, 103, 104, 105, and/or 106. The client device may provide an interface that enables a user of the client device to interact with the client device. The client device may also output information to the user via the interface. Although fig. 1 depicts only six client devices, those skilled in the art will appreciate that the present disclosure may support any number of client devices.
Client devices 101, 102, 103, 104, 105, and/or 106 may include various types of computer devices, such as portable handheld devices, general purpose computers (such as personal computers and laptop computers), workstation computers, wearable devices, gaming systems, thin clients, various messaging devices, sensors or other sensing devices, and the like. These computer devices may run various types and versions of software applications and operating systems, such as Microsoft Windows, apple iOS, UNIX-like operating systems, linux, or Linux-like operating systems (e.g., *** Chrome OS); or include various mobile operating systems such as Microsoft Windows Mobile OS, iOS, windows Phone, android. Portable handheld devices may include cellular telephones, smart phones, tablet computers, personal Digital Assistants (PDAs), and the like. Wearable devices may include head mounted displays and other devices. The gaming system may include various handheld gaming devices, internet-enabled gaming devices, and the like. The client device is capable of executing a variety of different applications, such as various Internet-related applications, communication applications (e.g., email applications), short Message Service (SMS) applications, and may use a variety of communication protocols. An application executed by a client device is capable of making calls to a data management system for implementing a method according to the present disclosure.
Network 110 may be any type of network known to those skilled in the art that may support data communications using any of a number of available protocols, including but not limited to TCP/IP, SNA, IPX, etc. For example only, the one or more networks 110 may be a Local Area Network (LAN), an ethernet-based network, a token ring, a Wide Area Network (WAN), the internet, a virtual network, a Virtual Private Network (VPN), an intranet, an extranet, a Public Switched Telephone Network (PSTN), an infrared network, a wireless network (e.g., bluetooth, WIFI), and/or any combination of these and/or other networks.
The server 120 may include one or more general-purpose computers, special-purpose server computers (e.g., PC (personal computer) servers, UNIX servers, mid-end servers), mainframe computers, server clusters, or any other suitable arrangement and/or combination. The server 120 may include one or more virtual machines running a virtual operating system, or other computing architecture that involves virtualization (e.g., one or more flexible pools of logical storage devices that may be virtualized to maintain virtual storage devices of the server). In various embodiments, server 120 may run one or more services or software applications that provide the functionality described below.
The computing units in server 120 may run one or more operating systems including any of the operating systems described above as well as any commercially available server operating systems. Server 120 may also run any of a variety of additional server applications and/or middle tier applications, including HTTP servers, FTP servers, CGI servers, JAVA servers, database servers, etc.
In some implementations, the server 120 can include one or more applications to analyze and incorporate data feeds and/or event updates sent from the calling ends of the client devices 101, 102, 103, 104, 105, and 106. Server 120 may also include one or more applications to display data feeds and/or real-time events via one or more display devices of client devices 101, 102, 103, 104, 105, and 106.
The system 100 may also include one or more databases 130. In some embodiments, these databases may be used to store data and other information. For example, one or more of databases 130 may be used to store information such as audio files and video files. The data store 130 may reside in a variety of locations. For example, the data store used by the server 120 may be local to the server 120, or may be remote from the server 120 and may communicate with the server 120 via a network-based or dedicated connection. The data store 130 may be of different types. In some embodiments, the data store used by server 120 may be a database, such as a relational database. One or more of these databases may store, update, and retrieve the databases and data from the databases in response to the commands.
In some embodiments, one or more of databases 130 may also be used by applications to store application data. The databases used by the application may be different types of databases, such as key value stores, object stores, or conventional stores supported by the file system.
The system 100 of fig. 1 may be configured and operated in various ways to enable application of the various methods and apparatus described in accordance with the present disclosure.
Fig. 2 illustrates a flow chart of a method of data management according to some embodiments of the present disclosure. The method 2000 comprises the following steps: step S202, receiving a call request containing data information to be processed and corresponding configuration information; step S204, based on the information of the data to be processed, acquiring the data to be processed; and step S206, selectively processing the data to be processed based on the configuration information. By the mode, the method and the device not only can meet the diversified requirements of various different service scenes, but also can increase the convenience of a calling end and improve the usability and the safety of a storage system.
In step S202, a call request containing data information to be processed and corresponding configuration information is received.
According to some embodiments, the data information to be processed includes at least one of data to be processed and index information for downloading the data to be processed. Wherein the index information includes download address information or identification information (e.g., file name, MD5 value, number, etc.) for uniquely identifying the data to be processed.
According to some embodiments, the configuration information is used to indicate the processing operations that need to be performed on the uploaded or downloaded data to be processed. In some examples, the configuration information includes at least one of data format requirements, storage unit information, data deduplication requests, and data availability times, among others.
According to some embodiments, the call request includes at least one of a request to upload the data to be processed (i.e., to upload an operation), a request to download the data to be processed (i.e., to download an operation), a request to migrate the data to be processed (i.e., to migrate the data), and a request to backup the data to be processed (i.e., to backup the data).
According to some embodiments, the call request is issued by a calling end accessing a data management system for implementing method 2000. Herein, a "calling end" may include any of a device, system, or application, etc. At this time, step S204 may include selectively accessing the calling terminal to the data management system through at least one access mode of the remote procedure call RPC and the software development kit SDK, thereby implementing access of the calling terminals of more kinds, and freely selecting the access mode.
In step S204, the data to be processed is acquired based on the data information to be processed.
According to some embodiments, the data to be processed may include audio files, video files, pictures, document files, and the like.
According to some embodiments, when the data information to be processed includes data to be processed, and the call request indicates a request for uploading the data to be processed, step S204 includes directly acquiring the data to be processed from the data information to be processed. Further, when the data information to be processed includes index information for downloading the data to be processed, and when the call request indicates a request for downloading the data to be processed, step S204 includes acquiring the data to be processed from at least one storage unit based on the data information to be processed (e.g., index information). Wherein the data to be processed corresponding to the data information to be processed is stored in a certain storage unit of the at least one storage unit.
In step S206, the data to be processed is selectively processed based on the configuration information.
According to the configuration information for indicating the processing operation required to be performed on the uploaded or downloaded data to be processed, the data to be processed is selectively and correspondingly processed, so that the user can customize the uploaded or downloaded data by using the method 2000, and the requirements of diversification of different service scenes are met. For example, the user may download an arbitrary format file of the same audio through the process of format conversion of the audio to be downloaded by the method 2000. In some examples, the data to be processed may be selectively processed using a responsibility chain mode. That is, a chain is formed by connecting a plurality of functional modules that process data to be processed so that a call request can be passed on the chain so that one or more functional modules on the chain process the data to be processed. The calling end that issued this call request does not know which processing module or modules on the chain ultimately processed the call request, which allows the data management system implementing method 2000 to dynamically reorganize and allocate responsibilities without affecting the calling end.
According to some embodiments, when the call request indicates a request to upload the data to be processed, the method 2000 may further include determining whether at least one of a format, an encoding, a size, and the like of the data to be processed meets a specific requirement, before step S206; and in response to determining that the at least one of the format, the encoding, and the size of the data to be processed does not meet the specific requirements, sending a data unavailability notification, thereby ensuring availability of the data, avoiding waste of computing resources and storage resources, and preparing pre-information for subsequent data processing. For example, when the data to be processed is an audio file, the format of the data may include mp3, wav, m4a, etc., the size of the data may include an audio duration or size, and the encoding of the data may include pulse code modulation PCM encoding, OGG encoding, etc. In some examples, the size, format, encoding of the data, identification information (e.g., MD5 values) for identifying the data to be processed, etc. may be detected by a multimedia stream analysis tool FFprobe, and it may be determined whether at least one of the size, format, and encoding of the data to be processed meets certain requirements.
According to some embodiments, after detecting the size, format, encoding, and identification information (e.g., MD5 value) of the data, etc., at least one of the size, format, encoding, and identification information of the data, etc., may also be stored.
Fig. 3 shows a flow chart of the selective processing steps of fig. 2. In fig. 3, the call request indicates a request to upload the data to be processed. According to some embodiments, when the configuration information includes data format requirements, step S206 may include: step S2062, determining whether the format of the data to be processed meets the data format requirement; and step S2063, in response to determining that the format of the data to be processed does not meet the data format requirement, converting the format of the data to be processed based on the data format requirement. In step S2061, if it is determined that the configuration information includes a data format requirement, steps S2062 and S2063 will be performed, and if it is determined that the configuration information does not include a data format requirement, steps S2062 and S2063 are skipped to perform the processing of the next step. The data format requirement refers to that the calling end indicates to perform format conversion on the data to be processed, and can include target format information. For example, when the data to be processed is an audio file, the target format information may include any one of mp3, wav, m4a, and the like. Taking an audio file with the data A as mp3 as an example, the target format information in the data format requirement of the calling end is wav, and if the format of the data A is determined not to meet the data format requirement, converting the data A from mp3 to wav. The data to be processed is converted into the specific format and then stored according to the requirement of the calling end, so that the calling end can use the data conveniently. By determining whether the format of the data to be processed meets the data format requirement or not in advance, the calculated amount of data conversion is reduced, and the occupation of calculation resources is reduced. In some examples, the data to be processed may be directly converted in format without determining in advance whether the format of the data to be processed meets the data format requirement. In some examples, the format of the data to be processed may be converted by a multimedia video processing tool FFmpeg.
According to some embodiments, the configuration information may also include data encoding requirements, at which time step S206 may further include determining whether the encoding of the data to be processed meets the data encoding requirements; and in response to determining that the encoding of the data to be processed does not meet the data encoding requirements, converting the encoding of the data to be processed based on the data encoding requirements. The data coding requirement refers to that the calling end indicates to perform coding conversion on the data to be processed, and the target coding information can be included. The feature of converting the code of the data to be processed is the same as the feature of converting the format of the data to be processed, and will not be described here again.
According to some embodiments, the configuration information may also include an input/output IO stream conversion request, and at this time, step S206 may further include performing IO stream conversion on the data to be processed based on the IO stream conversion request. The IO stream conversion requirement refers to that the calling end indicates to perform IO stream conversion on the data to be processed.
According to some embodiments, as shown in fig. 3, when the configuration information includes a data deduplication request, step S206 may further include: step S2065, determining whether or not there is data identical to the content of the data to be processed in at least one storage unit; and step S2066, in response to determining that the data which is the same as the content of the data to be processed exists in at least one storage unit, sending a data repetition notification, thereby avoiding file repeated storage and wasting storage resources. In step S2064, steps S2065 and S2066 are performed if it is determined that the configuration information includes a data deduplication request, and steps S2065 and S2066 are not performed if it is determined that the configuration information does not include a data deduplication request. The data deduplication request indicates that deduplication processing is performed on the repeatedly stored data; the at least one storage unit is a storage unit interfacing with a data management system for implementing the method 2000, i.e., the data to be processed is stored in the at least one storage unit. In some examples, it may be determined whether the contents of the data to be processed are the same by the identification information for identifying the data to be processed. The identification information is, for example, an MD5 value. In some examples, the data information to be processed may further include identification information, and when the data to be processed is stored, the corresponding identification information may be stored, so as to determine whether there is data identical to the content of the data to be processed by comparing whether there is identical identification information in the storage unit. In some examples, the method 2000 may further include detecting identification information of the data to be processed, and storing the data to be processed and the detected identification information, thereby determining whether there is data identical to the content of the data to be processed by comparing whether there is identical identification information in the storage unit. In some examples, when sending the data repetition notification, whether to continue uploading the data to be processed can be judged according to the requirement of the calling end, or the request for indicating to upload the data to be processed can be directly refused.
According to some embodiments, as shown in fig. 3, when the configuration information includes storage unit information, step S206 may further include: step S2068, storing the data to be processed in a storage unit corresponding to the storage unit information in at least one storage unit, thereby storing the data in a specific storage unit according to the demand of the calling terminal, so as to be used by the calling terminal. If in step S2067, it is determined that the configuration information includes storage unit information, step S2068 will be performed. If it is determined in step S2067 that the configuration information does not include the storage unit information, step S2068 is not performed and the data to be processed is directly stored in any storage unit. In some examples, the at least one storage unit may include two major classes of file storage and object storage, including, for example, at least one of a hundred degree object storage BOS system, a network file system NFS, a distributed file system CEPH, and a local storage, thereby enabling access to multiple underlying file storage modes, increasing diversity of data storage.
According to some embodiments, to increase the reliability and security of the storage, step S2068 may include: determining whether a storage unit corresponding to the storage unit information is available; and in response to determining that the storage unit corresponding to the storage unit information is not available, storing the data to be processed in the available storage unit in the at least one storage unit. This allows to store the data to be processed in the available memory unit when it is detected that the memory unit required by the calling terminal is not available. In addition, the mode is the dynamic switching of the storage unit under the condition that the calling end does not sense, so that the persistence of various requests of the calling end is ensured. Wherein, the case that the storage unit is not available includes: interruption of communication with the storage unit, failure of the storage unit itself, the storage unit being in an upgraded state, and the like.
Although the operations are depicted in the drawings in a particular order, this should not be understood as requiring that the operations be performed in the particular order shown or in sequential order, nor should it be understood that all illustrated operations must be performed in order to achieve desirable results. For example, steps S2064 to S2066 may be performed before step S2061 or concurrently with step S2061. For another example, S2064 to S2066 may even be omitted.
According to some embodiments, when the call request indicates a request for downloading the data to be processed, and the configuration information includes a data format requirement, step S206 further includes performing format conversion on the data to be processed based on the data format requirement, so as to realize downloading of data in a specific format according to the requirement of the call end, thereby meeting the requirements of various service scenarios. Features in this step are the same as those of the step of performing format conversion on the data to be processed in the uploading flow of fig. 3, and will not be described here again. In some examples, before converting the format of the data to be processed, it may be determined whether the format of the data to be processed meets the data format requirement. And when the format of the data to be processed does not meet the data format requirement, converting the format of the data to be processed. In some examples, the data to be processed may be directly converted in format without determining in advance whether the format of the data to be processed meets the data format requirement.
According to some embodiments, when the call request indicates a request for downloading the data to be processed, and the configuration information includes a data encoding requirement, step S206 further includes performing transcoding on the data to be processed based on the data encoding requirement, so as to achieve downloading of the data with specific encoding according to the requirements of the call end, thereby meeting the requirements of multiple service scenarios. Features in this step are the same as those of the step of performing code conversion on the data to be processed in the uploading flow of fig. 3, and will not be described here again.
According to some embodiments, when a call request indicates a request to download data to be processed, it may be determined whether the configuration information includes a data availability time. When the configuration information includes a data available time, step S206 may include generating a theft protection link for the data to be processed based on the data available time, identification information for identifying the data to be processed, and a download address of the data to be processed. That is, according to the data available time, the identification information and the download address, the random anti-theft link available in the data available time is generated through the customized encryption algorithm, so that the data is prevented from being maliciously stolen. Wherein the identification information may include a data ID. In some examples, a tamper-proof link may also be generated for the pending data that is available for a period of time based on the timestamp, identification information, and download address of the received call request, without depending on the configuration information.
According to some embodiments, when the call request indicates a request for migration or backup of the data to be processed, the processing operation is performed on the data to be processed, which is not described herein.
According to some embodiments, the method 2000 is used in a data management system, the data management system comprising a storage module that interacts with at least one storage unit through a unified interface, and the method 2000 may further comprise: counting the number of communication interruptions between the memory module and each of the at least one memory unit over a period of time; determining whether a number of communication interruptions between the memory module and each of the at least one memory unit is greater than a number threshold; and in response to determining that the number of times of communication interruption between the storage module and a certain storage unit in the at least one storage unit is greater than a number of times threshold, sending a connection abnormality notification to monitor the availability of the storage unit and perform intelligent alarm to ensure the reliability or safety of data uploading or downloading.
According to some embodiments, the availability of the storage unit may also be monitored according to feedback on the abnormal uploading or downloading of data at the calling end. At this time, the method 2000 may further include transmitting a status exception notification in response to receiving feedback on the abnormal uploading or downloading of the data from the calling end, thereby ensuring reliability or security of the data uploading or downloading.
According to some embodiments, the method 2000 may further include, in response to receiving at least one of the connection exception notification and the status exception notification, migrating data in the corresponding exception storage unit to other storage units of the at least one storage unit other than the exception storage unit to increase security and reliability of data storage, thereby ensuring normal performance of the call request. In some examples, data in at least one storage unit may be backed up periodically to increase the security of the stored data even if no connection or status exception notification is accepted.
According to some embodiments, in the data uploading process, when the data to be processed is stored, if the connection exception notification or the state exception notification is received about a storage unit where the data to be processed needs to be stored, the data to be processed is stored in another storage unit in at least one storage unit.
According to some embodiments, upon receiving the call request, the method 2000 may further comprise: determining whether the call request is granted the corresponding right; and in response to determining that the call request is not granted the corresponding rights, sending a request rejection notification. Wherein determining whether the call request is granted the corresponding rights includes at least one of: whether a calling end sending a calling request is granted with calling authority, whether a calling request containing an uploading request is granted with uploading authority, whether a calling request containing a downloading request is granted with downloading authority, whether a calling request containing a backup request is granted with backup authority, whether a calling request containing a migration request is granted with migration authority, and whether a calling frequency used by the calling request is granted with specific calling frequency authority. The security of the data management system is increased by detecting whether the sender of the call request and corresponding operations are granted corresponding rights, so that illegal and malicious calls are prevented.
According to some embodiments, the method 2000 may further comprise: determining whether the flow occupied by the operation of the calling end exceeds a flow threshold; and limiting the operation of the calling end or sending a request refusal notice in response to the fact that the flow exceeds the flow threshold. If a call request from one calling end occupies a large amount of traffic in a short time, other calling ends will not use the data management system for implementing the method 2000. Therefore, to ensure the reliability and security of the data management system, the occupied traffic may be appropriately limited or the corresponding call request may be directly discarded.
According to some embodiments, the method 2000 may further include counting the size of the data to be processed, the number of call requests, various anomalies, etc., so as to facilitate the security of the data management system by reporting and pre-warning accordingly, for example, in the form of a report.
Fig. 4 illustrates a block diagram of a data management device according to some embodiments of the present disclosure. As shown in fig. 4, the apparatus 4000 for data management may include: a receiving module 410, an acquiring module 420 and a processing module 430. The receiving module 410 is configured to receive a call request containing data to be processed and corresponding configuration information. The acquisition module 420 is configured to acquire the data to be processed based on the data information to be processed. The processing module 430 is configured to selectively process the data to be processed based on the configuration information.
It should be appreciated that the various modules of the apparatus 4000 shown in fig. 4 may correspond to the various steps in the method 2000 described with reference to fig. 2 and 3. Thus, the operations, features and advantages described above with respect to method 2000 apply equally to apparatus 4000 and the modules that it comprises. For brevity, certain operations, features and advantages are not described in detail herein.
Fig. 5 shows a block diagram of a data management apparatus according to further embodiments of the present disclosure. As shown in fig. 5, the apparatus 5000 may include: an access layer 510, a processing layer 520, a storage layer 530, and a monitoring layer 550.
The access layer 510 includes a receiving module 511, a rights authentication module 512, and a current limit module 513.
Wherein the receiving module 511 is configured to receive a call request containing data information to be processed and corresponding configuration information;
rights authentication module 512 is configured to determine whether the call request is granted the corresponding rights; and sending a request rejection notification in response to determining that the invocation request is not granted the corresponding rights; and
the current limiting module 513 is configured to determine whether the traffic occupied by the operation of the calling end exceeds a traffic threshold; and limiting the operation of the calling end or sending a request refusal notice in response to the fact that the flow exceeds the flow threshold.
The process layer 520 includes: the device comprises a detection module 521, a conversion module 522, a deduplication module 523, a dynamic switching module 524, an address generation module 525 and an anti-theft module 526.
Wherein the detection module 521 is configured to determine whether at least one of a format, an encoding, a size, etc. of the data to be processed meets a specific requirement; and in response to determining that the at least one of the format and the size of the data to be processed does not meet the particular requirements, sending a data unavailability notification;
the conversion module 522 is configured to convert the format of the data to be processed based on the data format requirements; determining whether the coding of the data to be processed meets the data coding requirement; based on the data coding requirement, converting the coding of the data to be processed; and carrying out IO stream conversion on the data to be processed based on the IO stream conversion request.
The deduplication module 523 is configured to determine whether there is data in at least one storage unit that is identical to the content of the data to be processed; and transmitting a data repetition notification in response to determining that there is data in the at least one storage unit that is the same as the content of the data to be processed;
the dynamic switching module 524 is configured to determine whether a storage unit corresponding to the storage unit information is available; in response to determining that a storage unit corresponding to the storage unit information is not available, storing the data to be processed in an available storage unit of the at least one storage unit; and in response to receiving at least one of the connection exception notification and the status exception notification, migrating data in the corresponding exception storage unit to other storage units of the at least one storage unit other than the exception storage unit;
The address generation module 525 is configured to generate a download address for the data to be processed; and
the anti-theft module 526 is configured to generate an anti-theft link for the data to be processed based on the data availability time, identification information for identifying the data to be processed, and a download address of the data to be processed.
The storage layer 530 includes a storage module 531. The memory module 531 interacts with at least one memory unit 541, 542 … n via a unified interface. The storage module 531 defines the same interface standard, thereby providing a unified upload interface, download interface, migration interface, and backup interface for at least one storage unit 541, 542 … n. The storage module 531 may be configured to obtain the data to be processed from at least one storage unit, for example, based on the data information to be processed (e.g., index information). The at least one memory unit 541, 542 … n corresponds to the features of the at least one memory unit in the method 2000 described above, and will not be described herein.
The monitoring layer 550 includes a monitoring module 551 and a statistics module 552.
Wherein the monitoring module 551 is configured to count the number of communication interruptions between the memory module and each of the at least one memory units over a time; determining whether a number of communication interruptions between the memory module and each of the at least one memory unit is greater than a number threshold; in response to determining that the number of communication interruptions between the storage module and a certain one of the at least one storage unit is greater than a number threshold, sending a connection anomaly notification; and sending a state exception notification in response to receiving feedback from the calling end regarding the data exception upload or download;
The statistics module 552 is configured to count the size of the data to be processed, the number of call requests, various exceptions, and the like.
It should be appreciated that the various modules of the apparatus 5000 shown in fig. 5 may correspond to the various steps in the method 2000 described with reference to fig. 2 and 3. For example, the receiving module 511 may be used, for example, to perform step S202 in the method 2000; the rights authentication module 512 may, for example, perform the rights validation step in method 2000; the flow restriction module 513 may, for example, perform the flow verification step in method 2000; the processing layer 220 may be used, for example, to perform step S206 in the method 2000; the detection module 521 may be used, for example, to perform and store the format and size of the data in the method 2000; the conversion module 522 may be used, for example, to perform steps S2061 to S2063 in the method 2000, format conversion in the download flow, transcoding steps in the upload and download flows, and IO stream conversion steps, etc.; the deduplication module 523 may be used, for example, to perform steps S2064 to S2066 in the method 2000; the dynamic switching module 524 may be used, for example, to perform steps S2067 to S2068 and the migration step in the method 2000; the anti-theft module 526 may be used, for example, to perform the anti-theft link generation step in method 2000; the storage module 531 corresponds to the storage module mentioned in the method 2000, and may, for example, perform step S204 in the method 2000; the monitoring module 551 may for example be used to perform the connection and status anomaly determination steps in the method 2000; and statistics module 552 may be used, for example, to perform statistics steps in method 2000. The operations, features and advantages described above with respect to method 2000 thus apply equally to apparatus 5000 and the modules comprised thereof. For brevity, certain operations, features and advantages are not described in detail herein.
According to another aspect of the present disclosure, there is also provided a computer device comprising a memory, a processor and a computer program stored on the memory, the processor being configured to execute the computer program to implement the steps of the method 2000 of data management described above.
According to yet another aspect of the present disclosure, there is also provided a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the method 2000 of data management described above.
According to yet another aspect of the present disclosure, there is also provided a computer program product comprising a computer program which, when executed by a processor, implements the steps of the method 2000 of data management described above.
Referring to fig. 6, a block diagram of a computer device 6000, which may be a server or a client of the present disclosure, will now be described, which is an example of a hardware device that may be applied to aspects of the present disclosure.
Computer device 6000 may include elements which connect to bus 6002 or which communicate with bus 6002 (possibly via one or more interfaces). For example, computer device 6000 may include a bus 6002, one or more processors 6004, one or more input devices 6006 and one or more output devices 6008. The one or more processors 6004 may be any type of processor and may include, but are not limited to, one or more general purpose processors and/or one or more special purpose processors (e.g., special processing chips). The processor 6004 may process instructions executing within the computer device 6000, including instructions stored in or on memory to display graphical information of a GUI on an external input/output device, such as a display device coupled to an interface. In other embodiments, multiple processors and/or multiple buses may be used, if desired, along with multiple memories and multiple memories. Also, multiple computer devices may be connected, each providing a portion of the necessary operations (e.g., as a server array, a set of blade servers, or a multiprocessor system). In fig. 6, a processor 6004 is illustrated.
Input device 6006 may be any type of device capable of inputting information to computer device 6000. The input device 6006 may receive entered numeric or character information and generate key signal inputs related to user settings and/or functional control of a computer device for implementing the method of data management, and may include, but is not limited to, a mouse, keyboard, touch screen, trackpad, trackball, joystick, microphone, and/or remote control. The output device 6008 may be any type of device capable of presenting information and may include, but is not limited to, a display, speakers, video/audio output terminals, vibrators, and/or printers.
Computer device 6000 may also include, or be connected to, non-transitory storage device 6010, which may be any storage device that is non-transitory and that may enable data storage, and may include, but is not limited to, magnetic disk drives, optical storage devices, solid state memory, floppy diskettes, flexible disks, hard disks, magnetic tape, or any other magnetic medium, optical disks or any other optical medium, ROM (read-only memory), RAM (random access memory), cache memory, and/or any other memory chip or cartridge, and/or any other medium from which a computer may read data, instructions, and/or code. The non-transitory storage device 6010 may be detachable from the interface. The non-transitory storage device 6010 may have data/program (including instructions)/code/modules (e.g., the receiving module 410, the obtaining module 420, and the processing module 430 shown in fig. 4) for implementing the methods and steps described above.
Computer device 6000 may also include communication device 6012. The communication device 6012 may be any type of device or system that enables communication with external devices and/or with a network and may include, but is not limited to, modems, network cards, infrared communication devices, wireless communication devices, and/or chipsets, such as bluetooth (TM) devices, 1302.11 devices, wiFi devices, wiMax devices, cellular communication devices, and/or the like.
Computer device 6000 may also include a working memory 6014, which may be any type of working memory that may store programs (including instructions) and/or data useful for the operation of processor 6004 and may include, but is not limited to, random access memory and/or read-only memory devices.
Software elements (programs) may reside in the working memory 6014 including, but not limited to, an operating system 6016, one or more application programs 6018, drivers, and/or other data and code. Instructions for performing the above-described methods and steps may be included in one or more applications 6018, and the above-described methods may be implemented by instructions of the one or more applications 6018 being read and executed by the processor 6004. Executable code or source code for instructions of software elements (programs) may also be downloaded from a remote location.
It should also be understood that various modifications may be made according to specific requirements. For example, custom hardware may also be used, and/or particular elements may be implemented in hardware, software, firmware, middleware, microcode, hardware description languages, or any combination thereof. For example, some or all of the disclosed methods and apparatus may be implemented by programming hardware (e.g., programmable logic circuits including Field Programmable Gate Arrays (FPGAs) and/or Programmable Logic Arrays (PLAs)) in an assembly language or hardware programming language such as VERILOG, VHDL, c++ using logic and algorithms according to the present disclosure.
It should also be appreciated that the foregoing method may be implemented by a server-client mode. For example, a client may receive data entered by a user and send the data to a server. The client may also receive data input by the user, perform a part of the foregoing processes, and send the processed data to the server. The server may receive data from the client and perform the aforementioned method or another part of the aforementioned method and return the execution result to the client. The client may receive the result of the execution of the method from the server and may present it to the user, for example, via an output device. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computer devices and having a client-server relationship to each other. The server may be a server of a distributed system or a server that incorporates a blockchain. The server can also be a cloud server, or an intelligent cloud computing server or an intelligent cloud host with artificial intelligence technology. The cloud server is a host product in a cloud computing service system, so as to solve the defects of large management difficulty and weak service expansibility in the traditional physical host and virtual private server (VPS, virtual Private Server) service.
It should also be appreciated that components of computer device 6000 may be distributed over a network. For example, some processes may be performed using one processor while other processes may be performed by another processor remote from the one processor. Other components of computing device 2000 may also be similarly distributed. As such, computer device 6000 may be interpreted as a distributed computing system which performs processing in a variety of locations.
Although embodiments or examples of the present disclosure have been described with reference to the accompanying drawings, it is to be understood that the foregoing methods, systems, and apparatus are merely exemplary embodiments or examples, and that the scope of the present invention is not limited by these embodiments or examples but only by the claims following the grant and their equivalents. Various elements of the embodiments or examples may be omitted or replaced with equivalent elements thereof. Furthermore, the steps may be performed in a different order than described in the present disclosure. Further, various elements of the embodiments or examples may be combined in various ways. It is important that as technology evolves, many of the elements described herein may be replaced by equivalent elements that appear after the disclosure.

Claims (16)

1. A method of data management, comprising:
receiving a call request containing data information to be processed and corresponding configuration information;
acquiring data to be processed based on the data information to be processed; and
selectively processing the data to be processed based on the configuration information,
wherein the pending data information includes the pending data, the call request indicates a request to upload the pending data, and the method further comprises:
determining whether at least one of the format, the coding and the size of the data to be processed meets specific requirements; and
in response to determining that the at least one of format, encoding, and size of the data to be processed does not meet the particular requirement, sending a data unavailability notification, and
wherein the configuration information includes data format requirements, and the selectively processing the data to be processed includes:
determining whether the format of the data to be processed meets the data format requirement; and
and converting the format of the data to be processed based on the data format requirement in response to determining that the format of the data to be processed does not meet the data format requirement.
2. The method of claim 1, wherein the configuration information includes storage unit information, and the selectively processing the data to be processed includes:
and storing the data to be processed into a storage unit corresponding to the storage unit information in at least one storage unit.
3. The method of claim 2, wherein the storing the data to be processed into a memory location corresponding to the memory location information in at least one memory location comprises:
determining whether a storage unit corresponding to the storage unit information is available; and
in response to determining that a storage unit corresponding to the storage unit information is not available, the data to be processed is stored in an available storage unit of the at least one storage unit.
4. The method of claim 1, wherein the configuration information comprises a data deduplication request, and the selectively processing the data to be processed comprises:
determining whether data with the same content as the data to be processed exists in at least one storage unit; and
and transmitting a data repetition notification in response to determining that the data which is the same as the content of the data to be processed exists in the at least one storage unit.
5. The method of claim 1, wherein the pending data is stored in a certain one of at least one storage unit, the call request indicates a request to download the pending data, the acquiring the pending data includes acquiring the pending data from the at least one storage unit based on the pending data information, the configuration information includes data format requirements, and the selectively processing the pending data includes format converting the pending data based on the data format requirements.
6. The method of claim 5, wherein the configuration information includes a data availability time, and the selectively processing the data to be processed comprises:
and generating an anti-theft link for the data to be processed based on the data available time, the identification information for identifying the data to be processed and the download address of the data to be processed.
7. The method of any of claims 1 to 6, wherein the method is used in a data management system comprising a storage module that interacts with at least one storage unit through a unified interface, and the method further comprises:
Counting the number of communication interruptions between the memory module and each of the at least one memory unit over a period of time;
determining whether a number of communication interruptions between the memory module and each of the at least one memory unit is greater than a number threshold; and
in response to determining that the number of communications interruptions between the storage module and a certain one of the at least one storage unit is greater than the number threshold, a connection anomaly notification is sent.
8. The method of claim 7, the method further comprising:
and sending a state exception notification in response to receiving feedback of the calling end about data exception uploading or downloading.
9. The method of claim 8, wherein the method further comprises:
in response to receiving at least one of the connection exception notification and the status exception notification, migrating data in a corresponding exception storage unit to other storage units of the at least one storage unit than the exception storage unit.
10. The method of claim 7, wherein the receiving a call request containing data information to be processed and corresponding configuration information includes selectively accessing a calling side to the data management system by at least one of an RPC and an SDK.
11. The method of any of claims 2-6, wherein the at least one storage unit comprises at least one of BOS, CEPH, NFS and local storage.
12. The method of any one of claims 1 to 6, further comprising:
determining whether the call request is granted corresponding rights; and
in response to determining that the call request is not granted the corresponding rights, a request rejection notification is sent.
13. The method of any one of claims 1 to 6, further comprising:
determining whether the flow occupied by the operation of the calling end exceeds a flow threshold; and
and responding to the determination that the flow exceeds the flow threshold, limiting the operation of the calling end or sending a request refusal notification.
14. An apparatus for data management, comprising:
the receiving module is configured to receive a call request containing data to be processed and corresponding configuration information;
the acquisition module is configured to acquire data to be processed based on the data information to be processed; and
a processing module configured to selectively process the data to be processed based on the configuration information,
Wherein the pending data information includes the pending data, the call request indicates a request to upload the pending data, and the apparatus further includes:
a determining module configured to determine whether at least one of a format, an encoding, and a size of the data to be processed meets a particular requirement; and
a transmission module configured to transmit a data unavailability notification in response to determining that the at least one of a format, an encoding, and a size of the data to be processed does not meet the particular requirement, an
Wherein the configuration information includes data format requirements, and the selectively processing the data to be processed includes:
determining whether the format of the data to be processed meets the data format requirement; and
and converting the format of the data to be processed based on the data format requirement in response to determining that the format of the data to be processed does not meet the data format requirement.
15. A computer device, comprising:
a memory, a processor and a computer program stored on the memory,
wherein the processor is configured to execute the computer program to implement the steps of the method of any one of claims 1 to 13.
16. A non-transitory computer readable storage medium having stored thereon a computer program, wherein the computer program when executed by a processor implements the steps of the method of any of claims 1 to 13.
CN202110315932.5A 2021-03-24 2021-03-24 Method, device, computer equipment and storage medium for data management Active CN112988726B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110315932.5A CN112988726B (en) 2021-03-24 2021-03-24 Method, device, computer equipment and storage medium for data management

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110315932.5A CN112988726B (en) 2021-03-24 2021-03-24 Method, device, computer equipment and storage medium for data management

Publications (2)

Publication Number Publication Date
CN112988726A CN112988726A (en) 2021-06-18
CN112988726B true CN112988726B (en) 2024-03-01

Family

ID=76333526

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110315932.5A Active CN112988726B (en) 2021-03-24 2021-03-24 Method, device, computer equipment and storage medium for data management

Country Status (1)

Country Link
CN (1) CN112988726B (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108566420A (en) * 2018-03-29 2018-09-21 上海点融信息科技有限责任公司 Data processing method, equipment and computer readable storage medium for block chain
CN110178391A (en) * 2017-01-23 2019-08-27 Oppo广东移动通信有限公司 Wireless communications method, terminal device and the network equipment
CN110247959A (en) * 2019-05-24 2019-09-17 深圳龙图腾创新设计有限公司 A kind of data transmission method and device
CN110740103A (en) * 2019-09-02 2020-01-31 深圳壹账通智能科技有限公司 Service request processing method and device, computer equipment and storage medium
CN110855772A (en) * 2019-11-08 2020-02-28 北京奇艺世纪科技有限公司 Cross-device data storage method, system, device, server and medium
CN112231727A (en) * 2020-10-19 2021-01-15 北京达佳互联信息技术有限公司 Data processing method and device, electronic equipment, server and storage medium

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9432238B2 (en) * 2012-12-20 2016-08-30 Dropbox, Inc. Communicating large amounts of data over a network with improved efficiency
US10878031B2 (en) * 2017-05-16 2020-12-29 Walmart Apollo, Llc Web services-based data transfers for item management
US10965771B2 (en) * 2018-09-25 2021-03-30 International Business Machines Corporation Dynamically switchable transmission data formats in a computer system

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110178391A (en) * 2017-01-23 2019-08-27 Oppo广东移动通信有限公司 Wireless communications method, terminal device and the network equipment
CN108566420A (en) * 2018-03-29 2018-09-21 上海点融信息科技有限责任公司 Data processing method, equipment and computer readable storage medium for block chain
CN110247959A (en) * 2019-05-24 2019-09-17 深圳龙图腾创新设计有限公司 A kind of data transmission method and device
CN110740103A (en) * 2019-09-02 2020-01-31 深圳壹账通智能科技有限公司 Service request processing method and device, computer equipment and storage medium
CN110855772A (en) * 2019-11-08 2020-02-28 北京奇艺世纪科技有限公司 Cross-device data storage method, system, device, server and medium
CN112231727A (en) * 2020-10-19 2021-01-15 北京达佳互联信息技术有限公司 Data processing method and device, electronic equipment, server and storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于数据仓库的JMS中数据管理方法的研究;张小芳;古清月;;计算机工程与设计(05);全文 *

Also Published As

Publication number Publication date
CN112988726A (en) 2021-06-18

Similar Documents

Publication Publication Date Title
KR100823122B1 (en) Remote management and access of databases, services and devices associated with a mobile terminal
US8296445B1 (en) Software testing harness
JP2018085117A (en) Efficient data compression and analysis as service
CN107436844B (en) Method and device for generating interface use case aggregate
US9667703B1 (en) System, method and computer program product for generating remote views in a virtual mobile device platform
CN110781408B (en) Information display method and device
CN111198797B (en) Operation monitoring method and device and operation analysis method and device
CN112583898A (en) Business process arranging method and device and readable medium
US20150356853A1 (en) Analyzing accelerometer data to identify emergency events
CN112799925A (en) Data acquisition method and device, electronic equipment and readable storage medium
WO2022171127A1 (en) Reducing start latency of serverless microservices
CN109450997B (en) Data cross-terminal migration method and device, computer equipment and storage medium
US11736619B2 (en) Automated indication of urgency using Internet of Things (IoT) data
CN112241362A (en) Test method, test device, server and storage medium
CN112988726B (en) Method, device, computer equipment and storage medium for data management
CN113170517B (en) Short message service linking for active feed communications
CN110930110B (en) Distributed flow monitoring method and device, storage medium and electronic equipment
US10674563B2 (en) Cognitive message dynamic response optimization
CN111343132B (en) File transmission detection method and device and storage medium
CN107168648B (en) File storage method and device and terminal
CN111338899B (en) Monitoring method, terminal and storage medium
WO2022193142A1 (en) Behavior monitoring method and apparatus, terminal device, and computer readable storage medium
CN114285839A (en) File transmission method and device, computer storage medium and electronic equipment
CN112764828B (en) Business logic management method and device, computer equipment and medium
CN114270309A (en) Resource acquisition method and device and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant