CN113590381A - Feature distribution method and device, electronic equipment and computer-readable storage medium - Google Patents

Feature distribution method and device, electronic equipment and computer-readable storage medium Download PDF

Info

Publication number
CN113590381A
CN113590381A CN202110734720.0A CN202110734720A CN113590381A CN 113590381 A CN113590381 A CN 113590381A CN 202110734720 A CN202110734720 A CN 202110734720A CN 113590381 A CN113590381 A CN 113590381A
Authority
CN
China
Prior art keywords
image
distributed
image features
distribution
image feature
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110734720.0A
Other languages
Chinese (zh)
Inventor
张峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Kuangshi Technology Co Ltd
Beijing Megvii Technology Co Ltd
Original Assignee
Beijing Kuangshi Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Kuangshi Technology Co Ltd filed Critical Beijing Kuangshi Technology Co Ltd
Priority to CN202110734720.0A priority Critical patent/CN113590381A/en
Publication of CN113590381A publication Critical patent/CN113590381A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1448Management of the data involved in backup or backup restore
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1458Management of the backup or restore process
    • G06F11/1464Management of the backup or restore process for networked environments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1458Management of the backup or restore process
    • G06F11/1469Backup restoration techniques

Abstract

The invention relates to a feature distribution method, a feature distribution device, electronic equipment and a computer-readable storage medium, wherein N image features and attribute information corresponding to each image feature are acquired from an image feature library to be distributed at a certain period T, and the attribute information comprises an object identifier used for indicating an object to which the image features belong; according to the attribute information, carrying out duplication elimination processing on the N image characteristics, and reserving the latest image characteristics corresponding to each object identifier to obtain duplication elimination image characteristics; storing the de-duplicated image characteristics into a bucket to be distributed in a memory; and taking out the de-duplicated image characteristics from the bucket to be distributed, and distributing the de-duplicated image characteristics to a plurality of distribution channels for parallel distribution. Errors caused by out-of-order processing of subsequent image features can be avoided.

Description

Feature distribution method and device, electronic equipment and computer-readable storage medium
Technical Field
The application belongs to the field of image processing, and particularly relates to a feature distribution method and device, electronic equipment and a computer-readable storage medium.
Background
After image feature extraction is carried out on each picture, an image feature is obtained. In the application process of the image features, a large number of image features are often required to be acquired, and the image features are distributed to the receiving end through the thread channel, so that the receiving end can perform feature comparison based on the image features in the subsequent processing process.
The image features have objects corresponding thereto, and one object may have a plurality of image features corresponding thereto. In order to improve the comparison accuracy and efficiency when the receiving end performs the feature comparison based on the image features, it is generally desirable to perform the comparison using one image feature of the same object, and the image feature is the latest image feature of the object. Therefore, when the image features are distributed to the receiving end, the old image features corresponding to the same object are covered by the new image features, and only one latest image feature of the same object is guaranteed to be stored at the receiving end.
In order to improve the image feature distribution efficiency, the image features are distributed in a mode of parallel operation of multi-thread channels. Although the problem of distribution efficiency can be solved by the mode of parallel work of the multi-thread channels, the problem that the processing order of the image features is disordered and the image features stored at the receiving end are repeated easily occurs by the parallel distribution of the multi-thread channels.
For example, image feature 1 and image feature 2 are features corresponding to the same object, image feature 1 is generated before image feature 2, image features 1-100 are distributed in parallel by 10 threads a-J, when parallel distribution is performed, in order to avoid occurrence of repeated image features, when a distribution end needs to write image features or a receiving end receives image features, whether the image features of the same object corresponding to the image features exist in the receiving end is judged first, and if the image features exist, the received image features are used for covering the existing image features. If not, the received image feature is written. Due to differences in processing speed, processing pressure, and the like of different thread channels, the image feature 1 generated first may arrive at the receiving end, and the image feature 2 generated later may arrive at the receiving end first, so that the old image feature 1 may cover the new image feature 2. To avoid this, it is necessary to determine whether the image feature of the same object corresponding to the image feature exists in the receiving end, and further determine the sequence of the generation times of the image feature 2 and the image feature 1 when the image feature of the same object corresponding to the image feature exists in the receiving end, so that the image feature distribution efficiency is greatly reduced. In another case, if the image feature 1 is distributed to the receiving end by the thread a, and the image feature 2 is already distributed to the receiving end by the thread B but writing is not completed, the receiving end may think that the receiving end has not stored any image feature of the object corresponding to the image feature 1, and then the receiving end allows the image feature 1 to be written, so that the receiving end stores the image feature 1 and the image feature 2 at the same time, which causes the image features stored by the receiving end to be repeated.
It will be appreciated that the above problem is particularly acute when the image features corresponding to the same object are modified frequently.
Disclosure of Invention
In view of the above, an object of the present application is to provide a feature distribution method, an apparatus, an electronic device and a computer-readable storage medium, in which before distributing image features, the image features are subjected to de-duplication processing to avoid errors caused by disorder of processing orders of the image features.
The application is realized as follows:
in a first aspect, an embodiment of the present application provides a feature distribution method, where the method includes: acquiring N image characteristics and attribute information corresponding to each image characteristic from an image characteristic library to be distributed at a certain period T, wherein the attribute information comprises an object identifier used for indicating an object to which the image characteristics belong; according to the attribute information, carrying out duplication elimination processing on the N image characteristics, and reserving the latest image characteristics corresponding to each object identifier to obtain duplication elimination image characteristics; storing the de-duplicated image characteristics into a bucket to be distributed in a memory; and taking out the de-duplicated image characteristics from the bucket to be distributed, and distributing the de-duplicated image characteristics to a plurality of distribution channels for parallel distribution.
In the above embodiment, since a batch of image features are subjected to past reprocessing before being distributed, and the latest image features for the same object are retained, after the duplicate removal image features are distributed to each processing module for processing through a distribution channel in parallel, at least no processing result for the same object identifier exists in the processing results of the image features obtained in the cycle, so that the problem of disorder of the processing order of the image features does not exist in the cycle, and the problem of mutual overlapping does not exist between the processing results of the image features obtained in the cycle, thereby avoiding errors caused by mutual overlapping between the processing results of the image features obtained in the cycle.
With reference to the embodiment of the first aspect, in one possible implementation manner, the method further includes: and determining the size of N and/or the size of the bucket to be distributed according to at least one of the speed of newly adding image features in the image feature library to be distributed, the repetition rate of image features in the image feature library to be distributed, the parallel distribution speed of the de-duplicated image features and the size of an allocable memory space.
With reference to the embodiment of the first aspect, in a possible implementation manner, determining the size of N and/or the size of the bucket to be distributed according to at least one of a speed of adding an image feature in the image feature library to be distributed, a repetition rate of image features in the image feature library to be distributed, a speed of distributing the deduplicated image features in parallel, and a size of an allocable memory space includes: determining the size of a bucket to be distributed according to the minimum value of the number of newly added image features to be distributed in unit time, the maximum distributable image distribution feature number in unit time multiplied by a distribution factor and the size of distributable memory space in the image feature library to be distributed; the number of the newly added image features to be distributed in the image feature library to be distributed per unit time is x (1-the repetition rate of the image features in the image feature library to be distributed) of the number of the newly added image features to be distributed per unit time in the image feature library to be distributed.
With reference to the embodiment of the first aspect, in a possible implementation manner, the method further includes any one of the following steps:
determining T according to N and the speed of parallel distribution of the de-duplicated image features;
determining the size of the barrel to be distributed according to the size of N;
and determining the size of N according to the size of the barrel to be distributed.
With reference to the embodiment of the first aspect, in a possible implementation manner, the acquiring, at a certain period T, N image features and attribute information corresponding to each of the image features from an image feature library to be distributed includes:
determining an image feature pointed by a mark position of an image feature library to be distributed, wherein the mark position is used for pointing to the last image feature acquired in the previous period; starting from the next image feature of the image features pointed by the mark positions, acquiring N image features which are not acquired from the image feature library to be distributed;
correspondingly, after acquiring N image features and attribute information corresponding to each image feature from an image feature library to be distributed at a certain period T, the method further includes: and updating the mark position to point to the last image feature acquired in the current period.
In this embodiment, by setting the mark position, the problem that the same image feature is repeatedly acquired when the image feature to be delivered is acquired in the present period can be avoided.
With reference to the embodiment of the first aspect, in a possible implementation manner, the attribute information further includes distribution order information used for characterizing image features corresponding to the attribute information, and performing deduplication processing on the N image features according to the attribute information includes: deleting other image features of which the distribution sequence information is not latest in the plurality of image features with the same object identifier; wherein the distribution order information is a number configured by a user or a generation time of the corresponding image feature.
With reference to the embodiment of the first aspect, in a possible implementation manner, the extracting the de-duplicated image features from the bucket to be distributed, and allocating the de-duplicated image features to multiple distribution channels for parallel distribution includes: comparing the image characteristics stored by the receiving end with the de-duplicated image characteristics; if the receiving end has stored the existing image characteristics with the same object identification as the duplication-removing image characteristics, writing the duplication-removing image characteristics into the receiving end, and deleting the existing image characteristics of the receiving end.
In the above embodiment, the duplicate removal processing is performed again when the image features are written in the receiving end, that is, the dual duplicate removal effect can be achieved, and further, errors caused by the processing result of the old image feature overwriting the processing result of the new image feature can be avoided.
With reference to the embodiment of the first aspect, in a possible implementation manner, before the removing the duplicate image feature from the bucket to be distributed and allocating the removed duplicate image feature to a plurality of distribution channels for parallel distribution, the method further includes: backing up the de-duplicated image features to a persistent bucket located in a persistent storage space to form backup image features;
correspondingly, the method further comprises the following steps: and when determining that the power failure restart or the abnormal restart exists, acquiring the backup image characteristics from the persistent bucket, and distributing the backup image characteristics to the plurality of distribution channels for parallel distribution.
In the embodiment, after abnormal restart or power-off restart, backup image features are preferentially acquired from a persistent bucket located in a persistent storage space for parallel distribution, so that the problem of image feature missing caused by the fact that a memory does not have persistence after restart can be avoided.
In a second aspect, an embodiment of the present application provides a feature distribution apparatus, including: the device comprises: the device comprises an acquisition module, a duplication elimination module, a storage module and a distribution module.
The system comprises an acquisition module, a distribution module and a distribution module, wherein the acquisition module is used for acquiring N image characteristics and attribute information corresponding to each image characteristic from an image characteristic library to be distributed at a certain period T, and the attribute information comprises an object identifier used for indicating an object to which the image characteristics belong;
the duplication removing module is used for carrying out duplication removing processing on the N image characteristics according to the attribute information, and reserving the latest image characteristics corresponding to each object identifier to obtain duplication removing image characteristics;
the storage module is used for storing the de-duplicated image characteristics into a bucket to be distributed in a memory;
and the distribution module is used for taking out the de-duplicated image characteristics from the bucket to be distributed and distributing the de-duplicated image characteristics to a plurality of distribution channels for parallel distribution.
With reference to the second aspect, in a possible implementation manner, the apparatus further includes a determining module, configured to determine a size of N and/or a size of the bucket to be distributed according to at least one of a speed of adding an image feature in the image feature library to be distributed, a repetition rate of image features in the image feature library to be distributed, a speed of distributing the deduplicated image features in parallel, and a size of an allocable memory space.
With reference to the second aspect, in a possible implementation manner, the determining module is configured to determine the size of the bucket to be distributed according to a minimum value of the number of newly added image features to be distributed in a unit time in the image feature library to be distributed, a maximum distributable image feature number in the unit time × a distribution factor, and a size of a distributable memory space; the number of the newly added to-be-distributed image features in the to-be-distributed image feature library per unit time is x (1-the repetition rate of the image features in the to-be-distributed image feature library).
With reference to the embodiment of the second aspect, in a possible implementation manner, the determining module is further configured to perform any one of the following:
determining T according to N and the speed of parallel distribution of the de-duplicated image features;
determining the size of the barrel to be distributed according to the size of N;
and determining the size of N according to the size of the barrel to be distributed.
With reference to the embodiment of the second aspect, in a possible implementation manner, the obtaining module is configured to determine an image feature pointed by a mark position of an image feature library to be distributed, where the mark position is used to point to a last image feature obtained in a previous cycle; starting from the next image feature of the image features pointed by the mark positions, acquiring N image features which are not acquired from the image feature library to be distributed;
correspondingly, the device further comprises an updating module for updating the mark position to point to the last image feature acquired in the current period.
With reference to the embodiment of the second aspect, in a possible implementation manner, the attribute information further includes distribution order information used for characterizing image features corresponding to the attribute information, and the deduplication module is configured to delete, from among a plurality of image features having the same object identifier, other image features whose distribution order information is not latest;
wherein the distribution order information is a number configured by a user or a generation time of the corresponding image feature.
With reference to the second aspect, in a possible implementation manner, the allocating module is configured to compare the image feature stored by the receiving end with the de-duplicated image feature;
if the receiving end has stored the existing image characteristics with the same object identification as the duplication-removing image characteristics, writing the duplication-removing image characteristics into the receiving end, and deleting the existing image characteristics of the receiving end.
With reference to the second aspect, in a possible implementation manner, the saving module is further configured to backup the deduplication image features to a persistent bucket located in a persistent storage space, so as to form backup image features;
correspondingly, the allocation module is further configured to, when it is determined that there is a power-off restart or an abnormal restart, acquire the backup image features from the persistent bucket, and allocate the backup image features to the plurality of distribution channels for parallel distribution.
In a third aspect, an embodiment of the present application further provides an electronic device, including: a memory and a processor, the memory and the processor connected; the memory is used for storing programs; the processor calls a program stored in the memory to perform the method of the first aspect embodiment and/or any possible implementation manner of the first aspect embodiment.
In a fourth aspect, the present application further provides a non-transitory computer-readable storage medium (hereinafter, referred to as a computer-readable storage medium), on which a computer program is stored, where the computer program is executed by a computer to perform the method in the foregoing first aspect and/or any possible implementation manner of the first aspect.
Additional features and advantages of the application will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by the practice of the embodiments of the application. The objectives and other advantages of the application may be realized and attained by the structure particularly pointed out in the written description and drawings.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings without creative efforts. The foregoing and other objects, features and advantages of the application will be apparent from the accompanying drawings. Like reference numerals refer to like parts throughout the drawings. The drawings are not intended to be to scale as practical, emphasis instead being placed upon illustrating the subject matter of the present application.
Fig. 1 shows a flowchart of a feature distribution method provided in an embodiment of the present application.
Fig. 2 shows a schematic diagram of a change of a mark position provided in an embodiment of the present application.
Fig. 3 shows a block diagram of a feature distribution apparatus according to an embodiment of the present application.
Fig. 4 shows a schematic structural diagram of an electronic device provided in an embodiment of the present application.
100-an electronic device; 110-a processor; 120-a memory; 400-feature distribution means; 410-an obtaining module; 420-a deduplication module; 430-a saving module; 440-allocation module.
Detailed Description
The technical solutions in the embodiments of the present application will be described below with reference to the drawings in the embodiments of the present application.
It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures. Meanwhile, the term "comprises," "comprising," or any other variation thereof is intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
In addition, the defects (which cause disorder of the processing order of the image features) existing in the feature distribution method in the prior art are the results obtained after the applicant has practiced and studied carefully, and therefore, the discovery process of the above defects and the solutions proposed by the embodiments of the present application in the following for the above defects should be considered as contributions of the applicant to the present application.
In order to solve the above problem, embodiments of the present application provide a feature distribution method, an apparatus, an electronic device, and a computer-readable storage medium, where before distributing image features, duplicate removal processing is performed on the image features, so as to implement order preservation of the image features, and further avoid problems occurring in subsequent processing flows.
The technology can be realized by adopting corresponding software, hardware and a combination of software and hardware. The following describes embodiments of the present application in detail.
The following description will be made with respect to a feature distribution method provided by the present application.
Referring to fig. 1, an embodiment of the present application provides a feature distribution method, which may include the following steps.
Step S110: the method comprises the steps of obtaining N image characteristics and attribute information corresponding to each image characteristic from an image characteristic library to be distributed at a certain period T, wherein the attribute information comprises an object identification used for indicating an object to which the image characteristics belong.
In the embodiment of the application, after the feature extraction end performs feature extraction on an object included in one image, one image feature can be obtained.
The image features may be image features of objects included in the image, and the objects may be human faces, human bodies, motor vehicles, target animals and plants, target objects, and the like.
Optionally, after the feature extraction end obtains the image features, the currently obtained image features may be continuously stored in an image feature library to be distributed (generally located in a persistent storage space, such as a magnetic disk), so that in the process of storage or in the process of storage, the image features are obtained from the image feature library to be distributed by the feature distribution method in the embodiment of the present application and are distributed.
In one example, the image features in the image feature library to be distributed are stored substantially in the order of image feature generation, and the image features generated first are written into the bottom of the image feature library to be distributed first.
Of course, it is worth pointing out that the feature extraction end and the execution end that executes the feature distribution method provided in the embodiment of the present application may be different electronic devices, or may be different processes belonging to the same electronic device.
In some embodiments of the present application, when feature distribution is required, image features may be acquired from an image feature library to be distributed in batches. On one hand, in some cases, the image features in the image feature library to be distributed are distributed while being generated, and if all the image features are generated and distributed together, the feature distribution has hysteresis; on the other hand, when the number of image features to be distributed is large, it is difficult to have a memory space sufficient to accommodate all the image features. Therefore, a batch-wise manner of acquiring image features is employed.
In which a batch of image features (N image features) is acquired within one period T.
Specifically, when acquiring image features, in some embodiments, at each cycle, N image features may be randomly acquired directly from the image feature library to be distributed.
In this embodiment, after acquiring N image features from the to-be-distributed image feature library, the N image features need to be deleted from the to-be-distributed image feature library, so as to avoid repeated subsequent acquisition of the same image feature.
In addition, in some embodiments, there may be a need to log the total number of the image features in the image feature library to be distributed and the total number of the image features extracted by the feature extraction terminal, and in this case, the image features in the image feature library to be distributed cannot be deleted immediately, but a validity period (for example, 1 year) may be set for each image feature, and when the image features expire, the image features are deleted from the image feature library to be distributed.
In this embodiment, the image feature library to be distributed may be regarded as a queue, the image features extracted by the feature extraction end sequentially enter the queue from the top of the queue for buffering, and when the image features need to be acquired from the image feature library to be distributed, the image features are sequentially acquired from the bottom of the queue.
In addition, a mark position is set in the image feature library to be distributed, and the mark position is used for pointing to the last image feature acquired in the last period in the image feature library to be distributed. It is worth noting that the marker position points to the bottom of the library of image features to be distributed when in the initialization phase, i.e. before the first acquisition of an image feature from the library of image features to be distributed. Therefore, in the embodiment of the present application, the acquisition progress of the image features in the image feature library to be distributed can be marked by the marking position.
Subsequently, when a plurality of image features to be distributed need to be acquired from the image feature library to be distributed, in order to avoid repeatedly acquiring the same image feature that has been acquired in the previous cycle in the subsequent cycle, when image acquisition is performed in each cycle, which image features to acquire can be determined according to the current mark position in the image feature library to be distributed.
Specifically, when acquiring N image features to be distributed from the image feature library to be distributed, the image feature pointed by the mark position of the image feature library to be distributed may be determined first, and then, starting from the next image feature of the image features pointed by the mark position of the image feature library to be distributed, one image feature that has not been acquired is sequentially acquired from the image feature library to be distributed, and N times of acquisition are performed to obtain N image features. At this time, the N image features are the N image features to be distributed.
Of course, after the N image features are acquired, the image feature pointed by the mark position in the image feature library to be distributed needs to be updated, so that the mark position points to the last image feature acquired in the current period, that is, the image feature acquired the nth time among the N image features acquired in the current period.
Taking fig. 2 as an example, it is assumed that there are a large number (typically in the order of one hundred thousand) of image features in the image feature library to be distributed, and in the initial state, the mark position (indicated by an arrow in fig. 2) of the image feature library to be distributed points to the bottom of the image feature library to be distributed.
Of course, it is worth pointing out that, regardless of the manner in which the image features are acquired from the image feature library to be distributed, when the image features are acquired, the attribute information corresponding to each image feature may be acquired.
In the attribute information, an object identification, such as an ID, of the image feature is included.
It is worth pointing out that, in general, an image feature refers to a feature of an object contained in an image, an object identifier of the image feature is used to indicate an object to which the image feature belongs, and if two image features belong to the same object, the object identifiers are the same.
Step S120: and according to the attribute information, carrying out duplication elimination processing on the N image characteristics, and reserving the latest image characteristics corresponding to each object identifier to obtain duplication elimination image characteristics.
In the embodiment of the application, the image features stored in the image feature library to be distributed may be duplicated, that is, there is a case where a plurality of image features have the same object identifier.
Taking an object as a face and an image feature as a portrait feature extracted from a certificate as an example, normally, the same person acquires a first certificate at the age of 18, acquires a new certificate again at the age of 25, and acquires a third certificate again at the age of 50. Therefore, for the same person, when feature extraction is carried out on three certificates of the same person, three image features with the same object identification can be obtained and stored in the image feature library to be distributed. If the person is 52 years old at this time, because the certificate of 50 years old has been generated, the face contained in the certificate of 50 years old is closer to the current face than the face contained in the certificate of 18 years old and the certificate of 25 years old, therefore, in the subsequent processing process, the image feature corresponding to the certificate of 50 years old needs to be the processing object, otherwise, the subsequent processing flow is prone to error (for example, the face contained in the certificate of 18 years old is more prone to error than the current face).
Furthermore, in some cases, multiple image features with the same object identification may be caused by information modification. For example, for zhang san, when feature extraction is performed on zhang san, the name of the obtained attribute information of the image feature is zhang san, the certificate number is a, errors are found in the information of the certificate number subsequently, a new image feature is generated again and stored in an image feature library to be distributed, the name of the new attribute information of the image feature is zhang san, the certificate number is B, and the object identifier of the new image feature is the same as the object identifier of the old image feature. In this example, since the old image feature information is wrong, in the subsequent processing process, the new image feature needs to be used as the processing object, otherwise, the wrong information is easily caused to cover the correct information, and further, the subsequent processing flow is caused to be wrong.
Furthermore, in some cases, multiple image features with the same object identification may be caused by data transmission repetition. For example, when the feature extraction side saves the image features to the image feature library to be distributed, after the image features No. 1-10 are transmitted, the transmission is interrupted. When the transmission is resumed, the No. 8-20 image characteristics are transmitted, so that the No. 8-10 image characteristics are repeated. In this case, duplicate image features need to be deleted, otherwise distribution resources are easily wasted. Of course, in general, the image features 8-10 of the image features 1-10 transmitted first need to be deleted, and the image features 8-10 of the image features 8-20 transmitted later need to be retained.
As can be seen from the above example, in the embodiment of the present application, when a plurality of image features having the same object identifier exist at the same time, the latest image feature of the plurality of image features having the same object identifier needs to be used as a processing object, so that the problem caused by disorder of the processing order when the image features are processed and the problem of wasting distribution resources are avoided.
Specifically, after the plurality of image features are obtained, the plurality of image features may be subjected to deduplication processing according to attribute information of each image feature, so that the latest image features corresponding to each object identifier are retained, and the deduplication image features are obtained.
The process of performing the deduplication process may be as follows:
in some embodiments, distribution order information of the image feature corresponding to the attribute information may be further included in the attribute information of the image feature.
The distribution sequence information may be feature extraction time of the feature extraction end when performing feature extraction on the image, that is, generation time of image features; or a user configured number; the time of generation of the original image corresponding to the image feature may be used.
The distribution order information of the image features is used for representing the degree of freshness of the image features corresponding to the attribute information, and can be used as a judgment standard for determining which image feature is reserved and deleted during deduplication. For example, the earlier the image feature is generated, the smaller the number, the earlier the original image is generated, and the older the image feature is.
In a specific embodiment, the image features may be sorted according to the distribution order information of the image features, and after the sorting is completed, for some object identifiers, there may be a plurality of corresponding image features. For such cases, the image features with the same object identifier, and other image features with the ordering being in the non-last distribution order, are deleted, so as to achieve the purpose of duplicate removal.
That is, for a set composed of a plurality of image features having the same object id, only the image feature in the set that is closest to the distribution order is retained, and the image feature is determined as the latest image feature corresponding to the object id.
Step S130: and storing the de-duplicated image characteristics into a bucket to be distributed in a memory.
In some embodiments, before the duplicate removal image features are obtained and the duplicate removal image features are distributed in parallel, a dynamic memory can be applied from a memory to form a bucket to be distributed according to the number of the currently obtained duplicate removal image features, and the obtained duplicate removal image features are stored in the bucket to be distributed. And subsequently, taking out the duplicate image features from the bucket to be distributed for distribution until the current duplicate removal image features are distributed completely, releasing the dynamic memory, and repeating the process to execute the duplicate removal and distribution tasks of the next period. Thus, this process is a serial process.
However, since a certain time is required for obtaining the image features and for performing deduplication on the image features, the foregoing manner may cause a situation that the distribution channel waits for deduplication of the image features, that is, the distribution channel is in an idle state, and the resource utilization rate is not maximized, which may affect the distribution efficiency. And, the memory utilization is low due to the dynamic memory and repeated creation and release.
In some embodiments, in order to ensure the dispensing efficiency, a fixed-size storage position can be applied from the memory in advance to form the bucket to be dispensed.
In this embodiment, after the de-duplication image features are obtained, the de-duplication image features may be stored in the bucket to be dispensed in the memory. When image feature distribution is needed subsequently, the duplicate image features are taken out from the bucket to be distributed and are distributed to a plurality of distribution channels for parallel distribution.
Therefore, the process of acquiring the image features from the image feature library to be distributed and performing deduplication processing can be ensured, the process of distributing the deduplication image features in parallel is two independent processes, the distribution channel is not in an idle state due to waiting for the deduplication image features, and the distribution efficiency can be improved.
Step S140: and taking out the de-duplicated image characteristics from the bucket to be distributed, and distributing the de-duplicated image characteristics to a plurality of distribution channels for parallel distribution.
After the duplicate image characteristics are obtained, the duplicate image characteristics can be taken out from the bucket to be distributed and distributed to a plurality of distribution channels (namely distribution threads) working in parallel, so that the effect of rapid distribution is realized.
In some embodiments, each distribution channel has a corresponding status flag for marking how busy it is. In this embodiment, according to the busy degree of each distribution channel, the distribution channels with less busy de-duplication image feature allocation can be distributed, so that the resource utilization maximization is realized.
According to the embodiment of the invention, before a batch of image features are distributed, the image features are subjected to deduplication processing and stored in the bucket to be distributed in the memory, and then the image features are taken out from the distribution bucket for parallel distribution. In addition, when the image features in the image feature library to be distributed are stored approximately according to the image feature generation sequence, the generation time of the image features distributed in the previous batch is probably earlier than that of the image features distributed in the next batch, and the image features of the same object corresponding to the current image features (which are necessarily distributed in the batch before the current batch) are directly covered when the receiving end is judged to exist, so that the sequence of the generation time of the two image features does not need to be further judged. That is to say, the method provided by the embodiment of the invention can perform in-batch deduplication through the bucket to be distributed, and perform inter-batch deduplication through batch distribution, so that the accuracy and efficiency of image feature distribution can be improved for the situation that the number of the image features to be distributed is large and the number of the repeated image features is large.
In some embodiments, the size of N and/or the size of the capacity of the bucket to be distributed may be determined according to at least one of a speed of adding image features in the library of image features to be distributed, a repetition rate of image features in the library of image features to be distributed, a speed of parallel distribution of the de-duplicated image features, and a size of allocable memory space in a device for saving the de-duplicated image features.
The size of the barrel to be distributed is adaptive to the size of N. When one of the buckets and N to be distributed is sized, the other may be determined. If the bucket to be distributed is too large relative to N, the bucket to be distributed is always not full, and memory space waste is caused; if the bucket to be dispensed is too small relative to N, meaning that the bucket to be dispensed cannot accommodate the deduplication feature of a batch, a memory overflow may result.
In one embodiment, the size of the bucket to be dispensed may be determined first. Specifically, the size of the bucket to be distributed is determined according to the minimum value of the number of newly added image features to be distributed in unit time, the maximum distributable image feature number in unit time multiplied by a distribution factor and the size of the distributable memory space in the image feature library to be distributed, namely the minimum value of the three is determined as the capacity of the bucket to be distributed.
It can be understood that the capacity of the bucket to be distributed is mainly limited by the maximum number of distributable image features and the size of distributable memory space per unit time, and if the speed of writing the image features into the bucket to be distributed is greater than the speed of distributing the image features, the more image features in the bucket to be distributed result. Of course, even if the maximum number of distributable image features per unit time is large, when the size of the distributable memory space is small, a bucket to be distributed with a large capacity cannot be created in the memory. Even if the maximum distributable image feature number in unit time is larger, the size of the distributable memory space is larger, if the number of newly added to-be-distributed image features in the to-be-distributed image feature library in unit time is smaller, namely the number of the to-be-distributed image features is not large, a large to-be-distributed barrel is not needed to be arranged.
The maximum distributable image feature number per unit time is the sum of the distributable maximum image feature numbers of the current distribution channels per unit time. The distribution factor is (x + y), x is an empirical value, y depends on the performance of the device, and the smaller the value the better the performance of the device. Generally, x is 4, and y is not greater than 1.
The number of the image features to be distributed newly added in the image feature library to be distributed per unit time is x (1-the repetition rate of the image features in the image feature library to be distributed).
It will be appreciated that N is determined by the repetition rate of the image features in the library of image features to be distributed. If the repetition rate of the image features in the image feature library to be distributed is low, for example, 2 repeated image features appear in 1000 image features, and if the number of the image features taken out from each batch is 50, the probability of the 50 image features having the repeated image features is very low, and the effect of deduplication is difficult to achieve. And, although the number of image features taken out of the image feature library to be distributed per batch is N, the number of image features to be stored in the bucket to be distributed is not N, but is the number of N image features after deduplication. The repetition rate of the image features in the image feature library to be distributed can be pre-judged according to the service scenes, and can be adaptively adjusted according to different service scenes. For example, statistics can be performed on image features in the image feature library to be distributed over a period of time to estimate the repetition rate of the image features in the image feature library to be distributed.
In some embodiments, after the size of the bucket to be distributed is determined, the size of N (the number of image features that need to be acquired from the image feature library to be distributed in each period) may be determined according to the size of the bucket to be distributed, and then the distribution efficiency of the image features is optimized through the values of the two.
Specifically, the value of N is smaller than the size of the bucket to be distributed, so that the duplicate removal image features stored in the bucket to be distributed at each time can be guaranteed not to overflow, and the distribution thread is in a busy state.
In addition, the value of N also needs to be large enough (for example, half the size of the bucket to be distributed), so that before the N image features are acquired from the image feature library to be distributed in the current period for deduplication and the duplicate-removed image features obtained by deduplication are stored in the bucket to be distributed, the duplicate-removed image features obtained in the previous period and not distributed remain in the bucket to be distributed, and the occurrence of the situation that the distribution channel waits for the duplicate-removed image features is avoided.
In some embodiments, the size of N may be determined first, and then the size of the bucket to be distributed may be determined according to the size of N.
In this embodiment, the value of N may be determined according to at least one of the speed of adding image features in the image feature library to be distributed, the repetition rate of image features in the image feature library to be distributed, the speed of distributing the duplicate removal image features in parallel, and the size of the allocable memory space in the device for storing the duplicate removal image features.
As corresponds to the above, the size of the bucket to be dispensed needs to be larger than the numerical size of N, for example, the size of the bucket to be dispensed is 2 times the numerical size of N.
In addition, in some embodiments, after the value of N is determined, the value of the period T may be determined according to the value of N and the speed V at which the de-duplicated image features are distributed in parallel. For example, the period T is determined according to N ═ VT.
Therefore, the balance between the speed of writing the image features in the bucket to be distributed and the speed of taking out the image features can be ensured, and the situation that the image features are accumulated more and more or the distribution thread is idle is avoided.
The speed V of parallel distribution of the de-duplicated image features can be estimated and calculated by counting the working conditions of all distribution channels, and the size of V depends on the busyness degree of all the distribution channels.
According to the embodiment of the invention, before a batch of image features are distributed, the image features are subjected to deduplication processing and stored in the bucket to be distributed in the memory, and then the image features are taken out from the distribution bucket for parallel distribution. Therefore, for the situation that the number of the image features to be distributed is large and the number of the repeated image features is large, the method provided by the embodiment of the invention can improve the accuracy and efficiency of image feature distribution.
It should be noted that, of course, if the de-duplication image features obtained in the previous cycle include the image feature 1 of the object a, and the de-duplication image features obtained in the present cycle include the image feature 2 of the object a, then the image feature 2 of the object a still normally covers the image feature 1 of the object a when the image features are written to the receiving end (when the image features in the image feature library to be distributed are stored in the image feature generation order, the probability of the image feature 2 is greater than that of the image feature 1, otherwise, the generation time of the image feature 2 and the generation time of the image feature 1 may be compared to determine whether the image features are covered or not).
In one implementation, the period T can be ensured to be sufficiently large by setting the value of N to be large and the value of V to be small. After the setting, as T is increased, the number of the image features acquired in T can be increased as much as possible, and at this time, if there are duplicate image features for the same object identifier, the duplicate image features will also be deduplicated and will not be distributed to the processing module of the receiving end for processing, so that the data transmission bandwidth can be saved.
In addition, for the receiving end, the probability of the processing module at the receiving end being covered by the processing result can be reduced, that is, the receiving end does not change the processing result frequently, and the consumption of computing resources can be reduced.
In addition, because the data stored in the memory does not have persistence, that is, the data stored in the memory is deleted after the device is powered off or abnormally started, if the power-off restart or the abnormal restart occurs before the de-duplicated image features are not completely and successfully distributed, the de-duplicated image features in the memory are lost, and the image features are missed.
To address the above issues, in some embodiments, before the de-duplicated image features are allocated to multiple distribution channels for parallel distribution, the de-duplicated image features may also be backed up to a persistent bucket located in a persistent storage space, thereby forming backup image features.
Because the data in the persistent storage space has persistence, when it is determined that a power-off restart or an abnormal restart exists, the backup image features can be preferentially acquired from the persistent bucket, and the backup image features are allocated to a plurality of distribution channels for parallel distribution.
In addition, in order to avoid the waste of distribution resources caused by repeated distribution of the same duplicate removal image feature, on the premise of backing up the duplicate removal image feature to the persistent bucket, after each duplicate removal image feature is successfully issued from the bucket to be distributed, the successfully issued duplicate removal image feature can be deleted from the persistent bucket, so that the situation that the successfully issued duplicate removal image feature is repeatedly received by the same distribution channel or different distribution channels after power failure restart or abnormal restart is avoided.
In addition, in some embodiments, after the deduplication image features are taken out of the bucket to be distributed, or after backup image features (the nature of the backup image features is also deduplication image features) are taken out of the persistent bucket and allocated to multiple distribution channels for parallel distribution, the image features and the deduplication image features stored by the receiving end may also be compared, and if the receiving end already stores existing image features having the same object identifier as the deduplication image features, the deduplication image features are written into the receiving end. In addition, the existing image features of the receiving end also need to be deleted. Of course, if the image features in the image feature library to be distributed are not stored in the image feature generation sequence but stored out of order, after it is determined that the receiving end has stored the existing image features having the same object identifiers as the de-duplicated image features, the distribution sequence information of the existing image features having the same object identifiers and the de-duplicated image features is continuously compared, and if the distribution sequence information of the de-duplicated image features is newer than the distribution sequence information of the existing image features, the de-duplicated image features are written into the receiving end.
In this way, the effect of double deduplication can be achieved, and further, errors caused by the fact that the processing result of the old image feature overlaps the processing result of the new image feature can be avoided.
Referring to fig. 3, an embodiment of the present application further provides a feature distribution apparatus 400, where the feature distribution apparatus 400 may include: an acquisition module 410, a deduplication module 420, a preservation module 430, and an assignment module 440.
An obtaining module 410, configured to obtain, at a certain period T, N image features and attribute information corresponding to each of the image features from an image feature library to be distributed, where the attribute information includes an object identifier used to indicate an object to which the image feature belongs;
a duplicate removal module 420, configured to perform duplicate removal processing on the N image features according to the attribute information, and retain the latest image features corresponding to each object identifier to obtain duplicate removal image features;
a storage module 430, configured to store the de-duplicated image features in a bucket to be distributed in a memory;
and the distribution module 440 is configured to take out the de-duplicated image features from the bucket to be distributed, and distribute the de-duplicated image features to multiple distribution channels for parallel distribution.
The device further comprises a determining module, configured to determine the size of N and/or the size of the bucket to be distributed according to at least one of a speed of adding an image feature in the image feature library to be distributed, a repetition rate of image features in the image feature library to be distributed, a speed of distributing the de-duplicated image features in parallel, and a size of an allocable memory space.
In a possible implementation manner, the determining module is configured to determine the size of the bucket to be distributed according to the minimum value of the number of newly added image features to be distributed per unit time in the image feature library to be distributed, the maximum distributable image feature number per unit time × the distribution factor, and the size of the distributable memory space; the number of the image features to be distributed newly added in the unit time is the number x of the image features to be distributed newly added in the unit time in the image feature library (1-the repetition rate of the image features in the image feature library to be distributed).
In a possible implementation, the determining module is further configured to perform any one of the following:
determining T according to N and the speed of parallel distribution of the de-duplicated image features;
determining the size of the barrel to be distributed according to the size of N;
and determining the size of N according to the size of the barrel to be distributed.
In a possible implementation manner, the obtaining module 410 is configured to determine an image feature pointed to by a mark position of an image feature library to be distributed, where the mark position is used to point to a last image feature obtained in a previous cycle; starting from the next image feature of the image features pointed by the mark positions, acquiring N image features which are not acquired from the image feature library to be distributed;
correspondingly, the device further comprises an updating module for updating the mark position to point to the last image feature acquired in the current period.
In a possible implementation manner, the attribute information further includes distribution order information for characterizing image features corresponding to the attribute information, and the deduplication module 420 is configured to delete, from among a plurality of image features having the same object identifier, other image features whose distribution order information is not the latest;
wherein the distribution order information is a number configured by a user or a generation time of the corresponding image feature.
In a possible implementation, the assigning module 440 is configured to compare the image feature stored at the receiving end with the de-duplicated image feature;
if the receiving end has stored the existing image characteristics with the same object identification as the duplication-removing image characteristics, writing the duplication-removing image characteristics into the receiving end, and deleting the existing image characteristics of the receiving end.
In a possible implementation manner, the saving module 430 is further configured to backup the de-duplicated image features to a persistent bucket located in a persistent storage space to form backup image features;
correspondingly, the allocating module 440 is further configured to, when it is determined that there is a power-off restart or an abnormal restart, acquire the backup image features from the persistent bucket, and allocate the backup image features to the multiple distribution channels for parallel distribution.
The feature distribution apparatus 400 provided in the embodiment of the present application has the same implementation principle and the same technical effect as those of the foregoing method embodiments, and for the sake of brief description, reference may be made to the corresponding contents in the foregoing method embodiments for parts of the embodiment without reference.
In addition, an embodiment of the present application further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a computer, the computer program performs the feature distribution method as described above.
In addition, referring to fig. 4, an embodiment of the present application further provides an electronic device 100 for implementing the feature distribution method and apparatus of the embodiment of the present application.
Alternatively, the electronic Device 100 may be, but is not limited to, a Personal Computer (PC), a tablet computer, a Mobile Internet Device (MID), a Personal digital assistant (pda), a server, and the like.
The server may be, but is not limited to, a web server, a database server, a cloud server, and the like.
Among them, the electronic device 100 may include: a processor 110, a memory 120.
It should be noted that the components and structure of electronic device 100 shown in FIG. 4 are exemplary only, and not limiting, and electronic device 100 may have other components and structures as desired.
The processor 110, memory 120, and other components that may be present in the electronic device 100 are electrically connected to each other, directly or indirectly, to enable the transfer or interaction of data. For example, the processor 110, the memory 120, and other components that may be present may be electrically coupled to each other via one or more communication buses or signal lines.
The memory 120 is used for storing a program, for example, a program corresponding to the above-mentioned feature distribution method or the above-mentioned feature distribution apparatus. Optionally, when the memory 120 stores the feature distribution means, the feature distribution means includes at least one software function module that can be stored in the memory 120 in the form of software or firmware (firmware).
Alternatively, the software function module included in the feature distribution apparatus may also be solidified in an Operating System (OS) of the electronic device 100.
The processor 110 is adapted to execute executable modules stored in the memory 120, such as software functional modules or computer programs comprised by the feature distribution apparatus. When the processor 110 receives the execution instruction, it may execute the computer program, for example, to perform: acquiring N image characteristics and attribute information corresponding to each image characteristic from an image characteristic library to be distributed at a certain period T, wherein the attribute information comprises an object identifier used for indicating an object to which the image characteristics belong; according to the attribute information, carrying out duplication elimination processing on the N image characteristics, and reserving the latest image characteristics corresponding to each object identifier to obtain duplication elimination image characteristics; storing the de-duplicated image characteristics into a bucket to be distributed in a memory; and taking out the de-duplicated image characteristics from the bucket to be distributed, and distributing the de-duplicated image characteristics to a plurality of distribution channels for parallel distribution.
Of course, the method disclosed in any of the embodiments of the present application can be applied to the processor 110, or implemented by the processor 110.
In summary, the feature distribution method, the feature distribution device, the electronic device, and the computer-readable storage medium according to the embodiments of the present invention perform the past reprocessing on the image features before the image features are distributed, and the latest image features for the same object are retained, so that after the deduplication image features are distributed to the processing modules for processing through the distribution channel in parallel, at least the processing results for the same object identifier do not exist in the processing results for the image features obtained in the cycle, and thus, in the cycle, the problem of disorder of the processing order of the image features does not exist, and at least the problem of mutual coverage does not exist between the processing results for the image features obtained in the cycle, and further, the error caused by mutual coverage between the processing results for the image features obtained in the cycle can be avoided.
It should be noted that, in the present specification, the embodiments are all described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments may be referred to each other.
In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method can be implemented in other ways. The apparatus embodiments described above are merely illustrative, and for example, the flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
In addition, functional modules in the embodiments of the present application may be integrated together to form an independent part, or each module may exist separately, or two or more modules may be integrated to form an independent part.
The functions, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored in a computer-readable storage medium. Based on such understanding, the technical solution of the present application or portions thereof that substantially contribute to the prior art may be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a notebook computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
The above description is only for the specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present application, and shall be covered by the scope of the present application.

Claims (11)

1. A method for feature distribution, the method comprising:
acquiring N image characteristics and attribute information corresponding to each image characteristic from an image characteristic library to be distributed at a certain period T, wherein the attribute information comprises an object identifier used for indicating an object to which the image characteristics belong;
according to the attribute information, carrying out duplication elimination processing on the N image characteristics, and reserving the latest image characteristics corresponding to each object identifier to obtain duplication elimination image characteristics;
storing the de-duplicated image characteristics into a bucket to be distributed in a memory;
and taking out the de-duplicated image characteristics from the bucket to be distributed, and distributing the de-duplicated image characteristics to a plurality of distribution channels for parallel distribution.
2. The method of claim 1, further comprising:
and determining the size of N and/or the size of the bucket to be distributed according to at least one of the speed of newly adding image features in the image feature library to be distributed, the repetition rate of image features in the image feature library to be distributed, the parallel distribution speed of the de-duplicated image features and the size of an allocable memory space.
3. The method according to claim 2, wherein determining the size of N and/or the size of the bucket to be distributed according to at least one of a speed of adding an image feature in the image feature library to be distributed, a repetition rate of an image feature in the image feature library to be distributed, a speed of parallel distribution of the de-duplicated image features, and a size of an allocable memory space comprises:
determining the size of a bucket to be distributed according to the minimum value of the number of newly added image features to be distributed in unit time in the image feature library to be distributed, the maximum distributable image feature number in unit time multiplied by a distribution factor and the size of distributable memory space; the number of the newly added image features to be distributed in the image feature library to be distributed per unit time is x (1-the repetition rate of the image features in the image feature library to be distributed) of the number of the newly added image features to be distributed per unit time in the image feature library to be distributed.
4. A method according to claim 2 or 3, characterized in that the method further comprises any of the following steps:
determining T according to N and the speed of parallel distribution of the de-duplicated image features;
determining the size of the barrel to be distributed according to the size of N;
and determining the size of N according to the size of the barrel to be distributed.
5. The method according to any one of claims 1 to 4, wherein the obtaining of N image features and attribute information corresponding to each image feature from an image feature library to be distributed at a certain period T comprises:
determining an image feature pointed by a mark position of an image feature library to be distributed, wherein the mark position is used for pointing to the last image feature acquired in the previous period;
starting from the next image feature of the image features pointed by the mark positions, acquiring N image features which are not acquired from the image feature library to be distributed;
correspondingly, after acquiring N image features and attribute information corresponding to each image feature from an image feature library to be distributed at a certain period T, the method further includes:
and updating the mark position to point to the last image feature acquired in the current period.
6. The method according to any one of claims 1 to 5, wherein the attribute information further includes distribution order information for characterizing image features corresponding to the attribute information, and performing deduplication processing on the N image features according to the attribute information includes:
deleting other image features of which the distribution sequence information is not latest in the plurality of image features with the same object identifier;
wherein the distribution order information is a number configured by a user or a generation time of the corresponding image feature.
7. The method of any of claims 1-6, wherein removing the de-duplicated image features from the bucket to be dispensed and assigning them to a plurality of dispensing channels for parallel dispensing comprises:
comparing the image characteristics stored by the receiving end with the de-duplicated image characteristics;
if the receiving end has stored the existing image characteristics with the same object identification as the duplication-removing image characteristics, writing the duplication-removing image characteristics into the receiving end, and deleting the existing image characteristics of the receiving end.
8. The method of any of claims 1-7, wherein prior to removing the de-duplicated image features from the bucket to be dispensed for assignment to a plurality of dispensing channels for parallel dispensing, the method further comprises:
backing up the de-duplicated image features to a persistent bucket located in a persistent storage space to form backup image features;
correspondingly, the method further comprises the following steps:
and when determining that the power failure restart or the abnormal restart exists, acquiring the backup image characteristics from the persistent bucket, and distributing the backup image characteristics to the plurality of distribution channels for parallel distribution.
9. A feature distribution apparatus, the apparatus comprising:
the system comprises an acquisition module, a distribution module and a distribution module, wherein the acquisition module is used for acquiring N image characteristics and attribute information corresponding to each image characteristic from an image characteristic library to be distributed at a certain period T, and the attribute information comprises an object identifier used for indicating an object to which the image characteristics belong;
the duplication removing module is used for carrying out duplication removing processing on the N image characteristics according to the attribute information, and reserving the latest image characteristics corresponding to each object identifier to obtain duplication removing image characteristics;
the storage module is used for storing the de-duplicated image characteristics into a bucket to be distributed in a memory;
and the distribution module is used for taking out the de-duplicated image characteristics from the bucket to be distributed and distributing the de-duplicated image characteristics to a plurality of distribution channels for parallel distribution.
10. An electronic device, comprising: a memory and a processor, the memory and the processor connected;
the memory is used for storing programs;
the processor calls a program stored in the memory to perform the method of any of claims 1-8.
11. A computer-readable storage medium, on which a computer program is stored which, when executed by a computer, performs the method of any one of claims 1-8.
CN202110734720.0A 2021-06-30 2021-06-30 Feature distribution method and device, electronic equipment and computer-readable storage medium Pending CN113590381A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110734720.0A CN113590381A (en) 2021-06-30 2021-06-30 Feature distribution method and device, electronic equipment and computer-readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110734720.0A CN113590381A (en) 2021-06-30 2021-06-30 Feature distribution method and device, electronic equipment and computer-readable storage medium

Publications (1)

Publication Number Publication Date
CN113590381A true CN113590381A (en) 2021-11-02

Family

ID=78245338

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110734720.0A Pending CN113590381A (en) 2021-06-30 2021-06-30 Feature distribution method and device, electronic equipment and computer-readable storage medium

Country Status (1)

Country Link
CN (1) CN113590381A (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140188805A1 (en) * 2012-12-28 2014-07-03 Commvault Systems, Inc. Backup and restoration for a deduplicated file system
CN105959798A (en) * 2016-06-30 2016-09-21 北京小米移动软件有限公司 Video stream frame positioning method and device, and equipment
US20180218005A1 (en) * 2017-01-28 2018-08-02 Microsoft Technology Licensing, Llc Chunk storage deduplication using graph structures
CN109101333A (en) * 2018-06-27 2018-12-28 北京蜂盒科技有限公司 Image characteristic extracting method, device, storage medium and electronic equipment
CN109413392A (en) * 2018-11-23 2019-03-01 中国兵器装备集团自动化研究所 A kind of system and method for embedded type multichannel video image acquisition and parallel processing
CN110766600A (en) * 2019-12-26 2020-02-07 武汉精立电子技术有限公司 Image processing system with distributed architecture
CN111968218A (en) * 2020-07-21 2020-11-20 电子科技大学 Three-dimensional reconstruction algorithm parallelization method based on GPU cluster
CN112136104A (en) * 2019-07-29 2020-12-25 深圳市大疆创新科技有限公司 Data packet writing method and device, control terminal and movable platform

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140188805A1 (en) * 2012-12-28 2014-07-03 Commvault Systems, Inc. Backup and restoration for a deduplicated file system
CN105959798A (en) * 2016-06-30 2016-09-21 北京小米移动软件有限公司 Video stream frame positioning method and device, and equipment
US20180218005A1 (en) * 2017-01-28 2018-08-02 Microsoft Technology Licensing, Llc Chunk storage deduplication using graph structures
CN109101333A (en) * 2018-06-27 2018-12-28 北京蜂盒科技有限公司 Image characteristic extracting method, device, storage medium and electronic equipment
CN109413392A (en) * 2018-11-23 2019-03-01 中国兵器装备集团自动化研究所 A kind of system and method for embedded type multichannel video image acquisition and parallel processing
CN112136104A (en) * 2019-07-29 2020-12-25 深圳市大疆创新科技有限公司 Data packet writing method and device, control terminal and movable platform
CN110766600A (en) * 2019-12-26 2020-02-07 武汉精立电子技术有限公司 Image processing system with distributed architecture
CN111968218A (en) * 2020-07-21 2020-11-20 电子科技大学 Three-dimensional reconstruction algorithm parallelization method based on GPU cluster

Similar Documents

Publication Publication Date Title
US20220283988A1 (en) Distributed write journals that support fast snapshotting for a distributed file system
US8782011B2 (en) System and method for scalable reference management in a deduplication based storage system
CN102594849B (en) Data backup and recovery method and device, virtual machine snapshot deleting and rollback method and device
CA2901668C (en) Deduplication storage system with efficient reference updating and space reclamation
US8874522B2 (en) Managing backups of data objects in containers
US10872037B2 (en) Estimating worker nodes needed for performing garbage collection operations
JP2021509989A (en) Resource reservation method, resource reservation device, resource reservation device, and resource reservation system
CN104067239A (en) Systems and methods for data chunk deduplication
WO2020159585A1 (en) Scalable garbage collection for deduplicated storage
WO2020205015A1 (en) Marking impacted similarity groups in garbage collection operations in deduplicated storage systems
US20200310965A1 (en) Deleting data in storage systems that perform garbage collection
US11093453B1 (en) System and method for asynchronous cleaning of data objects on cloud partition in a file system with deduplication
CN111190537A (en) Method and system for managing sequential storage disks in write-addition scene
US20230376357A1 (en) Scaling virtualization resource units of applications
US11392546B1 (en) Method to use previously-occupied inodes and associated data structures to improve file creation performance
CN110737389A (en) Method and device for storing data
CN113590381A (en) Feature distribution method and device, electronic equipment and computer-readable storage medium
CN110941597B (en) Method and device for cleaning decompressed file, computing equipment and computer storage medium
US10423494B2 (en) Trimming unused blocks from a versioned image backup of a source storage that is stored in a sparse storage
CN115756955A (en) Data backup and data recovery method and device and computer equipment
US9241046B2 (en) Methods and systems for speeding up data recovery
CN108984343B (en) Virtual machine backup and storage management method based on content analysis
EP4312126A1 (en) Parallelization of incremental backups
CN116708479A (en) Mirror image management method, device, equipment and storage medium
CN117951084A (en) Data writing method and device of file system, storage medium and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination