CN109685002B

CN109685002B - Data set acquisition method and system and electronic device

Info

Publication number: CN109685002B
Application number: CN201811580152.8A
Authority: CN
Inventors: 张发恩; 秦永强; 赵江华
Original assignee: Alnnovation Guangzhou Technology Co ltd
Current assignee: Chongqing Cisai Tech Co Ltd
Priority date: 2018-12-21
Filing date: 2018-12-21
Publication date: 2020-12-15
Anticipated expiration: 2038-12-21
Also published as: CN109685002A

Abstract

The present invention relates to the field of data acquisition technologies, and in particular, to a data set acquisition method, system, and electronic device. The method comprises the following steps: providing a target; shooting the target to obtain a video; reading each frame of image in the video, and obtaining position information and/or characteristic information of the target in each frame of image; the method has the advantages that targets are labeled, the data acquisition and labeling efficiency is greatly improved through a standardized data acquisition process and parallelized acquisition tasks, the problems of high manual labeling cost, long consumed time, inaccuracy and the like are solved, and the method is suitable for rapidly acquiring a batch of image data with accurate labels in practical industrial application scenes.

Description

Data set acquisition method and system and electronic device

[ technical field ] A method for producing a semiconductor device

The present invention relates to the field of data acquisition technologies, and in particular, to a data set acquisition method, system, and electronic device.

[ background of the invention ]

Nowadays, industries such as freight transportation, freight production and freight sales all need to classify and detect different kinds of cargoes based on characteristic information of the cargoes, so when facing a large amount of cargoes needing to be detected and classified, multiple kinds of characteristic data information under multiple directions and different scenes about each cargo usually needs to be manually collected, and the cargoes are marked by utilizing the characteristic data information, and the operation is often low in efficiency, high in cost and easy to make mistakes.

[ summary of the invention ]

The invention provides a data set acquisition method, a data set acquisition system and an electronic device, aiming at the problems in the existing data acquisition method.

The invention provides a data set acquisition method, which comprises the following steps:

s1, providing a target, wherein the target is goods; s2, shooting the target to obtain a video; s3, reading each frame of image in the video, and obtaining the position information and/or the characteristic information of the target in each frame of image; the step S3 specifically includes the following steps: s31, calibrating the initial position of the target in the first frame on the video and obtaining the position information and the characteristic information of the target; s32, reading the next frame of video image, and calculating to obtain a plurality of possible new positions and confidence rates by adopting a target tracking algorithm based on the characteristic information of the frame of video image and combining the position information and the characteristic information of the target in the previous frame; s33, the target tracking algorithm judges whether a target needing to be tracked exists in the plurality of new positions, if yes, the target position and the tracking state are updated, meanwhile, the position information and the characteristic information of the frame target are correspondingly updated, and the step S32 is returned until the video is finished; if not, the target position is reinitialized and the process returns to the step S32; and S4, labeling the targets by using the category information representing the targets to form a data set corresponding to each target.

Preferably, the feature information of the target in each frame in step S3 includes color feature information, and the feature information is represented by a multi-dimensional vector.

Preferably, the specific step of the target tracking algorithm determining whether there is a target to be tracked in the plurality of new positions is: and judging whether the target exists at the new position or not by combining the characteristic information of the areas determined by the plurality of new positions, the position information of each new position, the target comparison in the last frame and the confidence rate of the new position.

Preferably, the step of determining whether the new position exists in the target by the target tracking algorithm according to the feature information of the regions determined by the plurality of new positions, the position information of each new position, the target comparison in the previous frame and the confidence rate of the new position comprises the following steps: and respectively comparing whether the difference value between the characteristic information of the area determined by the new position and the characteristic information of the target in the previous frame is within a set threshold range, whether the difference value between the position information of each new position and the position information of the target in the previous frame is within the set threshold range, and whether the confidence rate of the new position is within the set threshold range.

Preferably, in the step S2, the capturing of the target to obtain the video is performed under preset capturing conditions, where the preset capturing conditions include a set capturing angle, a set lighting condition, and a set background condition.

The invention also provides a data set acquisition system for solving the technical problems, which comprises a shooting module and a data reading module, wherein the shooting module is used for carrying out video shooting on a target; the data reading module is used for reading each frame of image in the video and obtaining the position information and/or the characteristic information of the target in each frame of image; the data reading module includes: the device comprises an acquisition module, an operation module, a tracking module and a first judgment module, wherein the acquisition module is used for acquiring the characteristic information and the position information of each frame of image, and the operation module is used for calculating and acquiring a plurality of new positions related to a target in a subsequent frame based on the characteristic information of the target in the previous frame and outputting a confidence rate corresponding to each new position; the tracking module: the target tracking module is used for tracking the target based on a plurality of new positions obtained by the operation module; the first judging module: and the tracking module is used for judging whether a target needing to be tracked exists in the plurality of new positions obtained by the operation module.

Preferably, the data collection system further comprises a storage module for storing the labeled image.

The present invention further provides an electronic device, which includes a memory and a processor, wherein the memory stores a computer program, and the processor is configured to execute the data set collection method described above through the computer program.

Compared with the prior art, the invention has the following beneficial effects:

according to the data set acquisition method provided by the invention, the target to be subjected to data set acquisition is subjected to video shooting, the position information and/or characteristic information of the target in each frame of image in the video is read by utilizing a target tracking algorithm to acquire the data set related to the target, and the acquisition task is parallelized through a standardized data acquisition process, so that the data acquisition and labeling efficiency is greatly improved, the problems of high manual labeling cost, long time consumption, inaccuracy and the like are solved, and the method is suitable for quickly acquiring a batch of image data with accurate labels in an actual industrial application scene; moreover, the applicability is high, specific samples are not limited, the target can be tracked and data can be acquired based on the relevance of the characteristic information and the position information of the target in different frames on the video without acquiring a data training model in advance, and the process is saved; meanwhile, the process of acquiring the position information and the characteristic information of each frame of image and the operation of labeling the target are asynchronously carried out, so that the flexibility of the data acquisition process is higher, the target can be labeled when the first frame is read, and the target can be labeled after the video reading is finished and the position information of the target in each frame is calculated through a tracking model.

Whether the target to be tracked exists in the new position is judged by combining the characteristic information of the areas determined by the plurality of new positions, the position information of each new position, the target comparison in the previous frame and the confidence rate of the new position, so that the accuracy of tracking the target can be well improved, and the acquisition of unqualified data is avoided.

The data set acquisition system provided by the invention comprises a shooting module, a data acquisition module and a data acquisition module, wherein the shooting module is used for carrying out video shooting on a target; the data reading module is used for reading each frame of image in the video and obtaining the position information and the characteristic information of the target in each frame of image, the whole system has simple structure flow, the target needing data acquisition is provided, the shooting module is used for shooting the video of the target, and the reading module can automatically read the position information and the characteristic information of the target in each frame of image in the video to form a data set; moreover, the method is high in applicability, specific samples are not limited, the target can be tracked and data can be acquired based on the relevance between the characteristic information and the position information of the target in different frames on the video without acquiring a data training model in advance, and the process is saved.

[ description of the drawings ]

Fig. 1 is a schematic flow-chart structure diagram of a data set acquisition method provided in a first embodiment of the present invention;

FIG. 2 is a schematic diagram of another flow structure of the data set collection method provided in the first embodiment of the present invention;

FIG. 3 is a diagram illustrating the target state in the first frame according to the first embodiment of the present invention;

FIG. 4 is a schematic view showing a state in which the angle of view of the object is changed from that of the object in FIG. 3 in the first embodiment of the present invention;

FIG. 5 is a schematic view of the object of the first embodiment of the present invention in a view angle and under illumination change relative to the object of FIG. 4;

FIG. 6 is a schematic diagram of the first embodiment of the present invention showing the viewing angle, illumination and angle of the target relative to the target of FIG. 5;

FIG. 7 is a diagram illustrating the position information and status of an object in a first frame according to a first embodiment of the present invention;

FIG. 8 is a schematic diagram of a flow structure of a target tracking algorithm according to a first embodiment of the present invention;

FIG. 9 is a diagram illustrating a state of a target position in a first frame according to a first embodiment of the present invention;

FIG. 10 is a schematic structural diagram of a plurality of new positions associated with a target according to a target tracking algorithm in the first embodiment of the present invention;

FIG. 11 is a diagram illustrating a structure of re-initializing a target location when a target is lost according to a first embodiment of the present invention;

FIG. 12 is a diagram illustrating a first embodiment of the present invention after updating the tracking status when the target is not lost;

fig. 13 is a schematic block diagram of a data set acquisition system according to a second embodiment of the present invention;

FIG. 14 is a schematic structural diagram of a data reading module of a data set acquisition system according to a second embodiment of the present invention;

fig. 15 is a schematic block diagram of an electronic device according to a third embodiment of the present invention.

[ detailed description ] embodiments

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

Referring to fig. 1, the present invention provides a data set collecting method for collecting data of a target requiring to collect characteristic data, including the following steps:

s1, providing a target;

s2, shooting the target to obtain a video;

s3, reading each frame of image in the video, and obtaining the position information and/or the characteristic information of the target in each frame of image;

and S4, labeling the target.

In the step S1, the target is an article that needs to acquire characteristic information and position information to form a data set, and includes goods such as daily necessities, food, document goods, and the like.

In the above step S2, the target is photographed according to the preset photographing standard to obtain the video. The preset shooting standard refers to video shooting under the conditions of multiple angles, multiple illumination and simple background set for the target. The selected background is generally a background that has a good distinguishing effect from the color of the object. The method has the advantages that the target is shot in multiple angles and under multiple light-ray conditions, the characteristic data of the target under different environments and different scenes can be well obtained, so that the data set collected by the target in the video can represent articles more comprehensively, and when the collected data is utilized to establish a model to detect or classify products in the later period, more pictures obtained under different scenes can be identified, and the target can be identified.

Referring to fig. 2, the step S3 specifically includes the following steps:

s31, calibrating the initial position of the target in the first frame on the video and obtaining the position information and the characteristic information of the target;

s32, reading the next frame of video image, and calculating to obtain a plurality of possible new positions and confidence rates by adopting a target tracking algorithm based on the characteristic information of the frame of video image and combining the position information and the characteristic information of the target in the previous frame;

s33, the target tracking algorithm judges whether a target needing to be tracked exists in the plurality of new positions, if yes, the target position and the tracking state are updated, meanwhile, the position information and the characteristic information of the frame target are correspondingly updated, and the step S32 is returned until the video is finished; if not, the target position is reinitialized and the process returns to step S32.

Referring to fig. 2 and fig. 3, the step S33 specifically includes the steps of:

s331, judging whether the target is lost,

if not, executing the step

S332, updating the target position,

if yes, execute the step

S333, reinitializing the target position;

after step S332 or step S333 is executed, the following steps are continuously executed:

s334, updating the tracking state;

after step S334 is executed, the step of:

s335, judging whether the video is finished or not, if so, executing the step

S336, uploading the marked data and finishing the operation;

if not, the method returns to the step S32 until the video acquisition is finished.

Referring to fig. 4, 5, 6 and 7, fig. 4 is a schematic diagram showing a position and a state of an object in a first frame, fig. 5 is a schematic diagram showing a position and a state of an object when a view angle of the object changes in a subsequent frame adjacent to the first frame, fig. 6 is a schematic diagram showing a view angle and a position and a state when illumination of an object in a subsequent frame adjacent to a previous frame changes, and fig. 7 is a schematic diagram showing a position and a state when a view angle, illumination and an angle of an object in a subsequent frame adjacent to the previous frame change. It can be seen that the position and state of the object are gradual and continuous in time, that is, the position information and the feature information have a certain correlation in time. The target can be tracked based on the position information of the target in the continuous frames in the video and the relevance of the characteristic information, the position information and/or the characteristic information of the target in different frames are obtained to form a data set, and after the video reading is finished, a model can be established by utilizing the collected data for screening and classifying the subsequent target.

Referring to fig. 2 and 8, in step S31, the initial position of the target in the first frame on the video is marked as O, a rectangular frame is drawn by a manual marking method to mark the target to be tracked, and the operation of the mouse is completed manually. Of course, the method is not limited to the use of rectangular frames, for example, circles, diamonds, or other closed figures, based on the difference in the outlines of different objects.

It is understood that after the object is marked with the rectangular frame, the position information of the object in the first frame image can be obtained. Specifically, the position information may be understood as vertex coordinate values of a rectangular frame, such as coordinate values corresponding to the O1 point, the O2 point, the O3 point, and the O4 point in fig. 8, that is, the target position is characterized by the vertex coordinate values of the rectangular frame.

In step S31, the position of the target is manually calibrated and the characteristic information of the target is also measured for tracking the target in the subsequent frame. The feature information includes at least one of a color feature, a grayscale feature, an edge feature, or a contour feature. Optionally, the target is tracked by combining multiple kinds of characteristic information, so that the accuracy of tracking the target can be well improved, the accuracy of the acquired data is improved, and the acquired data can be better used for screening and classifying so as to obtain more accurate screening results and classification results. Therefore, it can be understood that after the position information and the feature information of the object in the first frame are determined, the position information and the feature information, i.e. the data representing the object in the first frame, are one of the sub-data in the data set of the object to be collected.

The color characteristics are exemplified in the present invention. Referring to fig. 8 again, after the position information of the target in the first frame image is manually calibrated, the histogram of the RGB value combination of each pixel in the rectangular frame is used as the feature of the target. The features of the object are combined by the R, G, B components of the pixel values:

F＝w₁R+w₂G+w₃B，

wherein w₁、w₂、w₃To a set value, e.g. w₁、w₂、w₃The values of (A) are:-any three of 2, 1, -1 and 0. F is a linear combination of integer coefficients of R, G, B components, with the exception of (w)₁、w₂、w₃)＝(0、0、0)、(w₁、w₂、w₃)＝(2、2、2)、(w₁、w₂、w₃) The three cases of (1, 1).

That is, after the position of the object in the first frame is manually calibrated, the color feature of the object in the first frame, which is expressed by the pixel value, can be obtained. That is, the feature information may be expressed as a multidimensional vector, which may be expressed in 1 n dimensions.

In step S32, when reading the next frame of video image, a target tracking algorithm is used to calculate and obtain a plurality of possible new positions and confidence rates based on the feature information of the frame of video image and the feature information of the target in the previous frame, that is, a plurality of regions matching the color feature information of the target in the previous frame are found in each subsequent frame. The confidence rate is used for measuring the probability that the target needing to be tracked exists in the corresponding new position, and the new position where the target needing to be tracked possibly exists is determined based on comparison with the set threshold value. The target tracking algorithm used in the invention comprises one of Adaboost algorithm, Svm algorithm or DPM algorithm.

Referring to fig. 9 and 10, when determining a plurality of new positions, it may be performed to start a search in a position close to the target position in the previous frame by taking a region of the target in the previous frame, that is, a rectangular frame, as a unit, to find a plurality of positions close to the feature information of the target and calculate a confidence rate of each position, and when the confidence rate is higher than a threshold, it is considered that there may be a new position of the target. For example, in fig. 9, a rectangular block having a centroid (x, y) and a size (m, n) of a rectangular frame in the first frame is the current block. As shown in fig. 10, in the second frame, all possible positions within a (m +. DELTA.m, n +. DELTA.n) search range are taken with the (x, y) position as the center and the confidence rates corresponding to the positions are calculated. It will be appreciated that new positions of a plurality of possible objects are found, i.e. whether the pixel values of the regions formed by comparing the new positions are close to the pixel values of the object in the first frame. The threshold values for position a, position B and position C as shown in fig. 10 are 75%, 85%, 90%, respectively. When the set threshold is 70%, the confidence rates of the position a, the position B and the position C are all higher than the set threshold. Therefore, it is considered that the target to be tracked may exist in the area formed by the position a, the position B and the position C. At this time, the pixel values of the regions correspondingly formed at the position a, the position B, and the position C match the pixel values of the first frame target.

It will be appreciated that, due to the change of the view angle of the target in different frames, when a plurality of new positions of each subsequent frame are selected, the rectangular frame of the target in the previous frame may be enlarged appropriately and then the new positions are searched in the subsequent frame, so as to improve the accuracy of the search. The scaling of the previous and next frames may be 1: 1.4.

after obtaining a plurality of new positions higher than the set threshold, determining whether the object in the area corresponding to the new position and the target in the previous frame are the same target, that is, entering step S331, and determining whether the target is lost, that is, determining whether there is a target to be tracked in the plurality of areas determined by the new position.

Specifically, in step S331, the feature information of the regions determined by the plurality of new positions, the position information of each new position, the target comparison in the previous frame, and the confidence rate of the new position are combined to determine whether the target has a new position. It can be understood that the feature information of the target between two adjacent frames is associated and is gradually changed, a feature threshold for determining whether the feature information is close to each other may be set, when the difference value of the feature information between two adjacent frames is not within the set feature threshold range, it is considered that the region corresponding to the target cannot be found, the target is considered to be lost, and if the difference value of the feature information of the target between two adjacent frames is within the set threshold range, the target to be tracked may exist.

When the difference value of the feature information of the target between two adjacent frames is within the set threshold range, further judging whether the difference value is within the set threshold range by combining with the confidence rate of the new position, if the difference value is not within the set threshold range of the confidence rate, the target is considered to be lost, otherwise, further judging whether the target is lost by combining with the position information of the new position. It should be noted that the confidence threshold used in determining whether the target is lost is different from the threshold used in determining the new position, and the threshold used in determining whether the target is lost is higher than the threshold used in determining the new position, for example, if the threshold used in determining the new position is 70%, the threshold used in determining whether the target is lost should be higher than 70%. In the present invention, the confidence threshold for determining whether a target is missing is defined to be greater than 85%.

If the confidence rate of the new position is within the preset threshold range of the confidence rate, further combining the position information of the new position to judge whether the target is lost, at the moment, the threshold set based on the position information can be used to judge whether the new position information is within the set threshold range, if so, the target is judged not to be lost, otherwise, the target is considered to be lost.

Therefore, in the process of determining whether the target is lost in step S331, only if the feature information of the area determined by the new position, the confidence rate of the new position, and the position information of each new position are matched with the feature information corresponding to the target in the previous frame, the confidence rate of the new position, and the position information of the new position, the target is considered not lost, otherwise, the target is lost, so that the accuracy of the acquired data can be improved well by determining whether the target is lost.

Referring to fig. 11, after step S331 is executed, if it is determined that the target is lost, step S333 is executed to reinitialize the target position, that is, to locate the target again in a manual calibration manner and obtain the feature information corresponding to the target in the frame, and to continue tracking the target in the subsequent frame based on the feature information and the position information of the target in the frame. And calibrating the position D by a manual calibration mode, wherein the position D has an object to be tracked, and corresponding the position information and the characteristic information of the object in the frame to the object in the frame so as to continuously track the object in the subsequent frame.

Referring to fig. 12, if the target is not lost, for example, it is detected that the object tracked in the new position a is the same target as the previous frame, the target position is automatically updated and the tracking status is updated. That is, the position information of the target in the frame is updated to the position information at the position a, and the tracking state corresponds to the position a. At this time, the position information and the feature information of the position a correspond to the object of the frame to represent the position information and the feature information of the object of the frame to form a data set. Meanwhile, the target in the subsequent frame continuously tracks and collects data based on the position information and the characteristic information of the frame.

Referring again to fig. 1, it should be noted that, in step S4, the target is labeled, that is, the label value of the target is labeled as information, and is usually manually input. The label value comprises the information of the kind of the object, such as cola, biscuits or milk. The step of labeling the target may be performed when the target of the first frame image is manually calibrated, may be performed after the video reading is finished, or may be performed during the process of reading the subsequent frame by the video. Therefore, the whole data acquisition process is more flexible, and detection and labeling can be asynchronously operated.

After the video acquisition is finished, labeling all the targets appearing in the video, and before uploading labeled data, performing image enhancement processing on each acquired picture. Specific ways of image enhancement processing include, but are not limited to: noise processing and distortion processing. And training the model after enhancing the image so that the obtained model has better detection effect and classification effect and can be better applied to different scenes. For example, even if the pictures are taken with different shooting devices and under illumination adjustment, the model can still be well distinguished, and the samples to be classified are detected or classified.

Referring to fig. 13, a second embodiment of the present invention provides a data set acquisition system, which includes: a shooting module 11 and a data reading module 12. The shooting module 11 is used for shooting a video of a target according to a preset shooting standard. The data reading module 12 is configured to read each frame of image in the video and obtain position information and feature information of the target in each frame of image.

And the storage module 14 is used for storing the marked data to form a data set.

Referring to fig. 14, the data reading module 12 includes: the device comprises an acquisition module 121, an operation module 122, a tracking module 123, a first judgment module 124 and a second judgment module 125. The obtaining module 121 is configured to obtain feature information and location information. The method comprises the steps of obtaining position information and characteristic information of a first frame of target and characteristic information of each subsequent frame of image. The operation module 122 obtains a plurality of new positions and confidence rates related to the target in the subsequent frames based on the feature information of the target in the previous frame. The tracking module 123 is configured to track the target based on the plurality of new positions obtained by the operation module, that is, the new target position and the new tracking state. The first determining module 124 is used to determine whether the objects in the new positions and the target in the previous frame are the same target, i.e. determine whether the target is lost. The second determining module 125 is configured to determine whether the video is finished.

Referring to fig. 15, a third embodiment of the present invention provides an electronic device, which includes a memory 31 and a processor 32, wherein the memory 31 stores a computer program, and the processor 32 is configured to execute any step of the data collection method in the first embodiment through the computer program.

The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents and improvements made within the spirit of the present invention are intended to be included within the scope of the present invention.

Claims

1. A method of data set acquisition, characterized by: the method comprises the following steps:

s1, providing a target, wherein the target is goods;

s2, shooting the target to obtain a video;

the step S3 specifically includes the following steps:

s32, reading the next frame of video image, and calculating to obtain a plurality of possible new positions and confidence rates by adopting a target tracking algorithm based on the characteristic information of the frame of video image and combining the position information and the characteristic information of the target in the previous frame; and

s33, the target tracking algorithm judges whether a target needing to be tracked exists in the plurality of new positions, if yes, the target position and the tracking state are updated, meanwhile, the position information and the characteristic information of the frame target are correspondingly updated, and the step S32 is returned until the video is finished; if not, the target position is reinitialized and the process returns to the step S32;

the specific steps of judging whether the target to be tracked exists in the plurality of new positions by the target tracking algorithm are as follows: judging whether the target exists in the new position or not by combining the characteristic information of the regions determined by the plurality of new positions, the position information of each new position, the target comparison in the previous frame, the confidence rate of the new position and the confidence rate of whether the target is lost or not; the confidence rate of whether the target is lost is higher than the confidence rate of the new position; and

and S4, labeling the targets by using the category information representing the targets to form a data set corresponding to each target.

2. The data set acquisition method of claim 1, characterized in that: the feature information of the target in each frame in step S3 includes color feature information, and the feature information is represented by a multi-dimensional vector.

3. The data set acquisition method of claim 2, characterized in that: the target tracking algorithm combines the characteristic information of the regions determined by a plurality of new positions, the position information of each new position, the target comparison in the previous frame and the confidence rate of the new position to judge whether the new position exists in the target or not comprises the following steps: and respectively comparing whether the difference value between the characteristic information of the area determined by the new position and the characteristic information of the target in the previous frame is within a set threshold range, whether the difference value between the position information of each new position and the position information of the target in the previous frame is within the set threshold range, and whether the confidence rate of the new position is within the set threshold range.

4. The data set acquisition method of claim 1, characterized in that: in the step S2, the shooting of the target to obtain the video is performed under preset shooting conditions, where the preset shooting conditions include a set shooting angle, a set lighting condition, and a set background condition.

5. A data set acquisition system for performing the data set acquisition method of claim 1, characterized by: the system comprises a shooting module and a data reading module, wherein the shooting module is used for carrying out video shooting on a target; the data reading module is used for reading each frame of image in the video and obtaining the position information and/or the characteristic information of the target in each frame of image; the data reading module includes: the device comprises an acquisition module, an operation module, a tracking module and a first judgment module, wherein the acquisition module is used for acquiring the characteristic information and the position information of each frame of image, and the operation module is used for calculating and acquiring a plurality of new positions related to a target in a subsequent frame based on the characteristic information of the target in the previous frame and outputting a confidence rate corresponding to each new position; the tracking module: the target tracking module is used for tracking the target based on a plurality of new positions obtained by the operation module; the first judging module: and the tracking module is used for judging whether a target needing to be tracked exists in the plurality of new positions obtained by the operation module.

6. The data set acquisition system as set forth in claim 5, wherein: the data set acquisition system further comprises a storage module for storing the data of the labeled image.

7. An electronic device comprising a memory and a processor, characterized in that: the memory has stored therein a computer program by which the processor is arranged to execute the data set acquisition method of any of claims 1 to 4.