CN110851630A - Management system and method for deep learning labeled samples - Google Patents

Management system and method for deep learning labeled samples Download PDF

Info

Publication number
CN110851630A
CN110851630A CN201910974814.8A CN201910974814A CN110851630A CN 110851630 A CN110851630 A CN 110851630A CN 201910974814 A CN201910974814 A CN 201910974814A CN 110851630 A CN110851630 A CN 110851630A
Authority
CN
China
Prior art keywords
sample
labeled
format
instruction
receiving
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910974814.8A
Other languages
Chinese (zh)
Inventor
廖剑锋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan Huirun Tiancheng Information Technology Co Ltd
Original Assignee
Wuhan Huirun Tiancheng Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan Huirun Tiancheng Information Technology Co Ltd filed Critical Wuhan Huirun Tiancheng Information Technology Co Ltd
Priority to CN201910974814.8A priority Critical patent/CN110851630A/en
Publication of CN110851630A publication Critical patent/CN110851630A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/51Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/583Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/5866Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using information manually generated, e.g. tags, keywords, comments, manually generated location and time information
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Library & Information Science (AREA)
  • Databases & Information Systems (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Electrically Operated Instructional Devices (AREA)

Abstract

The invention relates to a management system and a method for deep learning labeled samples, wherein the system comprises an application layer, a data processing layer and a data storage layer. The application layer is used for inputting operation instructions to the data processing layer. The data processing layer comprises a sample renaming module, a sample management module and a format conversion module, wherein the sample renaming module is used for renaming the original marked sample according to the basic information of the original marked sample in the data storage layer. The format conversion module is used for receiving a format conversion instruction input by the user interactive interface and converting the labeled sample into a data set in a target format. The user can input an operation instruction through the application layer and rename the original marked sample in the data storage layer, so that the user can quickly know the basic information of the marked sample, code development is not needed again to analyze the type of the marked sample, and the management efficiency of the marked sample is improved. The invention converts the labeled sample into the data set with the target format, thereby facilitating the application of deep learning.

Description

Management system and method for deep learning labeled samples
Technical Field
The invention relates to the field of computer information management, in particular to a system and a method for managing deep learning labeled samples.
Background
Currently, AI is applied more and more, and social impact is also greater and greater. However, behind the rising application fields of "face recognition", "automatic driving", "voice recognition", etc., the core is still more huge data but more precise data is required. The "data annotation" work is naturally a key step in changing the most primitive data into data usable for the algorithm, and is the basis for the whole AI industry.
Although many public data are provided for selection at home and abroad, basically, all sample data are different in expression mode and storage mode, the data are manually managed, and the management mode is not uniform, so that a large amount of manpower resource waste is caused, and the manual operation data entry speed is low and the accuracy rate is low.
Because the name of an output document marked by the picture by the existing marking tool is associated with the name of the picture, the output of a marking sample of the same picture by utilizing a plurality of different marking software is the same, when the names of a plurality of marking samples are similar, the corresponding types of all marking samples are easy to be confused by the existing manual management mode, and the marking data of the same type cannot be searched in a deep learning marking sample database accurately and efficiently for training. Furthermore, different depth learning algorithms have different requirements on the format of the training set, and the format of the labeled sample is different from the format required by the training set.
Disclosure of Invention
Aiming at the problems in the prior art, the invention provides a system and a method for managing deep learning labeled samples.
The technical scheme for solving the technical problems is as follows:
in a first aspect, the invention provides a management system for deep learning labeled samples, which comprises an application layer, a data processing layer and a data storage layer;
the application layer comprises a user interaction interface used for inputting an operation instruction to the data processing layer;
the data processing layer comprises: the system comprises a sample renaming module, a data storage layer and a data processing module, wherein the sample renaming module is used for receiving a sample renaming instruction input by a user interaction interface and renaming an original labeled sample according to basic information of the original labeled sample in the data storage layer; the basic information at least comprises an annotation object, an image width multiplied by a height, an image scene, an image number, an image format and renaming time;
a sample management module: the system comprises a data storage layer, a sample management instruction and a data processing module, wherein the sample management instruction is used for receiving a sample management instruction input by a user interaction interface and performing management operation on a labeled sample in the data storage layer; the management operation at least comprises inquiry, analysis, downloading and deletion;
and the format conversion module is used for receiving a format conversion instruction input by the user interactive interface and converting the labeled sample into a data set in a target format.
And the data storage layer is used for storing the marked samples.
Further, the sample management module comprises:
the sample query submodule is used for receiving a sample query instruction and querying a labeled sample in the data storage layer;
the sample analysis submodule is used for receiving a sample analysis instruction and analyzing the marked sample;
the sample downloading submodule is used for receiving a sample downloading instruction and downloading a specified marked sample;
and the sample deleting submodule is used for receiving a sample deleting instruction and deleting the specified marked sample.
Further, the data processing layer further comprises: and the sample quality inspection module is used for receiving a quality inspection instruction input by the user interactive interface, inspecting the mapping relation between each labeled sample and the sample image, and deleting or modifying the wrong labeled sample according to the inspection result.
Further, the sample quality inspection module comprises:
the mapping relation checking submodule is used for checking whether the quantity and the name of the annotated sample are consistent with those of the sample images, and if the quantity or the name of the annotated sample is not consistent with those of the sample images, deleting the corresponding annotated sample or the corresponding sample image according to the checking result;
the file attribute checking submodule is used for checking the file attribute of each labeled sample and modifying or deleting the labeled sample with the wrong file attribute according to the checking result;
and the marking information checking submodule is used for checking the marking information of each marking sample and modifying or deleting the marking sample with the wrong marking information according to the checking result.
Further, the data processing layer further comprises: and the data amplification module is used for receiving a data amplification instruction input by the user interaction interface and performing data amplification on the labeled sample in a pixel inversion, salt and pepper noise, Gaussian filtering or rotation transformation mode.
Further, the format conversion module is specifically configured to:
receiving a format conversion instruction, and extracting a conversion format type from the format conversion instruction;
and selecting a corresponding conversion model based on the conversion format type, and converting the labeled sample into a data set in a target format based on the conversion model.
In a second aspect, the present invention provides a method for managing deep learning labeled samples, including:
receiving a sample renaming instruction, and renaming an original labeled sample according to prestored basic information of the original labeled sample; the basic information at least comprises an annotation object, an image width multiplied by a height, an image scene, an image number, an image format and renaming time;
receiving a sample management instruction, and performing management operation on the renamed labeled sample; the management operation at least comprises inquiry, analysis, downloading and deletion;
and receiving a format conversion instruction input by the user interactive interface, and converting the labeled sample into a data set in a target format.
Further, the receiving a format conversion instruction input by a user interaction interface, and converting the labeled sample into a data set in a target format specifically includes:
receiving a format conversion instruction, and extracting a conversion format type from the format conversion instruction;
and selecting a corresponding conversion model based on the conversion format type, and converting the labeled sample into a data set in a target format based on the conversion model.
In a third aspect, an embodiment of the present invention provides an electronic device, including a processor, a communication interface, a memory, and a bus, where the processor, the communication interface, and the memory complete communication with each other through the bus, and the processor may call a logic instruction in the memory to perform the steps of the method as provided in the second aspect.
In a fourth aspect, the invention provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the method as provided in the second aspect.
The management system for the deep learning labeled sample comprises an application layer, a data processing layer and a data storage layer, wherein a user can input an operation instruction through a user interaction interface of the application layer, the data processing layer receives the operation instruction and renames an original labeled sample in the data storage layer, so that the user can quickly know basic information of the labeled sample, code development is not needed again to analyze the type of the labeled sample, and the management efficiency of the labeled sample is improved. The invention converts the labeled sample into the data set with the target format, thereby facilitating the application of deep learning.
Drawings
Fig. 1 is a schematic structural diagram of a management system for deep learning labeled samples according to an embodiment of the present invention;
fig. 2 is a schematic flow chart of a management method for deep learning labeled samples according to an embodiment of the present invention;
fig. 3 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The existing sample data management mode generally carries out data entry through manual management, when the names of a plurality of labeled samples are similar, the types corresponding to the labeled samples are easy to be confused, and the same type of labeled data is difficult to be searched in a deep learning labeled sample database accurately and efficiently for training. Furthermore, different depth learning algorithms have different requirements on the format of the training set, and the format of the labeled sample is different from the format required by the training set.
In view of the foregoing problems in the prior art, an embodiment of the present invention provides a management system for deep learning labeled samples. Fig. 1 is a schematic structural diagram of a management system for deep learning annotation samples according to an embodiment of the present invention, and referring to fig. 1, the system includes an application layer, a data processing layer, and a data storage layer;
the application layer comprises a user interaction interface used for inputting an operation instruction to the data processing layer;
the data processing layer comprises: the system comprises a sample renaming module, a data storage layer and a data processing module, wherein the sample renaming module is used for receiving a sample renaming instruction input by a user interaction interface and renaming an original labeled sample according to basic information of the original labeled sample in the data storage layer; the basic information at least comprises an annotation object, an image width multiplied by a height, an image scene, an image number, an image format and renaming time. Here, the original annotation sample refers to the annotation sample before renaming. Before renaming the original marked sample, marking the sample image by a marking tool to obtain the original marked sample.
A sample management module: the system comprises a data storage layer, a sample management instruction and a data processing module, wherein the sample management instruction is used for receiving a sample management instruction input by a user interaction interface and performing management operation on a labeled sample in the data storage layer; the management operation at least comprises inquiry, analysis, downloading and deletion;
and the format conversion module is used for receiving a format conversion instruction input by the user interactive interface and converting the labeled sample into a data set in a target format.
And the data storage layer is used for storing the marked samples.
Specifically, as shown in fig. 1, the management system provided by this embodiment includes an application layer, a data processing layer, and a data storage layer, which are sequentially connected in a communication manner. The application layer comprises a user interaction interface used for inputting operation instructions to the data processing layer. The operation instructions herein include at least a sample renaming instruction, a sample management instruction, and a format conversion instruction.
The data processing layer comprises a sample renaming module, a sample management module and a format conversion module. The sample renaming module is used for receiving a sample renaming instruction input by the user interaction interface and renaming the original marked sample according to the basic information of the original marked sample in the data storage layer. It should be noted that, in this embodiment, before renaming an original labeled sample, a labeling tool is first used to label a sample image, in this embodiment, a labelme or labelImg is used to label the sample image, the original labeled sample is output in a voc format, and the original labeled sample is stored in a data storage layer.
And then, after the sample image is labeled by a labeling tool, a sample renaming module receives a sample renaming instruction input by a user interaction interface, and renames the original labeled sample according to the basic information of the original labeled sample in the data storage layer. The basic information comprises an annotation object, an image width multiplied by a height, an image scene, an image number, an image format and renaming time. In this embodiment, each piece of basic information corresponds to one field, the 6 pieces of basic information form 6 fields, and the arrangement order of the fields can be determined according to actual requirements.
Furthermore, a user can input a sample management instruction on the user interaction interface, and the sample management module receives the sample management instruction input by the user interaction interface and manages the labeled sample in the data storage layer. Here, the management operation includes at least query, analysis, download, and deletion.
Because different depth learning algorithms have different requirements on the format of the training set, before model training is performed by using the labeled sample, a format conversion instruction is input through a user interaction interface, and the format conversion module receives the format conversion instruction and converts the labeled sample into a data set in a target format. In this embodiment, the format of the annotation sample is a voc format, and the format conversion module converts the annotation sample in the voc format into a coco format dataset and/or a mask dataset.
The management system for the deep learning labeled sample comprises an application layer, a data processing layer and a data storage layer, wherein a user can input an operation instruction through a user interaction interface of the application layer, the data processing layer receives the operation instruction and renames an original labeled sample in the data storage layer, so that the user can quickly know basic information of the labeled sample, code development is not needed again to analyze the type of the labeled sample, and the management efficiency of the labeled sample is improved. The invention converts the labeled sample into the data set with the target format, thereby facilitating the application of deep learning.
Based on the content of the foregoing embodiment, as an optional embodiment, the sample management module includes:
the sample query submodule is used for receiving a sample query instruction and querying a labeled sample in the data storage layer;
the sample analysis submodule is used for receiving a sample analysis instruction and analyzing the marked sample;
the sample downloading submodule is used for receiving a sample downloading instruction and downloading a specified marked sample;
and the sample deleting submodule is used for receiving a sample deleting instruction and deleting the specified marked sample.
Specifically, in this embodiment, the data storage layer includes a database, and the same type of labeled sample is stored in each folder of the database. In the embodiment, a display window is added to the user interaction interface, so that the user interaction interface can display the annotation sample and a user can conveniently and quickly check the annotation file of the data storage layer.
In this embodiment, the query of the labeled sample is taken as an example for explanation, a user may input a name of a certain field of the labeled sample to be queried on the user interaction interface, after the user interaction interface is clicked to start the query, the system may query the relevant labeled sample in the database, and display a query result on the client file management interface.
Further, the analyzing process of the labeled sample specifically includes: the user can input an instruction for analyzing a certain type of labeling sample data set on the user interaction interface, and the sample analysis submodule analyzes the labeling sample data set and sends the distribution condition of the labeling information to the user interaction interface after receiving the instruction. The distribution of the labeling sample data set is mainly presented in the modes of a histogram and a pie chart.
Furthermore, the user can select a proper labeled sample from a display window of the user interaction interface through the sample downloading submodule to download the labeled sample to the local computer. The user can also input a deleting instruction for designating the marked sample on the user interaction interface, so that the sample deleting submodule deletes the corresponding marked sample according to the sample deleting instruction.
Based on the content of the foregoing embodiment, as an optional embodiment, the data processing layer further includes: and the sample quality inspection module is used for receiving a quality inspection instruction input by the user interactive interface, inspecting the mapping relation between each labeled sample and the sample image, and deleting or modifying the wrong labeled sample according to the inspection result.
Wherein, sample quality testing module includes:
the mapping relation checking submodule is used for checking whether the quantity and the name of the annotated sample are consistent with those of the sample images, and if the quantity or the name of the annotated sample is not consistent with those of the sample images, deleting the corresponding annotated sample or the corresponding sample image according to the checking result;
and the file attribute checking submodule is used for checking the file attribute of each labeled sample and modifying or deleting the labeled sample with the wrong file attribute according to the checking result. Where file attributes refer to the division of files into different types of files for storage and transmission, it defines some unique property of a file. Common file attributes are system attributes, hidden attributes, read-only attributes, and archive attributes.
And the marking information checking submodule is used for checking the marking information of each marking sample and modifying or deleting the marking sample with the wrong marking information according to the checking result.
Specifically, because the labeled samples in the data storage layer may have the problems of inconsistent file attributes of the labeled samples or wrong labeled information, the embodiment provides the sample quality inspection module to clean the labeled samples which do not meet the requirements. In this embodiment, the mapping relation checking submodule checks whether the number and the name of the annotated sample are consistent with those of the sample images, and if the number or the name of the annotated sample is determined to be inconsistent with those of the sample images, the corresponding annotated sample or the corresponding sample image is deleted according to the checking result. And finally, ensuring that the sample image and the labeled sample are in one-to-one corresponding mapping relation.
When the number of targets to be labeled is large, labeling personnel may confuse attribute classification of the targets, so that labeled samples with different file attributes appear in the same folder. In this embodiment, the file attribute of each labeled sample is checked by the file attribute checking submodule, and the labeled sample with the wrong file attribute is modified or deleted according to the checking result. And ensuring that the labeled samples with the same file attribute are stored in the same folder.
Due to an operation error of a labeling person or a deviation of understanding a labeling standard, a label sample may be wrongly labeled, in this embodiment, the labeling information of each label sample is checked by the label information checking submodule, and the label sample with the wrong label information is modified or deleted according to a checking result.
Based on the content of the foregoing embodiment, as an optional embodiment, the data processing layer further includes: and the data amplification module is used for receiving a data amplification instruction input by the user interaction interface and performing data amplification on the labeled sample in a pixel inversion, salt and pepper noise, Gaussian filtering or rotation transformation mode.
Specifically, when a large number of training samples are required by the deep learning algorithm, data amplification needs to be performed on the existing labeled sample, the amplification modes provided by the embodiment of the invention include pixel inversion, salt and pepper noise, gaussian filtering and rotation transformation, and a user can select one or more modes according to requirements to perform data amplification on the labeled sample.
Based on the content of the foregoing embodiment, as an optional embodiment, the format conversion module is specifically configured to:
and receiving a format conversion instruction, and extracting a conversion format type from the format conversion instruction.
Here, the conversion format type is a format type of the data set obtained by the conversion, such as a coco format data set or a mask data set. The embodiment of the present invention is not particularly limited thereto.
And selecting a corresponding conversion model based on the conversion format type, and converting the labeled sample into a data set in a target format based on the conversion model. Wherein, the conversion model is preset, and each conversion model is used for realizing one type of format conversion. In this embodiment, the format of the annotated sample is voc format. And selecting a corresponding conversion model based on the required conversion format type, and converting the mark sample in the voc format into a coco format data set and/or a mask data set.
Based on the foregoing embodiment, fig. 2 is a schematic flow chart of a method for managing deep learning labeled samples according to an embodiment of the present invention, and referring to fig. 2, an embodiment of the present invention provides a method for managing deep learning labeled samples, including:
201, receiving a sample renaming instruction, and renaming an original labeled sample according to prestored basic information of the original labeled sample; the basic information at least comprises an annotation object, an image width multiplied by a height, an image scene, an image number, an image format and renaming time;
202, receiving a sample management instruction, and performing management operation on the renamed marked sample; the management operation at least comprises inquiry, analysis, downloading and deletion;
and 203, receiving a format conversion instruction input by the user interactive interface, and converting the labeled sample into a data set in a target format.
Specifically, the management method for the deep learning labeled sample provided in the embodiment of the present invention is specifically executed by the management system for the deep learning labeled sample, and since the functional modules of the management system for the deep learning labeled sample are described in detail in the embodiment, the management method for the deep learning labeled sample is not described in detail here.
According to the management method for the deep learning labeled sample, provided by the embodiment of the invention, the user inputs the operation instruction through the user interaction interface of the application layer, the data processing layer receives the operation instruction and renames the original labeled sample in the data storage layer, so that the user can quickly know the basic information of the labeled sample, code development is not needed again to analyze the type of the labeled sample, and the management efficiency of the labeled sample is improved. The invention converts the labeled sample into the data set with the target format, thereby facilitating the application of deep learning.
Based on the contents of the above embodiments, as an alternative embodiment. In step 203, the receiving a format conversion instruction input by the user interaction interface, and converting the labeled sample into a data set in a target format specifically includes:
receiving a format conversion instruction, and extracting a conversion format type from the format conversion instruction;
and selecting a corresponding conversion model based on the conversion format type, and converting the labeled sample into a data set in a target format based on the conversion model.
Fig. 3 is a schematic structural diagram of an electronic device according to an embodiment of the present invention, and as shown in fig. 3, the electronic device may include: a processor (processor)301, a communication Interface (communication Interface)302, a memory (memory)303 and a communication bus 304, wherein the processor 301, the communication Interface 302 and the memory 303 complete communication with each other through the communication bus 304. The processor 301 may call a computer program stored on the memory 303 and operable on the processor 301 to execute the method for managing deep learning annotation samples provided by the above embodiments, for example, including: receiving a sample renaming instruction, and renaming an original labeled sample according to prestored basic information of the original labeled sample; the basic information at least comprises an annotation object, an image width multiplied by a height, an image scene, an image number, an image format and renaming time. Receiving a sample management instruction, and performing management operation on the renamed labeled sample; the management operations include at least query, analysis, download, and deletion. And receiving a format conversion instruction input by the user interactive interface, and converting the labeled sample into a data set in a target format.
An embodiment of the present invention further provides a non-transitory computer-readable storage medium, on which a computer program is stored, where the computer program is implemented to perform the method for managing deep learning annotation samples provided in the foregoing embodiments when executed by a processor, and the method includes: receiving a sample renaming instruction, and renaming an original labeled sample according to prestored basic information of the original labeled sample; the basic information at least comprises an annotation object, an image width multiplied by a height, an image scene, an image number, an image format and renaming time. Receiving a sample management instruction, and performing management operation on the renamed labeled sample; the management operations include at least query, analysis, download, and deletion. And receiving a format conversion instruction input by the user interactive interface, and converting the labeled sample into a data set in a target format.
The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (10)

1. A management system for deep learning labeled samples is characterized by comprising an application layer, a data processing layer and a data storage layer;
the application layer comprises a user interaction interface used for inputting an operation instruction to the data processing layer;
the data processing layer comprises: the system comprises a sample renaming module, a data storage layer and a data processing module, wherein the sample renaming module is used for receiving a sample renaming instruction input by a user interaction interface and renaming an original labeled sample according to basic information of the original labeled sample in the data storage layer; the basic information at least comprises an annotation object, an image width multiplied by a height, an image scene, an image number, an image format and renaming time;
a sample management module: the system comprises a data storage layer, a sample management instruction and a data processing module, wherein the sample management instruction is used for receiving a sample management instruction input by a user interaction interface and performing management operation on a labeled sample in the data storage layer; the management operation at least comprises inquiry, analysis, downloading and deletion;
the format conversion module is used for receiving a format conversion instruction input by the user interactive interface and converting the labeled sample into a data set in a target format;
and the data storage layer is used for storing the marked samples.
2. The management system of claim 1, wherein the sample management module comprises:
the sample query submodule is used for receiving a sample query instruction and querying a labeled sample in the data storage layer;
the sample analysis submodule is used for receiving a sample analysis instruction and analyzing the marked sample;
the sample downloading submodule is used for receiving a sample downloading instruction and downloading a specified marked sample;
and the sample deleting submodule is used for receiving a sample deleting instruction and deleting the specified marked sample.
3. The management system of claim 1, wherein the data processing layer further comprises: and the sample quality inspection module is used for receiving a quality inspection instruction input by the user interactive interface, inspecting the mapping relation between each labeled sample and the sample image, and deleting or modifying the wrong labeled sample according to the inspection result.
4. The management system of claim 3, wherein the sample quality inspection module comprises:
the mapping relation checking submodule is used for checking whether the quantity and the name of the annotated sample are consistent with those of the sample images, and if the quantity or the name of the annotated sample is not consistent with those of the sample images, deleting the corresponding annotated sample or the corresponding sample image according to the checking result;
the file attribute checking submodule is used for checking the file attribute of each labeled sample and modifying or deleting the labeled sample with the wrong file attribute according to the checking result;
and the marking information checking submodule is used for checking the marking information of each marking sample and modifying or deleting the marking sample with the wrong marking information according to the checking result.
5. The management system of claim 1, wherein the data processing layer further comprises: and the data amplification module is used for receiving a data amplification instruction input by the user interaction interface and performing data amplification on the labeled sample in a pixel inversion, salt and pepper noise, Gaussian filtering or rotation transformation mode.
6. The management system according to claim 1, wherein the format conversion module is specifically configured to:
receiving a format conversion instruction, and extracting a conversion format type from the format conversion instruction;
and selecting a corresponding conversion model based on the conversion format type, and converting the labeled sample into a data set in a target format based on the conversion model.
7. A management method for deep learning labeled samples is characterized by comprising the following steps:
receiving a sample renaming instruction, and renaming an original labeled sample according to prestored basic information of the original labeled sample; the basic information at least comprises an annotation object, an image width multiplied by a height, an image scene, an image number, an image format and renaming time;
receiving a sample management instruction, and performing management operation on the renamed labeled sample; the management operation at least comprises inquiry, analysis, downloading and deletion;
and receiving a format conversion instruction input by the user interactive interface, and converting the labeled sample into a data set in a target format.
8. The management method according to claim 7, wherein the receiving a format conversion instruction input by the user interaction interface, and converting the annotated sample into the dataset in the target format specifically includes:
receiving a format conversion instruction, and extracting a conversion format type from the format conversion instruction;
and selecting a corresponding conversion model based on the conversion format type, and converting the labeled sample into a data set in a target format based on the conversion model.
9. An electronic device, comprising a processor, a communication interface, a memory and a bus, wherein the processor, the communication interface and the memory communicate with each other via the bus, and the processor can call logic instructions in the memory to execute the method according to claim 7 or 8.
10. A non-transitory computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the method according to claim 7 or 8.
CN201910974814.8A 2019-10-14 2019-10-14 Management system and method for deep learning labeled samples Pending CN110851630A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910974814.8A CN110851630A (en) 2019-10-14 2019-10-14 Management system and method for deep learning labeled samples

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910974814.8A CN110851630A (en) 2019-10-14 2019-10-14 Management system and method for deep learning labeled samples

Publications (1)

Publication Number Publication Date
CN110851630A true CN110851630A (en) 2020-02-28

Family

ID=69597401

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910974814.8A Pending CN110851630A (en) 2019-10-14 2019-10-14 Management system and method for deep learning labeled samples

Country Status (1)

Country Link
CN (1) CN110851630A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111815182A (en) * 2020-07-10 2020-10-23 积成电子股份有限公司 Power grid power failure maintenance planning method based on deep learning
CN113220920A (en) * 2021-06-01 2021-08-06 中国电子科技集团公司第五十四研究所 Satellite remote sensing image sample labeling system and method based on micro-service architecture
CN113554146A (en) * 2020-04-26 2021-10-26 华为技术有限公司 Method for verifying labeled data, method and device for model training

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106156025A (en) * 2015-03-25 2016-11-23 阿里巴巴集团控股有限公司 The management method of a kind of data mark and device
CN106339479A (en) * 2016-08-30 2017-01-18 深圳市金立通信设备有限公司 Picture naming method and terminal
WO2018107777A1 (en) * 2016-12-15 2018-06-21 威创集团股份有限公司 Method and system for annotating video image
CN109165623A (en) * 2018-09-07 2019-01-08 北京麦飞科技有限公司 Rice scab detection method and system based on deep learning
CN109766916A (en) * 2018-12-17 2019-05-17 新绎健康科技有限公司 A kind of method and system determining tongue picture sample database based on deep learning model

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106156025A (en) * 2015-03-25 2016-11-23 阿里巴巴集团控股有限公司 The management method of a kind of data mark and device
CN106339479A (en) * 2016-08-30 2017-01-18 深圳市金立通信设备有限公司 Picture naming method and terminal
WO2018107777A1 (en) * 2016-12-15 2018-06-21 威创集团股份有限公司 Method and system for annotating video image
CN109165623A (en) * 2018-09-07 2019-01-08 北京麦飞科技有限公司 Rice scab detection method and system based on deep learning
CN109766916A (en) * 2018-12-17 2019-05-17 新绎健康科技有限公司 A kind of method and system determining tongue picture sample database based on deep learning model

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113554146A (en) * 2020-04-26 2021-10-26 华为技术有限公司 Method for verifying labeled data, method and device for model training
WO2021218226A1 (en) * 2020-04-26 2021-11-04 华为技术有限公司 Method for verifying labeled data, method and device for model training
CN111815182A (en) * 2020-07-10 2020-10-23 积成电子股份有限公司 Power grid power failure maintenance planning method based on deep learning
CN113220920A (en) * 2021-06-01 2021-08-06 中国电子科技集团公司第五十四研究所 Satellite remote sensing image sample labeling system and method based on micro-service architecture

Similar Documents

Publication Publication Date Title
CN110851630A (en) Management system and method for deep learning labeled samples
CN109800354B (en) Resume modification intention identification method and system based on block chain storage
CN109635120A (en) Construction method, device and the storage medium of knowledge mapping
CN110737689B (en) Data standard compliance detection method, device, system and storage medium
CN109840087B (en) Interface design system and method, computer readable storage medium
CN116881430B (en) Industrial chain identification method and device, electronic equipment and readable storage medium
CN112331348A (en) Analysis method and system for set marking, data, project management and non-programming modeling
CN111061733B (en) Data processing method, device, electronic equipment and computer readable storage medium
US10459942B1 (en) Sampling for preprocessing big data based on features of transformation results
CN113157978B (en) Data label establishing method and device
US10360208B2 (en) Method and system of process reconstruction
CN111858236B (en) Knowledge graph monitoring method and device, computer equipment and storage medium
CN116126790B (en) Railway engineering archive archiving method and device, electronic equipment and storage medium
CN108268488A (en) The recognition methods of webpage master map and device
CN116431828A (en) Construction method of power grid center data asset knowledge graph database constructed based on neural network technology
CN107491530B (en) Social relationship mining analysis method based on file automatic marking information
CN113407678B (en) Knowledge graph construction method, device and equipment
CN110532224A (en) A kind of file management system and method for deep learning mark sample
US9632990B2 (en) Automated approach for extracting intelligence, enriching and transforming content
CN115905371A (en) Data trend analysis method, device and equipment and computer readable storage medium
CN116303641A (en) Laboratory report management method supporting multi-data source visual configuration
CN108205564B (en) Knowledge system construction method and system
CN111143356B (en) Report retrieval method and device
CN113704650A (en) Information display method, device, system, equipment and storage medium
CN112131379A (en) Method, device, electronic equipment and storage medium for identifying problem category

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination