CN111125389A - Data classification cleaning system and cleaning method based on dynamic progressive sampling - Google Patents

Data classification cleaning system and cleaning method based on dynamic progressive sampling Download PDF

Info

Publication number
CN111125389A
CN111125389A CN201911305676.0A CN201911305676A CN111125389A CN 111125389 A CN111125389 A CN 111125389A CN 201911305676 A CN201911305676 A CN 201911305676A CN 111125389 A CN111125389 A CN 111125389A
Authority
CN
China
Prior art keywords
data
label
classification cleaning
pseudo
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911305676.0A
Other languages
Chinese (zh)
Inventor
秦永强
张发恩
李素莹
纪双西
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ainnovation Hefei Technology Co ltd
Original Assignee
Ainnovation Hefei Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ainnovation Hefei Technology Co ltd filed Critical Ainnovation Hefei Technology Co ltd
Priority to CN201911305676.0A priority Critical patent/CN111125389A/en
Publication of CN111125389A publication Critical patent/CN111125389A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/55Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a data classification cleaning system and a method, wherein the system comprises: the label sample graph placing module is used for placing the label sample graph into each type of data subset in the sample data set; the iterative model training module is used for training a data classification cleaning model by taking a label data set L formed by each label sample diagram as a training sample; the data pseudo label generating module is used for carrying out data classification cleaning on the data set to be cleaned based on the data classification cleaning model and carrying out pseudo labeling on each piece of cleaned unmarked data; the data screening module is used for screening data of a pseudo label data set obtained by pseudo labels to obtain a pseudo label candidate set S; the iterative model training module is also used for iteratively training the data classification cleaning model by taking the pseudo label candidate set S and the label data set L as training samples.

Description

Data classification cleaning system and cleaning method based on dynamic progressive sampling
Technical Field
The invention relates to the technical field of data cleaning, in particular to a data classification cleaning system and a cleaning method based on dynamic progressive sampling.
Background
At present, data cleaning of a picture data set mainly depends on manual cleaning or recognition and cleaning based on a large number of models obtained by training picture samples with labels, but manual cleaning efficiency is low, multiple times of checking are often needed to relatively ensure cleaning accuracy, and the requirement of a user on automatic cleaning of the picture data set cannot be met. The data cleaning method based on a large number of image samples with labels also needs to label the images manually, so that the labeling cost is high, the labeling period is long, the labeling quality is difficult to guarantee, and the technical problem of low accuracy of the prepared data classification result is also solved.
Disclosure of Invention
The invention aims to provide a data classification cleaning system and method based on dynamic progressive sampling to solve the technical problems.
In order to achieve the purpose, the invention adopts the following technical scheme:
the utility model provides a data classification cleaning system based on dynamic progressive sampling, includes:
the label sample graph placing module is used for providing a user with a label sample graph to place the label sample graph with a label into each type of data subset in the sample data set, and each label sample graph correspondingly represents one data type;
the iterative model training module is connected with the label sample diagram placing module and is used for initially training a label data set L formed by the placed label sample diagrams to form a data classification cleaning model by taking the label data set L as a training sample;
the data pseudo label generating module is connected with the iterative model training module and used for inputting a data set to be cleaned into the data classification cleaning model, predicting the data type of unmarked data in the data set through the data classification cleaning model, and performing pseudo labeling on each unmarked data obtained through prediction to obtain a pseudo labeled data set;
the data screening module is connected with the data pseudo label generating module and used for screening the data of the pseudo label data set to obtain a pseudo label candidate set S;
the iterative model training module is further connected with the data screening module, and is further used for iteratively training the data classification cleaning model by taking an extended training data set D formed by the pseudo label candidate set S and the label data set L as a training sample;
and the data pseudo label generation module further cleans the data of the data set based on the data classification cleaning model obtained by iterative training until the classification cleaning process of the data set is completed.
As a preferable aspect of the present invention, the data classification cleaning system further includes:
an index data marking module, connected to the data pseudo tag generating module, configured to mark each remaining unmarked data in the data set as index tag data after the data pseudo tag generating module completes pseudo tagging of each unmarked data in the data set;
the index data marking module is also connected with the iterative model training module, and the iterative model training module is used for updating the data classification cleaning model through iterative training by taking the extended training data set D and each index label data as training samples;
and the data pseudo label generating module carries out data classification cleaning on the data set according to the data classification cleaning model updated iteratively until the data classification cleaning process of all data in the data set is completed.
The invention also provides a data classification cleaning method based on dynamic progressive sampling, which is realized by applying the data classification cleaning system and comprises the following steps:
step S1, the data classification cleaning system acquires the label sample maps and correspondingly places each acquired label sample map into each type of data subset of the sample data set;
step S2, the data classification cleaning system takes a label data set L formed by each label sample diagram as a training sample, and the data classification cleaning model is formed by initial training;
step S3, the data classification cleaning system inputs the data set to be cleaned into the data classification cleaning model, predicts the data type of each unmarked data in the data set through the data classification cleaning model, and performs pseudo marking on each unmarked data obtained through prediction to obtain a pseudo marked data set;
step S4, the data classification cleaning system performs data screening on the data in the pseudo label data set to obtain a pseudo label candidate set S;
step S5, the data classification cleaning system takes an extended training data set D formed by the pseudo label candidate set S and the label data set L as a training sample, and iteratively trains the data classification cleaning model;
and step S6, the data classification cleaning system continuously performs data classification cleaning on the data set based on the data classification cleaning model obtained through iterative training until the data classification cleaning process is completed.
The invention also provides a data classification cleaning method based on dynamic progressive sampling, which is realized by applying the data classification cleaning system and comprises the following steps:
l1, the data classification cleaning system acquires the label sample maps and correspondingly places each acquired label sample map into each type of the data subsets of the sample data set;
step L2, the data classification cleaning system takes a label data set L formed by each label sample diagram as a training sample, and initially trains to form the data classification cleaning model;
step L3, the data classification cleaning system inputs the data set to be cleaned into the data classification cleaning model, predicts the data type of each unlabeled data in the data set through the data classification cleaning model, and performs pseudo labeling on each unlabeled data obtained through prediction to obtain a pseudo-labeled data set;
step L4, the data classification cleaning system performs data screening on the pseudo label data set to obtain a pseudo label candidate set S;
step L5, the data classification cleaning system iteratively trains the data classification cleaning model by taking an extended training data set D formed by the pseudo label candidate set S and the label data set L as a training sample;
step L6, the data classification cleaning system marks each of the unmarked data remaining in the data set as index tag data after completing the pseudo-marking of each of the unmarked data in the data set;
step L7, the data classification cleaning system takes the extended training data set D and each index label data as training samples, and iteratively trains and updates the data classification cleaning model;
and L8, the data classification cleaning system continuously performs data cleaning on the data set based on the data classification cleaning model obtained by iterative training until the classification cleaning process of all data is completed.
The dynamic progressive-based data classification cleaning system provided by the invention only needs to manually label one picture in each type of data subset in a data set to be classified, then the system performs model training according to each labeled sample picture, then performs data classification cleaning on the data set through the trained data classification cleaning model, then automatically marks each piece of unmarked data cleaned, and iteratively updates the data classification cleaning model by taking the data which is automatically marked and is obtained through cleaning and each labeled sample picture as a training sample until the classification cleaning process of the data in the data set is completed. The invention greatly reduces the time cost of manual marking, and improves the accuracy of data classification and cleaning by repeatedly carrying out data classification, cleaning and marking on the data set.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required to be used in the embodiments of the present invention will be briefly described below. It is obvious that the drawings described below are only some embodiments of the invention, and that for a person skilled in the art, other drawings can be derived from them without inventive effort.
FIG. 1 is a schematic structural diagram of a data classification and cleaning system based on dynamic progressive sampling according to a first embodiment of the present invention;
FIG. 2 is a schematic structural diagram of a data classification and cleaning system based on dynamic progressive sampling according to a second embodiment of the present invention;
FIG. 3 is a diagram of steps of a method for implementing classified cleaning of data by using the classified cleaning system according to the first embodiment of the present invention;
fig. 4 is a diagram of the steps of a method for implementing data classification cleaning by using the data classification cleaning system according to the second embodiment of the present invention.
Detailed Description
The technical scheme of the invention is further explained by the specific implementation mode in combination with the attached drawings.
Wherein the showings are for the purpose of illustration only and are shown by way of illustration only and not in actual form, and are not to be construed as limiting the present patent; to better illustrate the embodiments of the present invention, some parts of the drawings may be omitted, enlarged or reduced, and do not represent the size of an actual product; it will be understood by those skilled in the art that certain well-known structures in the drawings and descriptions thereof may be omitted.
The same or similar reference numerals in the drawings of the embodiments of the present invention correspond to the same or similar components; in the description of the present invention, it should be understood that if the terms "upper", "lower", "left", "right", "inner", "outer", etc. are used for indicating the orientation or positional relationship based on the orientation or positional relationship shown in the drawings, it is only for convenience of description and simplification of description, but it is not indicated or implied that the referred device or element must have a specific orientation, be constructed in a specific orientation and be operated, and therefore, the terms describing the positional relationship in the drawings are only used for illustrative purposes and are not to be construed as limitations of the present patent, and the specific meanings of the terms may be understood by those skilled in the art according to specific situations.
In the description of the present invention, unless otherwise explicitly specified or limited, the term "connected" or the like, if appearing to indicate a connection relationship between the components, is to be understood broadly, for example, as being fixed or detachable or integral; can be mechanically or electrically connected; they may be directly connected or indirectly connected through intervening media, or may be connected through one or more other components or may be in an interactive relationship with one another. The specific meanings of the above terms in the present invention can be understood in specific cases to those skilled in the art.
Example one
The embodiment of the present invention provides a data classification cleaning system based on dynamic progressive sampling, referring to fig. 1, including:
the system comprises a label sample graph placing module 1, a data collecting module and a data analyzing module, wherein the label sample graph placing module 1 is used for providing a user to place a label sample graph with a label into each type of data subset in a sample data set, and each label sample graph correspondingly represents one data type;
the iterative model training module 2 is connected with the label sample diagram placing module 1 and is used for initially training a label data set L formed by placed label sample diagrams to form a data classification cleaning model by taking the label data set L as a training sample;
the data pseudo label generating module 3 is connected with the iterative model training module 2 and is used for inputting a data set to be cleaned into the data classification cleaning model, predicting the data type of unmarked data in the data set through the data classification cleaning model, and carrying out pseudo labeling on each unmarked data obtained through prediction to obtain a pseudo labeled data set;
the data screening module 4 is connected with the data pseudo label generating module 3 and is used for screening data of the pseudo label data set to obtain a label candidate set S;
the iterative model training module 2 is also connected with the data screening module 4, and the iterative model training module 2 is also used for carrying out iterative training data classification and model cleaning by taking an extended training data set D formed by a label candidate set S and a label data set L as a training sample;
and the data pseudo label generating module 3 further cleans the data of the data set based on the data classification cleaning model obtained by iterative training until the classification cleaning process of the data set is completed.
In the above technical solution, the method for iteratively training the data classification cleaning model is an existing model training method, and since the model training method is not within the scope of the claimed invention, the specific training process of the data classification cleaning model is not described herein.
The labeling process for a pseudo-labeled data set is briefly as follows:
the system predicts the data predicted by the data classification cleaning model to obtain pseudo markers, and the data of the pseudo markers are all suspected label data. The pseudo-tagging of the model to the data may be implemented by existing correlation algorithms, and the pseudo-tagging process to the data is not elaborated herein since it is not within the scope of the claimed invention.
In addition, the pseudo tag candidate set S may be obtained by calculating the confidence that the pseudo tag data is the tag data, the method for calculating the confidence that the pseudo tag data is the existing method, and of course, other existing screening methods may also be used to screen the pseudo tag data to obtain the pseudo tag candidate set S.
The embodiment of the present invention further provides a data classification cleaning method based on dynamic progressive sampling, which is implemented by applying the data classification cleaning system provided in the first embodiment, and please refer to fig. 3, including the following steps:
step S1, the data classification cleaning system acquires the label sample drawings and correspondingly places each acquired label sample drawing into each type of data subset of the sample data set; the data type of the label sample graph is the same as the data type in the placed data subset;
step S2, the data classification cleaning system takes a label data set L formed by each label sample diagram as a training sample, and a data classification cleaning model is formed by initial training;
step S3, inputting the data set to be cleaned into a data classification cleaning model by the data classification cleaning system, predicting the data type of each unmarked data in the data set by the data classification cleaning model, and carrying out pseudo marking on each unmarked data obtained by prediction to obtain a pseudo marked data set;
step S4, the data classification cleaning system performs data screening on the data in the pseudo label data set to obtain a pseudo label candidate set S;
step S5, the data classification cleaning system takes an extended training data set D formed by the pseudo label candidate set S and the label data set L as a training sample, and iterates a training data classification cleaning model;
and step S6, the data classification cleaning system continues to perform data classification cleaning on the data set based on the data classification cleaning model obtained by iterative training until the data classification cleaning process is completed.
Example two
The difference between the second embodiment and the first embodiment is that, referring to fig. 2, the data classification and cleaning system based on dynamic progressive sampling provided in the second embodiment further includes:
the index data marking module 5 is connected with the pseudo label generating module 3 and is used for marking the remaining unmarked data in the data set as index label data after the data pseudo label generating module finishes pseudo labeling of the unmarked data in the data set;
the index data marking module 5 is also connected with an iterative model training module 2, and the iterative model training module 2 is used for updating a data classification cleaning model by iterative training by taking an extended training data set D and each label data as training samples;
and the data pseudo label generating module 3 performs data classification cleaning on the data set according to the data classification cleaning model updated iteratively until the data classification cleaning process of all data in the data set is completed.
The data classification cleaning system provided by the embodiment II can clean data more thoroughly in a classification manner, and the data classification cleaning effect is better.
The second embodiment further provides a data classification cleaning method based on dynamic progressive sampling, which is implemented by applying the data classification cleaning system provided in the second embodiment, with reference to fig. 4, and includes the following steps:
l1, the data classification cleaning system acquires the label sample drawings and correspondingly places each acquired label sample drawing into each type of data subset of the sample data set;
l2, the data classification cleaning system takes a label data set L formed by each label sample diagram as a training sample, and a data classification cleaning model is formed by initial training;
step L3, inputting the data set to be cleaned into a data classification cleaning model by the data classification cleaning system, predicting the data type of each unmarked data in the data set by the data classification cleaning model, and performing pseudo marking on each unmarked data obtained by prediction to obtain a pseudo marked data set;
l4, the data classification cleaning system performs data screening on the pseudo label data set to obtain a pseudo label candidate set S;
l5, the data classification cleaning system takes an extended training data set D formed by the pseudo label candidate set S and the label data set L as a training sample, and iterates a training data classification cleaning model;
step L6, after completing the pseudo labeling of each unmarked data in the data set, the data classification cleaning system marks each unmarked data remaining in the data set as index label data;
l7, the data classification cleaning system takes the expanded training data D and each index label data as training samples, and iteratively trains and updates the data classification cleaning model;
and step L8, the data classification cleaning system continuously performs data cleaning on the data set based on the data classification cleaning model obtained by iterative training until the classification cleaning process of all data is completed.
It should be understood that the above-described embodiments are merely preferred embodiments of the invention and the technical principles applied thereto. It will be understood by those skilled in the art that various modifications, equivalents, changes, and the like can be made to the present invention. However, such variations are within the scope of the invention as long as they do not depart from the spirit of the invention. In addition, certain terms used in the specification and claims of the present application are not limiting, but are used merely for convenience of description.

Claims (4)

1. A data classification cleaning system based on dynamic progressive sampling, comprising:
the label sample graph placing module is used for providing a user with a label sample graph to place the label sample graph with a label into each type of data subset in the sample data set, and each label sample graph correspondingly represents one data type;
the iterative model training module is connected with the label sample diagram placing module and is used for initially training a label data set L formed by the placed label sample diagrams to form a data classification cleaning model by taking the label data set L as a training sample;
the data pseudo label generating module is connected with the iterative model training module and used for inputting a data set to be cleaned into the data classification cleaning model, predicting the data type of unmarked data in the data set through the data classification cleaning model, and performing pseudo labeling on each unmarked data obtained through prediction to obtain a pseudo labeled data set;
the data screening module is connected with the data pseudo label generating module and used for screening the data of the pseudo label data set to obtain a pseudo label candidate set S;
the iterative model training module is further connected with the data screening module, and is further used for iteratively training the data classification cleaning model by taking an extended training data set D formed by the pseudo label candidate set S and the label data set L as a training sample;
and the data pseudo label generation module further cleans the data of the data set based on the data classification cleaning model obtained by iterative training until the classification cleaning process of the data set is completed.
2. The data sorting and cleaning system of claim 1, further comprising:
an index data marking module, connected to the data pseudo tag generating module, configured to mark each remaining unmarked data in the data set as index tag data after the data pseudo tag generating module completes pseudo tagging of each unmarked data in the data set;
the index data marking module is also connected with the iterative model training module, and the iterative model training module is used for updating the data classification cleaning model through iterative training by taking the extended training data set D and each index label data as training samples;
and the data pseudo label generating module carries out data classification cleaning on the data set according to the data classification cleaning model updated iteratively until the data classification cleaning process of all data in the data set is completed.
3. A data classification cleaning method based on dynamic progressive sampling is realized by applying the data classification cleaning system according to the weight 1, and is characterized by comprising the following steps:
step S1, the data classification cleaning system acquires the label sample maps and correspondingly places each acquired label sample map into each type of data subset of the sample data set;
step S2, the data classification cleaning system takes a label data set L formed by each label sample diagram as a training sample, and the data classification cleaning model is formed by initial training;
step S3, the data classification cleaning system inputs the data set to be cleaned into the data classification cleaning model, predicts the data type of each unmarked data in the data set through the data classification cleaning model, and performs pseudo marking on each unmarked data obtained through prediction to obtain a pseudo marked data set;
step S4, the data classification cleaning system performs data screening on the data in the pseudo label data set to obtain a pseudo label candidate set S;
step S5, the data classification cleaning system takes an extended training data set D formed by the pseudo label candidate set S and the label data set L as a training sample, and iteratively trains the data classification cleaning model;
and step S6, the data classification cleaning system continuously performs data classification cleaning on the data set based on the data classification cleaning model obtained through iterative training until the data classification cleaning process is completed.
4. A data classification cleaning method based on dynamic progressive sampling is realized by the data classification cleaning system with the application weight 2, and is characterized by comprising the following steps:
l1, the data classification cleaning system acquires the label sample maps and correspondingly places each acquired label sample map into each type of the data subsets of the sample data set;
step L2, the data classification cleaning system takes a label data set L formed by each label sample diagram as a training sample, and initially trains to form the data classification cleaning model;
step L3, the data classification cleaning system inputs the data set to be cleaned into the data classification cleaning model, predicts the data type of each unlabeled data in the data set through the data classification cleaning model, and performs pseudo labeling on each unlabeled data obtained through prediction to obtain a pseudo-labeled data set;
step L4, the data classification cleaning system performs data screening on the pseudo label data set to obtain a pseudo label candidate set S;
step L5, the data classification cleaning system iteratively trains the data classification cleaning model by taking an extended training data set D formed by the pseudo label candidate set S and the label data set L as a training sample;
step L6, the data classification cleaning system marks each of the unmarked data remaining in the data set as index tag data after completing the pseudo-marking of each of the unmarked data in the data set;
step L7, the data classification cleaning system takes the extended training data set D and each index label data as training samples, and iteratively trains and updates the data classification cleaning model;
and L8, the data classification cleaning system continuously performs data cleaning on the data set based on the data classification cleaning model obtained by iterative training until the classification cleaning process of all data is completed.
CN201911305676.0A 2019-12-18 2019-12-18 Data classification cleaning system and cleaning method based on dynamic progressive sampling Pending CN111125389A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911305676.0A CN111125389A (en) 2019-12-18 2019-12-18 Data classification cleaning system and cleaning method based on dynamic progressive sampling

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911305676.0A CN111125389A (en) 2019-12-18 2019-12-18 Data classification cleaning system and cleaning method based on dynamic progressive sampling

Publications (1)

Publication Number Publication Date
CN111125389A true CN111125389A (en) 2020-05-08

Family

ID=70498379

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911305676.0A Pending CN111125389A (en) 2019-12-18 2019-12-18 Data classification cleaning system and cleaning method based on dynamic progressive sampling

Country Status (1)

Country Link
CN (1) CN111125389A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111899254A (en) * 2020-08-12 2020-11-06 华中科技大学 Method for automatically labeling industrial product appearance defect image based on semi-supervised learning
CN112800151A (en) * 2021-04-06 2021-05-14 中译语通科技股份有限公司 Interactive unsupervised label classification system, method, medium and terminal
CN112860676A (en) * 2021-02-06 2021-05-28 高云 Data cleaning method applied to big data mining and business analysis and cloud server

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109784391A (en) * 2019-01-04 2019-05-21 杭州比智科技有限公司 Sample mask method and device based on multi-model
US20190340507A1 (en) * 2017-01-17 2019-11-07 Catchoom Technologies, S.L. Classifying data

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190340507A1 (en) * 2017-01-17 2019-11-07 Catchoom Technologies, S.L. Classifying data
CN109784391A (en) * 2019-01-04 2019-05-21 杭州比智科技有限公司 Sample mask method and device based on multi-model

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
好文: "《半监督学习之self-training》", 《HTTPS://WWW.MATOOLS.COM/BLOG/190181674》 *
竹席: "《"【译⽂】伪标签学习导论-⼀种半监督学习⽅法》", 《HTTPS://ZHUANLAN.ZHIHU.COM/P/29886875》 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111899254A (en) * 2020-08-12 2020-11-06 华中科技大学 Method for automatically labeling industrial product appearance defect image based on semi-supervised learning
CN112860676A (en) * 2021-02-06 2021-05-28 高云 Data cleaning method applied to big data mining and business analysis and cloud server
CN112800151A (en) * 2021-04-06 2021-05-14 中译语通科技股份有限公司 Interactive unsupervised label classification system, method, medium and terminal
CN112800151B (en) * 2021-04-06 2021-08-13 中译语通科技股份有限公司 Interactive unsupervised label classification system, method, medium and terminal

Similar Documents

Publication Publication Date Title
CN108197664B (en) Model acquisition method and device, electronic equipment and computer readable storage medium
CN111125389A (en) Data classification cleaning system and cleaning method based on dynamic progressive sampling
CN109086756A (en) A kind of text detection analysis method, device and equipment based on deep neural network
CN108416003A (en) A kind of picture classification method and device, terminal, storage medium
CN110569856B (en) Sample labeling method and device, and damage category identification method and device
CN109448005B (en) Network model segmentation method and equipment for coronary artery
CN110931112B (en) Brain medical image analysis method based on multi-dimensional information fusion and deep learning
CN112613569B (en) Image recognition method, training method and device for image classification model
CN111008576B (en) Pedestrian detection and model training method, device and readable storage medium
CN111444850B (en) Picture detection method and related device
CN113052295B (en) Training method of neural network, object detection method, device and equipment
CN109857878B (en) Article labeling method and device, electronic equipment and storage medium
CN110727816A (en) Method and device for determining interest point category
CN115205727A (en) Experiment intelligent scoring method and system based on unsupervised learning
CN111126486A (en) Test statistical method, device, equipment and storage medium
CN112381840A (en) Method and system for marking vehicle appearance parts in loss assessment video
CN107291774A (en) Error sample recognition methods and device
CN113191362B (en) Transformer equipment oil leakage defect detection device and method
CN113159146A (en) Sample generation method, target detection model training method, target detection method and device
CN111126493A (en) Deep learning model training method and device, electronic equipment and storage medium
CN110046666B (en) Mass picture labeling method
CN116958512A (en) Target detection method, target detection device, computer readable medium and electronic equipment
CN116309343A (en) Defect detection method and device based on deep learning and storage medium
CN111199050A (en) System for automatically desensitizing medical records and application
CN114842492A (en) Key information extraction method and device, storage medium and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20200508

RJ01 Rejection of invention patent application after publication