CN109816019A

CN109816019A - A kind of image data automation auxiliary mask method

Info

Publication number: CN109816019A
Application number: CN201910075655.8A
Authority: CN
Inventors: 龚飞
Original assignee: Shanghai Xiaomeng Technology Co Ltd
Current assignee: Shanghai Xiaomeng Technology Co Ltd
Priority date: 2019-01-25
Filing date: 2019-01-25
Publication date: 2019-05-28

Abstract

The invention discloses a kind of image datas to automate auxiliary mask method comprising the steps of: A, acquisition image data；B, large-scale dataset and small lot data set are splitted data into；C, small-scale data set is manually marked and 3D renders labeled data；The small lot data set that obtains that treated；D, to treated, small lot data set is labeled model treatment, summarizes after processing with large-scale dataset；E, batch is carried out to the data set that step D summarizes to mark from dynamic auxiliary；F, the initial automatic marking data that step D is obtained manually are finely tuned, obtain extensive complete labeled data collection, the automatic replication theme of data described using this patent, on the one hand time reduction manual operation can be saved, mark rule is optimized, promotes data mark accuracy, on the other hand it can guarantee the accuracy of algorithm, enhance the stability of intelligence retail cabinet.

Description

A kind of image data automation auxiliary mask method

Technical field

The present invention relates to technical field of image processing, specifically a kind of image data automation auxiliary mask method.

Background technique

It is always the very stubborn problem of artificial intelligence field one that image data, which marks work, has model steady Fixed performance, it is necessary to carry out training pattern using a large amount of labeled data, usually in artificial intelligence landing project, at least need tens of thousands of The labeled data of even higher number of levels can just train feasible model.Conventional images data mark work It is completed completely using artificial mark, scientific research institutions both domestic and external, business unit etc. all take in this work sizable Manpower and time, to increase project development cost.

Mass data mark problem is a big difficulty in artificial intelligence landing project, for promoted image data mark when Between efficiency, save mark cost of labor, optimization mark rule, the design proposes a kind of image data automation auxiliary mark side Case.

Summary of the invention

The purpose of the present invention is to provide a kind of image datas to automate auxiliary mask method, to solve above-mentioned background technique The problem of middle proposition.

To achieve the above object, the invention provides the following technical scheme:

A kind of image data automation auxiliary mask method comprising the steps of:

A, image data is acquired；

B, large-scale dataset and small lot data set are splitted data into；

C, small-scale data set is manually marked and 3D rendering marks；The small lot data set that obtains that treated；

D, to treated, small lot data set is labeled model treatment, summarizes after processing with large-scale dataset；

E, batch is carried out to the data set that step D summarizes to mark from dynamic auxiliary；

F, the initial automatic marking data that step E is obtained manually are finely tuned, obtains extensive complete labeled data collection.

As further technical solution of the present invention: the step A is realized by camera.

As further technical solution of the present invention: being more than the number using 3D rendering mark using the data manually marked According to.

As further technical solution of the present invention: the step D is specifically: by the artificial labeled data and 3D of small lot Rendering labeled data constitutes a small training set together, is then carried out come training pattern to high-volume data using the training set Automatic marking.

As further technical solution of the present invention: the marking model is the neural network model of low complex degree.

As further technical solution of the present invention: the marking model is the network model in actual items.

As further technical solution of the present invention: 3D renders labeled data and needs to carry out 3D modeling to commodity in advance, Automatic marking is carried out to 3D rendering labeled data using program clearly after mark rule.

Compared with prior art, the beneficial effects of the present invention are: using this patent describe the automatic replication theme of data, one Aspect can save time reduction manual operation, optimize mark rule, promote data mark accuracy, on the other hand can guarantee The accuracy of algorithm enhances the stability of intelligence retail cabinet.

Detailed description of the invention

Fig. 1 is the schematic diagram of image data automation auxiliary mask method.

Specific embodiment

Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete Site preparation description, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based on Embodiment in the present invention, it is obtained by those of ordinary skill in the art without making creative efforts every other Embodiment shall fall within the protection scope of the present invention.

Embodiment 1: referring to Fig. 1, a kind of image data automation auxiliary mask method comprising the steps of:

A, image data is acquired；

B, large-scale dataset and small lot data set are splitted data into；

C, small-scale data set is manually marked and 3D rendering marks；The small lot data set that obtains that treated；It builds first Then 3D rendering mark number of the part with markup information is added in the artificial labeled data collection of a vertical small lot in the data set According to.Artificial labeled data and 3D rendering labeled data are as shown in Figure 1.Artificial labeled data by data that camera is acquired into Pedestrian's work mark obtains, and 3D renders labeled data and needs to carry out 3D modeling to commodity in advance, utilizes after clearly mark rule Program carries out automatic marking to 3D rendering labeled data.The data in two kinds of sources are made into Small Sample Database library, it can be in terms of two Promote the efficiency of auxiliary mark.Firstly, artificial small lot labeled data specifies mark rule, secondly, 3D renders labeled data The integrality of data can be promoted up from a variety of variations such as multi-angle, multi-pose.The two is used to train marking model simultaneously, is had Help be promoted the accuracy of mark.

In the actual operation process, the selection mode of labeled data is rendered as the artificial labeled data and 3D of small lot, It can be depending on specific requirements.For image data, it is assumed that the total amount of data for finally needing to mark is T, and is needed small quantities of Measure that artificial labeled data is, the data volume of 3D rendering labeled data is, then the data count for training marking model is N ,, here, their quantity meets following relationship:；Usually take,,, Known to here artificial small lot labeled data only account for 7.0 % (0.7 × 0.1 × 100%) of total data, in actual mechanical process Use ratio to various data it is not strictly necessary that, but to guarantee for training the artificial labeled data of marking model to use up More than 3D rendering labeled data, secondly in the actual operation process amount guarantees, it is proposed thatValue cannot be too small, otherwise trains Automatic marking model accuracy is lower, can not ensure the accuracy of automatic marking, will increase labor workload instead.

D, to treated, small lot data set is labeled model treatment, summarizes after processing with large-scale dataset；For It realizes to the quick marks of high-volume data, the present invention devises a kind of automatic marking scheme to assist to complete mark task.It is real In the operating process of border, artificial labeled data and 3D the rendering labeled data of small lot are constituted a small training by the present invention together Then collection carries out automatic marking to high-volume data come training pattern using the training set.

F, it is manually finely tuned to what step E was obtained from dynamic auxiliary labeled data, obtains extensive complete labeled data collection.It is logical Often, if handling the markup information of hundreds of thousands of or millions of ranks using manually, the plenty of time can be consumed, this mark side Method can cause huge obstruction to the implementation progress of actual items.

Therefore, the present invention first trains mini Mod using small lot data, then mini Mod is recycled to mark from dynamic auxiliary The scheme of high-volume image data solves the above problems.

In the actual operation process, trained auxiliary marking model is usually constructed with preferably mark performance, marks accuracy rate 95.00% or more can be reached, generally manually the error label of the size to callout box, position or only a few is only needed to adjust It is whole.In some cases, the mark rule manually determined is less reasonable, causes automatic marking to will appear more apparent mark and misses Difference at this time can adjust mark rule, and then optimize mark rule.

It is mutually coordinated from both dynamic auxiliary mark and artificial fine tuning, it is common to promote annotating efficiency and mark quality, finally both Time cost and cost of labor are saved, mark rule is optimized, improves data mark accuracy, and protect to a certain extent Arithmetic accuracy is demonstrate,proved.

Embodiment 2, on the basis of embodiment 1, in training marking model, how to design marking model can be according to reality Depending on situation, usually there are two types of optinal plans here:

Scheme one: the neural network model for designing low complex degree is trained.This scheme flexibility is higher, can be according to trained As a result model structure is adjusted at any time, reduces the training time.But the data that mark out of the model of this scheme training be possible to The demand of actual items has certain difference.

Scheme two: it is trained using the network model in actual items.Because data volume is smaller, this scheme can be one Determine to reduce the training time in degree, the data marked out using the model that this scheme trains more meet the need of actual items It asks, and can also assist in the performance of analysis actual items network model.

The present invention relates to automation auxiliary labelling schemes in, it is preferred to use scheme two is carried out training pattern and is marked automatically Note.

It is obvious to a person skilled in the art that invention is not limited to the details of the above exemplary embodiments, Er Qie In the case where without departing substantially from spirit or essential attributes of the invention, the present invention can be realized in other specific forms.Therefore, no matter From the point of view of which point, the present embodiments are to be considered as illustrative and not restrictive, and the scope of the present invention is by appended power Benefit requires rather than above description limits, it is intended that all by what is fallen within the meaning and scope of the equivalent elements of the claims Variation is included within the present invention.Any reference signs in the claims should not be construed as limiting the involved claims.

In addition, it should be understood that although this specification is described in terms of embodiments, but not each embodiment is only wrapped Containing an independent technical solution, this description of the specification is merely for the sake of clarity, and those skilled in the art should It considers the specification as a whole, the technical solutions in the various embodiments may also be suitably combined, forms those skilled in the art The other embodiments being understood that.

Claims

1. a kind of image data automation auxiliary mask method, which is characterized in that comprise the steps of:

A, image data is acquired；

B, large-scale dataset and small lot data set are splitted data into；

The initial automatic marking data that step E is obtained manually are finely tuned, extensive complete labeled data collection is obtained.

2. a kind of image data automation auxiliary mask method according to claim 1, which is characterized in that the step A It is realized by camera.

3. a kind of image data automation auxiliary mask method according to claim 1, which is characterized in that using artificial mark The data of note are more than the data using 3D rendering mark.

4. a kind of image data automation auxiliary mask method according to claim 1, which is characterized in that the step E Specifically: artificial labeled data and 3D the rendering labeled data of small lot being constituted into a small training set together, then used The training set carrys out training pattern and carries out automatic marking to high-volume data.

5. a kind of image data automation auxiliary mask method according to claim 1, which is characterized in that the mark mould Type is the neural network model of low complex degree.

6. a kind of image data automation auxiliary mask method according to claim 1, which is characterized in that the mark mould Type is the network model in actual items.

7. a kind of image data automation auxiliary mask method according to claim 1 to 4, which is characterized in that 3D wash with watercolours Contaminate labeled data need in advance to commodity carry out 3D modeling, clearly mark rule after using program to 3D render labeled data into Row automatic marking.