CN111860565A

CN111860565A - Workflow for training classifiers for quality inspection in measurement technology

Info

Publication number: CN111860565A
Application number: CN202010331750.2A
Authority: CN
Inventors: C.沃耶克; A.施里坎塔; W.阿尔卡尔迪; A.弗雷塔格
Original assignee: Carl Zeiss Industrielle Messtechnik GmbH
Current assignee: Carl Zeiss Industrielle Messtechnik GmbH
Priority date: 2019-04-25
Filing date: 2020-04-24
Publication date: 2020-10-30
Also published as: DE102019110721A1

Abstract

Workflow for training a classifier for quality checking in measurement technology. A computer-implemented method for extending a training data set for machine learning is presented. In this case, the method comprises providing a data set of the object to be investigated. In this case, the data set contains coordinate values and a measured value for each coordinate. The method further comprises the following steps: the method further comprises identifying anomalies in a partition in the data set corresponding to a sub-region of the object under study, and classifying the anomalies into a plurality of predefined classification classes by a first machine learning model trained with a first training data set. The method further comprises the following steps: determining a difference value of the anomaly compared to the trained first machine learning model, and adding data relating to the identified anomaly to the first training data set when the difference value is above a first predefined threshold.

Description

Workflow for training classifiers for quality inspection in measurement technology

Technical Field

The present invention relates to workflows for annotating training data sets, and in particular to a computer-implemented method for extending training data sets for machine learning, and to a corresponding workflow system.

Background

Industrial processes require constant quality control in order to continuously meet the demands of consumers (industrial companies and individuals). For a continuously high quality, even slight deficiencies in the market cannot be tolerated. In order to meet the continuing high quality requirements of manufactured products, automated quality assurance systems are often used. Some of these systems are based on purely optical methods, while others rely on mechanical/electromechanical measurements of product parameters. Another type of quality measurement relies on data evaluation by electron microscopy, fluorescence microscopy, light microscopy, optical coherence tomography systems, interferometers, spectrometers, or other computer tomography systems.

Artificial intelligence techniques, particularly machine learning systems, can be used to assess the measured values. In order for these systems to function as desired, it is often necessary to have a large amount of training data in order to detect anomalies in the produced workpieces using a machine learning system that is subsequently trained. All training data must be manually annotated, which requires significant time investment by professionals. In this case, annotation is understood to mean the process of assigning each anomaly to be classified to a desired predetermined anomaly category. This may be porosity, core cracking, wall displacement, etc.

From many perspectives, this manual process is problematic: collecting data is very time consuming, as is the manual annotation process. One problem with collecting data is that the frequency distribution of defect classes is usually very different. To ensure that there are enough instances of even relatively rare defects for training purposes, it is often necessary to record an extremely large amount of data. However, this is technically very complex and expensive. In particular, it is generally not possible to store all exceptions.

With regard to data annotation, another problem is that in some cases very similar or even redundant data is recorded and stored. If these redundant data are annotated without a proper choice, it can be expected that the classification performance will hardly be improved any more, since the redundant annotation data are more or less insignificant. Especially in the context of (feasibility) research, it is often possible to record large amounts of data; however, in situations where data is not sampled properly, situations may arise where a large number of similar data points are unnecessarily collected and annotated in a time-consuming manner.

The defect feature set is typically incomplete. Instead, it is often desirable to record new defect classes in existing directories over time. Possible reasons are to change the parameters of the supervised process or to determine that the selected original training data set is too small and therefore does not fully represent the overall quality assurance task. Therefore, in this respect, it is necessary to prevent previously unknown defects from being assigned to a known defect class (classification) in the classification or to prevent features of the known class from being incorrectly classified as non-defects or defects of another class. When selecting a new example, irrelevant outliers should be avoided from being selected for annotation. Furthermore, the selection and annotation process should provide the possibility that the operator/visual inspector can influence the annotation process, for example, because the selected examples are caused by erroneous recordings and are not assigned to any relevant defect type.

Third, if the selected defect is not annotated by an application professional, but by a trained layperson, a mismatch or error in the annotation should be avoided.

Furthermore, in practical applications, incremental adaptation of the classifier after adding annotations is generally not feasible, since the application specialist wishes to annotate several items of data (e.g. 30-300) on the workpiece, for example, at once, since training requires a relatively long time compared to the sample examination time. Therefore, it is feasible to train the classifier with only a few new data points in an extended way. Thus, in selecting several examples for annotation, it must be ensured not only that each individual example provides as much information as possible, but also that the combination of these examples (i.e., the training data) is not redundant.

In addition, it is difficult to annotate 3-dimensional data because viewing the data on a 2D display is challenging. In addition, classifying data is also very challenging due to the large amount of data. In case there are a large number of sub-volumes to be sorted, a correspondingly large amount of computing power is required.

Thus, the basic goals of the concepts presented herein are to optimize the annotation process gracefully, to relieve the burden of responsible personnel, to enable continuous addition to the training data set, and to allow the appropriate opportunity to respond only to anomalies that cannot be classified correctly.

Disclosure of Invention

This object is achieved by a computer-implemented method for extending a training data set for machine learning and a corresponding workflow system for extending a training data set for machine learning according to the independent claims presented herein. Further developments are described by the corresponding dependent claims.

According to a first aspect of the present invention, a computer-implemented method for extending a training data set for machine learning is presented. The method comprises providing a data set of the object to be investigated, wherein the data set has coordinate values and a measured value for each coordinate.

The method further comprises the following steps: the method further comprises identifying anomalies in a partition in the data set corresponding to a sub-region of the object under study, and classifying the anomalies into a plurality of predefined classification classes by a first machine learning model trained with a first training data set.

The method further comprises the following steps: determining a difference value of the anomaly compared to the trained first machine learning model, and adding data relating to the identified anomaly to the first training data set when the difference value is above a first predefined threshold.

According to a second aspect of the invention, a corresponding workflow system is proposed, having:

A receiving unit, which is designed for receiving a data set of the object to be investigated, wherein the data set has coordinate values and a measured value for each coordinate,

a detection module designed for identifying anomalies in a region of the dataset corresponding to a sub-region of the object under investigation,

a classification system designed for classifying the anomaly into a plurality of predefined classification classes by means of a first machine learning model trained with a first training data set,

a difference determination module designed to determine a difference of the anomaly compared to the trained first machine learning model, an

-an adding unit designed for adding data relating to the identified anomaly to the first training data set when the difference is above a first predefined threshold.

Furthermore, embodiments can be implemented in a corresponding computer program product accessible from a computer-usable or computer-readable medium containing program code for use by or in connection with a computer or instruction execution system. In the context of this specification, a computer-usable or computer-readable medium may be any apparatus that can store, communicate, forward, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.

The computer-implemented method of expanding a training data set for machine learning has several advantages and technical effects:

first, the proposed concept allows annotation costs and workload to be greatly reduced. To reduce the amount of data and thus complexity, two-dimensional XY, YZ and XZ sections through the center of the anomaly may be used instead of a full 3D volume for annotation and analysis/classification. In this case, the correlation profile can be obtained from a probability map of anomaly detection. As shown by experiments, the cross-sectional images contain enough information to distinguish between the error classes, but are easier to process than the full 3D volume information. Alternatively, other 2D projections of the 3D volume (e.g., local min/max projections) may also be used.

In addition, training time may be reduced. Since less data leads to the same information content overall as a result of the targeted selection of data, less time needs to be provided for classifier training overall. This has a direct impact on faster turnaround, which also allows interactive switching between training, data selection and annotation so that the solution can be implemented faster.

Furthermore, the concepts presented herein result in less classification errors. Since defects that are difficult to detect are preferably selected for annotation, defect classification enables reliable and fast detection of even difficult anomalies. The classification accuracy is thus increased in a sustainable way, which improves the result of the quality check and thus of the associated product.

Furthermore, manual intervention may be performed during the automated or partially automated annotation process in order to thereby prevent and accordingly adjust incorrect annotation parameters.

Overall, the annotation process is thus elegantly optimized, can relieve the burden of responsible personnel, can be continuously added to the training data set, and can provide an appropriate opportunity to respond only to anomalies that cannot be correctly classified.

Further embodiments of the inventive concept of the method are presented below, which can be used in the same way for the respective workflow system:

according to one embodiment of the method, the exception may be a large number of exceptions, and the adding may include: when the difference is above a second predefined threshold, data from the selected identified anomaly is added to the training data set. In this case, the first threshold value may make the operator notice the fact that: the anomalies are not unambiguously assigned to a class and manual intervention in the annotation process seems to be necessary.

However, if the determined difference is higher than a second difference, e.g. a higher value, a clearly different anomaly may be assumed to be present, which anomaly may then also be automatically applied to annotate the training data set.

It is further noted that the data about anomalies may not only consist of one scan data set, but may also contain a large amount of data combined from several scans. The possibility of detection can be greatly improved.

According to another embodiment of the method, the difference value may be determined by a degree of novelty method for the selected anomaly. The novelty degree method may be selected from the group consisting of: novelty detection, estimation error reduction, estimation entropy minimization, expected model output variation (EMOC), MC-dropout, OpenMAX (the latter two for neural networks), SVM margin, neural networks, ratio of highest classification probability to second highest classification probability, classification probability entropy, GP variance, variance of individual classification probabilities from aggregated classifiers (e.g., bagging), variance of classification probabilities in case of disturbance of the input signal (i.e., in case the dataset is relevant to the subject to be studied).

This provides a broad library of methods for differences from which the person responsible for the workflow can select without human restriction. The method for detecting the difference can be further supplemented by those skilled in the art.

According to a further possible embodiment of the method, a combination of novelty degree methods may be determined using a method selected from the group consisting of: multi-arm slot machine formulation, successful drive selection among multiple criteria, linear combination based on reward functions, reinforcement learning. These variants make it possible to consider further more sophisticated novelty degree methods in order to thus satisfy the combination of identification anomalies and classifications used.

According to a further embodiment, when determining the difference value, the method may further comprise determining the difference value or the combined difference value by a second machine learning model trained with a second training data set. In particular, this may be one of the methods of forest regression, linear regression, gaussian process, neural network, etc. This accurate qualitative check can more accurately detect whether additional anomaly classes are actually involved that need to be considered.

An advantageous embodiment of the method may comprise creating the first training data set and the second training data set by randomly selecting (e.g. by bagging) a data set from the overall training data set, respectively. Thereby also the trend for taking into account or not taking into account certain anomalies can be reduced and thus the method can be made more robust against incorrect interpretations.

Further advantageous embodiments of the method may further comprise retraining the first machine learning model with the added training data set to generate a third machine learning model. In this case, the parameters of the first machine learning model may be used as starting values for training. Alternatively, the machine learning model may also be restarted completely to create a trendless machine learning model.

According to a particular embodiment of the method, the classification may be performed by a classifier selected from the group consisting of: neural networks (including deep neural networks), random forests, logistic regression (typically atypical), Support Vector Machines (SVMs), and gaussian regression. In principle, all known classifiers can be used, from which a person skilled in the art selects a suitable classifier or classifier system depending on the abnormality and the object to be investigated.

According to a further useful embodiment of the method, the data set may be derived from an image recording method. In this case, the image recording method may be based on the following: electron microscopy (other charged particles may also be used in particular), fluorescence microscopy, light microscopy, optical coherence tomography systems, interferometers, spectrometers, surgical microscopes and computed tomography systems. The person skilled in the art can perform this method using comparable image recording methods.

According to one possible embodiment, the method may further comprise: a selection signal for an anomaly is received, for example from a vision inspector, wherein the selection signal amplifies the difference value such that the difference value is above the first predefined threshold or above the second predefined novelty threshold. Using this technique based on overwriting the determined difference value, the "uncertain candidate" can be changed to the "confirmed candidate" and thus the annotation can be supported by experts.

Drawings

It is noted that exemplary embodiments of the present invention may be described with reference to different implementation classes. In particular, some exemplary embodiments are described with reference to methods, while other exemplary embodiments may be described in the context of corresponding devices. In any case, if not otherwise indicated, a person skilled in the art will be able to recognize and combine, from the above and the following description, possible combinations of features of the method and of features of the corresponding system, even if these features belong to different claim categories.

The aspects that have been described above and further aspects of the invention will become apparent from the exemplary embodiments that have been described and from further specific modifications described with reference to the accompanying drawings.

Preferred exemplary embodiments of the present invention are described by way of example and with reference to the following drawings:

FIG. 1 illustrates a block diagram of one exemplary embodiment of a computer-implemented method of the present invention for extending a training data set for machine learning.

Fig. 2 shows a block diagram of a sequence of conventional training sequences for a classifier.

Fig. 3 shows a block diagram of a flow chart of an embodiment closer to the proposed method.

FIG. 4 illustrates a block diagram of an exemplary embodiment of a workflow system.

FIG. 5 illustrates a block diagram of a computer system that additionally has a workflow system.

Detailed Description

In the context of this description, conventions, terms, and/or expressions should be understood as follows:

the term "machine learning" is a fundamental term or function in artificial intelligence, in which a computer system is given the ability to "learn", for example, using static methods. By way of example, in this case, certain behavior patterns within a particular task scope are optimized. The method used gives the machine learning system the ability to analyze data without explicit programming for this purpose. As an example of a machine learning system, a CNN (convolutional neural network) is, for example, a typical network of nodes that act as artificial neurons and artificial links between them, wherein the artificial links can be assigned parameters, such as weight parameters of the links. In training the neural network, the weight parameters of the links are automatically adjusted based on the input signals to produce the desired results. In the case of supervised learning, metadata (annotation) is added to an image (generally input data) provided as an input value (training data) to display a desired output value. Such annotations are not necessary in the case of unsupervised learning. It is generally considered that the mapping of input data to output data is learned.

In this respect, reference should also be made to a Recurrent Neural Network (RNN), which also constitutes a deep neural network, in which weight adjustments are performed recursively, so that structured predictions about variable-size input data can be generated. Such RNNs are typically used for sequential input of data. In this case, in the same manner as the CNN, a back propagation function is used in addition to the forward weight adjustment. RNNs can also be used for image analysis.

The term "training data set" in connection with this document essentially describes image data that can be obtained using the method from one or more abnormalities in which an annotation already exists in the object to be investigated, for example a workpiece.

The term "extended training data set" describes the process of extending an existing training data set with new image data having a corresponding classification, which can then be used for further or extended training of the classifier.

In this case, the term "object to be investigated" generally describes a workpiece to be subjected to a quality inspection.

The term "abnormality" describes in principle an undesired abnormality in the volume or surface of the object to be investigated. This may be, for example, porosity, inclusions, core cracking, wall displacement, or other anomalies in or on the workpiece.

The term "classification" describes the process of assigning detected anomalies to anomaly classes by a classifier operating on the principle of machine learning.

The term "classifier" (also referred to as classifier system in the context of machine learning) describes a machine learning system which is given the ability to assign input data, here in particular image data of anomalies of an object to be investigated, to a specific class, in particular anomalies, by training with training data.

It should also be noted in this case that the classifier is typically classified into a predefined number of classes. Typically, this is achieved by the classification value of the input data determined for each class and a WTA (winner take all) filter that selects the class with the highest classification value as the classified class. In a classifier, the difference from the 100% classification value is typically used as a quality parameter for the classification or a probability of classification correctness.

In the context of this document, the term "difference value" describes a value determined by the method presented herein or a corresponding workflow system, which specifies the magnitude of the difference between the detected (i.e., identified) anomaly and the 100% classification. In this case, the highest classification value (see above) may be used as an indicator; however, other classification values of other classes may also be used in addition. In the simplest case, the classification probability parameters of the classifier will thus be able to be applied simply. This difference is applied in the decision as to whether or not to assign the identified anomalies in a reasonable way to the already known specific classes and, therefore, the corresponding pixel data is automatically annotated accordingly for the purpose of adding to the training data set. These annotation values for the pixels are then used to add the parameter "annotation value" to the data set consisting of coordinate values and measured values.

If no explicit classification is possible, the practitioner can authorize a new exception class or perform a further or new round of training of the classifier to be trained.

The term "training classifier" refers to adapting a machine learning system, for example, by evaluating several example image parameters, e.g., in a neural network, after a training phase, in part repeatedly, to assign even unknown images to one or more classes on which the learning system has been trained. The example images typically have annotations (i.e., are provided with metadata) to produce desired results based on the input images, such as information about maintenance measures to be taken.

As an example of a classifier, the term "convolutional neural network" describes a class of artificial neural networks based on feed-forward techniques. They are commonly used for image analysis using images as input data. The main component of the convolutional neural network is in this case the convolutional layer (hence the name) that enables efficient evaluation through parameter sharing.

The following facts are also mentioned: the deep neural network consists of several layers with different functions-e.g. an input layer, an output layer and one or more insertion layers, e.g. for convolution operations, application of non-linear functions, dimensionality reduction, normalization functions, etc. These functions may be "executed in software", or a particular hardware component may take over the calculation of the corresponding function value. Combinations of hardware elements and software elements are also known.

The term "novelty detection" describes a mechanism by which an "intelligent" system (e.g., a smart creature) can classify arriving sensor patterns as previously unknown patterns. The principles may also be applied to artificial neural networks or other classifiers. An arriving sensor pattern (e.g., a digital image) may be classified as an image with new content (novelty) if the arriving sensor pattern does not generate an output signal with a probability of detection above a predefined (or dynamically adjusted) threshold, or generates several output signals with approximately the same probability of detection.

On the other hand, if a neural network trained with known example images (and configured as, for example, an auto-encoder) is fed with new example images, and the example images traverse the auto-encoder, the auto-encoder should be able to reconstruct the new example images again at the output. This is possible because the new example image is compressed to a large extent throughout the auto-encoder, to be subsequently re-expanded/reconstructed by the neural network. If the new input image (to a large extent) matches the expanded/reconstructed image, the new example image corresponds to a known pattern or has a high similarity to the content in the training image database. If the comparison of the new example image with the expanded/reconstructed image shows a significant difference, i.e. if the reconstruction error is large, the example image is an image with a previously unknown distribution (image content). That is, if the auto-encoder is unable to map the data distribution, this is an anomaly compared to known databases that are assumed to be normal.

A detailed description of the drawings is given below. In this case, it should be understood that all the details and information in the figures are shown schematically. First, presented is a block diagram of one exemplary embodiment of a computer-implemented method for extending a training data set for machine learning according to the present invention. Further exemplary embodiments or exemplary embodiments for a corresponding system are described below:

FIG. 1 illustrates a block diagram of one exemplary embodiment of a computer-implemented method 100 of the present invention for extending a training data set for machine learning. The method 100 comprises providing 102 a data set of an object to be studied. In this case, the data set has coordinate values and a measured value for each coordinate. It may be a 2D (2 dimensional) or 3D data set with a gray value (also commonly referred to as voxel) of each pixel as a measure, or with other color values. Other measurements may also exist for each pixel or group of pixels. The data in the data set is recorded using the methods described above, and may be pre-processed (e.g., normalized).

The method 100 further comprises identifying 104 anomalies in a region of the dataset corresponding to a sub-region of the object under study (also referred to in literature as "region of interest"). In one variant, this is therefore a partition; in the case of 3D image data, this is a sub-volume.

The method 100 further includes classifying 106 the identified anomalies into a plurality of predefined classification classes by a first machine learning model trained with a first training data set. This makes it possible to classify abnormalities in the subject to be investigated. To typically form the first machine learning model in a classifier for classification, the classifier has been trained with a first training data set prior to use (i.e., prior to anomaly classification).

The method 100 also includes determining 108 a difference of the anomaly compared to the trained first machine learning model. The difference is not necessarily the classification probability value of the classifier but can be derived therefrom-i.e. a function thereof.

The method 100 further comprises: when the difference is above a first predefined threshold, data relating to the identified anomaly is added 110 to a first training data set. In this case, the threshold value of the difference value may be set to a separately predefined quality criterion, production process and expected anomaly. Annotation is typically performed on a pixel-by-pixel basis.

Fig. 2 shows a block diagram of a sequence 200 of conventional training sequences for a classifier. In this case, a potential error, i.e., anomaly, is first determined 202. These anomalies are annotated 204 according to the defect class. In this case, in particular those pixels which belong to the respective anomaly are provided manually and thus in a time-consuming manner with a specific annotation mark for the respective anomaly class. The process is performed interactively by the user. Thereby creating a training data set. It is well known that a large number of example exceptions are required for this purpose.

In the next step, the classifier is trained with the complete training data set, 206. The trained classifier is then used in an inspection process in a quality assurance system, 208. This process is known from the prior art to be time-consuming and requires special expertise in assessing anomalies and thus creating training data with corresponding annotations.

The method presented herein may use this procedure to create a basic training data set in order to obtain basic training data, in particular a first training data set.

Fig. 3 shows a block diagram of a flow chart 300 of an embodiment closer to the proposed method. According to the workflow of the proposed method, a potential error, i.e. an anomaly, is also determined in this case first, 302. To specifically form a first training data set for training the first machine learning model, the corresponding data is typically manually annotated 304 according to defect classes for later training with an increased amount of training data 306. The classifier trained in this manner may also be used during the normal inspection process, 308.

However, if the difference of the classifications is then found to be too large compared to a first predefined threshold (i.e. simply greater than the threshold) based on the probability of correctness of the classification or the corresponding quality parameter of the classification during an ongoing check, a training data set may be added, in which case the new anomaly class is then considered. At the same time, pixels belonging to an identified anomaly that cannot be classified unambiguously (i.e., difference value is less than a predefined threshold) may be automatically annotated with a new annotation tag belonging to a new anomaly class.

Retraining the classifier with the extended training data set then also includes learning additional class classes, i.e., the machine learning model of the classifier is also trained by the added training data set for the additional class classes.

As described above, the degree of certainty (i.e. the threshold value) is determined 310 according to predefined parameters of the input data used (typically image data), the material forming the object to be studied and/or the expected anomalies or defects. Additional annotations 312 by the degree of certainty of the selected additional data and retraining 306 using the now added data set have also been explained above.

This degree can then be used differently for the sequence in the examination process: (1) to make the data set more targeted, only data related to sub-volumes with higher estimated values for the newly obtained information is stored. Thereby saving storage space and a large number of related sub-volumes can be stored in a predefined size of memory. Then, only those sub-volumes that provide particularly useful information are considered for annotation, and thus, the classifier is expected to improve in subsequent training.

A high estimated information content, i.e. a high difference value (e.g. due to high uncertainty about the classification), indicates at the same time that the associated sub-volume may contain a previously unknown defect. Therefore, this should also be taken into account when expanding the training data set. If the failure image occurs frequently, the set of error classes in the defect directory may be extended with the new class. That is, it is not necessary to introduce a new class every time an exception with a difference greater than a threshold first occurs. In this case, the new model may be initialized using the model parameters of the previously trained model. This may significantly reduce training time and the amount of additional training data required.

When new data is examined, the degree can be evaluated immediately for each local impact, in order to decide directly in the examination process whether it should be submitted to an expert for subsequent examination (flow-based setup). Alternatively, a sufficient number of local defects (i.e., anomalies) may be collected first, and then selected for annotation (based on the pool settings) from among these local defects by evaluating the degree of information in the most relevant examples.

To prevent the selection of irrelevant data points, such factors may additionally be added to the information degree: for example, a factor for estimating whether a visually similar example appears when new data (novel clusters) is collected, or whether it is an abnormal value (for example, due to a recording error occurring in a short period and resulting in an image having a completely different appearance).

To avoid an increase in the number of rejections by the visual inspector, the degree of information should be adapted to assess similar examples with lower information values later.

Since the selected example is annotated (multiple annotations) by multiple people, the impact of incorrect annotations by non-application experts can be reduced very easily. Alternatively, it is also possible to already estimate the errors caused by the visual inspector in the information degree, so that different effects of the annotation are submitted to different visual inspectors. This operation can be performed regardless of the reliability of the respective visual inspector.

To ensure that the combination of several examples also provides useful information, a degree of diversity may be added to the selection. This can be performed by directly assessing the similarity of the examples, or by assessing based on the variance of the resulting classifier.

For completeness, FIG. 4 illustrates a block diagram of an exemplary embodiment of a workflow system 400. The workflow system 400 has a receiving unit 402 which is designed to receive a data set of an object to be investigated. In this case, the data set has coordinate values and a measured value for each coordinate.

The workflow system 400 also has a detection module 404 designed to identify anomalies in a partition of the dataset corresponding to a sub-region of the object under study, and a classification system 406 designed to classify the anomalies into a plurality of predefined classification classes by means of a first machine learning model trained with a first training dataset.

The difference determination module 408, which is part of the workflow system 400, is designed to determine the difference of the anomaly compared to the trained first machine learning model. Finally, the workflow system 400 also has an adding unit 410, which is designed to add data relating to the identified anomaly to the training data set when the difference is above a first predefined threshold.

Thus, automatic annotation and expansion of the training data set and insertion of potential new anomaly classes can be gracefully achieved.

In summary, it can thus be said that for creating the solution proposed herein, information content metrics known from the field of active learning can be used. This basically includes two method classes: exploratory methods and exploratory methods.

The exploratory approach would ignore already trained classifier models and instead attempt to cover the entire feature space as much as possible with as few data points to be selected as possible, i.e., generate a representative, rich-variance data set as quickly as possible. Thus, it is assumed that there is a large amount of information in those new data that are the largest and similar to the known data. For the potential problems here, these methods would therefore choose to annotate defects that have not previously been known in terms of visual appearance. These methods have the advantage of having the option of quickly selecting previously unknown defect classes (exception classes) for annotation. There are disadvantages with regard to the sensitivity of outliers caused, for example, by disturbing processes during data recording. Measures that explicitly address the emergence of new unknown data can be applied directly from the "novelty detection" field.

With respect to the mining approach, it should be assumed in the problem addressed herein that the classifier has achieved practical performance (warm start) and that it is not necessary to train the machine learning model from scratch (cold start). Thus, the degree of information that requires an already trained classifier can be successfully used.

Mining methods incorporate a previously trained classifier into the evaluation of the information content of the new example, i.e., these methods "mine" the classifier. The aim is then to achieve a fast reduction of classification errors. Depending on the amount of training data available and the computation time available, the expected error reduction may be evaluated directly (e.g., estimated error reduction or estimated entropy minimization), approximated slightly (e.g., using EMOC), or replaced by heuristic motivational methods.

Uncertainty-based sampling can be particularly advantageous in heuristic methods, which means that any example has a high information content, with the current classifier having the largest level of uncertainty (e.g., SVM margin, k nearest neighbor, 1-vs-2 and entropy, GP mean).

Similarly, example choices that have a large impact on previous classifiers in new training can be easily used, which is referred to as expected model changes (e.g., for SVMs or random forests).

In the context of the present application, applying uncertainty sampling is particularly relevant because the degree can be calculated quickly, so that no delay interaction can be made between the data recording system and the data annotation expert.

The information contents MC-Dropout and OpenMax are particularly suitable for use with neural networks or deep learning.

FIG. 5 illustrates a block diagram of a computer system that may have at least a portion of a maintenance supervision system. In principle, embodiments of the concepts presented herein can be used with virtually any type of computer regardless of the platform used therein for storing and/or executing program code. Fig. 5 illustrates, by way of example, a computer system 500 suitable for executing program code in accordance with the methods presented herein. The computer system already present in the microscope system may also be used as a computer system for performing the concepts presented herein, with a corresponding extension possible.

The computer system 500 has several general functions. In this case, the computer system may be a tablet computer, a laptop/notebook computer, another portable or mobile electronic device, a microprocessor system, a microprocessor-based system, a smart phone, or a computer system with a specially configured special function. Computer system 500 may be configured to execute computer-system-executable instructions, such as program modules, that are executed to implement the functionality of the concepts presented herein. To this end, program modules may include routines, programs, objects, components, logic, data structures, etc. that perform particular tasks or implement particular abstract data types.

The components of the computer system may have the following: one or more processors or processing units 502, a memory system 504, and a bus system 506 that couples various system components including the memory system 504 to the processors 502. Computer system 500 typically has several volatile or non-volatile storage media that can be accessed by computer system 500. The storage system 504 may store data and/or instructions (commands) of the storage medium in a volatile form for execution by the processor 502, such as in RAM (random access memory) 508. Which perform one or more functions or steps of the concepts presented herein. Other components of the storage system 504 may be a persistent store (ROM) 510 and a long term memory 512, in which program modules and data may be stored (reference numeral 516).

The computer system has a number of dedicated devices for communication (keyboard 518, mouse/pointing device (not shown), screen 520, etc.). These dedicated devices may also be combined in a touch sensitive display. The separately provided I/O controller 514 ensures frictionless data exchange with external devices. The network adapter 522 may be used to communicate via a local area network or a global network (LAN, WAN, e.g., via the internet). Other components of computer system 500 may access the network adapter via bus system 506. In this case, although not shown, it should be understood that other devices may also be connected to computer system 500.

At least a portion of the workflow system 400 (see fig. 4) may also be connected to the bus system 506. Digital image data from an image sensor (not shown) may also be processed by a separate pre-processing system (not shown). This may enable providing a data set of the object to be studied.

The description of the various exemplary embodiments of the present invention has been given for the purpose of improving understanding, but it is not intended to limit the inventive concept directly to these exemplary embodiments. Further modifications and variations will occur to those skilled in the art. The terminology used herein was chosen to best describe the general principles of the exemplary embodiments and to enable those skilled in the art to readily access them.

The principles presented herein may be embodied as systems, methods, combinations thereof, and/or computer program products. In this case, the computer program product may have one (or more) computer-readable storage media containing computer-readable program instructions to prompt a processor or control system to perform various aspects of the present invention.

Electronic, magnetic, optical, electromagnetic or infrared media or semiconductor systems are used as forwarding media; for example, an SSD (solid state device/drive as solid state memory), a RAM (random access memory) and/or a ROM (read only memory), an EEPROM (electrically erasable ROM), or any combination thereof. Propagating electromagnetic waves, electromagnetic waves in a waveguide or other transmission medium (e.g., light pulses in an optical cable), or electrical signals transmitted in a wire are also considered to be a repeating medium.

The computer readable storage medium may be an embodied device that retains or stores instructions for use by an instruction execution device. The computer-readable program instructions described herein may also be downloaded from a service provider to a corresponding computer system via a cable-based connection or a mobile radio network, for example as a (smartphone) application.

The computer-readable program instructions for carrying out operations of the present invention described herein may be machine-related or machine-independent instructions, microcode, firmware, state definition data, or any source or object code, for example, written in C + +, Java, or the like, or in a conventional procedural programming language (e.g., the programming language "C" or a similar programming language). These computer-readable program instructions may be executed entirely by a computer system. In some exemplary embodiments, it may also be an electronic circuit, such as a programmable logic circuit, a Field Programmable Gate Array (FPGA), or a Programmable Logic Array (PLA), that executes computer-readable program instructions using state information of the computer-readable program instructions to configure or customize the electronic circuit according to aspects of the present invention.

The present invention presented herein is further illustrated with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to exemplary embodiments of the invention. It is noted that virtually any block of the flowchart and/or block diagrams may be designed to be computer-readable program instructions.

The computer-readable program instructions may be made available to a general purpose computer, special purpose computer, or data processing system that is capable of being programmed in another manner, to create a machine, such that the instructions, which execute by a processor or computer or other programmable data processing apparatus, create means for implementing the functions or processes illustrated in the flowchart and/or block diagram block or blocks. These computer-readable program instructions may thus also be stored on a computer-readable storage medium.

To this extent, any block in the flowchart or block diagrams presented may represent a module, segment, or portion of instructions, which represents several executable instructions for implementing the specified logical function(s). In some exemplary embodiments, the functions illustrated in the various blocks may be performed in another order, or in parallel.

The structures, materials, sequences, and equivalents of all means and/or step plus function elements in the claims below are intended to include all structure, material, or sequence that is expressed as a result of the claims.

Reference numerals

100 method

102100 method step

104100 method step

106100 method step

108100 method step

110100 method step

200 method for annotating training data and using classification

202, … 208200 method step

300 detailed method compared to method 100

302, … 312300 method step

400 workflow system

402 receiving unit

404 detection module

406 classification system, classifier

408 difference determination module

410 add unit

500 computer system

502 processor

504 storage system

506 bus system

508 RAM

510 ROM

512 long term memory

514I/O controller

516 program modules and data

518 keyboard

520 Screen

522 network adapter.

Claims

1. A computer-implemented method (100) for extending a training data set for machine learning, the method comprising:

providing a data set of the object to be investigated, wherein the data set has coordinate values and a measured value for each coordinate, and wherein,

the data set is derived from an image recording method using a computer tomography system,

-identifying anomalies in the region of the dataset corresponding to the sub-region of the object under study,

classifying the anomaly into a plurality of predefined classification classes by a first machine learning model trained with a first training data set,

-determining a difference value of the anomaly compared to the trained first machine learning model, wherein determining the difference value comprises

The combination of degrees of novelty is determined by a second machine learning model trained with a second training data set,

-adding data relating to the identified anomaly to the first training data set when the difference is above a first predefined threshold, and

-creating the first training data set and the second training data set by randomly selecting a data set from the overall training data set, respectively.

2. The method of claim 1, wherein the anomaly is a large number of anomalies, and the adding comprises: -adding data of the selected identified anomaly to the training data set when the difference is above a second predefined threshold.

3. The method according to claim 1 or 2, wherein the difference is determined by a degree of novelty method for a selected anomaly selected from the group consisting of: novelty detection, estimation error reduction, estimation entropy minimization, expected model output variation (EMOC), MC-dropout, OpenMAX, SVM margin, neural network, ratio of highest classification probability to second highest classification probability, classification probability entropy, GP variance, variance of individual classification probabilities from an aggregated classifier, variance of classification probabilities in case of interference of input signals.

4. A method according to claim 3, wherein the combination of the novelty degree methods is determined using a method selected from the group consisting of: multi-arm slot machine formulation, successful drive selection among multiple criteria, linear combination based on reward functions, reinforcement learning.

5. The method according to one of the preceding claims, further comprising

-retraining the first machine learning model with the added training data set to generate a third machine learning model, wherein parameters of the first machine learning model are used as starting values.

6. The method according to one of the preceding claims, wherein the classification is performed by a classifier selected from the group consisting of: neural networks, random forests, logistic regression, support vector machines, and gaussian regression.

7. The method according to one of the preceding claims, further comprising

-receiving a selection signal for an anomaly, wherein the selection signal amplifies the degree of novelty such that the degree of novelty is above the first predefined novelty threshold or above the second predefined novelty threshold.

8. A workflow system (400) for extending a training data set for machine learning, the workflow system (400) having

A receiving unit (402) designed for receiving a data set of an object to be investigated, wherein the data set has coordinate values and a measured value for each coordinate, and wherein the data set is derived from an image recording method using a computed tomography system,

A detection module (404) designed for identifying anomalies in a region of the dataset corresponding to a sub-region of the object under investigation,

a classification system (406) designed for classifying the anomaly into a plurality of predefined classification classes by means of a first machine learning model trained with a first training data set,

-a difference determination module (408) designed for determining a difference of the anomaly compared to the trained first machine learning model, wherein determining the difference comprises

-an adding unit (410) designed to add data relating to the identified anomaly to the first training data set when the difference is above a first predefined threshold, and

-a provision determination module designed for creating the first training data set and the second training data set by respectively randomly selecting data sets from an overall training data set.