CN117009883B - Object classification model construction method, object classification method, device and equipment - Google Patents

Object classification model construction method, object classification method, device and equipment Download PDF

Info

Publication number
CN117009883B
CN117009883B CN202311269339.7A CN202311269339A CN117009883B CN 117009883 B CN117009883 B CN 117009883B CN 202311269339 A CN202311269339 A CN 202311269339A CN 117009883 B CN117009883 B CN 117009883B
Authority
CN
China
Prior art keywords
image
label
sub
class
classification
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202311269339.7A
Other languages
Chinese (zh)
Other versions
CN117009883A (en
Inventor
林岳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN202311269339.7A priority Critical patent/CN117009883B/en
Publication of CN117009883A publication Critical patent/CN117009883A/en
Application granted granted Critical
Publication of CN117009883B publication Critical patent/CN117009883B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present application relates to an object classification model construction method, an object classification device, a computer apparatus, a computer readable storage medium, and a computer program product, which are applicable to various scenarios such as cloud technology, artificial intelligence, intelligent traffic, assisted driving, and the like. The object classification model construction method comprises the following steps: acquiring a training set comprising a plurality of unlabeled objects and a plurality of labeled objects, and performing semi-supervised learning on the basis of the first class objects and at least one part of unlabeled objects aiming at the first class objects with smaller object numbers to obtain a classification model, wherein the classification model is used for determining the first class supplementary objects with the same predictive labels as the first class objects in the unlabeled objects of the training set; and performing model training by using a data subset comprising at least a part of sample objects to obtain object classification sub-models corresponding to the data subset, and constructing an object classification model comprising a plurality of object classification sub-models. By adopting the method, the performance of the object classification model can be improved.

Description

Object classification model construction method, object classification method, device and equipment
Technical Field
The present invention relates to the field of computer application technology, and in particular, to an object classification model construction method, an object classification device, a computer apparatus, a computer readable storage medium, and a computer program product.
Background
With the development of computer technology, the application field of machine learning is becoming wider and wider, and for example, the machine learning method can be applied to the fields of transportation, games, platform management and the like. Taking commodity transaction platform management as an example, a platform server can obtain a corresponding object classification model based on machine learning training, classify merchants providing commodities by using the classification model, and identify abnormal merchants so as to perform targeted processing.
In the conventional technology, an object classification model is trained by using classified objects carrying class labels, and the model performance is limited by the number of classified objects carrying the same class labels. Therefore, the object classification model constructed by the traditional technology has the defects of low recognition rate of the object class with less training samples in the training process and poor model performance.
Disclosure of Invention
In view of the foregoing, it is desirable to provide an object classification model construction method, an object classification method, an apparatus, a computer device, a computer-readable storage medium, and a computer program product that can improve flexibility.
In a first aspect, the present application provides a method for constructing an object classification model. The method comprises the following steps:
acquiring a training set comprising a plurality of unlabeled objects and a plurality of labeled objects; the tagged objects include a first class of objects whose number meets a small batch condition; the small batch condition means that the number of objects is smaller than or equal to a first number threshold;
semi-supervised learning is performed based on the first class object and at least a part of the unlabeled objects, so that a classification model is obtained; the classification model is used for determining a first class supplementary object with the same predictive label as the first class object in the unlabeled object;
model training is carried out by using a data subset comprising at least a part of sample objects, and an object classification sub-model corresponding to the data subset is obtained; the sample object includes the tagged object and the first type complement object;
constructing an object classification model comprising a plurality of object classification sub-models; the classification result of the object classification model is obtained by counting the respective sub-classification results of the object classification sub-models.
In a second aspect, the application also provides an object classification model construction device. The device comprises:
The training set acquisition module is used for acquiring a training set comprising a plurality of unlabeled objects and a plurality of labeled objects; the tagged objects include a first class of objects whose number meets a small batch condition; the small batch condition means that the number of objects is smaller than or equal to a first number threshold;
the semi-supervised learning module is used for performing semi-supervised learning based on the first class object and at least a part of the unlabeled objects to obtain a classification model; the classification model is used for determining a first class supplementary object with the same predictive label as the first class object in the unlabeled object;
the sub-model training module is used for carrying out model training by using a data subset comprising at least a part of sample objects to obtain an object classification sub-model corresponding to the data subset; the sample object includes the tagged object and the first type complement object;
an object classification model construction module for constructing an object classification model comprising a plurality of object classification sub-models; the classification result of the object classification model is obtained by counting the respective sub-classification results of the object classification sub-models.
In one embodiment, the semi-supervised learning module includes: the initial classification model determining unit is used for determining the first class object as a learning object and performing supervised learning on the learning object to obtain an initial classification model; the classifying unit is used for classifying at least one part of the unlabeled objects by using the initial classifying model to obtain pseudo-label objects carrying predictive labels; the iteration unit is used for determining a new learning object based on the pseudo tag object and returning to the step of performing supervised learning on the learning object until a learning stop condition is met; and the classification model determining unit is used for determining the current initial classification model as the classification model corresponding to the tag carried by the first class object under the condition that the learning stopping condition is met.
In one embodiment, the iteration unit is specifically configured to: acquiring label confidence of a predicted label of each pseudo label object; and determining the pseudo tag object with the tag confidence degree meeting the confidence condition as a new learning object.
In one embodiment, the tagged object further includes a second class object whose number of objects satisfies a bulk condition; the large-batch condition refers to that the number of objects is larger than or equal to a second number threshold; the second number threshold is greater than the first number threshold;
the semi-supervised learning module further includes a judging unit configured to: classifying the unlabeled object by using the initial classification model, wherein the determined predictive label is the same as the first-class object, and the label confidence degree meets a first-class supplementary object of a confidence condition; and under the condition that the sum of the numbers of the first class objects and the first class supplementary objects and the number of the objects of the second class objects meet a number balance condition, determining that the learning stop condition is currently met.
In one embodiment, the object classification model construction apparatus further includes a data subset construction module for: determining a data set comprising the sample object with the tagged object and the first class supplemental object as sample objects; at least a portion of the tagged objects and at least a portion of the first category supplemental objects are extracted from the dataset, constituting a subset of data.
In one embodiment, the tagged objects include a first class object carrying a first tag, and a second class object carrying a second tag; the number of the objects belonging to the object category represented by the first label in the non-label object is smaller than the number of the objects belonging to the object category represented by the second label in the non-label object, and the number difference of the objects belonging to the object category represented by the second label meets the number imbalance condition;
the object classification model construction device further includes: the object feature extraction module is used for extracting features of the non-tag objects to obtain respective object features of the non-tag objects; the local density analysis module is used for carrying out local density analysis on each non-label object based on the mapping position of each non-label object in the feature space to which the object feature belongs, so as to obtain local outliers corresponding to each non-label object; and the first type supplementary object determining module is used for determining the unlabeled object of which the local outlier factor meets the outlier condition as a first type supplementary object carrying a first label.
In one embodiment, the object feature extraction module is specifically configured to: acquiring respective object information of each non-tag object; the object information comprises sub-information of at least two information categories; corresponding to each piece of sub-information of the label-free object, performing feature extraction on the sub-information by using a feature extraction algorithm matched with the information category of the sub-information to obtain the sub-features of the label-free object; and determining object characteristics of the unlabeled object based on the sub-characteristics respectively corresponding to each piece of sub-information.
In one embodiment, the object classification model construction device further includes a feedback adjustment module for: obtaining objection information fed back by a classification result of the object classification model; under the condition that the objection information meets an objection condition, carrying out anomaly reason matching on the objection information, and determining anomaly reasons matched with semantics represented by the objection information; and adjusting the object classification model based on the abnormal reasons to obtain an updated object classification model.
In a third aspect, the present application also provides an object classification method. The method comprises the following steps:
acquiring object information of an object to be classified;
extracting the characteristics of the object information to obtain the object characteristics of the object to be classified;
using each object classification sub-model contained in the object classification model to respectively classify the object characteristics of the object to be classified to obtain a plurality of sub-classification results of the object to be classified; the object classification model is constructed based on the object classification model construction method;
and carrying out statistical analysis on each sub-classification result to obtain the object category of the object to be classified.
In a fourth aspect, the present application further provides an object classification apparatus. The device comprises:
The object information acquisition module is used for acquiring object information of an object to be classified;
the object feature extraction module is used for carrying out feature extraction on the object information to obtain object features of the object to be classified;
the sub-classification result determining module is used for respectively classifying the object characteristics of the object to be classified by using each object classification sub-model contained in the object classification model to obtain a plurality of sub-classification results of the object to be classified; the object classification model is constructed based on the object classification model construction method;
and the object category determining module is used for carrying out statistical analysis on each sub-classification result to obtain the object category of the object to be classified.
In one embodiment, the sub-classification result includes a primary selection label of the object to be classified; the object category determining module is specifically configured to: counting the sub-classification results, and determining the occurrence times of each primary selection label in the sub-classification results; and determining the object category represented by the primary selection label with the largest occurrence number as the object category of the object to be classified.
In one embodiment, the sub-classification result includes a primary selection label of the object to be classified and a confidence level of the primary selection label; the object category determining module is specifically configured to: carrying out confidence statistics on each primary label in each sub-classification result respectively, and determining a confidence statistics value of each primary label; and determining the object category represented by the primary selection label with the maximum confidence coefficient statistic value as the object category of the object to be classified.
In one embodiment, the sub-classification result includes probabilities that the object to be classified belongs to each candidate tag respectively; the object category determining module is specifically configured to: respectively carrying out probability statistics on each candidate label in each sub-classification result to obtain respective probability statistics values of each candidate label; and determining the object category represented by the candidate label with the largest probability statistic as the object category of the object to be classified.
In a fifth aspect, the present application also provides a computer device. The computer device comprises a memory storing a computer program and a processor implementing the steps of the above method when the processor executes the computer program.
In a sixth aspect, the present application also provides a computer-readable storage medium. The computer readable storage medium has stored thereon a computer program which, when executed by a processor, implements the steps of the above method.
In a seventh aspect, the present application also provides a computer program product. The computer program product comprises a computer program which, when executed by a processor, implements the steps of the above method.
In the object classification modeling process, a training set comprising a plurality of unlabeled objects and a plurality of labeled objects is obtained, and semi-supervised learning is performed on a first class object with a small number of objects based on the first class object and at least a part of unlabeled objects to obtain a classification model, wherein the classification model is used for determining the unlabeled objects of the training set, and predicting first class supplementary objects with labels identical to the first class object. Model training is carried out by using a data subset comprising at least a part of sample objects to obtain object classification sub-models corresponding to the data subset, and an object classification model comprising a plurality of object classification sub-models is constructed; the sample objects comprise the tag objects and the first class supplementary objects, so that the number of the first class objects can be increased, the influence of a small batch of samples on the model accuracy is reduced, the classification result of the object classification model is obtained by counting the respective sub-classification results of the object classification sub-models, which is equivalent to the implementation of object classification by adopting an integrated learning mode through a plurality of object classification sub-models, and the finally constructed object classification model has better generalization performance. Therefore, the method is beneficial to improving the performance of the object classification model.
Drawings
FIG. 1 is a diagram of an application environment for an object classification model construction method and an object classification method in one embodiment;
FIG. 2 is a flow diagram of a method of object classification model construction in one embodiment;
FIG. 3 is a schematic diagram of a process for building an object classification model in one embodiment;
FIG. 4 is a schematic diagram of a semi-supervised learning process in one embodiment;
FIG. 5 is a schematic diagram of the reachable distance of object P relative to object O in one embodiment;
FIG. 6 is a flow chart of a method for constructing an object classification model according to another embodiment;
FIG. 7 is a schematic diagram of a process for constructing an object classification model in another embodiment;
FIG. 8 is a flow diagram of a method of object classification in one embodiment;
FIG. 9 is a block diagram of an object classification model construction apparatus in one embodiment;
FIG. 10 is a block diagram of an object classification apparatus in one embodiment;
FIG. 11 is an internal block diagram of a computer device in one embodiment;
fig. 12 is an internal structural view of a computer device in another embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application will be further described in detail with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the present application.
The object classification processing method provided by the application can be based on artificial intelligence, for example, the object classification model in the application can be a neural network model. Among these, artificial intelligence (Artificial Intelligence, AI) is the theory, method, technique and application system that uses a digital computer or a digital computer-controlled machine to simulate, extend and extend human intelligence, sense the environment, acquire knowledge and use knowledge to obtain optimal results. In other words, artificial intelligence is an integrated technology of computer science that attempts to understand the essence of intelligence and to produce a new intelligent machine that can react in a similar way to human intelligence. Artificial intelligence, i.e. research on design principles and implementation methods of various intelligent machines, enables the machines to have functions of sensing, reasoning and decision. The artificial intelligence technology is a comprehensive subject, and relates to the technology with wide fields, namely the technology with a hardware level and the technology with a software level. Artificial intelligence infrastructure technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and other directions.
Machine learning is a multi-domain interdisciplinary, involving multiple disciplines such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory, and the like. It is specially studied how a computer simulates or implements learning behavior of a human to acquire new knowledge or skills, and reorganizes existing knowledge structures to continuously improve own performance. Compared with the data mining, which finds the mutual characteristics among big data, the machine learning is more focused on the design of an algorithm, so that a computer can automatically learn the rules from the data and predict unknown data by utilizing the rules. Machine learning is the core of artificial intelligence, a fundamental approach to letting computers have intelligence, which is applied throughout various areas of artificial intelligence. Machine learning typically includes techniques such as supervised learning, semi-supervised learning, integrated learning, and the like. Wherein, the supervised learning is a machine learning method for learning by using labeled data; semi-supervised learning is a machine learning method for learning by using a part of data with a label and a part of data without a label in training data; the ensemble learning is a machine learning method that completes learning tasks by constructing and combining a plurality of learners.
The scheme provided by the embodiment of the application relates to an artificial intelligence machine learning technology, and is specifically described by the following embodiments:
in one embodiment, the object classification model construction method and the object classification method provided by the application can be applied to an application environment as shown in fig. 1. Wherein the terminal 102 communicates with the server 104 via a network. The communication network may be a wired network or a wireless network. Accordingly, the terminal 102 and the server 104 may be directly or indirectly connected through wired or wireless communication. For example, the terminal 102 may be indirectly connected to the server 104 through a wireless access point, or the terminal 102 may be directly connected to the server 104 through the internet, which is not limited herein.
The terminal 102 includes, but is not limited to, a mobile phone, a computer, an intelligent voice interaction device, an intelligent home appliance, a vehicle-mounted terminal, an aircraft, and the like. The embodiment of the application can be applied to object classification modeling and scenes of object classification, including but not limited to cloud technology, artificial intelligence, intelligent transportation, auxiliary driving and the like. The terminal 102 may be provided with a client related to information interaction, where the client may be software (e.g., a browser, information interaction software, etc.), and may also be a web page, an applet, etc. The server 104 is a background server corresponding to software, web pages, applets, etc., or a server dedicated to object classification model construction or object classification, for example, a platform server of a content interaction platform providing a content interaction service, or a platform server of a commodity transaction platform providing a commodity transaction service. In some embodiments, the object classification model construction and object classification may also be implemented by the same server, which is not specifically limited in this application. Further, the server 104 may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, CDNs (Content Delivery Network, content delivery networks), basic cloud computing services such as big data and artificial intelligence platforms, and the like. The data storage system may store data that the server 104 needs to process. The data storage system may be provided separately, may be integrated on the server 104, or may be located on a cloud or other server.
It should be noted that, the object classification model construction method and the object classification method in the embodiments of the present application may be executed by the terminal 102 or the server 104 alone, or may be executed by the terminal 102 and the server 104 together. Taking the case where the server 104 is executed alone as an example, the server 104 builds an object classification model: acquiring a training set comprising a plurality of unlabeled objects and a plurality of labeled objects; semi-supervised learning is carried out on the first class object with the smaller number of objects based on the first class object and at least one part of unlabeled objects to obtain a two-class model, wherein the two-class model is used for determining the unlabeled objects in the training set, and predicting first class supplementary objects with labels identical to those of the first class object; and performing model training by using a data subset comprising at least a part of sample objects to obtain object classification sub-models corresponding to the data subset, and constructing an object classification model comprising a plurality of object classification sub-models. The sample object comprises a label object and a first class supplementary object, and the classification result of the object classification model is obtained by counting the respective sub-classification results of the object classification sub-models. The server 104 performs object classification: acquiring object information of an object to be classified; extracting characteristics of the object information to obtain object characteristics of the object to be classified; using each object classification sub-model contained in the object classification model to respectively classify the object characteristics of the object to be classified to obtain a plurality of sub-classification results of the object to be classified; and carrying out statistical analysis on each sub-classification result to obtain the object category of the object to be classified. The object classification model is constructed based on the object classification model construction method. After the object class of the object to be classified is determined, different processing strategies can be matched with the objects to be classified of different classes, so that the service quality is improved.
In one embodiment, as shown in fig. 2, there is provided an object classification model construction method, which may be performed by a computer device, which may be a terminal or a server shown in fig. 1, and in this embodiment, an example in which the method is applied to the server in fig. 1 is described, including the steps of:
step S202, a training set comprising a plurality of unlabeled objects and a plurality of labeled objects is obtained.
The training set is a data set for model training to obtain an object classification model. The training set may include a plurality of unlabeled objects and a plurality of labeled objects. An object refers to a thing that can be classified in a business process, and the object can be any kind of living thing or inanimate thing. The animate object includes, but is not limited to, at least one of a natural person, an animal, or a plant, and the inanimate object includes, but is not limited to, at least one of an image, audio, text, video, and the like. The tagged object refers to an object which carries a category tag and has an object category determined, and the untagged object refers to an object which does not carry a category tag and has an object category determined. That is, the object class may be characterized by a class label. For example, the category labels of the images may include face images and non-face images; the category labels of the audio may include human voice, click voice, stream voice, and the like; category labels for text may include presentation sentences, question sentences, imperative sentences, exclamation sentences, and the like; the category labels of the videos may include food videos, make-up videos, travel videos, and the like. It will be appreciated that the object types of tagged and untagged objects are consistent. For example, a labeled object is a classified image and an unlabeled object is an unclassified image; tagged objects are classified audio, untagged objects are unclassified audio, and so on.
In some possible implementations, the object may be a business object registered in a business scenario, such as a registered account number of a content service platform in a content service scenario, a registered account number of a game service platform in a game service scenario, or a registered account number of a commodity transaction platform in a commodity transaction service scenario. Taking the commodity transaction service scenario as an example, the object categories may include normal merchants and abnormal merchants. The normal merchant owners release commodity information of the commodity to be transacted in the commodity transaction platform, so that resource exchange with the purchaser is realized, and the abnormal merchants directly carry release information of other merchants or copy, splice or wash the release information of other merchants and edit the release information simply to obtain corresponding commodity information and release the corresponding commodity information, so that the actual transacted commodity and the commodity information are not corresponding. It will be appreciated that the abnormal merchant is fraudulently active for the purchaser. The tagged object is a registered account carrying a category tag, such as a normal account or an abnormal account, and the untagged object is a registered account not carrying a category tag and not determining the category of the object.
Further, the training set may specifically include object information of each of the plurality of unlabeled objects and the plurality of labeled objects. The object information may include, for example, identification information, interaction information, and the like. Taking the case that the object is a merchant as an example, the object information may include information such as transaction mode, transaction frequency, transaction amount, customer feedback, and the like of the merchant. The tagged object may include a plurality of class objects to which the class tags correspond, e.g., an a class object carrying tag a, a B class object carrying tag B, a C class object carrying tag C, etc. The tagged objects include a first class of objects whose number meets a small lot condition. Wherein, the small batch condition refers to the number of objects being less than or equal to a first number threshold.
In a particular embodiment, the first number threshold may be determined based on task requirements of supervised learning. For supervised learning tasks, the learning effect is positively correlated with the number of objects of the tagged object. That is, the more a-class objects carrying the tag a, the stronger the recognition capability of the object classification model determined by the supervised learning to the a-class objects, under the condition that other conditions are unchanged. Based on this, the server may determine a first number threshold that matches the task demand based on the task demand for supervised learning. For example, the identification capability of the non-original account number in the content service scenario is lower than the identification capability of the abnormal account number in the commodity transaction service scenario, and thus, the content service scenario may set a relatively smaller first quantity threshold value with respect to the commodity transaction service scenario.
In a particular embodiment, the first number threshold may be determined based on the number of objects of each class of objects. In the machine learning process, the effects of class sample imbalance also need to be considered. Class-sample imbalance (class-immalance) refers to a situation where the number of training samples for different classes in a classification task varies greatly, and sample imbalance can adversely affect model performance. Taking a classification model as an example, under the condition that the number of positive samples is far greater than that of negative samples, even if the recognition rate of the positive samples is higher, the negative samples which are not recognized possibly can be classified as positive samples due to the low recognition rate of the negative samples, so that the false recognition of the positive samples is caused, and the overall performance of the model is influenced. Further, in the presence of a class sample imbalance, a difference in the number of at least two class objects satisfies a number imbalance condition. Based on this, the server may determine a first number threshold that matches a number difference characterizing the number imbalance condition based on the number of objects and the number imbalance condition for each of the respective types of objects. The number difference may be a difference or a ratio. For example, if the number imbalance condition is that the number ratio of the majority class to the minority class is greater than 3:1, the first number threshold is 100 if the maximum number of objects in each class of objects is 300.
Specifically, the server may acquire a training set including a plurality of unlabeled objects and a plurality of labeled objects, and the specific manner in which the server acquires the classified data set and the data set to be classified may be active acquisition or passive reception. Further, the acquisition paths of the unlabeled object and the labeled object may be the same or different. For example, the server may obtain the unlabeled object and the labeled object from the terminal; the server may also obtain a tagged object from the terminal and based on the object type of the tagged object, obtain a plurality of untagged objects of the same object type from the data storage system.
Step S204, semi-supervised learning is performed based on the first class object and at least a part of the unlabeled objects, and a classification model is obtained.
The semi-supervised learning is a machine learning method for learning by using a part of data with a label and a part of data without a label in training data. The specific learning mode of the semi-supervised learning is not unique, and for example, pure (pure) semi-supervised learning, direct push learning (transductive learning) and the like can be included, and a label propagation algorithm based on graph algorithms such as GNN (Convolutional Neural Network ), GCN (GeneralizedConnection Network, universal connection network) and a knowledge graph can be applied, and unknown labels are predicted by using known labels, so that semi-supervised learning is realized.
Specifically, the server may perform feature extraction on the first class object and at least a portion of the unlabeled objects to obtain respective object features of each object, and a specific process of feature extraction may include multiple steps such as data cleaning, data conversion, and feature selection. In some specific embodiments, the server may further perform feature extraction by using a deep learning method under a condition that the data volume is large enough, so as to learn complex features from the data, and implement deep feature extraction. Then, the server may perform semi-supervised learning based on the object features by using a semi-supervised learning algorithm to obtain a classification model. The two-classification model is used for determining a first-class supplemental object with the same predictive label as the first-class object in the unlabeled object. That is, a part of unlabeled objects in the training set can be converted into labeled first-class supplementary objects which are complemented with the first-class objects through the two-class model obtained through semi-supervised learning, so that the number of the first-class objects is increased, and the problem of insufficient sample number is solved or relieved to a certain extent.
The number of categories of the first category object included in the training set may be one or more. Under the condition that the number of categories of the first category objects meeting the small-batch condition is multiple, the server can respectively perform semi-supervised learning on each first category object to obtain a corresponding two category model of each first category object. The server may perform semi-supervised learning on the B-class object and at least a portion of the non-labeled objects to obtain a classification model for the label B, and convert a portion of the non-labeled objects in the training set into B-class supplementary objects carrying the label B using the classification model, where the labeled objects include an a-class object carrying the label a, a B-class object carrying the label B, and a C-class object carrying the label C, and the respective object numbers of the B-class object and the C-class object satisfy a small batch condition; the server can also perform semi-supervised learning on the class-C object and at least a part of the non-labeled objects to obtain a classification model for the label-C, and the classification model is used for converting a part of the non-labeled objects in the training set into class-C supplementary objects carrying the label-C.
Step S206, performing model training by using the data subset comprising at least a part of sample objects to obtain an object classification sub-model corresponding to the data subset.
Wherein the sample object includes a tag object and a first class supplemental object. The specific structure of the object classification sub-model is not unique, and may be, for example, a decision tree model, a support vector machine model, or a neural network model. Specifically, the server may extract a part of objects from the tagged object and the first type supplementary object to form a data subset, and then use the respective object features and type tags of each object in the data subset to perform model training, so as to obtain an object classification sub-model corresponding to the data subset.
In a specific embodiment, the object classification model construction method further includes: taking the tagged object and the first type supplementary object as sample objects, and determining a data set comprising the sample objects; at least a portion of the tagged objects and at least a portion of the first class supplemental objects are extracted from the dataset, constituting a subset of the data.
In particular, the tagged object may comprise a respective category object of a plurality of category tags, including a first category object carrying a first tag. And determining a data set comprising the sample object by taking the tagged object and the first-class supplementary object as sample objects, extracting at least one part of the tagged object and at least one part of the first-class supplementary object from the data set to form a data subset, so that the two parts of data of the tagged object and the first-class supplementary object are fused in the data subset, the number of the objects carrying the first tag in the data subset is increased, various samples are more balanced, and the performance of the object classification sub-model obtained based on the training of the data subset is improved.
In one possible implementation, the respective number of objects in each object class in the subset of data satisfies a number balance condition. The number balance condition means that the number difference of the number of the objects of each object class is smaller than a difference threshold value or smaller than or equal to the difference threshold value. The difference threshold may be a ratio or a difference.
In one possible implementation, the respective number of objects in each object class in the data subset meets the model accuracy requirement. The model accuracy requirement may be, for example, that the accuracy is greater than or equal to the set accuracy. Specifically, the server may establish a correspondence between the model accuracy and the number of training samples according to the statistical result of the test data or the expert experience of the constructed model, and determine a lower limit of the number of objects matching the model accuracy requirement, so as to extract a part of objects from the tagged objects and the first class supplementary objects, and form a data subset, where the number of objects in each object class is higher than the lower limit of the number of objects.
In one possible implementation, the respective object numbers for each object class in the data subset satisfy the number balance condition and the model accuracy requirement. The server may determine a lower limit of the number of objects matching the model accuracy requirement according to a correspondence between the model accuracy and the number of training samples, determine an upper limit of the difference of the number of objects based on the number equalization condition, and extract a part of objects from the tagged objects and the first type supplementary objects to form a data subset that the number of objects in each object type is higher than the lower limit of the number of objects, and the number of objects in each object type is smaller than the upper limit of the difference.
Step S208, an object classification model comprising a plurality of object classification sub-models is constructed.
Wherein the plurality of object classification sub-models correspond to different subsets of data. The two data subsets are different, meaning that the sample objects contained in each of the two data subsets are not identical. That is, at least one sample object exists in one subset of data and does not exist in another subset of data. Further, there may or may not be duplicate sample objects in each data subset.
Specifically, after model training is performed by using a plurality of data subsets, respectively, so as to obtain object classification sub-models corresponding to the data subsets, an object classification model including the plurality of object classification sub-models can be constructed by the server. For example, as shown in fig. 3, the server may determine a dataset including a tag object and a first type of supplemental object, and self-sample the dataset (Bootstrap sampling) to obtain a plurality of sub-datasets, such as sub-dataset 1, sub-dataset 2, sub-dataset 3, and sub-dataset 4. Then, model training is performed by using each sub-data set, so as to obtain a plurality of object classification sub-models, such as an object classification sub-model 1 corresponding to the sub-data set 1, an object classification sub-model 2 corresponding to the sub-data set 2, an object classification sub-model 3 corresponding to the sub-data set 3, an object classification sub-model 4 corresponding to the sub-data set 4, and the like. And finally, constructing an object classification model comprising the object classification sub-models.
Further, the classification result of the object classification model is obtained by counting the respective sub-classification results of the object classification sub-models. That is, the server may connect the statistical layer carrying the statistical algorithm to the output end of each object classification sub-model to obtain the object classification model. The specific way of counting the sub-classification results is not unique.
In a specific embodiment, the sub-classification result includes a primary label of the object to be classified. In the case of this embodiment, the statistics of the respective sub-classification results of the object classification sub-models is performed to obtain classification results of the object classification models, including: counting all the sub-classification results, and determining the occurrence times of each primary selection label in all the sub-classification results; and determining the object category represented by the primary selection label with the largest occurrence number as the object category of the object to be classified. Taking the case that the object classification sub-model includes the object classification sub-models 1-4 as an example, if the primary selection label determined by the object classification sub-model 1 is the label a, the primary selection label determined by the object classification sub-model 2 is the label B, the primary selection label determined by the object classification sub-model 3 is the label a, and the primary selection label determined by the object classification sub-model 4 is the label C, the server can determine the object category represented by the label a as the object category of the object to be classified because the occurrence number of the label a is the largest.
In one embodiment, the sub-classification result includes a preliminary label of the object to be classified, and a confidence level of the preliminary label. In the case of this embodiment, the statistics of the respective sub-classification results of the object classification sub-models is performed to obtain classification results of the object classification models, including: carrying out confidence statistics on each primary selection label in each sub-classification result respectively, and determining a confidence statistics value of each primary selection label; and determining the object category represented by the primary selection label with the maximum confidence coefficient statistic value as the object category of the object to be classified. The confidence statistic may be, for example, an average, median, sum, or the like. Also taking the case that the object classification sub-model includes the object classification sub-models 1-4 and the confidence coefficient statistics are average values as an example, if the primary label determined by the object classification sub-model 1 is label a and the confidence coefficient is 80%, the primary label determined by the object classification sub-model 2 is label B and the confidence coefficient is 95%, the primary label determined by the object classification sub-model 3 is label a and the confidence coefficient is 90%, and the primary label determined by the object classification sub-model 4 is label C and the confidence coefficient is 90%. The confidence average value of the label a is 85%, the confidence average value of the label B is 95%, and the confidence average value of the label C is 90%, and the server can determine the object class characterized by the label B as the object class of the object to be classified because the confidence average value of the label B is maximum.
In one embodiment, the sub-classification result includes probabilities that the objects to be classified respectively belong to each candidate tag. In the case of this embodiment, the statistics of the respective sub-classification results of the object classification sub-models is performed to obtain classification results of the object classification models, including: respectively carrying out probability statistics on each candidate label in each sub-classification result to obtain respective probability statistics values of each candidate label; and determining the object category represented by the candidate label with the largest probability statistic as the object category of the object to be classified. Also, the probability statistic may be, for example, an average, median, sum, or the like. Taking the case that the object classification sub-model comprises the object classification sub-models 1-4 and the probability statistic value is an average value as an example, if the probability of the candidate label A determined by the object classification sub-model 1 is 40% and the probability of the candidate label B is 60%; the probability of the candidate label A determined by the object classification sub-model 1 is 60%, and the probability of the candidate label B is 40%; the probability of the candidate label A determined by the object classification sub-model 1 is 20%, and the probability of the candidate label B is 80%; the probability of candidate tag a determined by object classification sub-model 1 is 70% and the probability of candidate tag B is 30%. The average probability value of the label a is 47.5%, and the average confidence value of the label B is 52.5%, so that the server can determine the object class characterized by the label B as the object class of the object to be classified because the average probability value of the label B is the largest.
According to the object classification model construction method, the training set comprising a plurality of unlabeled objects and a plurality of labeled objects is obtained, semi-supervised learning is conducted on the first class objects with a small number of objects based on the first class objects and at least one part of unlabeled objects, and a classification model is obtained and is used for determining the first class supplementary objects with the same predicted labels as the first class objects in the unlabeled objects of the training set. Model training is carried out by using a data subset comprising at least a part of sample objects to obtain object classification sub-models corresponding to the data subset, and an object classification model comprising a plurality of object classification sub-models is constructed; the sample objects comprise the tag objects and the first class supplementary objects, so that the number of the first class objects can be increased, the influence of a small batch of samples on the model accuracy is reduced, the classification result of the object classification model is obtained by counting the respective sub-classification results of the object classification sub-models, which is equivalent to the implementation of object classification by adopting an integrated learning mode through a plurality of object classification sub-models, and the finally constructed object classification model has better generalization performance. Therefore, the method is beneficial to improving the performance of the object classification model.
In one embodiment, step S204 includes: determining the first class object as a learning object, and performing supervised learning on the learning object to obtain an initial classification model; performing object classification on at least one part of unlabeled objects by using an initial classification model to obtain pseudo-label objects carrying predictive labels; determining a new learning object based on the pseudo tag object, and returning to the step of performing supervised learning on the learning object until the learning stopping condition is met; and under the condition that the learning stopping condition is met, determining the current initial classification model as a classification model corresponding to the tag carried by the first class object.
The specific structure of the initial classification model is not unique, and may be, for example, a decision tree model, a support vector machine model, or a neural network model. The initial classification model is a two-classification machine learning model. The learning stop condition may refer to that the number of iterative learning times reaches a set number, or may refer to the number of first-class objects and second-class supplemental objects and that the small-batch condition is no longer satisfied. It can be understood that the finally determined classification model is the initial classification model obtained by the last iterative learning.
Specifically, the server may determine the first class object as a learning object, and perform supervised learning on object features and class labels of the learning object to obtain an initial classification model. The initial classification model has a certain identification capability for the object class of the first class object. Then, the server uses the initial classification model to classify at least a portion of the unlabeled objects to obtain pseudo-tagged objects carrying predictive tags. The prediction tag may be a class tag that characterizes an object class to which the first class object belongs, i.e. the pseudo tag object is a positive sample; the predictive tag may also characterize that the pseudo tag object is of a different class than the object to which the first class object belongs, i.e., the pseudo tag object is a negative sample. Then, the server determines a new learning object based on the pseudo tag object, returns to the step of performing supervised learning on the learning object, and performs the next iteration learning until the learning stop condition is met. And under the condition that the learning stop condition is met, the server determines the current initial classification model as a classification model corresponding to the label carried by the first class object, and takes the pseudo label object which is determined by the classification model and has the same prediction label as the first class object as a first class supplementary object.
In the embodiment, the classification model obtained by the last iteration learning in the self-learning process is determined to be the classification model corresponding to the tag carried by the first class object, and the first class supplementary object is obtained by applying the classification model, so that the accuracy of the first class supplementary object can be improved while the influence of the insufficient number of the first class objects is overcome.
Further, the specific manner of determining the new learning object based on the pseudo tag object is not unique. For example, the server may determine the pseudo tag object as a new learning object. In a specific embodiment, determining a new learning object based on the pseudo tag object includes: for each pseudo tag object, obtaining the tag confidence of the predicted tag of the pseudo tag object; and determining the pseudo tag object with the tag confidence degree meeting the confidence condition as a new learning object.
Wherein the label confidence level can characterize the confidence level of the predicted label, i.e. the higher the label confidence level, the higher the confidence level of the predicted label. The label confidence level satisfies the confidence condition, which may be that the index label confidence level is greater than a confidence threshold, or that the label confidence level is greater than or equal to the confidence threshold. Specifically, the server may obtain, for each pseudo tag object, a tag confidence of a predicted tag of the pseudo tag object, and then perform numerical comparison on each tag confidence, and determine the pseudo tag object whose tag confidence meets a confidence condition as a new learning object.
In one specific implementation, as shown in FIG. 4, the process of determining the two classification models includes the steps of:
step S401, determining a first class object as a learning object;
step S402, performing supervised learning on a learning object to obtain an initial classification model;
step S403, performing object classification on at least a part of the unlabeled objects by using the initial classification model to obtain pseudo-label objects carrying predictive labels;
step S404, judging whether the learning stop condition is met currently; if not, executing step S405; if yes, go to step S407;
step S405, for each pseudo tag object, obtaining the tag confidence of the predicted tag of the pseudo tag object;
step S406, determining the pseudo tag object with the tag confidence degree meeting the confidence condition as a new learning object; returning to step S402;
step S407, determining the current initial classification model as a classification model corresponding to the tag carried by the first class object.
In the above embodiment, the pseudo tag object whose tag confidence meets the confidence condition is determined as the new learning object, so that the supervised learning using the pseudo tag object with relatively high classification accuracy can be ensured, and the accuracy of the initial classification model determined in the next iteration can be improved.
In one embodiment, the tagged object further includes a second class object whose number of objects satisfies a bulk condition. In the case of this embodiment, the object classification model construction method further includes: classifying the unlabeled object by using an initial classification model, wherein the determined predicted label is the same as the first-class object, and the label confidence degree meets a first-class supplementary object of a confidence condition; in the case that the sum of the number of the first-class objects and the number of the first-class supplemental objects and the number of the objects of the second-class objects satisfy the number balance condition, it is determined that the learning stop condition is currently satisfied.
The large-batch condition refers to that the number of objects is larger than or equal to a second number threshold; the second number threshold is greater than the first number threshold. It will be appreciated that there is a problem of sample imbalance between the second class object and the first class object, i.e. the respective numbers of the second class object and the first class object do not satisfy the number balance condition. The quantity balance condition refers to that the quantity difference of the quantity of the respective objects in each object class is smaller than a difference threshold value or smaller than or equal to the difference threshold value. The difference threshold may be a ratio or a difference. Specifically, the server uses the initial classification model to classify the unlabeled object, and the determined predicted label is the same as the first-class object, and the label confidence level satisfies the first-class supplemental object of the confidence condition. Then, the server calculates the sum of the number of the first class objects and the number of the first class supplementary objects, and determines that the learning stop condition is currently satisfied when the sum of the number and the number of the objects of the second class objects satisfy the number balance condition.
In this embodiment, when the number of the first class objects and the number of the first class supplemental objects and the number of the objects of the second class objects satisfy the number balance condition, it is determined that the learning stop condition is currently satisfied, so that sample balance can be ensured, and the overall performance of the object classification model is improved.
In one embodiment, the tagged objects include a first class object carrying a first tag, and a second class object carrying a second tag; the number of the objects belonging to the object category represented by the first label in the non-label object is smaller than the number of the objects belonging to the object category represented by the second label in the non-label object, and the number difference of the objects belonging to the object category represented by the second label satisfies the number imbalance condition. In the case of this embodiment, the object classification model construction method further includes: extracting the characteristics of each non-tag object to obtain the respective object characteristics of each non-tag object; based on the mapping positions of the unlabeled objects in the feature space to which the object features belong, carrying out local density analysis on the unlabeled objects to obtain local outlier factors corresponding to the unlabeled objects respectively; and determining the unlabeled object with the local outlier factor meeting the outlier condition as a first-class supplementary object carrying the first label.
The number imbalance condition corresponds to a number balance condition, and may mean that a number difference of the number of objects of each object class is greater than a difference threshold, or greater than or equal to the difference threshold. An outlier condition may refer to a local outlier factor being greater than an outlier factor threshold, or greater than or equal to an outlier factor threshold.
In particular, where the training set includes a majority class and a minority class that satisfy the number imbalance condition, those data points that differ from the majority data, i.e., minority classes, may be identified based on the distribution of the data or the density of the data. Based on the above, the server may perform feature extraction on each unlabeled object to obtain respective object features of each unlabeled object, and then perform local density analysis on each unlabeled object based on mapping positions of each unlabeled object in a feature space to which the object features belong, to obtain local outliers corresponding to each unlabeled object. And finally, the server determines the unlabeled object with the local outlier factor meeting the outlier condition as a first-class supplementary object carrying the first label. The specific manner of performing the local density analysis is not limited, and may be, for example, based on a clustering algorithm, or may be performed by using a LOF (Local Outlier Factor ) algorithm.
In a specific embodiment, the server performs local density analysis on each unlabeled object by using an LOF algorithm, so as to obtain local outliers corresponding to each unlabeled object. The basic idea of the LOF algorithm is to compare the local density of one data point with the local density of its neighbors. If the local density of a data point is much lower than that of its neighbors, then that data point can be considered a minority class of mapped points in the feature space. Specifically, for each data point, the server first calculates its distance from all other data points, and then selects the k-th smallest distance as its k-distance. This distance can be expressed as:
wherein N is k(p) Is the k nearest neighbors of data point p, dist (p, o) is the distance between data points p and o.
The server then calculates a local reachable density for each data point, which is calculated based on the k-distance of the data point and the k-distances of its neighbors. This density can be expressed as:
wherein, reach dist k (p, o) is the reachable distance of data points p to o, as shown in FIG. 5, reach k (p, o) is defined as max [ dist (k, o), dist (p, o)]. That is, for data points p having a distance from the o-point less than dist (k, o) 1 The reachable distance from the o point is dist (k, o); for data points p with a distance from the o-point greater than dist (k, o) 2 Its reachable distance from o point is dist (p 2 ,o)。
The server then recalculates the local outlier factor for each data point, which is calculated based on the local reachable densities of the data point and its neighbors. This factor can be expressed as:
if the LOF value of a data point is much greater than 1, then the data point can be considered as a mapped point of a minority class in the feature space.
In the above embodiment, in the case of data unbalance, the effect of the shortage of the data amount of the first class object can be further reduced by identifying the first class supplemental object as a minority class through the unsupervised local density analysis.
In one embodiment, feature extraction is performed on each unlabeled object to obtain respective object features of each unlabeled object, including: acquiring respective object information of each non-tag object; corresponding to each piece of sub-information of the non-tag object, carrying out feature extraction on the sub-information by using a feature extraction algorithm matched with the information category of the sub-information to obtain the sub-features of the non-tag object; and determining object characteristics of the unlabeled object based on the sub-characteristics respectively corresponding to each piece of sub-information.
Wherein the object information comprises sub-information of at least two information categories. The information categories may include, for example, numeric categories, category categories, text categories, time series categories, and the like. Taking the case that the position is taken as a merchant as an example, the object information can comprise numerical value type information such as transaction frequency and transaction amount, type information such as transaction mode, text type information such as transaction place and customer feedback, and time-series type information such as transaction time.
Specifically, the server acquires object information of each unlabeled object, and then, corresponding to each piece of sub-information of the unlabeled object, performs feature extraction on the sub-information by using a feature extraction algorithm matched with the information category of the sub-information to obtain sub-features of the unlabeled object. And finally, the server performs fusion processing on the sub-features corresponding to the sub-information of the same non-tag object, and determines the object features of the non-tag object. The specific manner of the fusion process may be, for example, stitching, feature calculation, and the like.
For example, for numerical data, the original numerical values may be used directly as features, or some transformations may be performed, such as logarithmic transformation, normalization, etc.; statistical features such as mean, median, standard deviation, etc. may also be calculated. For the category data, the characteristic extraction can be carried out through coding, and specific algorithms of the coding can be, for example, single-hot coding, label coding and the like; statistical features such as frequency of categories, number of categories, etc. may also be calculated. For text data, feature extraction may be performed by combining text processing and a text identification method, and specific modes of text processing may include word segmentation, word deactivation, word stem extraction and the like, and text representation methods may include a word bag model, TF-IDF (Term Frequency-Inverse Document Frequency), word embedding and the like, for example. For time series data, time characteristics may be extracted, which may include, for example, seasonality, trending, and the like, or statistical characteristics, which may include, for example, a sliding average, a sliding standard deviation, and the like.
In the above embodiment, the sub-information is extracted by using the feature extraction algorithm matched with the information category of the sub-information, so as to obtain the sub-feature of the label-free object, so that the validity of the sub-feature can be ensured, and the validity of the object feature determined based on each sub-feature can be further ensured.
In one embodiment, the object classification model construction method further includes: obtaining objection information fed back by a classification result of the object classification model; under the condition that the objection information meets an objection condition, carrying out anomaly reason matching on the objection information, and determining anomaly reasons matched with semantics represented by the objection information; and adjusting the object classification model based on the abnormality reasons to obtain an updated object classification model.
The objection information may include, among other things, an objection classification result, and an evidence for the objection classification result. The objection condition may be characterized by time or semantics, etc. For example, the objection information meeting an objection condition may mean that the objection information is valid information and that the semantics represented by the objection information do not match with the classification result. The reasons for the anomalies may be, for example, model false alarms, characteristic anomalies or noise present in the data, etc.
Specifically, after the object classification model is applied online, the server may acquire objection information fed back for the classification result of the object classification model from the terminal. Taking the object classification model for the merchant as an example, the server may receive objection information fed back by the merchant or the business team. For example, if the object classification model predicts that a merchant is a fraudulent merchant, if the merchant is objection to the predicted result, the server may be fed back that they are not involved in fraudulent activity and be demonstrated. Then, the server matches the objection information with an objection condition, and judges whether the objection information satisfies the objection condition. Under the condition that the objection information meets the objection condition, the server performs anomaly reason matching on the objection information, determines anomaly reasons matched with semantics represented by the objection information, and adjusts the object classification model based on the anomaly reasons to obtain an updated object classification model. The specific content of the model adjustment may include, for example, adjustment features, adjustment model parameters, adjustment model structures, and the like. After the model is adjusted, the adjusted model is trained to obtain an updated object classification model.
In the embodiment, the model parameters are automatically adjusted through the feedback loop, so that the accuracy and the robustness of the model can be improved.
In one embodiment, as shown in fig. 6, there is provided an object classification model construction method, which may be performed by a computer device, which may be a terminal or a server as shown in fig. 1, taking the computer device as an example a server, in this embodiment, the method includes the steps of:
in step S601, a training set comprising a plurality of unlabeled objects and a plurality of labeled objects is obtained.
The tagged objects comprise a first class object carrying a first tag and a second class object carrying a second tag; the number of the objects belonging to the object category represented by the first label in the non-label object is smaller than the number of the objects belonging to the object category represented by the second label in the non-label object, and the number difference of the objects belonging to the object category represented by the second label satisfies the number imbalance condition. The number of the objects of the first class of objects meets the small batch condition, and the number of the objects of the second class of objects meets the large batch condition; the small batch condition means that the number of objects is less than or equal to a first number threshold; a bulk condition, meaning that the number of objects is greater than or equal to a second number threshold; the second number threshold is greater than the first number threshold.
In step S602, object information of each tagged object and each untagged object is acquired.
Wherein the object information comprises sub-information of at least two information categories.
Step S603, corresponding to each piece of sub-information, using a feature extraction algorithm matched with the category of the sub-information to perform feature extraction on the sub-information, so as to obtain a sub-feature corresponding to the sub-information.
In step S604, the object features of each tagged object and each untagged object are determined based on the sub-features corresponding to each sub-information.
Step S605, based on the mapping positions of the unlabeled objects in the feature space to which the object features belong, carrying out local density analysis on the unlabeled objects to obtain local outliers corresponding to the unlabeled objects.
In step S606, the unlabeled object whose local outlier factor satisfies the outlier condition is determined as the first-class supplemental object carrying the first label.
In step S607, the first class object is determined as the learning object.
Step S608, performing supervised learning on the learning object based on the object features and the object categories of the learning object to obtain an initial classification model.
Step S609, the initial classification model is used for classifying the unlabeled objects, and the determined predicted labels are the same as the first class objects, and the label confidence degree meets the first class supplementary objects of the confidence condition.
In step S610, when the sum of the numbers of the first class objects and the first class supplemental objects and the number of objects of the second class objects satisfy the number balance condition, it is determined that the learning stop condition is currently satisfied, and the current initial classification model is determined as the classification model corresponding to the tag carried by the first class object.
In step S611, in the case that the sum of the numbers of the first-class objects and the first-class supplemental objects and the number of objects of the second-class objects do not satisfy the number balance condition, it is determined that the learning stop condition is not currently satisfied.
In step S612, at least a part of the unlabeled objects are subject to object classification using the initial classification model, so as to obtain pseudo-labeled objects carrying predictive labels.
Step S613, for each pseudo tag object, obtains the tag confidence of the predicted tag of the pseudo tag object, and determines the pseudo tag object whose tag confidence satisfies the confidence condition as a new learning object. Returning to step S608.
In step S614, the labeled object and the first category supplemental object are taken as sample objects, and a data set including the sample objects is determined.
Step S615 extracts at least a portion of the tagged objects and at least a portion of the first class supplemental objects from the dataset, forming a subset of the data.
In step S616, model training is performed using the data subset including at least a portion of the sample objects, to obtain an object classification sub-model corresponding to the data subset.
Wherein the sample object includes a tag object and a first class supplemental object.
Step S617 builds an object classification model comprising a plurality of object classification sub-models.
The classification result of the object classification model is obtained by counting the respective sub-classification results of the object classification sub-models.
Step S618, obtaining objection information fed back by the classification result of the object classification model.
Step S619, under the condition that the objection information meets the objection condition, carrying out anomaly reason matching on the objection information, and determining anomaly reasons matched with the semantics represented by the objection information.
In step S620, the object classification model is adjusted based on the abnormality cause, and an updated object classification model is obtained.
According to the object classification model construction method, the training set comprising a plurality of unlabeled objects and a plurality of labeled objects is obtained, semi-supervised learning is conducted on the first class objects with a small number of objects based on the first class objects and at least one part of unlabeled objects, and a classification model is obtained and is used for determining the first class supplementary objects with the same predicted labels as the first class objects in the unlabeled objects of the training set. Model training is carried out by using a data subset comprising at least a part of sample objects to obtain object classification sub-models corresponding to the data subset, and an object classification model comprising a plurality of object classification sub-models is constructed; the sample objects comprise the tag objects and the first class supplementary objects, so that the number of the first class objects can be increased, the influence of a small batch of samples on the model accuracy is reduced, the classification result of the object classification model is obtained by counting the respective sub-classification results of the object classification sub-models, which is equivalent to the implementation of object classification by adopting an integrated learning mode through a plurality of object classification sub-models, and the finally constructed object classification model has better generalization performance. Therefore, the method is beneficial to improving the performance of the object classification model.
In one embodiment, the application further provides an application scenario identified by the abnormal merchant. In this scenario, the server may obtain a training set that includes a plurality of unlabeled objects and a plurality of labeled objects. As shown in fig. 7, the tagged object may include an abnormal object 701 carrying an abnormal tag, and a normal object 703 carrying a normal tag; the untagged objects may include a potentially anomalous object 702 that does not carry a tag, and a potentially normal object 704 that does not carry a tag. In an actual scenario, the number of abnormal objects 701 is much smaller than normal objects 703, and the number of potential abnormal objects 702 is much smaller than potential normal objects 704. That is, each type of sample in the training set satisfies the sample imbalance condition.
Then, the server performs feature engineering processing on each object to obtain respective object features of each object. Feature engineering is the most important step in machine learning, and determines the upper performance limit of a model. In particular, useful features need to be extracted from the raw merchant data, which may include transaction pattern features, transaction frequency features, transaction amount features, customer feedback features, etc. of the merchant. These features may help to better understand the behavior patterns of merchants and identify abnormal merchants.
In a specific embodiment, the process of feature engineering may be represented as a feature extraction function f that accepts raw merchant data D as input and outputs extracted features X. This function may include multiple steps of data cleansing, data conversion, feature selection, etc. This process can be represented by x=f (D). The server may need to select an appropriate feature extraction method according to the type and task of the data. For example, for numerical data, the original numerical values may be used directly as features, or some transformations may be performed, such as logarithmic transformation, normalization, etc.; statistical features such as mean, median, standard deviation, etc. may also be calculated. For the category data, the characteristic extraction can be carried out through coding, and specific algorithms of the coding can be, for example, single-hot coding, label coding and the like; statistical features such as frequency of categories, number of categories, etc. may also be calculated. For text data, feature extraction may be performed by combining text processing and a text identification method, and specific modes of text processing may include word segmentation, word deactivation, word stem extraction and the like, and text representation methods may include a word bag model, TF-IDF (Term Frequency-Inverse Document Frequency), word embedding and the like, for example. For time series data, time characteristics may be extracted, which may include, for example, seasonality, trending, and the like, or statistical characteristics, which may include, for example, a sliding average, a sliding standard deviation, and the like.
After obtaining the respective object features of each object in the training set, the server may perform unsupervised learning and semi-supervised learning based on each object feature to obtain the abnormal supplemental object 705, so as to compensate for the influence of insufficient number of abnormal objects.
In one particular embodiment, the process of identifying potential anomaly objects in unlabeled objects through unsupervised learning is referred to as anomaly detection. The process of anomaly detection may be represented as an anomaly detection function g that accepts the extracted feature X as input and outputs a detected anomaly merchant A. This function may include multiple steps of data preprocessing, model training, anomaly detection, and the like. This process can be expressed by the formula a=g (X). Specifically, the abnormality detection may be performed using a density-based abnormality detection method, which may be, for example, an LOF algorithm. The basic idea of the LOF algorithm is to compare the local density of one data point with the local density of its neighbors. A data point may be considered an outlier if its local density is much lower than its neighbors.
Further, semi-supervised learning can also be performed synchronously based on the results of the feature engineering. For a small number of abnormal merchant samples already present, some semi-supervised learning algorithms may be used to exploit unlabeled data. These algorithms can be trained with only a small number of tags and thus can help discover more potential fraudulent merchants.
In a specific embodiment, the semi-supervised learning process may be represented as a semi-supervised learning function h that accepts as inputs the extracted features X and the existing abnormal merchant samples Y, and outputs the potential abnormal merchants S obtained by semi-supervised learning. This function may include multiple steps of data preprocessing, model training, prediction, and the like. This process can be expressed by the formula s=h (X, Y). Specifically, self-learning (Self-learning) may be used as a method of semi-supervised learning. Self-learning is a simple and effective semi-supervised learning approach that can train models with small amounts of labeled data and large amounts of unlabeled data.
The self-learning method comprises the following specific steps:
(1) Training an initial model: an initial model is trained using existing small amounts of tag data. This model may be any supervised learning model, such as decision trees, support vector machines, neural networks, etc.
(2) Predicting unlabeled data: this initial model is used to predict unlabeled data. A predictive label for each unlabeled data point may be derived, along with a confidence level for this predictive label.
(3) Selecting a high confidence prediction: and selecting a part of predictions with highest confidence, and taking the predicted labels as actual labels. This step is the key to self-learning, which enables model improvement with unlabeled data.
(4) Retraining the model: the model is retrained using the original tag data and the new tag data. This new model can then be used to predict more unlabeled data and then the above steps repeated until the learning stop condition is met.
As shown in fig. 7, a certain number of potential abnormal merchants can be identified in each iterative learning process of semi-supervised learning, so that the number of potential abnormal merchants obtained by the next iterative learning is gradually increased. In practical applications, other suitable semi-supervised learning algorithms may also be selected according to business requirements and data characteristics. For example, it may be desirable to process high-dimensional data using some graph-based algorithms, or to process low-dimensional data using some cluster-based algorithms.
After the unsupervised learning and the semi-supervised learning respectively identify a part of potential abnormal merchants, the server fuses the identification results of the unsupervised learning and the semi-supervised learning with the labeled objects in the training set to obtain a data set for integrated learning. For example, the results of anomaly detection and semi-supervised learning may be fused using an integration method such as Stacking or voing.
In a specific embodiment, the process of ensemble learning may be represented as an ensemble learning function i that accepts as input a result a of abnormality detection and a result S of semi-supervised learning, and outputs a result E of ensemble learning. This function may include multiple steps of data preprocessing, model training, prediction, and the like. This process can be expressed by the following formula: e=i (a, S). In particular, the server may randomly extract a plurality of subsets from the dataset. These subsets may or may not have repeated data points. This process is also known as self-sampling (Bootstrap sampling). Each sub-set is then used to train a sub-model. This model may be any supervised learning model, such as decision trees, support vector machines, neural networks, etc. In this way, a plurality of sub-models can be obtained. Finally, the server combines the sub-models to obtain an object classification model for abnormal merchant identification. In practical application, the server can count the prediction results of each sub-model to obtain the classification result of the commercial tenant. In particular, for classification problems, voting may be used to incorporate the predicted outcome; for regression problems, the prediction results may be combined using an averaging approach.
In a specific embodiment, some deep learning methods can be used for feature learning and anomaly detection, as a complement to the ensemble learning. The method can automatically extract deep features and improve the accuracy of fraud detection. The process of deep learning can be represented as a deep learning function j that accepts the extracted feature X as input and outputs a result L of deep learning. This function may include multiple steps of data preprocessing, model training, prediction, and the like. This process can be expressed by the following formula: l=j (X). In the present embodiment, a deep neural network may be used as a method of deep learning. Deep neural networks are a powerful machine learning model that can automatically learn complex features from data.
The following are specific steps of the deep neural network:
(1) Designing a network structure: the structure of the neural network is designed. This includes selecting the number of layers of the network, the number of nodes per layer, the activation function per layer, etc. The network typically requires an input layer, one or more hidden layers, and an output layer.
(2) Initializing parameters: parameters of the network are initialized. These parameters include the weight and bias of each layer. Some random method is typically used to initialize these parameters, such as gaussian initialization, uniform initialization, etc.
(3) Forward propagation: data and parameters are used for forward propagation. In forward propagation, data is passed from the input layer to the hidden layer and then to the output layer, resulting in a predicted result.
(4) Calculating loss: and calculating the loss between the predicted result and the real result. This loss may be indicative of the accuracy of the prediction. This loss is typically calculated using some loss function, such as cross entropy loss, mean square error loss, etc.
(5) Back propagation: the parameters are updated using back propagation. In back propagation, a gradient of the loss versus the parameter is calculated and then used to update the parameter.
(6) Iterative training: repeating the steps, and performing repeated iterative training. In each iteration, the parameters are updated and then used for the next forward propagation.
In a specific embodiment, the server may also establish a feedback loop for feeding the prediction results back to the merchant after the object classification model is applied online, and adjust the model based on their feedback to continually refine the model and accommodate new abnormal behavior. The process of the feedback loop may be expressed as a feedback loop function k that accepts as inputs the ensemble learning result E, the deep learning result L, and the merchant feedback R, outputting the feedback loop result F. This function may include multiple steps of data preprocessing, model training, prediction, and the like. This process can be expressed by the following formula: f=k (E, L, R). The specific process of feedback learning is as follows:
(1) Collecting feedback: and collecting feedback of the prediction result of the model. This feedback may be from the merchant or from the business team. For example, if the model predicts that a merchant is fraudulent, but the merchant feedback says that they are not doing fraud and are underwriting, then the feedback may be collected.
(2) Analysis feedback: the reasons behind these feedback need to be understood, e.g. whether the model has false positives or false negatives, whether the features have problems, whether the data has noise, etc.
(3) And (3) adjusting a model: the model needs to be adjusted based on these feedback. This may include adjusting features, adjusting model parameters, adjusting model structure, etc.
(4) Retraining the model: the model needs to be retrained using the adjusted model. This new model can be used to predict more data and then the steps described above are repeated.
According to the object model construction method, through integrated learning and semi-supervised learning, unbalanced samples can be effectively processed, and the recognition rate of few classes is improved. Through deep learning, feature learning is automatically performed, and the workload of manual feature engineering is reduced. And the model parameters are automatically adjusted through a feedback loop, so that the accuracy and the robustness of the model are improved. Therefore, by adopting the object model construction method, the overall performance of the object classification model can be improved.
In one embodiment, as shown in fig. 8, there is provided an object classification method, which may be performed by a computer device, which may be a terminal or a server shown in fig. 1, and in this embodiment, the method is applied to the server in fig. 1, and is described as an example, including the following steps:
step S802, object information of an object to be classified is acquired.
The object information may include, for example, identification information, interaction information, and the like. Sub-information of a plurality of information categories may be included in the object information. The information categories may include, for example, numeric categories, category categories, text categories, time series categories, and the like. Taking the case that the position is taken as a merchant as an example, the object information can comprise numerical value type information such as transaction frequency and transaction amount, type information such as transaction mode, text type information such as transaction place and customer feedback, and time-series type information such as transaction time.
In step S804, feature extraction is performed on the object information to obtain object features of the object to be classified.
Specifically, the server may perform feature extraction on the sub-information using a feature extraction algorithm matched with the information category to which the sub-information belongs, corresponding to each sub-information, to obtain sub-features of the object to be classified. And finally, the server performs fusion processing on the sub-features corresponding to the sub-information respectively to determine the object features of the object to be classified. The specific manner of the fusion process may be, for example, stitching, feature calculation, and the like.
Step S806, classifying the object features of the object to be classified by using the object classification sub-models included in the object classification model, respectively, to obtain a plurality of sub-classification results of the object to be classified.
The object classification model is constructed based on the object classification model construction method. The sub-classification results may include categories, probabilities, confidence levels, and so forth. Specifically, the server inputs object features of the objects to be classified into each object classification sub-model respectively, and a plurality of sub-classification results are obtained.
Step S808, carrying out statistical analysis on each sub-classification result to obtain the object category of the object to be classified.
According to the object classification method, in the object classification modeling process, the training set comprising the plurality of unlabeled objects and the plurality of labeled objects is obtained, and for the first class object with the smaller number of objects, semi-supervised learning is performed based on the first class object and at least one part of unlabeled objects, so that a classification model is obtained, wherein the classification model is used for determining the unlabeled objects of the training set, and predicting the first class supplementary objects with labels identical to those of the first class object. Model training is carried out by using a data subset comprising at least a part of sample objects to obtain object classification sub-models corresponding to the data subset, and an object classification model comprising a plurality of object classification sub-models is constructed; the sample objects comprise the tag objects and the first class supplementary objects, so that the number of the first class objects can be increased, the influence of a small batch of samples on the model accuracy is reduced, the classification result of the object classification model is obtained by counting the respective sub-classification results of the object classification sub-models, which is equivalent to the implementation of object classification by adopting an integrated learning mode through a plurality of object classification sub-models, and the finally constructed object classification model has better generalization performance. Therefore, the method is beneficial to improving the performance of the object classification model, and further improving the accuracy of the object classification result.
In one embodiment, the sub-classification result includes a primary label of the object to be classified. In the case of this embodiment, step S808 includes: counting all the sub-classification results, and determining the occurrence times of each primary selection label in all the sub-classification results; and determining the object category represented by the primary selection label with the largest occurrence number as the object category of the object to be classified. Taking the case that the object classification sub-model includes the object classification sub-models 1-4 as an example, if the primary selection label determined by the object classification sub-model 1 is the label a, the primary selection label determined by the object classification sub-model 2 is the label B, the primary selection label determined by the object classification sub-model 3 is the label a, and the primary selection label determined by the object classification sub-model 4 is the label C, the server can determine the object category represented by the label a as the object category of the object to be classified because the occurrence number of the label a is the largest.
In this embodiment, the object class represented by the primary selection label with the largest occurrence number is determined as the object class of the object to be classified, so that the algorithm is simple, and the object classification efficiency is improved.
In one embodiment, the sub-classification result includes a preliminary label of the object to be classified, and a confidence level of the preliminary label. In the case of this embodiment, step S808 includes: carrying out confidence statistics on each primary selection label in each sub-classification result respectively, and determining a confidence statistics value of each primary selection label; and determining the object category represented by the primary selection label with the maximum confidence coefficient statistic value as the object category of the object to be classified.
The confidence statistics may be, for example, an average, median, sum, or the like. Also taking the case that the object classification sub-model includes the object classification sub-models 1-4 and the confidence coefficient statistics are average values as an example, if the primary label determined by the object classification sub-model 1 is label a and the confidence coefficient is 80%, the primary label determined by the object classification sub-model 2 is label B and the confidence coefficient is 95%, the primary label determined by the object classification sub-model 3 is label a and the confidence coefficient is 90%, and the primary label determined by the object classification sub-model 4 is label C and the confidence coefficient is 90%. The confidence average value of the label a is 85%, the confidence average value of the label B is 95%, and the confidence average value of the label C is 90%, and the server can determine the object class characterized by the label B as the object class of the object to be classified because the confidence average value of the label B is maximum.
In this embodiment, the object class represented by the primary selection label with the largest confidence coefficient statistics value is determined as the object class of the object to be classified, so that the accuracy of the object classification result can be further improved.
In one embodiment, the sub-classification result includes probabilities that the objects to be classified respectively belong to each candidate tag. In the case of this embodiment, step S808 includes: respectively carrying out probability statistics on each candidate label in each sub-classification result to obtain respective probability statistics values of each candidate label; and determining the object category represented by the candidate label with the largest probability statistic as the object category of the object to be classified.
Wherein the probability statistic may be, for example, an average, median, sum, or the like. Taking the case that the object classification sub-model comprises the object classification sub-models 1-4 and the probability statistic value is an average value as an example, if the probability of the candidate label A determined by the object classification sub-model 1 is 40% and the probability of the candidate label B is 60%; the probability of the candidate label A determined by the object classification sub-model 1 is 60%, and the probability of the candidate label B is 40%; the probability of the candidate label A determined by the object classification sub-model 1 is 20%, and the probability of the candidate label B is 80%; the probability of candidate tag a determined by object classification sub-model 1 is 70% and the probability of candidate tag B is 30%. The average probability value of the label a is 47.5%, and the average confidence value of the label B is 52.5%, so that the server can determine the object class characterized by the label B as the object class of the object to be classified because the average probability value of the label B is the largest.
In this embodiment, the object class represented by the candidate tag with the largest probability statistic is determined as the object class of the object to be classified.
It should be understood that, although the steps in the flowcharts related to the embodiments described above are sequentially shown as indicated by arrows, these steps are not necessarily sequentially performed in the order indicated by the arrows. The steps are not strictly limited to the order of execution unless explicitly recited herein, and the steps may be executed in other orders. Moreover, at least some of the steps in the flowcharts described in the above embodiments may include a plurality of steps or a plurality of stages, which are not necessarily performed at the same time, but may be performed at different times, and the order of the steps or stages is not necessarily performed sequentially, but may be performed alternately or alternately with at least some of the other steps or stages.
Based on the same inventive concept, the embodiment of the application also provides an object classification model construction device for realizing the above related object classification model construction method. The implementation of the solution provided by the apparatus is similar to the implementation described in the above method, so the specific limitation in the embodiments of the apparatus for constructing an object classification model provided below may be referred to the limitation of the method for constructing an object classification model hereinabove, and will not be described herein.
In one embodiment, as shown in fig. 9, there is provided an object classification model construction apparatus including: a training set acquisition module 901, a semi-supervised learning module 902, a sub-model training module 903, and an object classification model construction module 904, wherein:
a training set obtaining module 901, configured to obtain a training set including a plurality of unlabeled objects and a plurality of labeled objects; the tagged objects include a first class of objects whose number meets a small batch condition; the small batch condition means that the number of objects is less than or equal to a first number threshold;
the semi-supervised learning module 902 is configured to perform semi-supervised learning based on the first class object and at least a portion of the unlabeled objects to obtain a classification model; the two-classification model is used for determining a first-class supplementary object with the same predictive label as the first-class object in the label-free objects;
The sub-model training module 903 is configured to perform model training by using a data subset including at least a portion of the sample objects, so as to obtain an object classification sub-model corresponding to the data subset; the sample object comprises a label object and a first type supplementary object;
an object classification model construction module 904 for constructing an object classification model comprising a plurality of object classification sub-models; the classification result of the object classification model is obtained by counting the respective sub-classification results of the object classification sub-models.
In one embodiment, the semi-supervised learning module 902 includes: the initial classification model determining unit is used for determining the first class object as a learning object, and performing supervised learning on the learning object to obtain an initial classification model; the classifying unit is used for classifying at least one part of unlabeled objects by using the initial classifying model to obtain pseudo-label objects carrying predictive labels; the iteration unit is used for determining a new learning object based on the pseudo tag object and returning to the step of performing supervised learning on the learning object until the learning stopping condition is met; and the classification model determining unit is used for determining the current initial classification model as the classification model corresponding to the tag carried by the first class object under the condition that the learning stop condition is met.
In one embodiment, the iteration unit is specifically configured to: for each pseudo tag object, acquiring the tag confidence of the predicted tag of the pseudo tag object; and determining the pseudo tag object with the tag confidence degree meeting the confidence condition as a new learning object.
In one embodiment, the tagged object further comprises a second class of objects whose number meets a high volume condition; a bulk condition, meaning that the number of objects is greater than or equal to a second number threshold; the second number threshold is greater than the first number threshold. In the case of this embodiment, the semi-supervised learning module 902 further includes a determination unit configured to: classifying the unlabeled object by using an initial classification model, wherein the determined predicted label is the same as the first-class object, and the label confidence degree meets a first-class supplementary object of a confidence condition; in the case that the sum of the number of the first-class objects and the number of the first-class supplemental objects and the number of the objects of the second-class objects satisfy the number balance condition, it is determined that the learning stop condition is currently satisfied.
In one embodiment, the object classification model construction apparatus further includes a data subset construction module for: determining a data set comprising the sample object with the tagged object and the first class supplemental object as sample objects; at least a portion of the tagged objects and at least a portion of the first class supplemental objects are extracted from the dataset, constituting a subset of the data.
In one embodiment, the tagged objects include a first class object carrying a first tag, and a second class object carrying a second tag; the number of the objects belonging to the object category represented by the first label in the non-label object is smaller than the number of the objects belonging to the object category represented by the second label in the non-label object, and the number difference of the objects belonging to the object category represented by the second label satisfies the number imbalance condition. In the case of this embodiment, the object classification model construction apparatus further includes: the object feature extraction module is used for extracting features of each non-tag object to obtain respective object features of each non-tag object; the local density analysis module is used for carrying out local density analysis on each non-label object based on the mapping position of each non-label object in the feature space to which the object feature belongs, so as to obtain local outlier factors corresponding to each non-label object; and the first class supplementary object determining module is used for determining the unlabeled object with the local outlier factor meeting the outlier condition as a first class supplementary object carrying a first label.
In one embodiment, the object feature extraction module is specifically configured to: acquiring respective object information of each non-tag object; the object information comprises sub-information of at least two information categories; corresponding to each piece of sub-information of the non-tag object, performing feature extraction on the sub-information by using a feature extraction algorithm matched with the information category of the sub-information to obtain the sub-feature of the non-tag object; and determining object characteristics of the unlabeled object based on the sub-characteristics respectively corresponding to each piece of sub-information.
In one embodiment, the object classification model construction device further includes a feedback adjustment module for: obtaining objection information fed back by a classification result of the object classification model; under the condition that the objection information meets an objection condition, carrying out anomaly reason matching on the objection information, and determining an anomaly reason matched with the semantic represented by the objection information; and adjusting the object classification model based on the abnormality reasons to obtain an updated object classification model.
The respective modules in the above object classification model construction apparatus may be implemented in whole or in part by software, hardware, and a combination thereof. The above modules may be embedded in hardware or may be independent of a processor in the computer device, or may be stored in software in a memory in the computer device, so that the processor may call and execute operations corresponding to the above modules.
In one embodiment, as shown in fig. 10, there is provided an object classification apparatus including: an object information acquisition module 1001, an object feature extraction module 1002, a sub-classification result determination module 1003, and an object class determination module 1004, wherein:
an object information obtaining module 1001, configured to obtain object information of an object to be classified;
The object feature extraction module 1002 is configured to perform feature extraction on object information to obtain object features of an object to be classified;
a sub-classification result determining module 1003, configured to perform classification processing on object features of an object to be classified by using each object classification sub-model included in the object classification model, to obtain a plurality of sub-classification results of the object to be classified; the object classification model is constructed based on the object classification model construction method;
the object class determining module 1004 is configured to perform statistical analysis on each sub-classification result to obtain an object class of the object to be classified.
In one embodiment, the sub-classification result includes a primary label of the object to be classified. In the case of this embodiment, the object class determination module 1004 is specifically configured to: counting all the sub-classification results, and determining the occurrence times of each primary selection label in all the sub-classification results; and determining the object category represented by the primary selection label with the largest occurrence number as the object category of the object to be classified.
In one embodiment, the sub-classification result includes a primary label of the object to be classified, and a confidence level of the primary label. In the case of this embodiment, the object class determination module 1004 is specifically configured to: carrying out confidence statistics on each primary selection label in each sub-classification result respectively, and determining a confidence statistics value of each primary selection label; and determining the object category represented by the primary selection label with the maximum confidence coefficient statistic value as the object category of the object to be classified.
In one embodiment, the sub-classification result includes probabilities that the objects to be classified respectively belong to each candidate tag. In the case of this embodiment, the object class determination module 1004 is specifically configured to: respectively carrying out probability statistics on each candidate label in each sub-classification result to obtain respective probability statistics values of each candidate label; and determining the object category represented by the candidate label with the largest probability statistic as the object category of the object to be classified.
The respective modules in the above-described object classification apparatus may be implemented in whole or in part by software, hardware, and combinations thereof. The above modules may be embedded in hardware or may be independent of a processor in the computer device, or may be stored in software in a memory in the computer device, so that the processor may call and execute operations corresponding to the above modules.
In one embodiment, a computer device is provided, which may be a server, and the internal structure of which may be as shown in fig. 11. The computer device includes a processor, a memory, an Input/Output interface (I/O) and a communication interface. The processor, the memory and the input/output interface are connected through a system bus, and the communication interface is connected to the system bus through the input/output interface. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, computer programs, and a database. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The database of the computer device is used for storing data related to the object classification model construction method or the object classification method. The input/output interface of the computer device is used to exchange information between the processor and the external device. The communication interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement an object classification model construction method or an object classification method.
In one embodiment, a computer device is provided, which may be a terminal, and the internal structure thereof may be as shown in fig. 12. The computer device includes a processor, a memory, an input/output interface, a communication interface, a display unit, and an input means. The processor, the memory and the input/output interface are connected through a system bus, and the communication interface, the display unit and the input device are connected to the system bus through the input/output interface. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The input/output interface of the computer device is used to exchange information between the processor and the external device. The communication interface of the computer device is used for carrying out wired or wireless communication with an external terminal, and the wireless mode can be realized through WIFI, a mobile cellular network, NFC (near field communication) or other technologies. The computer program is executed by a processor to implement an object classification model construction method or an object classification method. The display unit of the computer equipment is used for forming a visual picture, and can be a display screen, a projection device or a virtual reality imaging device, wherein the display screen can be a liquid crystal display screen or an electronic ink display screen, the input device of the computer equipment can be a touch layer covered on the display screen, can also be a key, a track ball or a touch pad arranged on a shell of the computer equipment, and can also be an external keyboard, a touch pad or a mouse and the like.
It will be appreciated by those skilled in the art that the structures shown in fig. 11 and 12 are block diagrams of only portions of structures that are relevant to the present application and are not intended to limit the computer device on which the present application may be implemented, and that a particular computer device may include more or fewer components than shown, or may combine certain components, or have a different arrangement of components.
In an embodiment, there is also provided a computer device comprising a memory and a processor, the memory having stored therein a computer program, the processor implementing the steps of the method embodiments described above when the computer program is executed.
In one embodiment, a computer-readable storage medium is provided, on which a computer program is stored which, when executed by a processor, carries out the steps of the method embodiments described above.
In an embodiment, a computer program product is provided, comprising a computer program which, when executed by a processor, implements the steps of the method embodiments described above.
In the application, in the process of performing relevant data collection processing during example application, the informed consent or independent consent of the personal information body should be obtained strictly according to the requirements of laws and regulations in relevant regions, and the subsequent data use and processing behaviors are developed within the authorized range of laws and regulations and the personal information body.
Those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above. Any reference to memory, database, or other medium used in the various embodiments provided herein may include at least one of non-volatile and volatile memory. The nonvolatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical Memory, high density embedded nonvolatile Memory, resistive random access Memory (ReRAM), magnetic random access Memory (Magnetoresistive Random Access Memory, MRAM), ferroelectric Memory (Ferroelectric Random Access Memory, FRAM), phase change Memory (Phase Change Memory, PCM), graphene Memory, and the like. Volatile memory can include random access memory (Random Access Memory, RAM) or external cache memory, and the like. By way of illustration, and not limitation, RAM can be in the form of a variety of forms, such as static random access memory (Static Random Access Memory, SRAM) or dynamic random access memory (Dynamic Random Access Memory, DRAM), and the like. The databases referred to in the various embodiments provided herein may include at least one of relational databases and non-relational databases. The non-relational database may include, but is not limited to, a blockchain-based distributed database, and the like. The processors referred to in the embodiments provided herein may be general purpose processors, central processing units, graphics processors, digital signal processors, programmable logic units, quantum computing-based data processing logic units, etc., without being limited thereto.
The technical features of the above embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.
The above examples only represent a few embodiments of the present application, which are described in more detail and are not to be construed as limiting the scope of the present application. It should be noted that it would be apparent to those skilled in the art that various modifications and improvements could be made without departing from the spirit of the present application, which would be within the scope of the present application. Accordingly, the scope of protection of the present application shall be subject to the appended claims.

Claims (26)

1. An object classification model construction method, characterized by being applied to a server, the method comprising:
acquiring a plurality of tagged images from a terminal, acquiring a plurality of untagged images from a data storage system, and determining a training set comprising each tagged image and each untagged image; the tagged images include a first class of images that are in a quantity that satisfies a small batch condition; the small batch condition means that the number is less than or equal to a first number threshold;
Respectively extracting features of the first class image and at least a part of the unlabeled images to obtain respective image features of each image;
performing semi-supervised learning based on the image features of the first class image and at least a part of the label-free image respectively to obtain a classification model; the classification model is used for determining a first-class supplementary image with the same predicted label as the first-class image in the label-free image;
model training is carried out by using a data subset comprising the respective image characteristics of at least a part of sample objects, and an object classification sub-model corresponding to the data subset is obtained; the sample object includes the tagged image and the first category supplemental image;
constructing an object classification model comprising a plurality of object classification sub-models; the classification result of the object classification model is obtained by counting the respective sub-classification results of the object classification sub-models.
2. The method of claim 1, wherein the semi-supervised learning based on the respective image features of the first class image and at least a portion of the unlabeled image to obtain a classification model comprises:
Determining the first class image as a learning object, and performing supervised learning on the learning object to obtain an initial classification model;
performing image classification on at least a part of the label-free images by using the initial classification model to obtain pseudo label images carrying predictive labels;
determining a new learning object based on the pseudo tag image, and returning to the step of performing supervised learning on the learning object until a learning stopping condition is met;
and under the condition that the learning stopping condition is met, determining the current initial classification model as a classification model corresponding to the label carried by the first class image.
3. The method of claim 2, wherein the determining a new learning object based on the pseudo tag image comprises:
acquiring label confidence of a predicted label of each pseudo label image;
and determining the pseudo tag image with the tag confidence degree meeting the confidence condition as a new learning object.
4. The method of claim 2, wherein the tagged image further comprises a second class of images in an amount that satisfies a high volume condition; the large-batch condition refers to the quantity being larger than or equal to a second quantity threshold value; the second number threshold is greater than the first number threshold;
The method further comprises the steps of:
classifying the unlabeled image by using the initial classification model, and determining a first-class supplementary image, wherein a predicted label is the same as the first-class image and the label confidence degree meets a confidence condition;
and determining that the learning stop condition is currently met under the condition that the sum of the numbers of the first class images and the first class supplementary images and the number of the second class images meet a number balance condition.
5. The method according to claim 1, wherein the method further comprises:
determining a dataset comprising the sample object with the tagged image and the first category supplemental image as sample objects;
at least a portion of the tagged image and at least a portion of the first category supplemental image are extracted from the dataset, constituting a subset of data.
6. The method of claim 1, wherein the tagged images comprise a first category image carrying a first tag, and a second category image carrying a second tag; the number of images belonging to the image category represented by the first label in the non-label image is smaller than the number of images belonging to the image category represented by the second label in the non-label image, and the number difference of the images belongs to a number unbalance condition;
The method further comprises the steps of:
extracting the characteristics of each non-label image to obtain the respective image characteristics of each non-label image;
based on the mapping positions of the unlabeled images in the feature space to which the image features belong, carrying out local density analysis on the unlabeled images to obtain local outliers corresponding to the unlabeled images respectively;
and determining the unlabeled image of which the local outlier factor meets an outlier condition as a first-class supplementary image carrying a first label.
7. The method of claim 6, wherein the performing feature extraction on each of the unlabeled images to obtain respective image features of each of the unlabeled images includes:
acquiring respective image information of each label-free image; the image information comprises sub-information of at least two information categories;
corresponding to each piece of sub-information of the label-free image, performing feature extraction on the sub-information by using a feature extraction algorithm matched with the information category of the sub-information to obtain the sub-features of the label-free image;
and determining the image characteristics of the label-free image based on the sub-characteristics respectively corresponding to each piece of sub-information.
8. The method according to any one of claims 1 to 7, further comprising:
obtaining objection information fed back by a classification result of the object classification model;
under the condition that the objection information meets an objection condition, carrying out anomaly reason matching on the objection information, and determining anomaly reasons matched with semantics represented by the objection information;
and adjusting the object classification model based on the abnormal reasons to obtain an updated object classification model.
9. An object classification method, applied to a server, the method comprising:
acquiring image information of an image to be classified;
extracting features of the image information to obtain image features of the image to be classified;
using each object classification sub-model contained in the object classification model to respectively classify the image characteristics of the image to be classified to obtain a plurality of sub-classification results of the image to be classified; the object classification model being constructed based on the method of any one of claims 1 to 8;
and carrying out statistical analysis on each sub-classification result to obtain the image category of the image to be classified.
10. The method of claim 9, wherein the sub-classification result comprises a primary label of the image to be classified;
the step of carrying out statistical analysis on each sub-classification result to obtain the image category of the image to be classified comprises the following steps:
counting the sub-classification results, and determining the occurrence times of each primary selection label in the sub-classification results;
and determining the image category represented by the primary selection label with the largest occurrence number as the image category of the image to be classified.
11. The method of claim 9, wherein the sub-classification result comprises a preliminary label of the image to be classified and a confidence level of the preliminary label;
the step of carrying out statistical analysis on each sub-classification result to obtain the image category of the image to be classified comprises the following steps:
carrying out confidence statistics on each primary label in each sub-classification result respectively, and determining a confidence statistics value of each primary label;
and determining the image category represented by the primary selection label with the maximum confidence coefficient statistic value as the image category of the image to be classified.
12. The method of claim 9, wherein the sub-classification result includes a probability that the image to be classified belongs to each candidate tag, respectively;
The step of carrying out statistical analysis on each sub-classification result to obtain the image category of the image to be classified comprises the following steps:
respectively carrying out probability statistics on each candidate label in each sub-classification result to obtain respective probability statistics values of each candidate label;
and determining the image category represented by the candidate label with the largest probability statistic as the image category of the image to be classified.
13. An object classification model construction apparatus, characterized by being applied to a server, comprising:
the training set acquisition module is used for acquiring a plurality of tagged images from the terminal, acquiring a plurality of untagged images from the data storage system and determining a training set containing each tagged image and each untagged image; the tagged images include a first class of images that are in a quantity that satisfies a small batch condition; the small batch condition means that the number is less than or equal to a first number threshold;
the feature extraction module is used for respectively extracting features of the first class image and at least a part of the label-free images to obtain respective image features of each image;
the semi-supervised learning module is used for performing semi-supervised learning based on the image features of the first class image and at least one part of the unlabeled images to obtain a classification model; the classification model is used for determining a first-class supplementary image with the same predicted label as the first-class image in the label-free image;
The sub-model training module is used for carrying out model training by using a data subset comprising the respective image characteristics of at least a part of sample objects to obtain an object classification sub-model corresponding to the data subset; the sample object includes the tagged image and the first category supplemental image;
an object classification model construction module for constructing an object classification model comprising a plurality of object classification sub-models; the classification result of the object classification model is obtained by counting the respective sub-classification results of the object classification sub-models.
14. The apparatus of claim 13, wherein the semi-supervised learning module comprises:
the initial classification model determining unit is used for determining the first class image as a learning object and performing supervised learning on the learning object to obtain an initial classification model;
the classifying unit is used for performing image classification on at least one part of the unlabeled images by using the initial classifying model to obtain pseudo-label images carrying predictive labels;
the iteration unit determines a new learning object based on the pseudo tag image, and returns to the step of performing supervised learning on the learning object until a learning stop condition is met;
And the classification model determining unit is used for determining the current initial classification model as the classification model corresponding to the label carried by the first class image under the condition that the learning stop condition is met.
15. The apparatus according to claim 14, wherein the iteration unit is specifically configured to:
acquiring label confidence of a predicted label of each pseudo label image;
and determining the pseudo tag image with the tag confidence degree meeting the confidence condition as a new learning object.
16. The apparatus of claim 14, wherein the tagged image further comprises a second category of images in an amount that satisfies a high volume condition; the large-batch condition refers to the quantity being larger than or equal to a second quantity threshold value; the second number threshold is greater than the first number threshold;
the semi-supervised learning module further includes a judging unit configured to:
classifying the unlabeled image by using the initial classification model, and determining a first-class supplementary image, wherein a predicted label is the same as the first-class image and the label confidence degree meets a confidence condition;
and determining that the learning stop condition is currently met under the condition that the sum of the numbers of the first class images and the first class supplementary images and the number of the second class images meet a number balance condition.
17. The apparatus of claim 13, further comprising a data subset construction module configured to:
determining a dataset comprising the sample object with the tagged image and the first category supplemental image as sample objects;
at least a portion of the tagged image and at least a portion of the first category supplemental image are extracted from the dataset, constituting a subset of data.
18. The apparatus of claim 13, wherein the tagged images comprise a first category image carrying a first tag, and a second category image carrying a second tag; the number of images belonging to the image category represented by the first label in the non-label image is smaller than the number of images belonging to the image category represented by the second label in the non-label image, and the number difference of the images belongs to a number unbalance condition;
the feature extraction module is further used for extracting features of the non-label images to obtain respective image features of the non-label images;
the apparatus further comprises:
the local density analysis module is used for carrying out local density analysis on each non-label image based on the mapping positions of each non-label image in the feature space to which the image features belong, so as to obtain local outliers corresponding to each non-label image;
And the first type supplementary image determining module is used for determining the unlabeled image of which the local outlier factor meets the outlier condition as a first type supplementary image carrying a first label.
19. The apparatus of claim 18, wherein the feature extraction module is specifically configured to:
acquiring respective image information of each label-free image; the image information comprises sub-information of at least two information categories;
corresponding to each piece of sub-information of the label-free image, performing feature extraction on the sub-information by using a feature extraction algorithm matched with the information category of the sub-information to obtain the sub-features of the label-free image;
and determining the image characteristics of the label-free image based on the sub-characteristics respectively corresponding to each piece of sub-information.
20. The apparatus according to any one of claims 13 to 19, further comprising a feedback adjustment module for:
obtaining objection information fed back by a classification result of the object classification model;
under the condition that the objection information meets an objection condition, carrying out anomaly reason matching on the objection information, and determining anomaly reasons matched with semantics represented by the objection information;
And adjusting the object classification model based on the abnormal reasons to obtain an updated object classification model.
21. An object classification apparatus for application to a server, the apparatus comprising:
the information acquisition module is used for acquiring image information of the images to be classified;
the feature extraction module is used for extracting features of the image information to obtain image features of the images to be classified;
the sub-classification result determining module is used for classifying the image features of the image to be classified by using each object classification sub-model contained in the object classification model to obtain a plurality of sub-classification results of the image to be classified; the object classification model being constructed based on the method of any one of claims 1 to 8;
and the image category determining module is used for carrying out statistical analysis on each sub-classification result to obtain the image category of the image to be classified.
22. The apparatus of claim 21, wherein the sub-classification result comprises a preliminary label of the image to be classified; the image category determining module is specifically configured to:
counting the sub-classification results, and determining the occurrence times of each primary selection label in the sub-classification results;
And determining the image category represented by the primary selection label with the largest occurrence number as the image category of the image to be classified.
23. The apparatus of claim 21, wherein the sub-classification result comprises a preliminary label of the image to be classified and a confidence level of the preliminary label; the image category determining module is specifically configured to:
carrying out confidence statistics on each primary label in each sub-classification result respectively, and determining a confidence statistics value of each primary label;
and determining the image category represented by the primary selection label with the maximum confidence coefficient statistic value as the image category of the image to be classified.
24. The apparatus of claim 21, wherein the sub-classification result comprises a probability that the image to be classified belongs to each candidate tag, respectively; the image category determining module is specifically configured to:
respectively carrying out probability statistics on each candidate label in each sub-classification result to obtain respective probability statistics values of each candidate label;
and determining the image category represented by the candidate label with the largest probability statistic as the image category of the image to be classified.
25. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor implements the steps of the method of any one of claims 1 to 12 when the computer program is executed.
26. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the method of any of claims 1 to 12.
CN202311269339.7A 2023-09-28 2023-09-28 Object classification model construction method, object classification method, device and equipment Active CN117009883B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311269339.7A CN117009883B (en) 2023-09-28 2023-09-28 Object classification model construction method, object classification method, device and equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311269339.7A CN117009883B (en) 2023-09-28 2023-09-28 Object classification model construction method, object classification method, device and equipment

Publications (2)

Publication Number Publication Date
CN117009883A CN117009883A (en) 2023-11-07
CN117009883B true CN117009883B (en) 2024-04-02

Family

ID=88567520

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311269339.7A Active CN117009883B (en) 2023-09-28 2023-09-28 Object classification model construction method, object classification method, device and equipment

Country Status (1)

Country Link
CN (1) CN117009883B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112183577A (en) * 2020-08-31 2021-01-05 华为技术有限公司 Training method of semi-supervised learning model, image processing method and equipment
CN113723492A (en) * 2021-08-25 2021-11-30 哈尔滨理工大学 Hyperspectral image semi-supervised classification method and device for improving active deep learning
CN113822374A (en) * 2021-10-29 2021-12-21 平安科技(深圳)有限公司 Model training method, system, terminal and storage medium based on semi-supervised learning
CN113869464A (en) * 2021-12-02 2021-12-31 深圳佑驾创新科技有限公司 Training method of image classification model and image classification method

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20210056417A1 (en) * 2019-08-22 2021-02-25 Google Llc Active learning via a sample consistency assessment
KR20210149530A (en) * 2020-06-02 2021-12-09 삼성에스디에스 주식회사 Method for training image classification model and apparatus for executing the same

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112183577A (en) * 2020-08-31 2021-01-05 华为技术有限公司 Training method of semi-supervised learning model, image processing method and equipment
CN113723492A (en) * 2021-08-25 2021-11-30 哈尔滨理工大学 Hyperspectral image semi-supervised classification method and device for improving active deep learning
CN113822374A (en) * 2021-10-29 2021-12-21 平安科技(深圳)有限公司 Model training method, system, terminal and storage medium based on semi-supervised learning
CN113869464A (en) * 2021-12-02 2021-12-31 深圳佑驾创新科技有限公司 Training method of image classification model and image classification method

Also Published As

Publication number Publication date
CN117009883A (en) 2023-11-07

Similar Documents

Publication Publication Date Title
Chen et al. Selecting critical features for data classification based on machine learning methods
CN113011646B (en) Data processing method, device and readable storage medium
CN111582538A (en) Community value prediction method and system based on graph neural network
US11538029B2 (en) Integrated machine learning and blockchain systems and methods for implementing an online platform for accelerating online transacting
CN112819024B (en) Model processing method, user data processing method and device and computer equipment
Liu et al. Application of Decision Tree‐Based Classification Algorithm on Content Marketing
Li et al. Explain graph neural networks to understand weighted graph features in node classification
Demertzis et al. Geo-AI to aid disaster response by memory-augmented deep reservoir computing
Pham et al. Unsupervised training of Bayesian networks for data clustering
Zhang Financial data anomaly detection method based on decision tree and random forest algorithm
Kang et al. A CWGAN-GP-based multi-task learning model for consumer credit scoring
Liu et al. Learning multiple gaussian prototypes for open-set recognition
CN114049204A (en) Suspicious transaction data entry method, device, computer equipment and computer-readable storage medium
Hain et al. The promises of Machine Learning and Big Data in entrepreneurship research
Kamalloo et al. Credit risk prediction using fuzzy immune learning
Qasem et al. Extreme learning machine for credit risk analysis
Xu et al. Sample selection-based hierarchical extreme learning machine
Li et al. An improved genetic-XGBoost classifier for customer consumption behavior prediction
CN116805245A (en) Fraud detection method and system based on graph neural network and decoupling representation learning
CN117009883B (en) Object classification model construction method, object classification method, device and equipment
CN115994331A (en) Message sorting method and device based on decision tree
CN116029760A (en) Message pushing method, device, computer equipment and storage medium
Arya et al. Node classification using deep learning in social networks
Xiao et al. Explainable fraud detection for few labeled time series data
Raman et al. Multigraph attention network for analyzing company relations

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant