CN106874925A - object grouping method, model training method and device - Google Patents

object grouping method, model training method and device Download PDF

Info

Publication number
CN106874925A
CN106874925A CN201510927700.XA CN201510927700A CN106874925A CN 106874925 A CN106874925 A CN 106874925A CN 201510927700 A CN201510927700 A CN 201510927700A CN 106874925 A CN106874925 A CN 106874925A
Authority
CN
China
Prior art keywords
kernel
objects
group
groups
kernel object
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201510927700.XA
Other languages
Chinese (zh)
Inventor
席炎
王晓光
隋宛辰
漆远
张柯
姜晓燕
王少萌
俞吴杰
施兴
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba Group Holding Ltd
Original Assignee
Alibaba Group Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding Ltd filed Critical Alibaba Group Holding Ltd
Priority to CN201510927700.XA priority Critical patent/CN106874925A/en
Publication of CN106874925A publication Critical patent/CN106874925A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2413Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
    • G06F18/24147Distances to closest patterns, e.g. nearest neighbour classification

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the present application discloses object grouping method, model training method and device, to solve in group's partition process of the prior art because of the factor that there is human intervention, and influences the problem of the accuracy that final group divides.The object grouping method includes:According to default characteristic value corresponding with each object in the object set of group to be divided, the Euclidean distance between any two object in the object set is determined;Object in the object set is distributed according to the Euclidean distance in three dimensions;Based on distribution of the object in the object set in the three dimensions, it is determined that first kernel object of the number of objects not less than default value in the neighborhood of pre-set radius;It is determined that number of objects in the neighborhood of pre-set radius not less than default value and pre-set radius in first kernel object field in the second kernel object;First kernel object and second kernel object are belonged into same groups of objects.

Description

Object grouping method, model training method and device
Technical field
The application is related to computer technology, more particularly to a kind of object grouping method, model training method and dress Put.
Background technology
At present, machine learning (Machine Learning, ML) is applied to the every field of artificial intelligence. For example:Machine learning is carried out by extracting the credit data of user, credit scoring model is obtained.
By taking the application of credit scoring as an example, the credit data of all users is usually carried out into machine learning, A unified credit scoring model is obtained, and all users are completed in this unified credit scoring model Credit evaluation.It was verified that due to there are different group property or Crowds Distributes in customer group, it is above-mentioned Unified credit scoring model is often extremely difficult to gratifying credit evaluation effect.Therefore, reasonably to Family group carries out crowd's division, and is built one's credit respectively assessment models according to different crowd, it has also become current credit The important step of evaluation process.
In the prior art, supervised learning (Supervised Learning) or semi-supervised learning can typically be utilized (Semi-Supervised Learning, SSL) is divided realizing above-mentioned group, for example:Logistic regression (Logistic Regression).Wherein, in supervised learning or semi-supervised learning, generally requiring in advance to divide group is carried out It is artificially defined (such as:Predefine the number of clustering class).
It can be seen that, conventionally, as during supervised learning or semi-supervised learning, generally requiring In advance group is partitioned into going artificially defined so that because of the factor that there is human intervention, and shadow in group's partition process The accuracy that the final group of sound divides.
The content of the invention
The purpose of the embodiment of the present application is to provide a kind of object grouping method, model training method and device, with Solve in group's partition process of the prior art because of the factor that there is human intervention, and influence what final group divided The problem of accuracy.
In order to solve the above technical problems, object grouping method, model training method that the embodiment of the present application is provided And device is realized in:
A kind of object grouping method, including:
According to default characteristic value corresponding with each object in the object set of group to be divided, the object is determined The Euclidean distance between any two object in set;
Object in the object set is distributed according to the Euclidean distance in three dimensions;
Based on distribution of the object in the object set in the three dimensions, it is determined that in pre-set radius First kernel object of the number of objects not less than default value in neighborhood;
It is determined that number of objects in the neighborhood of pre-set radius not less than default value and in described first The second kernel object in the field of the pre-set radius of kernel object;
First kernel object and second kernel object are belonged into same groups of objects.
A kind of object grouping method, including:
According to default characteristic value corresponding with each object in the object set of group to be divided, the object is determined The Euclidean distance between any two object in set;
Object in the object set is distributed according to the Euclidean distance in three dimensions;
Based on distribution of the object in the object set in the three dimensions, it is determined that in pre-set radius Kernel object of the number of objects not less than default value in neighborhood, obtains the core pair being made up of kernel object As set;
If the first kernel object in the kernel object set is in the pre-set radius of the second kernel object In field, first kernel object and second kernel object are belonged into same groups of objects.
A kind of model training method, including:
The object in the object set of group to be divided is carried out into a point group using above-mentioned object grouping method;And
According to it is predetermined it is corresponding with point group obtains each object group treat selected characteristic, extract and each The object that is included in groups of objects is corresponding described to treat selected characteristic;
Treat that selected characteristic carries out model training using described in the object included in each object group for extracting, obtain To model corresponding with each object group.
A kind of object grouping device, including:
First determining unit, for basis default spy corresponding with each object in the object set of group to be divided Value indicative, determines the Euclidean distance between any two object in the object set;
Distribution unit, for by the object in the object set according to the Euclidean distance in three dimensions It is distributed;
Second determining unit, for the dividing in the three dimensions based on the object in the object set Cloth, it is determined that first kernel object of the number of objects not less than default value in the neighborhood of pre-set radius;
3rd determining unit, for determining that the number of objects in the neighborhood of pre-set radius is not less than default value And the second kernel object in the field of pre-set radius in first kernel object;
Divide group unit, it is same right for first kernel object and second kernel object to be belonged to As group.
A kind of object grouping device, including:
First determining unit, for basis default spy corresponding with each object in the object set of group to be divided Value indicative, determines the Euclidean distance between any two object in the object set;
Distribution unit, for by the object in the object set according to the Euclidean distance in three dimensions It is distributed;
Kernel object determining unit, for based on the object in the object set in the three dimensions Distribution, it is determined that kernel object of the number of objects not less than default value in the neighborhood of pre-set radius, obtains The kernel object set being made up of kernel object;
Divide group unit, the second kernel object is in for the first kernel object in the kernel object set Pre-set radius field in when, first kernel object and second kernel object are belonged to same Individual groups of objects.
A kind of model training apparatus, including:
Above-mentioned object grouping device;And
Training characteristics extraction unit, for according to predetermined corresponding with point group obtains each object group Treat selected characteristic, extract it is corresponding with the object included in each object group described in treat selected characteristic;
Model training unit, for waiting to choose special using described in the object included in each object group for extracting Levying carries out model training, obtains model corresponding with each object group.
The technical scheme provided from each embodiment of above the application, by by the object set of group to be divided Each object be distributed in three dimensions according to predetermined Euclidean distance, hereafter based on above-mentioned point Cloth, the number of objects in the field of the pre-set radius of certain object is (i.e. right in the field of pre-set radius As concentration) not less than default value when, the object is defined as the first kernel object;Then, continue to be based on Above-mentioned distribution, in the field of the pre-set radius of above-mentioned first kernel object, it is determined that in the neighborhood of pre-set radius Second core of the interior number of objects (the object concentration i.e. in the field of pre-set radius) not less than default value Heart object;Finally, above-mentioned first kernel object for determining and the second kernel object are belonged into same object In group.By repeating said process, it is possible to achieve the group of the object in the object set of above-mentioned group to be divided divides. In above process, due to and in advance group need not be partitioned into going it is artificially defined, so as to avoid group from dividing Journey is influenceed by excessive human intervention factor, the accuracy that lifting group divides.
Brief description of the drawings
In order to illustrate more clearly of the embodiment of the present application or technical scheme of the prior art, below will be to implementing Example or the accompanying drawing to be used needed for description of the prior art are briefly described, it should be apparent that, describe below In accompanying drawing be only some embodiments described in the application, for those of ordinary skill in the art, Without having to pay creative labor, other accompanying drawings can also be obtained according to these accompanying drawings.
The flow chart of the object grouping method that Fig. 1 is provided for the embodiment of the application one;
Fig. 2 is the distribution schematic diagram of each object in the object set of group to be divided in three dimensions;
Fig. 3 is distribution schematic diagram of each groups of objects that obtains of point group in three dimensions;
The flow chart of the object grouping method that Fig. 4 is provided for another embodiment of the application;
The flow chart of the model training method that Fig. 5 is provided for the embodiment of the application one;
The module diagram of the object grouping device that Fig. 6 is provided for the embodiment of the application one;
The module diagram of the object grouping device that Fig. 7 is provided for another embodiment of the application;
The module diagram of the model training apparatus that Fig. 8 is provided for the embodiment of the application one.
Specific embodiment
In order that those skilled in the art more fully understand the technical scheme in the application, below in conjunction with this Accompanying drawing in application embodiment, is clearly and completely described to the technical scheme in the embodiment of the present application, Obviously, described embodiment is only some embodiments of the present application, rather than whole embodiments.Base Embodiment in the application, those of ordinary skill in the art are obtained under the premise of creative work is not made The every other embodiment for obtaining, should all belong to the scope of the application protection.
For in the group's partition process for solving the problems, such as prior art, the application provides a kind of space based on density Clustering algorithm (Density-Based Spatial Clustering of Applications with Noise, DBSCAN) To realize that group divides.The technical program will be described by taking the scene of credit scoring as an example herein.
The flow of the object grouping method that Fig. 1 is provided for the embodiment of the application one, including:
S101:The corresponding default characteristic value of each object in the object set Q1 with group to be divided, really The Euclidean distance between any two object in the fixed object set Q1.
In the scene of credit scoring, above-mentioned object set Q1 can be user's set of group to be divided.Each User can correspond to embody for one or more the characteristic of personal credit degree.These credit datas can Being, for example, basic document (such as age, sex, race), business information, user's portrait, social row For, transaction record, consumption habit etc..Euclidean distance (Euclidean Distance) is represented in m-dimensional space In actual distance between two points, if a point each object being regarded as in space, can be by Europe The computing formula of formula distance determines the Europe in above-mentioned object set Q1 between any two object (user) Formula distance.
Alternatively, it is determined that before the process of Euclidean distance between any two object, also including:
The each object corresponding at least one extracted in the object set Q1 with group to be divided presets characteristic (these characteristics can be characterized in the form of numerical value);Described default characteristic to extracting is entered Row normalized, obtains default characteristic value.It is normalized by each the default characteristic extracted Treatment so that the numerical value that the final default characteristic value for participating in Euclidean distance calculating is between 0 to 1, Thus eliminating the need the difference of data unit.
For example, for object a and object b, it is assumed that the default characteristic corresponding with object a of extraction According to including:{ age s1, race s2, consumption habit s3, Social behaviors s4 }, extraction with object b pairs The default characteristic answered includes:{ age t1, race t2, consumption habit t3, Social behaviors t4 };Pass through Normalized, obtaining default characteristic value corresponding with object a includes:{xa=s1/ (s1+s2+s3+s4), ya=s2/ (s1+s2+s3+s4), za=s3/ (s1+s2+s3+s4), pa=s4/ (s1+s2+s3+s4) }; Include to default characteristic value corresponding with object b:{xb=t1/ (t1+t2+t3+t4), yb=t2/ (t1+t2+t3+t4), zb=t3/ (t1+t2+t3+t4), pb=t4/ (t1+t2+t3+t4) }.
According to the computing formula of Euclidean distance, the Euclidean distance between above-mentioned object a and object b can be obtained
By in said process, it may be determined that the Euclidean distance in object set Q1 between any two object. It can be seen that, if one or more characteristics that two objects are extracted are more similar or similar, to a certain degree On the Euclidean distance that can be rendered as between the two objects it is smaller (closer in space), the application reality In applying example and two closer in space objects being belonged into same groups of objects.
S102:Object in object set Q1 is distributed according to Euclidean distance in three dimensions.
Shown in reference picture 2, by by all objects in object set Q1 according to each other it is European away from From, be distributed in a three dimensions in the way of putting, then the distribution situation of Fig. 2 can be presented.Generally, In this three dimensions, there is dense and sparse situation in the distribution of above-mentioned object, and usually, distribution is more The object of a dense panel region can belong to same groups of objects.
S103:Based on distribution of the object in the object set Q1 in the three dimensions, it is determined that First kernel object of the number of objects not less than default value in the neighborhood of pre-set radius.
Shown in reference picture 2, in the embodiment of the present application, if definition pre-set radius are E, certain can be defined on The neighborhood of the pre-set radius E of individual object a is the E neighborhoods 11 of object a, and the E fields are in three dimensions Can be centered on certain object and spheroid of the radius as E.If the object in the E neighborhoods of certain object Quantity is not less than default value MinPts (can be redefined for a concrete numerical value), then can determine this Object is kernel object.
For example, if being found by detecting, the number of objects included in the E neighborhoods 11 of object a is not small In above-mentioned default value MinPts, then can determine that object a is the first kernel object.
Alternatively, after above-mentioned steps S103, can also include:
Judge whether first kernel object can belong to any one groups of objects for having existed.
If first kernel object cannot belong to any one groups of objects for having existed, newly-built one First kernel object is simultaneously belonged to the newly-built groups of objects C by individual groups of objects C.If first core pair As any one groups of objects for having existed can be belonged to, then the groups of objects is belonged to.
S104:It is determined that number of objects in the neighborhood of pre-set radius not less than default value and in institute The second kernel object in the field of the pre-set radius for stating the first kernel object.
In the present embodiment, after it is determined that above-mentioned object a is the first kernel object, can be from object a's Determine to be divided (ownership) to other objects of same groups of objects with object a in E neighborhoods.Typically Ground, if another object in an E field for kernel object falls within kernel object, can determine The two kernel objects belong to same groups of objects.In fig. 2, object b is in the E neighborhoods of object a, Found by detecting, the number of objects included in the E fields of object b is not less than above-mentioned default value MinPts, then can determine that object b is above-mentioned second kernel object.
Similarly, if determining in above-mentioned steps S103, object b is the first kernel object, in above-mentioned steps S104 In, above-mentioned default value is not less than by the number of objects included in the E neighborhoods 13 for detecting Finding Object c MinPts, then can determine that object c is above-mentioned second kernel object.
In fig. 2, the number of objects by being included in the E neighborhoods 14 for detecting Finding Object d is less than above-mentioned Default value MinPts, then can determine that object d is non-core object.
In the embodiment of the present application, above-mentioned steps S104 can be specifically included:
It is determined that number of objects in the neighborhood of pre-set radius not less than default value and from first core The second kernel object that the direct density of heart object is reachable or density is reachable.
Wherein, if object b is in the E neighborhoods of object a and object a is kernel object, it is right to define As b is reachable from the direct density of object a.If a given string object:P1, p2 ..., pn, p=p1, q=pn, If object pi is from p (i-1), and directly density is reachable, then object q is defined reachable from object p density;Wherein, n >=3,2≤i≤n.With reference to above-mentioned definition, in the example in figure 2, object b can from the direct density of object a Reach, object c is reachable from the direct density of object b, object c is reachable from object a density.
By said process, it is determined that to after first kernel object, can be true one by one in three dimensions Several second kernel objects fixed and that the direct density of the first kernel object is reachable or density is reachable.
S105:First kernel object and second kernel object are belonged into same groups of objects.
In above-mentioned row illustrated example, if object a is used as the first kernel object, object b is used as the second core Object, then in object a and object b being belonged into same groups of objects;If object b is used as the first core Heart object, object c is used as the second kernel object, then it is same right to belong to object b and object c As in group.
The application performs the process of above-mentioned steps S103~S105 by circulating, by can be by object set All kernel objects (number of objects included in E neighborhoods is not less than default value MinPts) in Q1 are drawn In assigning to the groups of objects specified.However, there may be a small amount of object in the object set Q1 of group generally to be divided It is not kernel object, these objects can not be divided into any one groups of objects, and alternatively, the application can To carry out a point group to these non-core objects as follows:
The object that any one groups of objects is not belonged in the object set is defined as the object that peels off.
It is determined that the nearest groups of objects nearest with the object distance that peels off.
The object that peels off is belonged into the nearest groups of objects.
Shown in reference picture 3, realize that group divides by above-mentioned density-based spatial clustering algorithm DBSCAN, The object being distributed in three dimensions can be divided into the first groups of objects 21, the second groups of objects 22 and the 3rd Groups of objects 23, above-mentioned groups of objects includes some kernel objects.In actual conditions, generally in three dimensions A small amount of object 24 (i.e. non-core object) that peels off also is distributed with, for these objects 24 that peel off, from having drawn Point first, second, third groups of objects in determine one it is closest with the above-mentioned object 24 that peels off nearest Groups of objects, and the object 24 that peels off is belonged to the nearest groups of objects of determination, so far, Ke Yishi one by one respectively Now in the object set Q1 of group to be divided all objects point group's task.
The flow of the object grouping method that Fig. 4 is provided for another embodiment of the application, including:
S201:The corresponding default characteristic value of each object in the object set Q1 with group to be divided, really The Euclidean distance between any two object in the fixed object set.
The step is referred to the particular content of above-mentioned steps S101, is no longer repeated.
S202:Object in object set Q1 is distributed according to Euclidean distance in three dimensions.
The step is referred to the particular content of above-mentioned steps S102, is no longer repeated.
S203:Based on distribution of the object in object set Q1 in three dimensions, it is determined that in pre-set radius Kernel object of the number of objects not less than default value MinPts in the neighborhood of E, obtains by kernel object group Into kernel object set Q2.
In this step, by detecting what is included in the E neighborhoods of each object of distribution in three dimensions one by one Whether number of objects is not less than default value MinPts, if, it is determined that the object is kernel object.So as to Can most all kernel objects of distribution are found out in three dimensions at last, and form kernel object set Q2 (Q2 is the subset of Q1).
S204:If the first kernel object in kernel object set Q2 is in default the half of the second kernel object In the field in footpath, the first kernel object and the second kernel object are belonged into same groups of objects.
In this step, based on the kernel object set Q2 for obtaining, any two core pair in Q2 is determined As if it is no belong to same target group principle can be:By judging a kernel object whether in another In the E neighborhoods of kernel object, if so, then showing that the two kernel objects can be divided into same groups of objects.
In the embodiment of the present application, step S204 can include:
If the first kernel object in kernel object set Q2 is reachable from the direct density of the second kernel object Or density is reachable, first kernel object and second kernel object are belonged into same groups of objects.
Wherein, on the definition that the reachable density of direct density is reachable, it is referred to the above.
Alternatively, after above-mentioned steps S204, can also include:
To in object set Q1 not be that the object of kernel object is defined as the object that peels off.
Based on the divided each object group for obtaining, it is determined that the nearest object nearest with the object distance that peels off Group.The object that peels off is belonged to the described nearest groups of objects of determination.
Based on above content, in the object grouping method that each embodiment of the application is provided, by will group be divided Object set Q1 in each object divided in three dimensions according to predetermined Euclidean distance Cloth, hereafter based on above-mentioned distribution, the number of objects in the field of the pre-set radius of certain object is (i.e. pre- If the object concentration in the field of radius) not less than default value MinPts when, the object is defined as the One kernel object;Then, continue to be based on above-mentioned distribution, in the pre-set radius E of above-mentioned first kernel object In field, it is determined that the number of objects in the neighborhood of pre-set radius E is (i.e. in the field of pre-set radius E Object concentration) it is not less than second kernel object of default value MinPts;Finally, above-mentioned for determining One kernel object and the second kernel object are belonged in same groups of objects.By repeating said process, can be with Realize that the group of the object in the object set of above-mentioned group to be divided divides.In above process, due to and need not It is definition (need not such as preset the quantity for dividing group) group to be divided in advance carries out anyone, so as to keep away Exempt from group partition process is influenceed by human intervention factor, the accuracy that lifting group divides.
The flow of the model training method that Fig. 5 is provided for the embodiment of the application one, including:
S301:The object grouping method provided using various embodiments above (i.e. calculate by density-based spatial clustering Method DBSCAN), the object in the object set Q1 of group to be divided is carried out into a point group.
In the scene of credit scoring, it is possible to use the above method is obtained for example:It is made up of student users University student group, working group that working clan user is constituted etc..
S302:According to it is predetermined it is corresponding with point group obtains each object group treat selected characteristic, extract Selected characteristic is treated described in corresponding with the object included in each object group.
For example, if being included by the customer group that above-mentioned steps S301 is obtained:
{ crowd 1, crowd 2, crowd 3 };
Can predefine and corresponding with crowd 1 treat that selected characteristic includes:
{ feature M1, feature M2, feature M3 };
Can predefine and corresponding with crowd 2 treat that selected characteristic includes:
{ feature M1, feature M2, feature M3, feature M5 };
Can predefine and corresponding with crowd 3 treat that selected characteristic includes:
{ feature M1, feature M3, feature M4 };
The then credit data based on the user in each customer group, predetermined waits to choose special based on above-mentioned Levy, that correspondingly extracts that the credit data of each user in each customer group includes above-mentioned treats selected characteristic.
S303:Treat that selected characteristic carries out model instruction using described in the object included in each object group for extracting Practice, obtain model corresponding with each object group.
For example, selected characteristic is treated with crowd 1 corresponding each user using extracting:Feature M1, Feature M2, feature M3 } carry out model training (i.e. machine learning ML), one and crowd can be obtained The 1 corresponding model 1 for being used to carry out this crowd credit scoring;It is corresponding with crowd 2 using what is extracted Each user's treats selected characteristic:{ feature M1, feature M2, feature M3, feature M5 } carries out mould Type training (i.e. machine learning ML), can obtain one and corresponding with crowd 2 be used to carry out this crowd The model 2 of credit scoring;Using extracting selected characteristic is treated with crowd 3 corresponding each user:{ feature M1, feature M3, feature M4 } carry out model training (i.e. machine learning ML), one can be obtained The model 3 for being used to carry out this crowd credit scoring corresponding with crowd 3.In the embodiment of the present application, lead to Cross using above-mentioned density-based spatial clustering algorithm DBSCAN to realize tenant group, can cause most The credit scoring model for obtaining eventually is more suitable for different user crowds, lifts the reliability of credit scoring system Property.
It is corresponding with above method flow, embodiments herein additionally provide a kind of object grouping device and Using the model training apparatus of the object grouping device.Said apparatus can be realized by software, it is also possible to logical The mode for crossing hardware or software and hardware combining is realized.As a example by implemented in software, as the device on logical meaning, It is by corresponding computer program by the central processing unit (Central Process Unit, CPU) of server Instruction runs what is formed in reading internal memory.
The module diagram of the object grouping device that Fig. 6 is provided for the embodiment of the application one, in the device each The function of unit is similar with the function of each step in the above method, therefore is referred in above method embodiment Particular content.The object grouping device includes:
First determining unit 101, it is corresponding pre- with each object in the object set of group to be divided for basis If characteristic value, the Euclidean distance between any two object in the object set is determined;
Distribution unit 102, for by the object in the object set according to the Euclidean distance in three-dimensional space It is interior to be distributed;
Second determining unit 103, for based on the object in the object set in the three dimensions Distribution, it is determined that first kernel object of the number of objects not less than default value in the neighborhood of pre-set radius;
3rd determining unit 104, for determining the number of objects in the neighborhood of pre-set radius not less than default Numerical value and the second kernel object in the field of pre-set radius in first kernel object;
Divide group unit 105, it is same for first kernel object and second kernel object to be belonged to Individual groups of objects.
Alternatively, described device also includes:
Peel off object determining unit, for will not belong to the right of any one groups of objects in the object set As being defined as the object that peels off;
Nearest groups of objects determining unit, for determining the nearest groups of objects nearest with the object distance that peels off;
Ownership unit, for the object that peels off to be belonged into the nearest groups of objects.
Alternatively, the 3rd determining unit 104 can be specifically for:
It is determined that number of objects in the neighborhood of pre-set radius not less than default value and from first core The direct density of heart object up to or the second reachable kernel object of density.
Alternatively, described device also includes:
Judging unit, for it is determined that after first kernel object, judging that first kernel object is It is no to belong to any one groups of objects for having existed;
Newly-built unit, it is right for any one that existed cannot to be belonged in first kernel object During as group, first kernel object is simultaneously belonged to the newly-built groups of objects by a newly-built groups of objects.
Alternatively, described device also includes:
Extraction unit, for it is determined that before the Euclidean distance, extract and the object set of group to be divided in The corresponding at least one default characteristic of each object;
Normalization unit, for being normalized to the described default characteristic extracted, is preset Characteristic value.
The module diagram of the object grouping device that Fig. 7 is provided for another embodiment of the application, it is each in the device The function of individual unit is similar with the function of each step in the above method, therefore is referred to above method embodiment In particular content.The object grouping device includes:
First determining unit 201, it is corresponding pre- with each object in the object set of group to be divided for basis If characteristic value, the Euclidean distance between any two object in the object set is determined;
Distribution unit 202, for by the object in the object set according to the Euclidean distance in three-dimensional space It is interior to be distributed;
Kernel object determining unit 203, for based on the object in the object set in the three dimensions Interior distribution, it is determined that kernel object of the number of objects not less than default value in the neighborhood of pre-set radius, Obtain the kernel object set being made up of kernel object;
Divide group unit 204, the second core is in for the first kernel object in the kernel object set When in the field of the pre-set radius of object, first kernel object and second kernel object are belonged to Same groups of objects.
Alternatively, described device also includes:
Peel off object determining unit, for will in the object set not be that the object of kernel object is defined as Peel off object;
Nearest groups of objects determining unit, for determining the nearest groups of objects nearest with the object distance that peels off;
Ownership unit, for the object that peels off to be belonged into the nearest groups of objects.
Alternatively, described point of group unit 204 specifically for:
If the first kernel object in the kernel object set can from the direct density of the second kernel object Up to or density it is reachable, first kernel object and second kernel object are belonged into same object Group.
The module diagram of the model training apparatus that Fig. 8 is provided for the embodiment of the application one, with reference to above-mentioned model The content of training method, the device includes:
Above-mentioned object grouping device 301 (unit shown in Fig. 6 or Fig. 7);
Training characteristics extraction unit 302, for according to predetermined right with point group obtains each object group That answers treats selected characteristic, extract it is corresponding with the object included in each object group described in treat selected characteristic;
Model training unit 303, for using the described to be selected of the object included in each object group for extracting Taking feature carries out model training, obtains model corresponding with each object group.
The device technique effect to be brought that the various embodiments described above are provided is referred to the various embodiments described above and carries The technique effect of the method for confession, is no longer repeated herein.
For convenience of description, it is divided into various units with function during description apparatus above to describe respectively.Certainly, The function of each unit can be realized in same or multiple softwares and/or hardware when the application is implemented.
It should be understood by those skilled in the art that, embodiments of the invention can be provided as method, system or meter Calculation machine program product.Therefore, the present invention can be using complete hardware embodiment, complete software embodiment or knot Close the form of the embodiment in terms of software and hardware.And, the present invention can be used and wherein wrapped at one or more Containing computer usable program code computer-usable storage medium (including but not limited to magnetic disk storage, CD-ROM, optical memory etc.) on implement computer program product form.
The present invention is produced with reference to method according to embodiments of the present invention, equipment (system) and computer program The flow chart and/or block diagram of product is described.It should be understood that can by computer program instructions realize flow chart and / or block diagram in each flow and/or the flow in square frame and flow chart and/or block diagram and/ Or the combination of square frame.These computer program instructions to all-purpose computer, special-purpose computer, insertion can be provided The processor of formula processor or other programmable data processing devices is producing a machine so that by calculating The instruction of the computing device of machine or other programmable data processing devices is produced for realizing in flow chart one The device of the function of being specified in individual flow or multiple one square frame of flow and/or block diagram or multiple square frames.
These computer program instructions may be alternatively stored in can guide computer or the treatment of other programmable datas to set In the standby computer-readable memory for working in a specific way so that storage is in the computer-readable memory Instruction produce include the manufacture of command device, the command device realization in one flow of flow chart or multiple The function of being specified in one square frame of flow and/or block diagram or multiple square frames.
These computer program instructions can be also loaded into computer or other programmable data processing devices, made Obtain and series of operation steps is performed on computer or other programmable devices to produce computer implemented place Reason, so as to the instruction performed on computer or other programmable devices is provided for realizing in flow chart one The step of function of being specified in flow or multiple one square frame of flow and/or block diagram or multiple square frames.
Also, it should be noted that term " including ", "comprising" or its any other variant be intended to non-row His property is included, so that process, method, commodity or equipment including a series of key elements not only include Those key elements, but also other key elements including being not expressly set out, or also include for this process, Method, commodity or the intrinsic key element of equipment.In the absence of more restrictions, by sentence " including One ... " key element that limits, it is not excluded that in the process including the key element, method, commodity or set Also there is other identical element in standby.
It will be understood by those skilled in the art that embodiments herein can be provided as method, system or computer journey Sequence product.Therefore, the application can using complete hardware embodiment, complete software embodiment or combine software and The form of the embodiment of hardware aspect.And, the application can be used and wherein include calculating at one or more Machine usable program code computer-usable storage medium (including but not limited to magnetic disk storage, CD-ROM, Optical memory etc.) on implement computer program product form.
The application can be described in the general context of computer executable instructions, example Such as program module.Usually, program module includes performing particular task or realizes particular abstract data type Routine, program, object, component, data structure etc..This can also in a distributed computing environment be put into practice Application, in these DCEs, by the remote processing devices connected by communication network come Execution task.In a distributed computing environment, program module may be located at including local including storage device In remote computer storage medium.
Each embodiment in this specification is described by the way of progressive, identical phase between each embodiment As part mutually referring to, what each embodiment was stressed be it is different from other embodiment it Place.For especially for system embodiment, because it is substantially similar to embodiment of the method, so description Fairly simple, the relevent part can refer to the partial explaination of embodiments of method.
Embodiments herein is the foregoing is only, the application is not limited to.For this area skill For art personnel, the application can have various modifications and variations.All institutes within spirit herein and principle Any modification, equivalent substitution and improvements of work etc., within the scope of should be included in claims hereof.

Claims (18)

1. a kind of object grouping method, it is characterised in that including:
According to default characteristic value corresponding with each object in the object set of group to be divided, the object is determined The Euclidean distance between any two object in set;
Object in the object set is distributed according to the Euclidean distance in three dimensions;
Based on distribution of the object in the object set in the three dimensions, it is determined that in pre-set radius First kernel object of the number of objects not less than default value in neighborhood;
It is determined that number of objects in the neighborhood of pre-set radius not less than default value and in described first The second kernel object in the field of the pre-set radius of kernel object;
First kernel object and second kernel object are belonged into same groups of objects.
2. method according to claim 1, it is characterised in that by first kernel object and institute State after the second kernel object belongs to same groups of objects, also include:
The object that any one groups of objects is not belonged in the object set is defined as the object that peels off;
It is determined that the nearest groups of objects nearest with the object distance that peels off;
The object that peels off is belonged into the nearest groups of objects.
3. method according to claim 1, it is characterised in that determine second kernel object, Including:
It is determined that number of objects in the neighborhood of pre-set radius not less than default value and from first core The direct density of heart object up to or the second reachable kernel object of density.
4. method according to claim 1, it is characterised in that determine first kernel object it Afterwards, also include:
Judge whether first kernel object can belong to any one groups of objects for having existed;
If it is not, then a newly-built groups of objects and first kernel object is belonged into the newly-built groups of objects.
5. method according to claim 1, it is characterised in that determined according to the default characteristic value Before the Euclidean distance, also include:
Extract corresponding with each object in the object set of group to be divided at least one default characteristic;
Described default characteristic to extracting is normalized, and obtains default characteristic value.
6. a kind of object grouping method, it is characterised in that including:
According to default characteristic value corresponding with each object in the object set of group to be divided, the object is determined The Euclidean distance between any two object in set;
Object in the object set is distributed according to the Euclidean distance in three dimensions;
Based on distribution of the object in the object set in the three dimensions, it is determined that in pre-set radius Kernel object of the number of objects not less than default value in neighborhood, obtains the core pair being made up of kernel object As set;
If the first kernel object in the kernel object set is in the pre-set radius of the second kernel object In field, first kernel object and second kernel object are belonged into same groups of objects.
7. method according to claim 6, it is characterised in that by first kernel object and institute State after the second kernel object belongs to same groups of objects, also include:
To in the object set not be that the object of kernel object is defined as the object that peels off;
It is determined that the nearest groups of objects nearest with the object distance that peels off;
The object that peels off is belonged into the nearest groups of objects.
8. method according to claim 6, it is characterised in that if in the kernel object set First kernel object be in the second kernel object pre-set radius field in, will first kernel object with Second kernel object belongs to same groups of objects, including:
If the first kernel object in the kernel object set can from the direct density of the second kernel object Up to or density it is reachable, first kernel object and second kernel object are belonged into same object Group.
9. a kind of model training method, it is characterised in that including:
Using the object grouping method described in any one in the claim 1-8 by the object set of group to be divided Object in conjunction carries out a point group;
According to it is predetermined it is corresponding with point group obtains each object group treat selected characteristic, extract and each The object that is included in groups of objects is corresponding described to treat selected characteristic;
Treat that selected characteristic carries out model training using described in the object included in each object group for extracting, obtain To model corresponding with each object group.
10. a kind of object grouping device, it is characterised in that including:
First determining unit, for basis default spy corresponding with each object in the object set of group to be divided Value indicative, determines the Euclidean distance between any two object in the object set;
Distribution unit, for by the object in the object set according to the Euclidean distance in three dimensions It is distributed;
Second determining unit, for the dividing in the three dimensions based on the object in the object set Cloth, it is determined that first kernel object of the number of objects not less than default value in the neighborhood of pre-set radius;
3rd determining unit, for determining that the number of objects in the neighborhood of pre-set radius is not less than default value And the second kernel object in the field of pre-set radius in first kernel object;
Divide group unit, it is same right for first kernel object and second kernel object to be belonged to As group.
11. devices according to claim 10, it is characterised in that described device also includes:
Peel off object determining unit, for will not belong to the right of any one groups of objects in the object set As being defined as the object that peels off;
Nearest groups of objects determining unit, for determining the nearest groups of objects nearest with the object distance that peels off;
Ownership unit, for the object that peels off to be belonged into the nearest groups of objects.
12. devices according to claim 10, it is characterised in that the 3rd determining unit is specific For:
It is determined that number of objects in the neighborhood of pre-set radius not less than default value and from first core The direct density of heart object up to or the second reachable kernel object of density.
13. devices according to claim 10, it is characterised in that described device also includes:
Judging unit, for it is determined that after first kernel object, judging that first kernel object is It is no to belong to any one groups of objects for having existed;
Newly-built unit, it is right for any one that existed cannot to be belonged in first kernel object During as group, first kernel object is simultaneously belonged to the newly-built groups of objects by a newly-built groups of objects.
14. devices according to claim 10, it is characterised in that described device also includes:
Extraction unit, for it is determined that before the Euclidean distance, extract and the object set of group to be divided in The corresponding at least one default characteristic of each object;
Normalization unit, for being normalized to the described default characteristic extracted, is preset Characteristic value.
A kind of 15. object grouping devices, it is characterised in that including:
First determining unit, for basis default spy corresponding with each object in the object set of group to be divided Value indicative, determines the Euclidean distance between any two object in the object set;
Distribution unit, for by the object in the object set according to the Euclidean distance in three dimensions It is distributed;
Kernel object determining unit, for based on the object in the object set in the three dimensions Distribution, it is determined that kernel object of the number of objects not less than default value in the neighborhood of pre-set radius, obtains The kernel object set being made up of kernel object;
Divide group unit, the second kernel object is in for the first kernel object in the kernel object set Pre-set radius field in when, first kernel object and second kernel object are belonged to same Individual groups of objects.
16. devices according to claim 15, it is characterised in that described device also includes:
Peel off object determining unit, for will in the object set not be that the object of kernel object is defined as Peel off object;
Nearest groups of objects determining unit, for determining the nearest groups of objects nearest with the object distance that peels off;
Ownership unit, for the object that peels off to be belonged into the nearest groups of objects.
17. devices according to claim 15, it is characterised in that described point of group unit specifically for:
If the first kernel object in the kernel object set can from the direct density of the second kernel object Up to or density it is reachable, first kernel object and second kernel object are belonged into same object Group.
A kind of 18. model training apparatus, it is characterised in that including:
Object grouping device in the claim 10-17 described in any one;
Training characteristics extraction unit, for according to predetermined corresponding with point group obtains each object group Treat selected characteristic, extract it is corresponding with the object included in each object group described in treat selected characteristic;
Model training unit,
For treating that selected characteristic carries out model instruction using described in the object included in each object group for extracting Practice, obtain model corresponding with each object group.
CN201510927700.XA 2015-12-14 2015-12-14 object grouping method, model training method and device Pending CN106874925A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510927700.XA CN106874925A (en) 2015-12-14 2015-12-14 object grouping method, model training method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510927700.XA CN106874925A (en) 2015-12-14 2015-12-14 object grouping method, model training method and device

Publications (1)

Publication Number Publication Date
CN106874925A true CN106874925A (en) 2017-06-20

Family

ID=59178627

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510927700.XA Pending CN106874925A (en) 2015-12-14 2015-12-14 object grouping method, model training method and device

Country Status (1)

Country Link
CN (1) CN106874925A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019096176A1 (en) * 2017-11-14 2019-05-23 深圳码隆科技有限公司 Method and system for learning data processing, and electronic device
TWI709927B (en) * 2017-12-06 2020-11-11 開曼群島商創新先進技術有限公司 Method and device for determining target user group

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103559630A (en) * 2013-10-31 2014-02-05 华南师范大学 Customer segmentation method based on customer attribute and behavior characteristic analysis
CN104200529A (en) * 2014-08-12 2014-12-10 电子科技大学 Three dimensional target body surface reconstruction method based on uncertainty
CN104239324A (en) * 2013-06-17 2014-12-24 阿里巴巴集团控股有限公司 Methods and systems for user behavior based feature extraction and personalized recommendation
CN104899899A (en) * 2015-06-12 2015-09-09 天津大学 Color quantification method based on density peak value
CN105069534A (en) * 2015-08-18 2015-11-18 广州华多网络科技有限公司 Customer loss prediction method and device
CN105096268A (en) * 2015-07-13 2015-11-25 西北农林科技大学 Denoising smoothing method of point cloud

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104239324A (en) * 2013-06-17 2014-12-24 阿里巴巴集团控股有限公司 Methods and systems for user behavior based feature extraction and personalized recommendation
CN103559630A (en) * 2013-10-31 2014-02-05 华南师范大学 Customer segmentation method based on customer attribute and behavior characteristic analysis
CN104200529A (en) * 2014-08-12 2014-12-10 电子科技大学 Three dimensional target body surface reconstruction method based on uncertainty
CN104899899A (en) * 2015-06-12 2015-09-09 天津大学 Color quantification method based on density peak value
CN105096268A (en) * 2015-07-13 2015-11-25 西北农林科技大学 Denoising smoothing method of point cloud
CN105069534A (en) * 2015-08-18 2015-11-18 广州华多网络科技有限公司 Customer loss prediction method and device

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
王海燕: "基于抽样矩阵的汽车客户分群及离群点分析", 《中国优秀硕士学位论文全文数据库 信息科技辑 I138-2143》 *
韩楠: "云环境下基于RIHDBSCAN的微博事件检测及跟踪", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019096176A1 (en) * 2017-11-14 2019-05-23 深圳码隆科技有限公司 Method and system for learning data processing, and electronic device
TWI709927B (en) * 2017-12-06 2020-11-11 開曼群島商創新先進技術有限公司 Method and device for determining target user group

Similar Documents

Publication Publication Date Title
Zhu et al. Brain tumor segmentation based on the fusion of deep semantics and edge information in multimodal MRI
WO2022001623A1 (en) Image processing method and apparatus based on artificial intelligence, and device and storage medium
CN108090516A (en) Automatically generate the method and system of the feature of machine learning sample
US11574430B2 (en) Method and system for creating animal type avatar using human face
TW201939400A (en) Method and device for determining group of target users
CN109034398A (en) Feature selection approach, device and storage medium based on federation's training
CN109564575A (en) Classified using machine learning model to image
CN108009627A (en) Neutral net instruction set architecture
TW202040585A (en) Method and apparatus for automated target and tissue segmentation using multi-modal imaging and ensemble machine learning models
CN107278310A (en) Batch normalizes layer
CN108287864A (en) A kind of interest group division methods, device, medium and computing device
CN111275784B (en) Method and device for generating image
CN104933428B (en) A kind of face identification method and device based on tensor description
US11663282B2 (en) Taxonomy-based system for discovering and annotating geofences from geo-referenced data
CN107766946A (en) Generate the method and system of the assemblage characteristic of machine learning sample
CN108008942A (en) The method and system handled data record
CN108230346A (en) For dividing the method and apparatus of image semantic feature, electronic equipment
Yang et al. Structural correlation between communities and core-periphery structures in social networks: Evidence from Twitter data
CN106875401A (en) The analysis method of multi-modal image group, device and terminal
CN109948680A (en) The classification method and system of medical record data
CN109274639A (en) The recognition methods of open platform abnormal data access and device
CN110798467A (en) Target object identification method and device, computer equipment and storage medium
CN109272402A (en) Modeling method, device, computer equipment and the storage medium of scorecard
CN108830100A (en) Privacy of user leakage detection method, server and system based on multi-task learning
CN112669143A (en) Risk assessment method, device and equipment based on associated network and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20170620