CN104933445A - Mass image classification method based on distributed K-means - Google Patents

Mass image classification method based on distributed K-means Download PDF

Info

Publication number
CN104933445A
CN104933445A CN201510363396.0A CN201510363396A CN104933445A CN 104933445 A CN104933445 A CN 104933445A CN 201510363396 A CN201510363396 A CN 201510363396A CN 104933445 A CN104933445 A CN 104933445A
Authority
CN
China
Prior art keywords
distributed
cluster centre
picture
input
image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201510363396.0A
Other languages
Chinese (zh)
Other versions
CN104933445B (en
Inventor
董乐
张宁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Electronic Science and Technology of China
Original Assignee
University of Electronic Science and Technology of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Electronic Science and Technology of China filed Critical University of Electronic Science and Technology of China
Priority to CN201510363396.0A priority Critical patent/CN104933445B/en
Publication of CN104933445A publication Critical patent/CN104933445A/en
Application granted granted Critical
Publication of CN104933445B publication Critical patent/CN104933445B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a mass image classification method based on distributed K-means, and belongs to the technical field of machine learning and image processing. The mass image classification method based on distributed K-means can be applied to large-scale image classification, adopts the distributed K-means algorithm to extract image characteristics on a big data processing platform Hadoop, and finally achieves the purpose of classifying large-scale images. According to the invention, through the design of performing dictionary learning on large-scale image data, and constructing a characteristic mapping function and a classification algorithm, a characteristic extracting algorithm based on the distributed K-means is provided on the basis of the big data processing platform Hadoop. The method avoids the tedious work of manually designing large-scale image characteristics, and reduces training time under the premise of ensuring classification accuracy; and the achievement of the invention has significant meanings in aspects of large-scale database management, military and medical treatment.

Description

A kind of large nuber of images sorting technique based on distributed K-means
Technical field
The invention belongs to machine learning with figurepicture processing technology field, relates to the magnanimity on distributed platform figurepicture process, particularly relates to a kind of large nuber of images sorting technique based on distributed K-means.
Background technology
In recent years, clustering algorithm is widely used in daily life.Commercially, clustering algorithm contributes to analyst and extract specific consumption information from various consumer database, and summarizes the consumption mode embodied in consumption information.Clustering algorithm is a pith in Data Mining, usually the feature representation of the profound level in database can be found as a good instrument, simultaneously, can summarize the feature of each particular category, the most important thing is, clustering algorithm can as the pre-treatment step of each algorithm in Data Mining.Along with figurethe continuous increase in picture storehouse, complexity constantly increases, and the feature that the extraction of unit artificially designs can not satisfy the demands far away, and use parallel processing is undoubtedly a good solution.Large data processing platform (DPP) Hadoop, as the realization of increasing income of Map-Reduce framework, is mainly used in the parallel computation of large-scale dataset, because framework is simple, can effectively support data-intensive applications.The present invention, just on the basis of large data processing platform (DPP) Hadoop, by unit K-means Algorithm parallelization, to the parallel data processing of input, has designed and Implemented based on distributed K-means's figurepicture feature extraction algorithm.
Summary of the invention
The present invention will solve on a large scale figurethe feature extraction problem of picture, thus reach figurethe object of picture classification, for figurethe accuracy of picture classification, proposes a kind of large nuber of images sorting technique based on distributed K-means, research realizes on the basis of large data processing platform (DPP) Hadoop, proposition parallelization figurepicture feature extraction algorithm, figuremany classification problems of picture, adopt DAG-SVM sorter to complete final figurepicture classification.
The present invention is by the following technical solutions to achieve these goals:
a kind of large nuber of images sorting technique based on distributed K-means, its flow process as figureshown in 1, specifically comprise the following steps:
Step 1. is trained figurepicture pre-service;
Input training figurepicture data set, and often will open training figurepicture is divided into multiple figurepicture block, to each figurecarry out regularization and whitening operation successively to remove interfere information, to retain key message as block, give next step process as input information;
Step 2. on large data processing platform (DPP) Hadoop, by K-means Algorithm parallelization, pretreated step 1 gained figurepicture block, as input, carries out the extraction of dictionary;
After step 3. extracts dictionary, construction feature mapping function, by pretreated training figurenew feature representation is mapped as block;
The training that step 3 obtains by step 4. figurenew feature representation as block is input in SVM classifier, carries out figurepicture classification based training;
Step 5. is for the target needing to carry out classifying figurepicture, carries out successively by it figureafter the division of picture block, regularization and whitening operation, complete described in utilization figuresVM classifier as classification based training is classified.
Further, the regularization operation described in step 1 is specific as follows:
x ~ ( i ) = x ( i ) - m e a n ( x ( i ) ) ν a r ( x ( i ) ) + σ - - - ( 1 )
Wherein x (i)it is i-th of input figurepicture block, var (i)and mean (i)x respectively (i)the variance of middle all elements and mean value; σ is a predetermined constant, the operation before division is being carried out in its effect, decrease noise and prevent variance level off to zero time, prevent divisor from being zero, the span for pixel value is [0,255], the general value of σ is 10 can reach good effect, its obtaining value method is generally judged by concrete effect by testing, and detailed process arranges one by experience to be worth relatively preferably, then adjust by experiment.
Further, each to regularization of PCA whitening approach is adopted figurepicture block carry out the process reducing correlativity between pixel:
x r o t ( i ) = ( U ( i ) ) T x ~ ( i ) - - - ( 2 )
x P C A w h i t e ( i ) = x r o t ( i ) λ ( i ) + ϵ - - - ( 3 )
Wherein, λ (i)and U (i)be respectively figurepicture block eigenwert and proper vector, the effect of formula (2) reduces input figurecorrelativity between the pixel of sheet, is obtained after albefaction by formula (3) figurepicture blocks of data, ε is preset constant, and its effect is can be level and smooth figurepicture data, reach and put forward high performance object, the value of ε is generally smaller value, the same σ of its obtaining value method.
Further, the dictionary leaching process described in step 2 is specific as follows:
Pretreated through step 1 figurepicture block is as the input of Map node, and first initialization cluster centre, the reading of multiple Map nodal parallel is pretreated figurepicture data, and dispensed is to the element of each cluster centre, afterwards on Reduce node, add up all elements of each classification, recalculate new cluster centre, whether the change contrasting new cluster centre and cluster centre is before less than the threshold value of setting, if be less than, then iteration terminates, and exports cluster centre, otherwise renewal cluster centre, restarts new one and takes turns iterative process;
Further, the detailed process described in step 3 is as follows:
The dictionary parallelization that step 2 is obtained distribute to multiple Map node, input new for label simultaneously figurepicture data set gives each Map node, on Map node figurecarry out feature learning as data set, will input figurecarry out as data the feature that Feature Mapping obtains, formula is as follows:
f ( i ) ( x ) = [ f 1 ( i ) ( x ) , ... , f k ( i ) ( x ) , ... , f N ( i ) ( x ) ] , k = 1... N f k ( i ) ( x ) = max { 0 , μ ( i ) ( z ) - z k ( i ) } z k ( i ) = | | x P C A w h i t e ( i ) - c ( k ) | | 2 μ ( i ) ( z ) = ( z 1 ( i ) + ... + z k ( i ) + ... + z N ( i ) ) / N - - - ( 5 )
Wherein, f (i)x () is figurepicture block new feature representation, N be step 2 extract dictionary cluster centre sum, c (k)a kth cluster centre; This formula shows as feature f kto cluster centre c (k)distance when exceeding average, this Feature Mapping function will export 0.
Further, in technique scheme, obtaining figureafter picture feature, due to right figureit is one that picture carries out classification figuremany classification problems of picture, therefore step 4 and step 5 adopt DAG-SVM sorter to carry out last training and assorting process.
The invention has the beneficial effects as follows:
The present invention exists figureon the basis of picture feature extraction algorithm, unsupervised learning method K-means is adopted to carry out the study of feature, because the training parameter of K-means decreases a lot relative to traditional unsupervised learning method, therefore, this algorithm, under the prerequisite ensureing classify accuracy, greatly reduces complicated classification degree; Meanwhile, the present invention, on the basis of large data processing platform (DPP) hadoop, to every layer of process parallelization of degree of depth level feature learning, reduces time cost and resource overhead.
Accompanying drawing explanation
figure1 based on distributed K-means's figurepicture sorting technique flow process framework figure.
figure2 based on distributed K-means's figureas the flow process extracting dictionary in sorting technique figure.
figure3 figureillustrate as many classification problems assorting process figure.
figure4 whitening operation are on the impact of dictionary.
figure5Hadoop network topology figure.
Embodiment
In order to make object of the present invention, technical scheme and beneficial effect clearly understand, below in conjunction with example, and with reference to attached figure, the present invention is described in more detail
The present invention can be used on a large scale figurepicture classification, the method adopts distributed K-means algorithm to extract on large data processing platform (DPP) Hadoop figurepicture feature, final realizes on a large scale figurepicture carries out the object of classifying; The present invention is by analyzing figurethe newest research results of the picture association area such as treatment technology and machine learning, on a large scale figurecarry out the study of dictionary as data, the design of construction feature mapping function and sorting algorithm, propose on large data processing platform (DPP) Hadoop basis, based on the feature extraction algorithm of distributed K-means.This method avoid artificial design extensive figurethe tedious work of picture feature, under the prerequisite ensureing classify accuracy, decrease the training time, achievement of the present invention has great significance in large-scale data library management, military affairs, medical treatment etc.
Example
The test experiments hardware environment of the present embodiment is as follows, and experiment topology figureas figureshown in 5:
Hardware environment:
Computer type: desktop computer;
CPU:Pentium(R)Dual-Core CPU E56002.93GHz
Internal memory: 4.00GB (3.49GB can use)
System type: 32-bit operating system
Display card: integrated graphics card
Software environment:
IDE:Eclipse
figurepicture treatment S DK:JavaCV
Development language: Java;
As figure1 the present invention be directed on a large scale figurewhat picture was classified is feature extraction algorithm, comprises the steps:
Step 1. is trained figurepicture pre-service;
Input training figurepicture data set, and often will open training figurepicture is divided into multiple figurepicture block, to each figurecarry out regularization and whitening operation successively to remove interfere information, to retain key message as block, give next step process as input information;
Step 2. on large data processing platform (DPP) Hadoop, by K-means Algorithm parallelization, pretreated step 1 gained figurepicture block message, as input, carries out the extraction of dictionary;
As figurethe process that distributed K-means algorithm extracts dictionary shown in 2: first initialization cluster centre, the reading of multiple Map nodal parallel is pretreated figurepicture data, and dispensed is to the element of each cluster centre, afterwards on Reduce node, add up all elements of each classification, recalculate new cluster centre, whether the change contrasting new cluster centre and cluster centre is before less than the threshold value of setting, if be less than, then iteration terminates, and exports cluster centre, otherwise renewal cluster centre, restarts new one and takes turns iterative process;
After step 3. extracts dictionary, construction feature mapping function, by pretreated training figurea new feature representation is mapped as block; In computer vision, a lot of Feature Mapping function is had to calculate the required time and storage resources is very huge, all need the optimization problem that solution one is very complicated, document is had to prove to adopt the method for sparse coding can reach good effect to carry out Feature Mapping, but when tape label figurewhen picture data are considerably less, sparse coding then shows certain limitation, on a large scale figurepicture carries out in the process of feature extraction, the people such as A.Coates [A.Coates, A.Y.Ng, H.Lee.An analysis of single-layer networks in unsupervised feature learning [C] .International Conference on Artificial Intelligence and Statistics, 2011:215 – 223.] demonstrate above-mentioned formula (5) good effect can have been reached, therefore, the present embodiment, after extracting dictionary, adopts the process having carried out feature extraction in this way;
The training that step 3 obtains by step 4. figurenew feature representation as block is input in SVM classifier, carries out figurepicture classification based training;
It is theoretical that Cortes and Vapnik first proposed support vector machine (Support Vector Machine) in nineteen ninety-five, this theory can be good at the problem solving small sample and non-linear and high dimensional pattern identification etc., and can be generalized in the machine learning fields such as Function Fitting widely; Support vector machine method is that the Corpus--based Method theories of learning and Structural risk minization principle realize, under the prerequisite of limited sample information, be seek best compromise between its complicacy and learning ability at model, thus obtain best Generalization Ability or generalization ability; In algorithm of the present invention, obtaining tape label figureafter the feature of picture data, can will obtain figurepicture characteristic sum is corresponding figureimage scale label are input to the process of carrying out training classifier in SVM classifier, on a large scale figurepicture classification problem, due to figurethe classification of picture is many, and therefore can be regarded as many classification problems here, this example DAG-SVM sorting technique solves figuremany classification problems of picture, concrete assorting process as figureshown in 3; In this example, data set ImageNet has shared 50 classes, and data set CIFAR-100 has 10 classes, data set STL-10 has 10 classes, here for data set ImageNet, n=50, need 49 sorters, for data set CIFAR-100, n=10, need 9 sorters, for data set STL-10, n=10, need 49 sorters, this not only accelerates classification speed, and avoids the phenomenon of classification overlap and unclassified;
Step 5. is for the target needing to carry out classifying figurepicture, carries out successively by it figureafter the division of picture block, regularization, whitening operation and feature extraction, complete described in utilization figuresVM classifier as classification based training is classified.
In order to verify effect of the present invention, the present embodiment has done experiment respectively on large-scale dataset ImageNet, CIFAR-100 and STL-10, have selected 50 classes from ImageNet data centralization, totally 60,000 figurepicture, wherein, 40,000 is used as training dataset, and remaining is used as test data set; The whole data set of CIFAR-10 is tested, comprises 50,000 figuresheet is trained, 10,000 figuresheet is tested; The whole data set of STL-10 is tested, has 10 different types of 96 × 96 pixels figureimage set, each class has 500 training figurepicture and 800 tests figurepicture.On these three data sets, we reach extraordinary classifying quality.
figure4 illustrate whitening operation to the impact learning the dictionary obtained, as figureshown in 4 (a), be never carry out whitening operation figureas in data by dictionary (cluster centre) that K-means Algorithm Learning obtains, can find out, because the correlativity between pixel is very large, the dictionary (cluster centre) obtained by K-means Algorithm Learning is with regard to height correlation, therefore, the dictionary (cluster centre) of this height correlation exists figureeffect in picture classification task can non-constant, from figurecan see in 4 (c), from through whitening operation figureas in data by the dictionary (cluster centre) that K-means Algorithm Learning obtains, eliminate the correlativity between pixel by whitening operation, obtain the orthogonality of dictionary (cluster centre) just relatively better, therefore exist figureapply as in classification problem, its effect will be fine.
figure4 (b) depicts the impact of whitening operation on K-means algorithm, its left side does not carry out whitening operation, dictionary (cluster centre) can because related data causes certain deviation, carried out whitening operation on the right of it, the dictionary (cluster centre) obtained by K-means algorithm just has more good orthogonality.
table 1illustrate whitening operation pair figurethe impact of picture classify accuracy, on data set ImageNet and data set CIFAR-100, describes based on distributed K-means's figureas feature extraction algorithm in characteristic extraction procedure, whitening operation pair figurethe impact of picture classify accuracy, compared for respectively from original figureextracting directly in picture data figurepicture characteristic sum is from after whitening operation figurepicture extracting data figurethe accuracy of picture feature, as can be seen from data, on data set ImageNet, from after whitening operation figurepicture extracting data figureobtain as feature figure70.19% as classify accuracy, than directly from original figurepicture extracting data figureobtain as feature figurepicture classify accuracy is high 7.71%, on data set CIFAR-100, from the extracting data after whitening operation figurepicture feature, its figurecan 55.38% be reached as classify accuracy, and from original figurepicture extracting data figurepicture feature, figureonly reach 48.07% as classify accuracy, can be obtained by above data analysis, whitening operation pair figurevital effect is had as classify accuracy.
The present embodiment on data set STL-10 to based on distributed K-means's figureperformance as feature extraction algorithm is tested, by more final figurethe validity of this algorithm is verified in the accuracy of picture classification. as table 2shown in, on data set STL-10 figurethe comparison of picture classify accuracy, the inventive method figure56.17% is reached as classify accuracy, than SC Features, K-means encoding [2] is high 0.17%, higher by 1.27% than VQ (1layer) [1], higher by 2.67% than Sparse Filtering [3], higher by 3.27% than Reconstruction ICA [2], as can be seen from these data, the inventive method exists figureas classify accuracy having obvious advantage.On data set STL-10, Sparse coding's [1] figure59.0% is reached as classify accuracy, compare method height herein 2.83%, the method that the present invention proposes realizes on the basis of large data processing platform (DPP) Hadoop, its inter-process mechanism relates to multiple Map node and Reduce node operates the distribution operation of data and convergence, can in accuracy, have certain losing, due to based on large data platform Hadoop's figurepicture feature extraction algorithm, needs extensive figurethe training of picture data, therefore, on relatively little data set STL-10, method of the present invention shows slightly not enough in accuracy.
table 1whitening operation pair figurethe impact of picture classify accuracy
table 2on data set STL-10 figurethe comparison of picture classify accuracy
Algorithm Accuracy
The method of the present embodiment 56.17%
Sparse Coding 59.0%
SC Feature,K-means encoding 56.0%
VQ(1layer) 54.9%
Sparse Filtering 53.5%
Reconstruction ICA 52.9%
The present embodiment relevant references is as follows:
[1]A.Coates,A.Y.Ng.The importance of encoding versus training with sparse coding and vector quantization[C].International Conference on Machine Learning,2011:921–928.
[2]Q.V.Le,A.Karpenko,J.Ngiam,et al.Ica with reconstruction cost for efficient overcompletefeature learning[C].Advances in Neural Information Processing System,2011:1017–1025.
[3]J.Ngiam,Z.Chen,B.S.A.,et al.Sparse filtering[C].Advances in Neural Information Processing System,2011:1125–1133。

Claims (6)

1., based on a large nuber of images sorting technique of distributed K-means, specifically comprise the steps:
The pre-service of step 1. training image;
Input training image data set, and be divided into multiple image block by often opening training image, carries out regularization and whitening operation successively to remove interfere information, to retain key message to each image block, gives next step process as input information;
Step 2., on large data processing platform (DPP) Hadoop, by K-means Algorithm parallelization, using the pretreated image block information of step 1 gained as input, carries out the extraction of dictionary;
After step 3. extracts dictionary, construction feature mapping function, is mapped as new feature representation by pretreated training image blocks;
The new feature representation of the training image blocks that step 3 obtains by step 4. is input in SVM classifier, carries out Images Classification training;
Step 5. is for the target image needing to carry out classifying, and after it is carried out image block division, regularization, whitening operation and feature extraction successively, the SVM classifier completing Images Classification training described in utilization is classified.
2. the large nuber of images sorting technique based on distributed K-means according to claim 1, is characterized in that, the regularization operation described in step 1 is specific as follows:
Wherein x (i)i-th image block of input, var (i)and mean (i)x respectively (i)the variance of middle all elements and mean value; σ is a predetermined constant, and the operation before division is being carried out in its effect, decrease noise and prevent variance level off to zero time, prevent divisor from being zero.
3. the large nuber of images sorting technique based on distributed K-means according to claim 1, is characterized in that, adopts PCA whitening approach to each image block of regularization carry out the process reducing correlativity between pixel:
Wherein, λ (i)and U (i)image block respectively eigenwert and proper vector, the effect of formula (2) be reduce input picture pixel between correlativity, obtain the image block data after albefaction by formula (3), ε is preset constant.
4. the large nuber of images sorting technique based on distributed K-means according to claim 1, is characterized in that, the dictionary leaching process described in step 2 is specific as follows:
Through the input of the pretreated image block of step 1 as Map node, first initialization cluster centre, the pretreated view data of reading of multiple Map nodal parallel, and dispensed is to the element of each cluster centre, afterwards on Reduce node, add up all elements of each classification, recalculate new cluster centre, whether the change contrasting new cluster centre and cluster centre is before less than the threshold value of setting, if be less than, then iteration terminates, and exports cluster centre, otherwise renewal cluster centre, restarts new one and takes turns iterative process.
5. the large nuber of images sorting technique based on distributed K-means according to claim 1, it is characterized in that, the detailed process described in step 3 is as follows:
The dictionary parallelization that step 2 is obtained distribute to multiple Map node, input the new image data set without label to each Map node simultaneously, carry out feature learning to the image data set on Map node, input image data is carried out the feature that Feature Mapping obtains, formula is as follows:
Wherein, f (i)x () is image block new feature representation, N be step 2 extract dictionary cluster centre sum, c (k)it is a kth cluster centre.
6. the large nuber of images sorting technique based on distributed K-means according to claim 1, is characterized in that, step 4 and step 5 adopt DAG-SVM sorter to carry out last training and assorting process.
CN201510363396.0A 2015-06-26 2015-06-26 A kind of large nuber of images classification method based on distributed K-means Expired - Fee Related CN104933445B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510363396.0A CN104933445B (en) 2015-06-26 2015-06-26 A kind of large nuber of images classification method based on distributed K-means

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510363396.0A CN104933445B (en) 2015-06-26 2015-06-26 A kind of large nuber of images classification method based on distributed K-means

Publications (2)

Publication Number Publication Date
CN104933445A true CN104933445A (en) 2015-09-23
CN104933445B CN104933445B (en) 2019-05-14

Family

ID=54120605

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510363396.0A Expired - Fee Related CN104933445B (en) 2015-06-26 2015-06-26 A kind of large nuber of images classification method based on distributed K-means

Country Status (1)

Country Link
CN (1) CN104933445B (en)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105718935A (en) * 2016-01-25 2016-06-29 南京信息工程大学 Word frequency histogram calculation method suitable for visual big data
CN106203508A (en) * 2016-07-11 2016-12-07 天津大学 A kind of image classification method based on Hadoop platform
CN106355202A (en) * 2016-08-31 2017-01-25 广州精点计算机科技有限公司 Image feature extraction method based on K-means clustering
CN106777347A (en) * 2017-01-17 2017-05-31 广东容祺智能科技有限公司 A kind of unmanned plane power-line patrolling big data processing system
CN107122653A (en) * 2017-05-11 2017-09-01 湖南星汉数智科技有限公司 A kind of picture validation code processing method and processing device
CN107545271A (en) * 2016-06-29 2018-01-05 阿里巴巴集团控股有限公司 Image-recognizing method, device and system
WO2018027459A1 (en) * 2016-08-08 2018-02-15 深圳市博信诺达经贸咨询有限公司 Method and system for classifying and comparing application in big data
CN110175546A (en) * 2019-05-15 2019-08-27 深圳市商汤科技有限公司 Image processing method and device, electronic equipment and storage medium
CN110277166A (en) * 2019-06-28 2019-09-24 曾清福 A kind of palace laparoscope assistant diagnosis system and method
CN112507895A (en) * 2020-12-14 2021-03-16 广东电力信息科技有限公司 Method and device for automatically classifying qualification certificate files based on big data analysis

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103390165A (en) * 2012-05-10 2013-11-13 北京百度网讯科技有限公司 Picture clustering method and device
US20130322740A1 (en) * 2012-05-31 2013-12-05 Lihui Chen Method of Automatically Training a Classifier Hierarchy by Dynamic Grouping the Training Samples
CN103473121A (en) * 2013-08-20 2013-12-25 西安电子科技大学 Mass image parallel processing method based on cloud computing platform
CN103955707A (en) * 2014-05-04 2014-07-30 电子科技大学 Mass image sorting system based on deep character learning
CN104199899A (en) * 2014-08-26 2014-12-10 浪潮(北京)电子信息产业有限公司 Method and device for storing massive pictures based on Hbase

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103390165A (en) * 2012-05-10 2013-11-13 北京百度网讯科技有限公司 Picture clustering method and device
US20130322740A1 (en) * 2012-05-31 2013-12-05 Lihui Chen Method of Automatically Training a Classifier Hierarchy by Dynamic Grouping the Training Samples
CN103473121A (en) * 2013-08-20 2013-12-25 西安电子科技大学 Mass image parallel processing method based on cloud computing platform
CN103955707A (en) * 2014-05-04 2014-07-30 电子科技大学 Mass image sorting system based on deep character learning
CN104199899A (en) * 2014-08-26 2014-12-10 浪潮(北京)电子信息产业有限公司 Method and device for storing massive pictures based on Hbase

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105718935A (en) * 2016-01-25 2016-06-29 南京信息工程大学 Word frequency histogram calculation method suitable for visual big data
CN107545271B (en) * 2016-06-29 2021-04-09 阿里巴巴集团控股有限公司 Image recognition method, device and system
CN107545271A (en) * 2016-06-29 2018-01-05 阿里巴巴集团控股有限公司 Image-recognizing method, device and system
CN106203508A (en) * 2016-07-11 2016-12-07 天津大学 A kind of image classification method based on Hadoop platform
WO2018027459A1 (en) * 2016-08-08 2018-02-15 深圳市博信诺达经贸咨询有限公司 Method and system for classifying and comparing application in big data
CN106355202A (en) * 2016-08-31 2017-01-25 广州精点计算机科技有限公司 Image feature extraction method based on K-means clustering
CN106777347A (en) * 2017-01-17 2017-05-31 广东容祺智能科技有限公司 A kind of unmanned plane power-line patrolling big data processing system
CN107122653A (en) * 2017-05-11 2017-09-01 湖南星汉数智科技有限公司 A kind of picture validation code processing method and processing device
CN110175546A (en) * 2019-05-15 2019-08-27 深圳市商汤科技有限公司 Image processing method and device, electronic equipment and storage medium
WO2020228163A1 (en) * 2019-05-15 2020-11-19 深圳市商汤科技有限公司 Image processing method and apparatus, and electronic device and storage medium
JP2021528715A (en) * 2019-05-15 2021-10-21 シェンチェン センスタイム テクノロジー カンパニー リミテッドShenzhen Sensetime Technology Co.,Ltd Image processing methods and devices, electronic devices and storage media
CN110175546B (en) * 2019-05-15 2022-02-25 深圳市商汤科技有限公司 Image processing method and device, electronic equipment and storage medium
JP7128906B2 (en) 2019-05-15 2022-08-31 シェンチェン センスタイム テクノロジー カンパニー リミテッド Image processing method and apparatus, electronic equipment and storage medium
CN110277166A (en) * 2019-06-28 2019-09-24 曾清福 A kind of palace laparoscope assistant diagnosis system and method
CN112507895A (en) * 2020-12-14 2021-03-16 广东电力信息科技有限公司 Method and device for automatically classifying qualification certificate files based on big data analysis

Also Published As

Publication number Publication date
CN104933445B (en) 2019-05-14

Similar Documents

Publication Publication Date Title
CN104933445A (en) Mass image classification method based on distributed K-means
Xu et al. Overfitting remedy by sparsifying regularization on fully-connected layers of CNNs
CN103955707B (en) A kind of large nuber of images categorizing system based on depth level feature learning
Masud et al. Facing the reality of data stream classification: coping with scarcity of labeled data
Boutsidis et al. Random projections for $ k $-means clustering
CN111898703B (en) Multi-label video classification method, model training method, device and medium
WO2022048363A1 (en) Website classification method and apparatus, computer device, and storage medium
Yang et al. An ensemble classification algorithm for convolutional neural network based on AdaBoost
CN105678261B (en) Based on the direct-push Method of Data with Adding Windows for having supervision figure
CN111488917A (en) Garbage image fine-grained classification method based on incremental learning
Zhao et al. Bisecting k-means clustering based face recognition using block-based bag of words model
CN114913379B (en) Remote sensing image small sample scene classification method based on multitasking dynamic contrast learning
Jia et al. Adaptive neighborhood propagation by joint L2, 1-norm regularized sparse coding for representation and classification
Xing et al. Oracle bone inscription detection: a survey of oracle bone inscription detection based on deep learning algorithm
Dong et al. Feature extraction through contourlet subband clustering for texture classification
Lu et al. A novel travel-time based similarity measure for hierarchical clustering
CN112668482A (en) Face recognition training method and device, computer equipment and storage medium
Maddumala A Weight Based Feature Extraction Model on Multifaceted Multimedia Bigdata Using Convolutional Neural Network.
CN108388918B (en) Data feature selection method with structure retention characteristics
CN106033546B (en) Behavior classification method based on top-down learning
CN102930258A (en) Face image recognition method
CN102496027B (en) Semi-supervised image classification method based on constrained adaptive transmission
Xu et al. X2-Softmax: Margin adaptive loss function for face recognition
Zong et al. Incomplete multi-view clustering with partially mapped instances and clusters
Liu et al. Stochastic gradient support vector machine with local structural information for pattern recognition

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20190514

CF01 Termination of patent right due to non-payment of annual fee