WO2006087854A1 - 情報分類装置、情報分類方法、情報分類プログラム、情報分類システム - Google Patents
情報分類装置、情報分類方法、情報分類プログラム、情報分類システム Download PDFInfo
- Publication number
- WO2006087854A1 WO2006087854A1 PCT/JP2005/021095 JP2005021095W WO2006087854A1 WO 2006087854 A1 WO2006087854 A1 WO 2006087854A1 JP 2005021095 W JP2005021095 W JP 2005021095W WO 2006087854 A1 WO2006087854 A1 WO 2006087854A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- information
- population
- classification
- distance
- statistical
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims description 188
- 238000011156 evaluation Methods 0.000 claims abstract description 233
- 238000004364 calculation method Methods 0.000 claims description 99
- 239000013598 vector Substances 0.000 claims description 95
- 239000011159 matrix material Substances 0.000 claims description 74
- 238000004458 analytical method Methods 0.000 claims description 34
- 238000004891 communication Methods 0.000 claims description 29
- 238000012384 transportation and delivery Methods 0.000 claims description 28
- 239000000470 constituent Substances 0.000 claims description 8
- 238000012217 deletion Methods 0.000 claims description 2
- 230000037430 deletion Effects 0.000 claims description 2
- 230000005484 gravity Effects 0.000 abstract description 38
- 239000000523 sample Substances 0.000 description 277
- 230000006870 function Effects 0.000 description 149
- 238000009826 distribution Methods 0.000 description 89
- 238000012545 processing Methods 0.000 description 66
- 230000010365 information processing Effects 0.000 description 56
- 238000012360 testing method Methods 0.000 description 47
- 230000013016 learning Effects 0.000 description 22
- 230000008569 process Effects 0.000 description 22
- 238000002474 experimental method Methods 0.000 description 20
- 238000003860 storage Methods 0.000 description 18
- 230000008859 change Effects 0.000 description 14
- 230000007704 transition Effects 0.000 description 13
- 238000013528 artificial neural network Methods 0.000 description 12
- 238000004422 calculation algorithm Methods 0.000 description 12
- 230000007423 decrease Effects 0.000 description 12
- 230000006399 behavior Effects 0.000 description 11
- 238000010586 diagram Methods 0.000 description 11
- 230000009471 action Effects 0.000 description 10
- 238000007726 management method Methods 0.000 description 9
- 230000009466 transformation Effects 0.000 description 9
- 235000008694 Humulus lupulus Nutrition 0.000 description 8
- 239000010410 layer Substances 0.000 description 8
- 239000000126 substance Substances 0.000 description 8
- 230000001965 increasing effect Effects 0.000 description 7
- 230000004069 differentiation Effects 0.000 description 6
- 230000000694 effects Effects 0.000 description 6
- 230000010354 integration Effects 0.000 description 6
- 238000012986 modification Methods 0.000 description 6
- 230000004048 modification Effects 0.000 description 6
- 238000013139 quantization Methods 0.000 description 6
- 238000012706 support-vector machine Methods 0.000 description 6
- 235000019640 taste Nutrition 0.000 description 6
- 230000002123 temporal effect Effects 0.000 description 6
- 102000053602 DNA Human genes 0.000 description 5
- 108020004414 DNA Proteins 0.000 description 5
- 230000005540 biological transmission Effects 0.000 description 5
- 238000012937 correction Methods 0.000 description 5
- 230000000875 corresponding effect Effects 0.000 description 5
- 238000000354 decomposition reaction Methods 0.000 description 5
- 230000006872 improvement Effects 0.000 description 5
- 238000010276 construction Methods 0.000 description 4
- 201000010099 disease Diseases 0.000 description 4
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 4
- 230000014509 gene expression Effects 0.000 description 4
- 238000003384 imaging method Methods 0.000 description 4
- 230000001976 improved effect Effects 0.000 description 4
- 241000282412 Homo Species 0.000 description 3
- 241001465754 Metazoa Species 0.000 description 3
- 230000002596 correlated effect Effects 0.000 description 3
- 230000003247 decreasing effect Effects 0.000 description 3
- 238000013461 design Methods 0.000 description 3
- 230000002349 favourable effect Effects 0.000 description 3
- 238000004519 manufacturing process Methods 0.000 description 3
- 239000000463 material Substances 0.000 description 3
- 238000000491 multivariate analysis Methods 0.000 description 3
- 238000010606 normalization Methods 0.000 description 3
- 238000000513 principal component analysis Methods 0.000 description 3
- 238000007639 printing Methods 0.000 description 3
- 238000011002 quantification Methods 0.000 description 3
- 238000005070 sampling Methods 0.000 description 3
- 230000035807 sensation Effects 0.000 description 3
- 235000019615 sensations Nutrition 0.000 description 3
- 238000007619 statistical method Methods 0.000 description 3
- 241000196324 Embryophyta Species 0.000 description 2
- 238000012313 Kruskal-Wallis test Methods 0.000 description 2
- 238000001604 Rao's score test Methods 0.000 description 2
- 238000010162 Tukey test Methods 0.000 description 2
- 230000003044 adaptive effect Effects 0.000 description 2
- 239000000654 additive Substances 0.000 description 2
- 230000000996 additive effect Effects 0.000 description 2
- 230000001413 cellular effect Effects 0.000 description 2
- 238000006243 chemical reaction Methods 0.000 description 2
- 239000003795 chemical substances by application Substances 0.000 description 2
- 230000002860 competitive effect Effects 0.000 description 2
- 239000006185 dispersion Substances 0.000 description 2
- 230000008030 elimination Effects 0.000 description 2
- 238000003379 elimination reaction Methods 0.000 description 2
- 230000008451 emotion Effects 0.000 description 2
- 238000000556 factor analysis Methods 0.000 description 2
- 206010016256 fatigue Diseases 0.000 description 2
- 238000001914 filtration Methods 0.000 description 2
- 230000002068 genetic effect Effects 0.000 description 2
- 238000009499 grossing Methods 0.000 description 2
- 238000001325 log-rank test Methods 0.000 description 2
- 238000013507 mapping Methods 0.000 description 2
- 238000012544 monitoring process Methods 0.000 description 2
- 230000036651 mood Effects 0.000 description 2
- 230000006855 networking Effects 0.000 description 2
- 235000019645 odor Nutrition 0.000 description 2
- 238000005457 optimization Methods 0.000 description 2
- 230000036961 partial effect Effects 0.000 description 2
- 229920001690 polydopamine Polymers 0.000 description 2
- 238000003672 processing method Methods 0.000 description 2
- 238000011867 re-evaluation Methods 0.000 description 2
- 238000000611 regression analysis Methods 0.000 description 2
- 238000013077 scoring method Methods 0.000 description 2
- 230000001953 sensory effect Effects 0.000 description 2
- 238000000926 separation method Methods 0.000 description 2
- 239000002356 single layer Substances 0.000 description 2
- 230000004083 survival effect Effects 0.000 description 2
- 208000024891 symptom Diseases 0.000 description 2
- 230000001360 synchronised effect Effects 0.000 description 2
- 238000010998 test method Methods 0.000 description 2
- 238000013519 translation Methods 0.000 description 2
- 238000011282 treatment Methods 0.000 description 2
- 101150012579 ADSL gene Proteins 0.000 description 1
- 102100020775 Adenylosuccinate lyase Human genes 0.000 description 1
- 108700040193 Adenylosuccinate lyases Proteins 0.000 description 1
- 206010063659 Aversion Diseases 0.000 description 1
- 101100289061 Drosophila melanogaster lili gene Proteins 0.000 description 1
- 238000001078 Durbin test Methods 0.000 description 1
- 244000187656 Eucalyptus cornuta Species 0.000 description 1
- 238000001134 F-test Methods 0.000 description 1
- 238000000729 Fisher's exact test Methods 0.000 description 1
- 238000001135 Friedman test Methods 0.000 description 1
- 238000001276 Kolmogorov–Smirnov test Methods 0.000 description 1
- 238000003657 Likelihood-ratio test Methods 0.000 description 1
- 235000007688 Lycopersicon esculentum Nutrition 0.000 description 1
- 238000000585 Mann–Whitney U test Methods 0.000 description 1
- 241000699666 Mus <mouse, genus> Species 0.000 description 1
- 241000699670 Mus sp. Species 0.000 description 1
- 238000001545 Page's trend test Methods 0.000 description 1
- 238000001358 Pearson's chi-squared test Methods 0.000 description 1
- 241000228740 Procrustes Species 0.000 description 1
- 238000011869 Shapiro-Wilk test Methods 0.000 description 1
- 240000003768 Solanum lycopersicum Species 0.000 description 1
- 238000001772 Wald test Methods 0.000 description 1
- 238000001787 Wald–Wolfowitz test Methods 0.000 description 1
- 238000000367 ab initio method Methods 0.000 description 1
- 230000001133 acceleration Effects 0.000 description 1
- 238000009825 accumulation Methods 0.000 description 1
- 238000004378 air conditioning Methods 0.000 description 1
- 238000000540 analysis of variance Methods 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 238000012093 association test Methods 0.000 description 1
- 230000003190 augmentative effect Effects 0.000 description 1
- 238000013398 bayesian method Methods 0.000 description 1
- 230000036772 blood pressure Effects 0.000 description 1
- 210000001124 body fluid Anatomy 0.000 description 1
- 239000010839 body fluid Substances 0.000 description 1
- 230000036760 body temperature Effects 0.000 description 1
- 230000003139 buffering effect Effects 0.000 description 1
- 230000011748 cell maturation Effects 0.000 description 1
- 238000000546 chi-square test Methods 0.000 description 1
- 238000007621 cluster analysis Methods 0.000 description 1
- 239000000306 component Substances 0.000 description 1
- 239000002131 composite material Substances 0.000 description 1
- 150000001875 compounds Chemical class 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 238000010219 correlation analysis Methods 0.000 description 1
- 238000009223 counseling Methods 0.000 description 1
- 230000001186 cumulative effect Effects 0.000 description 1
- 125000004122 cyclic group Chemical group 0.000 description 1
- 230000002950 deficient Effects 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000005315 distribution function Methods 0.000 description 1
- 229940079593 drug Drugs 0.000 description 1
- 239000003814 drug Substances 0.000 description 1
- 239000013583 drug formulation Substances 0.000 description 1
- 230000009977 dual effect Effects 0.000 description 1
- 230000004064 dysfunction Effects 0.000 description 1
- 230000002996 emotional effect Effects 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 238000013210 evaluation model Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 238000010304 firing Methods 0.000 description 1
- 239000012634 fragment Substances 0.000 description 1
- 238000007429 general method Methods 0.000 description 1
- 238000012880 independent component analysis Methods 0.000 description 1
- 230000006698 induction Effects 0.000 description 1
- 230000001939 inductive effect Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 239000007788 liquid Substances 0.000 description 1
- 206010025482 malaise Diseases 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 230000000474 nursing effect Effects 0.000 description 1
- 238000001543 one-way ANOVA Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 239000013307 optical fiber Substances 0.000 description 1
- 230000008520 organization Effects 0.000 description 1
- 238000010422 painting Methods 0.000 description 1
- 230000001575 pathological effect Effects 0.000 description 1
- 230000002093 peripheral effect Effects 0.000 description 1
- 230000003863 physical function Effects 0.000 description 1
- 238000013439 planning Methods 0.000 description 1
- 230000002265 prevention Effects 0.000 description 1
- 238000004445 quantitative analysis Methods 0.000 description 1
- 238000012892 rational function Methods 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 230000002829 reductive effect Effects 0.000 description 1
- 230000001373 regressive effect Effects 0.000 description 1
- 230000002787 reinforcement Effects 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 238000013432 robust analysis Methods 0.000 description 1
- 238000010079 rubber tapping Methods 0.000 description 1
- 238000012216 screening Methods 0.000 description 1
- 210000000697 sensory organ Anatomy 0.000 description 1
- 230000008786 sensory perception of smell Effects 0.000 description 1
- 230000014860 sensory perception of taste Effects 0.000 description 1
- 238000004088 simulation Methods 0.000 description 1
- 230000035943 smell Effects 0.000 description 1
- 238000000638 solvent extraction Methods 0.000 description 1
- 238000010183 spectrum analysis Methods 0.000 description 1
- 230000002269 spontaneous effect Effects 0.000 description 1
- 230000006641 stabilisation Effects 0.000 description 1
- 238000011105 stabilization Methods 0.000 description 1
- 230000000087 stabilizing effect Effects 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 238000002945 steepest descent method Methods 0.000 description 1
- 230000002194 synthesizing effect Effects 0.000 description 1
- 238000004613 tight binding model Methods 0.000 description 1
- 230000003867 tiredness Effects 0.000 description 1
- 208000016255 tiredness Diseases 0.000 description 1
- 238000012876 topography Methods 0.000 description 1
- 238000012549 training Methods 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 238000007492 two-way ANOVA Methods 0.000 description 1
- 238000009827 uniform distribution Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
- 239000002699 waste material Substances 0.000 description 1
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 1
- 230000005428 wave function Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/22—Indexing; Data structures therefor; Storage structures
- G06F16/2228—Indexing structures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
- G06F16/353—Clustering; Classification into predefined classes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/22—Matching criteria, e.g. proximity measures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
- G06F18/232—Non-hierarchical techniques
- G06F18/2321—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
- G06F18/23211—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with adaptive number of clusters
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2413—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
- G06F18/24133—Distances to prototypes
- G06F18/24137—Distances to cluster centroïds
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/74—Image or video pattern matching; Proximity measures in feature spaces
- G06V10/761—Proximity, similarity or dissimilarity measures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/762—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using clustering, e.g. of similar faces in social networks
- G06V10/763—Non-hierarchical techniques, e.g. based on statistics of modelling distributions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/764—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/70—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/80—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for detecting, monitoring or modelling epidemics or pandemics, e.g. flu
Definitions
- Information classification device information classification method, information classification program, information classification system
- the present invention relates to an information classification device, an information classification method, an information classification program, an information classification system, an information providing service using the information classification system, and a computer that records a post-classification population classified by the information classification system
- Readable recording media and databases that store an arbitrary number of populations to search for populations to which the sample information to be classified belongs using an information classification system, especially for statistically classifying information Suitable information classification apparatus, information classification method, information classification program, information classification system, information providing service using the information classification system, computer-readable recording medium for recording a population after classification classified by the information classification system, and Any number of populations to search the population to which the sample information to be classified belongs, using the information classification system On the memorize the database.
- classification methods for information recognition and classification are obtained by orthogonally decomposing the most matrix information group to obtain an optimal solution by using a plurality of matrix information groups, or by using the likelihood estimation method such as Baumweltian algorithm or minimum. Algorithms are used to arithmetically find the optimal solution such as error classification.
- Patent Document 1 discloses a method for optimizing boundary conditions by making the Mahalanobis distance constant.
- the local solution is continuously changed based on the appearance frequency distribution and likelihood distribution of the sample in the mixed distribution of the population called EM algorithm, and the local optimal solution is recursively generated. There is a method for maximizing expected values.
- SVM support vector machine
- Non-Patent Document 1 the evaluation for estimating the mean, variance, and standard deviation of a population is evaluated by the Bayesian method from the center of gravity of the entire population. Is evaluated whether the position of is within a specific range of standard deviation.
- Non-Patent Document 2 describes the high accuracy of phoneme evaluation using Mahalanobis distance.
- Patent Document 1 Japanese Patent Application Laid-Open No. 2003-76976
- Non-Patent Document 1 Gen Furujo, Hiroshi Wakuya, “Estimation of data distribution based on Bayesian reasoning realized by neural network”, Institute of Electrical Engineers of Japan, October 2003, IM— 0 3-55, p . 13-18
- Non-Patent Document 2 Nakamura Toshinobu, Iwano Koji, Furui Sadaaki, “Analysis of Acoustic Characteristics of Japanese Spoken Speech Using Mahalanobis Distance”, Acoustical Society of Japan 2005 Spring Meeting Presentation, March 2005 , Vol. 1, 2— 1—14, p. 231—232
- FIG. 7 is a diagram illustrating an example of a normal distribution.
- FIG. 8 shows an example of a non-normal distribution.
- the boundary specified in the population specified by the person has to be different depending on the situation in which the person interprets the information. Such a non-normal distribution was formed. For this reason, there was a problem that the population boundary by the optimal solution based on the normal distribution as shown in Fig. 7 could not be obtained arithmetically.
- the mixed distribution is not necessarily a mixed normal distribution, many local solutions with high likelihood that can be interpreted as the optimal solution of the normal distribution have appeared. For this reason, an optimal solution as an arithmetic solution is generated without limitation or more than necessary, and it does not form a key as a practical optimal solution. In general, there was a problem that stable classification could not be realized.
- the K means method, if the arbitrary center of gravity specified in the initial stage is not appropriate, the initial population number becomes the number of populations after optimization, so the population is not increased or decreased autonomously. However, there is a problem that stable classification into a population cannot be realized.
- Patent Document 1 only explains that the neural network functions optimally by keeping the Mahalanobis distance constant. For this reason, even if it is assumed to be used for clustering, it is classified as inside and outside the distance average value of the samples that make up the population, solving the problem of increasing the population unrestrictedly or more than necessary. Absent.
- the EM algorithm is known to construct local solutions without limitation or more than necessary, and there is a problem that the population cannot always be stably classified.
- SVM is a method for determining a boundary condition and a boundary width by converting a nonlinear mapping of a population into another dimensional space by an arbitrary function.
- SVM there is a problem that a stable classification of the population is not always possible.
- Non-Patent Document 1 is attributed evaluation based on variance and standard deviation with the population centroid as an average, and output in a multi-layered dual network.
- evaluating the average of the evaluation distance in the population to which it belongs and the standard deviation using the standard deviation as shown in the present invention it is possible to present problems related to information classification and solve problems. It ’s a proof of proof.
- Non-Patent Document 2 is an analysis result and consideration that speech analysis using Mahalanobis distance shows high correlation, and presents specific problems and solutions and demonstrations of the problems. Do not mean.
- the present invention has been made to solve the above-mentioned problems, and one of the objects of the present invention is an information classification apparatus and information capable of autonomously and stably classifying sample information into a population. It is to provide a classification method, an information classification program, and an information classification system.
- Another object of the present invention is to provide an information classification device, an information classification method, an information classification program, and an information classification system that can mutually evaluate sample information having different component aspects. is there.
- an information classification device includes a distance calculation unit, a statistical information calculation unit, an attribution degree evaluation unit, an attribution determination unit, and sample information. And a return part.
- the distance calculation unit calculates a statistical distance between the centroid for each population of sample information belonging to each of an arbitrary number of populations including the sample information and the classification target sample information.
- the statistical information calculation unit calculates statistical information for each population regarding the statistical distance calculated by the distance calculation unit.
- the attribution degree evaluation unit evaluates the degree of attribution of the classification target sample information to the population based on the statistical distance calculated by the distance calculation unit and the statistical information calculated by the statistical information calculation unit. .
- the attribution determination unit determines to which population the classification target sample information is to be attributed according to the attribution degree evaluated by the attribution degree evaluation unit.
- the sample information attribution unit assigns the sample information to be classified to the population determined by the attribution determination unit.
- the information classification apparatus integrates the centroid for each population of sample information belonging to each of an arbitrary number of populations including sample information and the classification target sample information.
- the statistical distance is calculated, and statistical information for each population is calculated for the calculated statistical distance.
- the degree of attribution of the classification target sample information to the population is determined to which population the classification target sample information belongs, and the classification target sample information is attributed to the determined population.
- the information classification device assigns the classification target sample information to any population corresponding to the degree of attribution to the population.
- the information classification apparatus that can autonomously and stably classify sample information into a population.
- the statistical information is an average value and a standard deviation value for each statistical population calculated by the distance calculation unit.
- the distance calculation unit statistically calculates a centroid for each updated population to which the classification target sample information is attributed by the sample information attribution unit, and classification target sample information belonging to each of the updated populations. The distance is further calculated.
- the information classification device further provides a statistical distance between the center of gravity of each updated population to which the classification target sample information is attributed and the classification target sample information belonging to each of the updated population. Based on the calculated statistical distance, the classification target sample information is further attributed to any population according to the degree of attribution.
- the information classification apparatus can further recursively classify sample information into a population.
- the belonging determination unit includes a population generating unit that newly generates a population when the degree of belonging to any population is outside the range of the predetermined degree. Decide to assign the sample information to be classified.
- the information classification device when the degree of belonging to any population is out of the predetermined range by the information classification device, a new population is generated, and the generated population is classified. Sample information is attributed.
- the sample information belonging to the population becomes sample information within a range with a predetermined degree of belonging.
- the information classification device can classify the sample information within a predetermined range with respect to the population.
- the statistical information is an average value and a standard deviation value of the statistical distance calculated by the distance calculation unit for each population, and the degree of attribution is an average value of the statistical distance for the population. It is a force deviation value, and the predetermined degree is a range of standard deviation values where the deviation value is a predetermined multiple from the average value.
- the information classification device when the deviation value from the average value of the statistical distance to any population is outside the range of the standard deviation value of a predetermined multiple by the information classification device, a new mother is newly created. A group is generated, and the sample information to be classified belongs to the generated population.
- the information classification device can classify the sample information within a standard deviation value range in which the deviation value from the average value of the statistical distance with respect to the population is a predetermined multiple. As a result, it is possible to classify the sample information into the population so that a certain percentage of the sample information belonging to the population is close to a normal distribution that is distributed within the range of the standard deviation value of the average value power a predetermined multiple. .
- the information classification device includes a population deletion unit that deletes a population to which a predetermined number of pieces of sample information are not attributed and causes sample information belonging to the deleted population to belong to another population. Further prepare.
- the information classification device deletes a population to which a predetermined number of sample information is not attributed, and samples information attributed to the deleted population is attributed to another population. For this reason, invalid populations are deceived.
- the attribution determining unit determines that the classification target sample information is attributed to the population having the highest attribution degree evaluated by the attribution degree evaluating unit.
- the information classification device determines that the classification target sample information is to be attributed to the population having the best evaluated degree of attribution, and the classification target sample information is attributed to the determined population. Is done.
- the information classification apparatus is attributed to the population having the highest degree of attribution for the classification target sample information.
- the sample information can be optimally classified into the population.
- the distance calculation unit calculates a statistical distance based on a covariance structure analysis.
- the distance calculation unit calculates the statistical distance based on the eigenvalue and the eigenvalue. calculate.
- the distance calculation unit calculates the Mahalanobis distance as the statistical distance.
- the distance calculation unit calculates a distance by a Bayes discriminant function as a statistical distance.
- the distance calculation unit includes a distance normalization unit that normalizes the calculated statistical distance.
- the statistical distance is normalized by the information classification device. As a result, statistical distance can be easily handled by the information classifier.
- the information classification method is executed by a computer, and the centroid for each population of sample information belonging to each of an arbitrary number of populations including the sample information, and the classification target Based on the step of calculating the statistical distance to the sample information, the step of calculating the statistical information for each statistical population for the calculated statistical distance, and the calculated statistical distance and statistical information A step of evaluating the degree of attribution of the sample information to the population, a step of determining to which population the sample information to be classified belongs to according to the evaluated degree of attribution, and the determined population And assigning the sample information to be classified.
- an information classification method capable of autonomous and stable classification of sample information into a population.
- the information classification program is executed by a computer, and the center of gravity for each population of sample information belonging to each of an arbitrary number of populations including sample information; Based on the step of calculating the statistical distance from the sample information to be classified, the step of calculating the statistical information for each statistical population about the calculated statistical distance, and the calculated statistical distance and statistical information, A step of evaluating the degree of attribution of the classification target sample information to the population, a step of determining to which population the classification target sample information should be attributed according to the evaluated degree of attribution, and the determined population And causing the computer to execute the step of assigning the sample information to be classified to the group.
- an information classification system includes an information classification device and an information terminal connected to the information classification device via a communication line.
- the information classification device includes a population reception unit, a distance calculation unit, a statistical information calculation unit, an attribution degree evaluation unit, an attribution determination unit, a sample information attribution unit, and a post-classification population delivery unit.
- the information terminal includes a population delivery unit and a post-classification population reception unit.
- the population delivery unit delivers an arbitrary number of populations including sample information to the information classification device.
- the population receiving unit receives an arbitrary number of populations including sample information from the information terminal.
- the distance calculation unit calculates a statistical distance between the centroid for each population of sample information belonging to each of the populations received by the population reception unit and the sample information to be classified.
- the statistical information calculation unit calculates statistical information for each population about the statistical distance calculated by the distance calculation unit.
- the belonging degree evaluation unit evaluates the degree of belonging to the population of the classification target sample information based on the statistical distance calculated by the distance calculating unit and the statistical information calculated by the statistical information calculating unit.
- the attribution determination unit determines to which population the classification target sample information is to be attributed, according to the attribution degree evaluated by the attribution degree evaluation unit.
- the sample information attribution unit assigns the classification target sample information to the population determined by the attribution determination unit.
- the post-classification population delivery unit delivers the post-classification population to which the classification target sample information is attributed by the sample information attribution unit to the information terminal.
- the post-classification population receiving unit receives the post-classification population from the information classification device.
- an information classification system capable of providing a population in which sample information is classified autonomously and stably.
- an information classification system includes an information classification device and an information terminal connected to the information classification device via a communication line.
- the information classification device includes a sample information receiving unit, a distance calculating unit, a statistical information calculating unit, an belonging degree evaluating unit, an belonging determining unit, and a population identification information passing unit.
- the information terminal includes a sample information delivery unit and a population identification information reception unit.
- the specimen information delivery unit delivers the classification target specimen information to the information classification device.
- the sample information receiving unit receives the classification target sample information from the information terminal.
- the distance calculator includes sample information And calculating a statistical distance between the center of gravity of the sample information belonging to each of an arbitrary number of populations and the classification target sample information received by the population receiving unit.
- the statistical information calculation unit calculates statistical information for each population about the statistical distance calculated by the distance calculation unit.
- the belonging degree evaluation unit evaluates the degree of belonging to the population of the classification target sample information based on the statistical distance calculated by the distance calculating unit and the statistical information calculated by the statistical information calculating unit.
- the attribution determination unit determines to which population the classification target sample information is to be attributed, according to the attribution degree evaluated by the attribution degree evaluation unit.
- the population identification information delivery unit delivers the population identification information for identifying the population determined by the attribution determination unit to the information terminal.
- the population identification information receiving unit receives population identification information from the information classification device.
- an information classification system capable of autonomously and stably giving information for identifying a population to which classification target sample information belongs.
- an information providing system used for an information providing service includes: an information classification device; and an information terminal connected to the information classification device via a communication line.
- the information classification device includes a sample information receiving unit, a distance calculating unit, a statistical information calculating unit, an attribution degree evaluating unit, an belonging determining unit, and a population identification information passing unit.
- the information terminal includes a sample information delivery unit and a population identification information reception unit.
- the specimen information delivery unit delivers the classification target specimen information to the information classification device.
- the sample information receiving unit receives the classification target sample information from the information terminal.
- the distance calculation unit calculates the statistical distance between the centroid for each population of the sample information belonging to each of an arbitrary number of populations including the sample information and the classification target sample information received by the population reception unit. To do.
- the statistical information calculation unit calculates statistical information for each population regarding the statistical distance calculated by the distance calculation unit.
- the belonging degree evaluation unit evaluates the degree of belonging to the population of the classification target sample information based on the statistical distance calculated by the distance calculating unit and the statistical information calculated by the statistical information calculating unit.
- the attribution determination unit determines to which population the classification target sample information is to be attributed, according to the attribution degree evaluated by the attribution degree evaluation unit.
- the population identification information delivery unit sends the population identification information for identifying the population determined by the attribution determination unit to the information terminal. Deliver.
- the population identification information receiving unit receives information identification apparatus power population identification information.
- an information classification system for classifying a post-classification population recorded on a computer-readable recording medium includes an information classification device, an information classification device, and a communication line.
- the information classification device includes a population receiving unit, a distance calculation unit, a statistical information calculation unit, an attribution degree evaluation unit, an attribution determination unit, a sample information attribution unit, and a post-classification population delivery unit.
- the information terminal includes a population delivery unit and a post-classification population reception unit.
- the population delivery unit delivers an arbitrary number of populations including sample information to the information classification device.
- the population receiving unit receives an arbitrary number of populations including sample information from the information terminal.
- the distance calculation unit calculates a statistical distance between the centroid for each population of sample information belonging to each of the populations received by the population reception unit and the sample information to be classified.
- the statistical information calculation unit calculates statistical information for each population regarding the statistical distance calculated by the distance calculation unit.
- the belonging degree evaluation unit evaluates the degree of belonging to the population of the classification target sample information based on the statistical distance calculated by the distance calculating unit and the statistical information calculated by the statistical information calculating unit.
- the attribution determination unit determines to which population the classification target sample information is to be attributed, according to the attribution degree evaluated by the attribution degree evaluation unit.
- the sample information attribution unit assigns the classification target sample information to the population determined by the attribution determination unit.
- the post-classification population delivery unit delivers the post-classification population to which the classification target sample information is attributed by the sample information attribution unit to the information terminal.
- the post-classification population receiving unit receives the post-classification population from the information classification device.
- a computer-readable recording medium for recording a post-classification population classified by an information classification system capable of providing a population in which sample information is autonomously and stably classified. Can be provided.
- the information classification system used for searching the population to which the sample information to be classified belongs includes an information classification device and an information terminal connected to the information classification device via a communication line.
- the information classification device includes a population receiving unit, a distance calculation unit, a statistical information calculation unit, a attribution degree evaluation unit, an attribution determination unit, a sample information attribution unit, and a post-classification population delivery unit.
- the information terminal includes a population delivery unit and a post-classification population reception unit.
- the population delivery unit delivers an arbitrary number of populations including sample information to the information classification device.
- the population receiving unit receives an arbitrary number of populations including sample information from the information terminal.
- the distance calculation unit calculates a statistical distance between the centroid for each population of sample information belonging to each of the populations received by the population reception unit and the sample information to be classified.
- the statistical information calculation unit calculates statistical information for each population regarding the statistical distance calculated by the distance calculation unit.
- the belonging degree evaluation unit evaluates the degree of belonging to the population of the classification target sample information based on the statistical distance calculated by the distance calculating unit and the statistical information calculated by the statistical information calculating unit.
- the attribution determination unit determines to which population the classification target sample information is to be attributed according to the attribution degree evaluated by the attribution degree evaluation unit.
- the sample information attribution unit assigns the classification target sample information to the population determined by the attribution determination unit.
- the post-classification population delivery unit delivers the post-classification population to which the classification target sample information is attributed by the sample information attribution unit to the information terminal.
- the post-classification population receiving unit receives the post-classification population from the information classification device.
- the present invention for searching for a population to which the classification target sample information belongs using an information classification system capable of providing a population in which sample information is autonomously and stably classified.
- a database for storing the arbitrary number of populations can be provided.
- the classification target specimen information is arbitrary vector information, matrix information, or tensor information in which an identifier is given to each element in advance, and a predetermined evaluation function is assigned to each element in advance. It is a function that receives vector information, matrix information, or tensor information of a given component aspect given an identifier, and the distance calculation unit is an identifier for each element of arbitrary vector information, matrix information, or tensor information. Is the prescribed structure Statistics are obtained by reconstructing each element of arbitrary vector information, matrix information, or tensor information so as to be the same identifier for each element identifier of the component aspect and inputting it to a predetermined evaluation function. The target distance is calculated.
- the identifier of each element of arbitrary vector information, matrix information, or tensor information is a predetermined component of vector information, matrix information, or tensor information input to a predetermined evaluation function. It is reconfigured so that it becomes the same identifier for each of the identifiers of the elements of the aspect, and is input to a predetermined evaluation function. For this reason, an information classification device, an information classification system, an information providing service using the information classification system, and a post-classification population classified by the information classification system that can mutually evaluate sample information with different component aspects. It is possible to provide a computer-readable recording medium for recording, and a database for storing the arbitrary number of populations for searching for a population to which the sample information to be classified belongs using an information classification system.
- an evaluation function or a sample may be configured using the feature amount, name, or identifier in an arbitrary field for these elements, and the attribution state of the sample to the population may be evaluated.
- These evaluation functions may be configured or reconfigured.
- the classification target specimen information is arbitrary vector information, matrix information, or tensor information in which an identifier is given in advance to each element, and a predetermined evaluation function is given in advance to each element. It is a function that receives vector information, matrix information, or tensor information of a given component aspect given an identifier, and the step of calculating the statistical distance is each of arbitrary vector information, matrix information, or tensor information.
- Each element of arbitrary vector information, matrix information, or tensor information is reconfigured so that the identifier of the element of the element is the same identifier as the identifier of the element of the predetermined component aspect, and the predetermined evaluation function To calculate the statistical distance.
- an identifier of each element of arbitrary vector information, matrix information, or tensor information is a predetermined component of vector information, matrix information, or tensor information input to a predetermined evaluation function. It is reconfigured so that it becomes the same identifier for each of the identifiers of the elements of the aspect and is input to a predetermined evaluation function. For this reason, an information classification method capable of mutually evaluating specimen information with different component aspects, and And an information classification program can be provided.
- an identifier is given to a feature vector, a matrix, and / or a tensor element, an element having a matching identifier is arranged as an evaluation feature quantity, and given to an evaluation function, or a vector vector, matrix, and / or tensor Perform distance evaluation.
- the distance calculation unit replaces the order of the element items of the vector, matrix, and Z or tensor, substitutes the element average value or 0 for the missing element, or deletes the excess element. And the function of making the apparent number of elements and the element identifier the same.
- Vector, matrix, and / or tensor with different elements, or vector, matrix, and / or tensor and evaluation function based on distance from population centroid, mean, and standard deviation The range of application of vectors, matrices and / or tensor evaluation functions is expanded.
- FIG. 1 is a diagram showing an outline of the configuration of an information classification device according to the present embodiment.
- FIG. 2 is a flowchart showing the flow of information classification processing executed by the information classification device according to the present embodiment.
- FIG. 3 is a diagram showing an example of a population in information classification processing by supervised learning according to the present embodiment.
- FIG. 4 is a graph showing an experimental result of an information classification experiment by supervised learning according to the present embodiment.
- FIG. 5 is a graph showing experimental results of a control experiment of an information classification experiment.
- FIG. 6 is a diagram showing an outline of an information classification system according to a modification of the present embodiment.
- FIG. 7 is a diagram showing an example of a normal distribution.
- FIG. 8 is a diagram showing an example of a non-normal distribution.
- FIG. 9 is a graph showing experimental results of an information classification experiment according to the present embodiment in unsupervised learning with more samples.
- 100 information classification device 100A, 100B information processing device, 110 processing unit, 120 storage unit, 130 input unit, 140 output unit, 200A to 200C information terminal, 500 network.
- FIG. 1 is a diagram showing an outline of the configuration of the information classification device 100 according to the present embodiment.
- information classification apparatus 100 is configured by a computer such as a PC (Personal Computer), and includes a processing unit 110, a storage unit 120, an input unit 130, and an output unit 140.
- the processing unit 110, the storage unit 120, the input unit 130, and the output unit 140 are connected via a bus and exchange necessary data via the bus.
- the information classification device 100 is not limited to a general-purpose device such as a PC, and may be configured as a dedicated device.
- the processing unit 110 includes an arithmetic circuit such as a CPU (Central Processing Unit), an MPU (Micro Processing Unit), a DSP (Digital Signal Processor), and its peripheral circuits.
- a CPU Central Processing Unit
- MPU Micro Processing Unit
- DSP Digital Signal Processor
- the storage unit 120 includes a storage circuit such as a ROM (Read Only Memory), a RAM (Random Access Memory), and a hard disk drive.
- the storage unit 120 stores a program executed by the information classification device 100 or is used as a work area when the program is executed.
- the input unit 130 is configured with power such as an input device such as a keyboard and a mouse, an imaging device such as a camera, and a sound collecting device such as a microphone.
- the input unit 130 delivers data input from the input device, the imaging device, and the sound collection device to the processing unit 110.
- the output unit 140 includes a display device such as a display, an acoustic device such as a speaker, and the like.
- the output unit 140 outputs the data received from the processing unit 110.
- processing unit 110 executes a predetermined process using storage unit 120 as a work area. Further, the processing unit 110 receives predetermined data from the input unit 130 according to the processing. Further, the processing unit 110 delivers predetermined data to the output unit 140 according to the processing.
- FIG. 2 shows the flow of information classification processing executed by the information classification device 100 according to the present embodiment. It is a flowchart to show.
- processing unit 110 constructs a distance function from sample information belonging to each population to be classified stored in storage unit 120. .
- n 1 2 m n and b can be multidimensional vectors, matrices and tensors.
- the processing unit 110 obtains variables for multivariate analysis of the respective populations A and B such as eigenvalues, eigenvectors, average values, and standard deviation values from these sample information groups.
- the processing unit 110 obtains the Mahalanobis distance between the populations A and B and each sample information based on the variables for covariance structure analysis obtained here, and the sample information a and bnm The argument
- [0106] is a vector, including the number of evaluation dimensions that are internal variables of the distance evaluation function Although the accuracy of the calculation result can be set using the variable for evaluation, it can be specified with any accuracy.
- i indicates an identification value of a plurality of populations.
- k represents the identification value of the sample.
- [0111] shows the distance between sample k and the center of gravity of population i.
- ⁇ represents an average vector obtained from the sample information.
- [0113] indicates a sample information vector.
- V in Equations 5 and 9 indicates the covariance matrix of population i.
- I represents the eigenvector of the covariance matrix of the population.
- ⁇ represents the eigenvalue of the covariance matrix of the population.
- a constant log IVI based on the eigenvalue of Mahalanobis distance and prior probability logP (c) can be added to construct a multidimensional distance calculation function using a Bayes discriminant function.
- a value corresponding to the distance from the population center of gravity can be derived in the form of [0124].
- step S12 the processing unit 110 evaluates the sample information with the distance function constructed in step S11, and calculates the evaluation distance.
- step S13 the processing unit 110 performs step S13.
- the average value, variance value, and standard deviation value of the evaluation distance calculated in 12 are derived.
- step S14 processing unit 110 normalizes the evaluation distance for each population calculated in step S12.
- D which is the evaluation distance group obtained by inputting the sample information group a, a, ..., a, b, b, ..., b
- [0137] is also the evaluation distance group obtained by inputting the sample information group a, a , ''', a, b, b,''', b Normalize D group with ⁇ D and ⁇ D. As a result, the samples belonging to each population
- the distance deviation values V, V, ⁇ , V, V, V, V, ⁇ , V of each sample calculated using the average distance from the center of gravity of each sample group by the constructed evaluation function are obtained.
- V bk o3 ⁇ 4- 1 (D bk - M D b )
- step S21 the processing unit 110 evaluates the distance deviation value of the sample with respect to the first population calculated in step S14.
- step S22 the processing unit 110 determines whether or not the distance deviation value is within a specified range.
- step S23 processing unit 110 assigns the sample to the population, and advances the process to step S24.
- step S24 processing unit 110 assigns the sample to the population, and advances the process to step S24.
- step S24 processing unit 110 determines whether or not there is a next population. When there is a next population (YES in step S24), in step S25, the processing unit 110 evaluates the sample distance deviation value with respect to the next population, and returns the process to step S22. On the other hand, if there is no next population (NO in step S24), the process proceeds to step S26.
- step S26 the processing unit 110 determines whether or not the sample belongs to any existing population. If it is not attributed to any population (NO in step S26), in step S27, the processing unit 110 generates a new population, assigns the sample to the population, and performs processing. Proceed to step S28. On the other hand, if it belongs to one of the populations (YES in step S26), the process proceeds to step S28.
- the distance deviation value V of the distance function F (a) is less than 3 ⁇ 4 ⁇ .
- step S28 the processing unit 110 determines whether there is a next sample. If there is a next sample (YES in step S 28), processing unit 110 returns the process to step S 21. On the other hand, when there is no next sample (NO in step S28), processing unit 110 advances the process to step S31.
- steps S21 to S27 are executed for sample information a force a.
- step S21 to step S27 are executed for sample information b force b.
- samples belonging to populations A and B may be attributed to the population with the smallest distance standard deviation value.
- the criterion of 3 times ⁇ which is an index used here, is 99.7% of sample information in the membership probability, appearance probability or membership probability derived based on the statistical probability density function. It is a value that can be expected to be included, and any magnification may be specified based on the specifications, ideas, and purpose of the device.
- any ⁇ value is used as an evaluation criterion, whether it is within 3 ⁇ to an arbitrary rank, or belonging to the population closest to the center of gravity If the evaluation distance is a negative value when the evaluation distance is negative, the probability value is greater than ⁇ Considering that, it is possible to use a method of selecting a population to be attributed, by combining with evaluation that it belongs to the population.
- the standard deviation for the distance from the center of gravity of the sample group may be obtained by using the average as in Equation 22 or 23, and may be used as the boundary reference in the above method.
- the distance average Since the distance from the population centroid is used as the evaluation criterion, the probability of occurrence or attribution probability determined by the statistical probability density function according to the mean z D for which the constant force based on the eigenvalue was obtained and the standard deviation based on the mean, or The distance that becomes the attribution boundary may be determined by using the affiliation probability.
- the reclassification condition may be specified by any combination of conditions with a plurality of populations. At this time, if it is sufficiently close to the center of gravity of multiple populations, the method is closer, or it is assigned to the smaller standard deviation value, or if both are small, a new population is formed, or both populations are assigned. It may be configured such that the method of assigning or changing the classification method for each positive / negative deviation value can be arbitrarily performed.
- step S31 the processing unit 110 determines whether there is a population whose sample information is less than a predetermined number, for example, less than 200. If there is a population whose sample information is less than the predetermined number (YES in step S31), processing unit 110 causes the sample information belonging to that population to belong to another population in step S32. In other words, the population is deleted. Thereafter, the processing unit 110 proceeds with the process to step S33. On the other hand, when there is no population whose sample information is less than the predetermined number (NO in step S31), processing unit 110 advances the process to step S33.
- a predetermined number for example, less than 200.
- the sample information belonging to the population to be deleted belongs to the population having the smallest distance standard deviation value.
- the sample information belonging to the deleted population is not attributed to any population, and it is used as sample information only to obtain the distance function force distance and the temporary belonging population in step S33. You can do it.
- step S33 the processing unit 110 calculates the distance function for the reclassified population. Then, the degree of coincidence is evaluated by recognition using a discriminant function, and it is evaluated whether the classification is made accurately. In step S34, the processing unit 110 determines whether or not the degree of coincidence satisfies the termination condition.
- processing unit 110 If the end condition is not satisfied (NO in step S34), processing unit 110 returns the process to step S12 and recursively executes the processes from step S12 to step S32. On the other hand, when the end condition is satisfied (YES in step S34), processing unit 110 terminates this information classification process.
- FIG. 3 is a diagram showing an example of a population in the information classification process by supervised learning according to the present embodiment.
- FIG. 3 (A) is a diagram showing the classification of the population before the information classification process. Referring to Fig. 3 (A), this figure is a plot of sample information a and b on a scatter plot. Specimen information a is indicated by “ ⁇ ”
- the sample information bn is indicated by “ ⁇ ”.
- the person judges the contents of the sample information and classifies them into a and b.
- the set of sample information a is population A
- the set of sample information b is population B.
- the centroids of the population A and the population B before classification are indicated by “ ⁇ ”, respectively.
- the 3 ⁇ boundary of population A before classification is shown by a one-dot chain line.
- the 3 ⁇ boundary of the population ⁇ before classification is indicated by a two-dot chain line.
- FIG. 3 (B) is a diagram showing the classification of the population after the information classification process.
- the new classification boundary of the population after processing is shown by a broken line.
- the centroid of newly generated population C is also indicated by “ ⁇ ”, as is the centroid of population A and population B.
- the statistical information of the distance obtained based on multiple distance functions by the information classification process Information with a distance from the population center of gravity can form a unique population or change attribution to a population with a closer center of gravity. Even if the information is likely to cause a difference, the distribution can be made close to a normal distribution, and an autonomously stable population can be formed.
- the number of dimensions per sample is 192
- the number of representative initial populations at the start specified by humans is eight
- the number of data samples is about 250,000. Yes, there are 28 utterance phonemes.
- the 28 types of phonemes are classified into 8 types of populations based on the specific human subjective speech conditions by the information classification process described above.
- the nearest distance obtained after the evaluation the nearest distance obtained after the evaluation, and the label population If the distance from the center of gravity of the population composed of matched samples is less than 3 ⁇ , the attribute is assigned to the population before evaluation.
- the distance of the sample is 3 ⁇ from the distance average value of the other populations. If it is within the range, it is attributed to the matched population, and if it is more than 3 ⁇ above the mean, a new population is created. Make it happen.
- FIG. 4 is a graph showing experimental results of an information classification experiment by supervised learning according to the present embodiment.
- FIG. 5 is a graph showing the experimental results of the control experiment of the information classification experiment.
- the vertical axis shows the number of populations and the matching rate.
- the horizontal axis indicates the number of repetitions of the information classification process.
- FIG. 9 is a graph showing an experimental result of an information classification experiment according to the present embodiment in unsupervised learning with a larger number of samples.
- the information classification device 100 performs the steps in FIG. As explained in steps S11 and S12, calculate the evaluation distance between the centroid of each population of sample information belonging to each of the multiple populations containing sample information and the sample information to be classified To do.
- the information classification device 100 calculates statistical information such as the mean, variance, and standard deviation for each population for the evaluation distance calculated in step S12.
- the information classification device 100 applies the population to the population based on the evaluation distance calculated in step S12 and the statistical information calculated in step S13. By evaluating the evaluation distance of the sample information, the degree of attribution of the sample information of the classification target to the population is evaluated.
- the information classification device 100 converts the sample information to be classified into any population according to the degree of attribution evaluated in step S21 or step S25. Decide whether to belong to
- the information classification device 100 assigns the sample information of the classification contrast to the determined population.
- the information classification device 100 causes the sample information to be classified to belong to any population according to the degree of attribution to the population. As a result, autonomous and stable classification of sample information into the population can be achieved.
- the information classification device 100 adds the center of gravity of each updated population to which the sample information to be classified belongs, and each updated population.
- the sample information of the classification target is further converted into the degree of attribution based on the calculated evaluation distance. Be attributed to one of the responding populations.
- the information classification apparatus 100 can further recursively classify sample information into a population S.
- the information classification device 100 has a degree of belonging to any population that is outside the range of the predetermined degree, that is, any of them.
- the deviation from the average value of the evaluation distance to the population is also outside the range of 3 ⁇ , A simple population and assign the sample information to be classified to the created population.
- the sample information belonging to the population becomes sample information when the degree of attribution is within a predetermined range. That is, the information classification device 100 can classify sample information within a range where the deviation value from the average value of the evaluation distance is 3 ⁇ with respect to the population.
- the information classification apparatus 100 can classify the sample information within a predetermined degree with respect to the population.
- the sample information can be classified into the population so that a certain percentage of the sample information belonging to the population is close to a normal distribution distributed within the range of 3 ⁇ from the average value.
- the information classification device 100 deletes a population to which a predetermined number of sample information is not attributed, and obtains other sample information belonging to the deleted population. Be attributed to the population of This tricks the invalid population.
- the sample information of the classification target may be attributed to the population with the highest degree of attribution evaluated in step S21 or step S25.
- the information classification apparatus 100 assigns the sample information to be classified to the population with the highest degree of attribution evaluated. As a result, it is possible to optimally classify sample information into a population.
- step S14 of FIG. 2 the information classification device 100 normalizes the evaluation distance calculated in step S12.
- the information classification device 100 can easily handle the evaluation distance.
- the information classification device 100 uses the processing unit 110 to calculate the mean and variance for the sample information classified by the population of the storage unit 120, forms a covariance matrix, and stores it in the storage unit 120.
- eigenvalues and eigenvectors are obtained from the covariance matrix, classified together with the population of the population to which the sample belongs, and stored in the storage unit 120 as an evaluation function.
- the processing unit 110 calculates the distances of all the samples. Implement and classify according to the contents. If necessary, a new population is given and stored in the storage unit 120.
- processing for obtaining an average, variance, and the like is performed again using processing unit 110 according to the new classification, and is repeated until the number of populations is stabilized.
- the 3 ⁇ range used in this experiment is a range that includes about 99.7% of the population, and in statistical predictions, it is possible to implement classification with good values around 2 ⁇ , which is the test boundary of 98%. I ’ll do it.
- the Mahalanobis distance average between the center of gravity and sample information in a certain population is the number of dimensions of the sample information. Considering this characteristic force, it can be seen that samples at a distance equal to the number of evaluation dimensions from the center of the population are included in 0.68 ⁇ .
- the distance corresponding to ⁇ 3 ⁇ is about 4.5 times the number of sample dimensions, and the Mahalanobis distance from this value If is small, it can be expected that it will belong to the original population with a probability of 99.7%.
- the minimum ⁇ value in the negative direction as viewed from the average position based on the average distance from the population centroid, or the minimum to the sample closest to the centroid can be used as the upper limit for evaluation of the ⁇ value in the positive direction when viewed from the average position.
- the power can be regarded as the upper limit of the standard deviation of +4 from the distance average.
- sample information of only one side smaller or larger than the average in the specified range centered on the average is used as a new population, or a new population is specified by specifying an asymmetric range. May be configured.
- the divided populations are fused by assigning the sample information a to a population that is closer and within an arbitrary boundary.
- Processing may be performed to reduce the number of populations.
- the Mahalanobis distance when used as the exponent, it is well known that it can be used as a probability based not only on a simple n-dimensional space but also on time-series statistics.
- the statistical distribution based on the mean and standard deviation is measured using the distance in this method or the exponent value when the probability value is regarded as the exponent of the natural logarithm as the distance.
- the probability of occurrence or the probability of attribution based on the probability density function to the population to which it belongs should all be 1, but this is not necessarily the case because of variance and changes in the environment due to human interpretation. It can be used as a countermeasure.
- distance evaluation of values used in combination with arbitrary input / output variables for part or all of the input layer, intermediate layer, and output layer is performed, and in the case of a non-hierarchical model, The input value to the node and the output value of the firing node, or by combining those non-hierarchical models into a hierarchy, 3D, or higher dimensions, the output evaluation results The based value may be used as the distance.
- the present invention is classified into hierarchical Bayes, experience Bayes, variational Bayes, naive Bayes method, extended Bayes method, integrated Bayes method, large scale Bayes method, simplified Bayes method, Markov chain Monte force Nore mouth.
- the present invention uses the spherical concentration phenomenon, which is conventionally referred to as "curse of dimension", to obtain the average distance of the sample with respect to the vicinity of the spherical surface, which is an average value, and to obtain the standard deviation thereof. Based on the statistical probability density function within the range, the attribution to the population is determined based on whether the probability of belonging is high or not, and the set-theoretic attribution is determined. It may be considered as a self-propagating neural network method for reconstructing the attribution function to the population.
- the present invention can also be regarded as an application of the experience Bayes method or the hierarchical Bayes method, and the average or variance of the belonging probability or appearance probability or belonging probability based on the probability density function to the population of each sample, Probability theory that the probability is greater than 1 and closer to the population center of gravity in the case of the present invention when the standard deviation is obtained and the average deviation is 3 times the standard deviation, that is, 3 ⁇ or more. Even if it is impossible, classification is possible even when information overlaps extremely close to the center of gravity because it is a distance evaluation based on Mahalanobis distance, eigenvalue and prior probability by Bayes discriminant function. It differs from simple probability evaluation in that it is easy to convert. In this case, the degree of divergence from the population can be regarded as evaluating whether it is within the range based on the mean and standard deviation of the population according to the probability density function based on the number of samples (number of samples) and other conditions. good.
- Distance calculation method using only one of eigenvalues and eigenvectors, calculating the distance by arbitrarily changing the statistical characteristics by changing either value arithmetically, or the eigenvalue itself Also, the norm of the eigenscale, the maximum component, etc. may be used for distance calculation.
- Jacobian method instituteijos method, standard eigenvalue problem, eigenvalue calculation method, Householder one method, Arnoldi method, QR compound method, Singnor QR method, double QR method, Gauss' Seidel method, Gauss * Jordan method
- the eigenvalues and eigenvectors may be derived by any method.
- the multiple distance information obtained from the multiple populations is regarded as the sample vector information, and the norm of the eigenvalue, eigenvector, and eigenbetatonole is obtained again, so that the second- and third-order matrixes are obtained.
- Nom, Ranobis distances, and eigenvalues, eigenvectors, averages, variances, standard deviation values, and re, when using eigenvalues and eigenvectors of multiple populations as sample vectors It is also possible to create a structure like a Bayesian network by implementing these contents recursively and hierarchically.
- the maximum eigenvalue and maximum eigenvector using a power method or the like is used to derive the past time. It is possible to evaluate the Mahalanobis distance from the input information itself in recent time series information and other shape information by using indices such as average, norm, standard deviation value based on eigenvalues and eigenvectors obtained from series information and other shape information it can.
- Mahalanobis distance can be evaluated based on indices such as a mean, norm, standard deviation value based on eigenvalues and eigenvectors obtained from recent time series information and different shape information.
- a method may be used in which information is classified based on the average distance or standard deviation value evaluated in this way, and a new population is formed and the population is assigned.
- this information is not time series or shape information, but color information, sound information, character information, character symbol string, phonetic symbol string, ideographic symbol string, phonetic symbol string, phoneme symbol string, phoneme Dynamic variable information such as symbol strings, meaning population symbols, names, shapes, spatial positions, spatial arrangements, symbol fragments such as phoneme symbols and their evaluation variables, feature values, symbol values, and changes. Alternatively, static variable information may be used.
- eigenvalues and eigenvectors can be obtained recursively from the mean and variance of the eigenvalues and eigenvectors of each population, and the Mahalanobis distance between the populations can be derived.
- the distance between the populations near the orthogonal boundary can be calculated by each type of outer partitioning method.
- a plurality of classified populations that are within an arbitrary specified range may be divided, combined, or changed. For example, when the distance between the averages of multiple populations is within 2 ⁇ of the standard deviation value of each other, for example, the populations may be integrated and combined.
- the distance from the center of gravity of a specific population is evaluated by evaluating the distance from the center of gravity of each population, and if the distance is 3 ⁇ or more, it is attributed previously.
- the method may be used when another population is constructed based on the previously established population.
- the specification of the variance range can be changed, the center of gravity of the samples that should be matched is used as the reference, and the center of gravity of only the samples that actually match as a result of the match evaluation is used as the reference Thus, the evaluation of reclassification may be performed.
- the local solution based on the likelihood distribution, appearance probability distribution, and distance distribution of the sample in the population is regarded as a temporary center, the distance of each sample is obtained from the temporary center, and the average and variance of the obtained distances, standard deviation
- the population may be divided, combined, or changed by discriminating whether the range is statistically significant or not.
- any method such as Gram's Schmidt decomposition, Cholesky decomposition, singular value decomposition, eigenvalue analysis, determinant, norm, condition number estimation, and linear equation solution by the linear algebra method can be used in this embodiment. You may use for distance calculation and attribution evaluation.
- correlation coefficient matrix multiple regression analysis, principal component analysis, factor analysis, canonical correlation analysis, multidimensional scaling, discriminant analysis, classification tree, log linear model, cluster one analysis by multivariate analysis Any method such as, dendrogram, and shortest distance tree may be used for distance calculation and attribution evaluation in this embodiment.
- one-way / two-way analysis of variance Tukey method, Latin square method, factorial planning, one-way / two-way robust analysis of variance, and any arbitrary method and multi-dimensional multi-way analysis based on analysis of variance
- the given method may be used for distance calculation and attribution evaluation in the present embodiment.
- test methods are Goodman 'Kruskal-Wallis test, one-sided test, ⁇ 2 test, two-sided limit, normal distribution test (population variance) Known), test for population mean of normal distribution (unknown population variance), t-one test, test for population variance of normal distribution, test for independence, test for variance, test for mean, run test, run covariance Matrix tests, multigroup discrimination effectiveness test, Wilks lambda metric test, variable contribution test in multigroup discrimination, partial ⁇ statistic test, Adichie-Koul test, Ansari-Bradley (Ansari-Bradley) ) Test, Cohen's Kappa, weighted Kappa Durbin test, Durbin Watson test, eigenvalue test (Bartlett), Kolmogorov—Smirnov test, Kolmogorov's Smirnov test, Lepage type Test, Lili Four test, log rank test, Ansari—Bradley test, Fisher exact test, Friedman test, F—test, Hodges-Lehmann estimation,
- the population to which they belong may be evaluated and recognized or identified.
- an evaluation function having an arbitrary network structure may be configured by connecting a plurality of evaluation results in a network and using a normal distribution as a connection weight.
- K means to evaluate whether the sample belongs to, for example, 3 ⁇ . This method may be used to improve the performance of any clustering process.
- the force plan 'Meyer method the varimax method, the quatimax method, the union' intersection method, the Quartimin method, the biquay maxi method, the promax method, the oblimax Method, oblimin method, ortho-max method, Ward method, ekomax method, force plan 'Meier method, Kaiser' Dickman method, Gauss' Dourit Nore method, Covalimin method, oblique rotation method, simultaneous general varimax method, Centroid method (centroid method), Studentized residual method, Beaton method, Shortest distance method, Longest distance method, Group average method, Median method, Ward method, Variable method, etc. , Optimize eigenvectors and eigenvalue spaces using factor analysis methods, multivariate analysis methods, and cluster analysis methods Or, you can use the distance evaluation, use record in the evaluation function, may be or distance evaluation Te.
- the present invention may be used for classification of variables and posture names for configuring a certain motion in motion learning based on association of information in a motion machine such as a robot.
- An information processing system or any drive system that implements a remote robot control service that analytically processes and reuses the robot's remote dance service operations, etc.
- Operation control systems and services based on feature learning of device operation and / or control methods including functions are conceivable, and work robots, organizing robots, transport robots, nursing robots, pet robots, help robots, dialogues using these Robots, housework robots, agricultural robots, etc. may be created.
- the energy obtained by the robot's actions is consumed or consumed, such as "excess, moderate, equilibrium, attenuation, loss".
- surrounding images and sounds temperature, humidity, air components and odors, liquid components, taste, weight, acceleration, impact, pressure, etc.
- Higher-order features based on multi-dimensional combinations of feature quantities such as sensor input values and analysis values such as secondary features based on the transition state of feature quantities and tertiary features based on the transition state of secondary feature quantities Quantity may be collected and classified using the present invention.
- the above five classifications may transition to analog between the classifications, or may be classified into finer classifications to form an evaluation function, or positive or negative values by one or any number of variables. It may be expressed as
- the procedure information may be configured by recording time-series changes in actions performed by the device itself.
- the behavior of the device may be controlled based on the procedure.
- a device used for a pointing device such as a capacitance sensor pad is used to evaluate the user's tapping or rubbing, and when hit, it is evaluated bad, and when it is stroked, it is evaluated well.
- a method such as a positive evaluation and a bad evaluation when the user does not respond can be considered, and the information may be classified using the method of the present invention.
- the slow consumption of energy is the longest over a long period exceeding any defined period using the classification based on the present invention. If it is confirmed and not instructed by the user, it automatically shifts to a standby or sleep mode, which is said to be on a personal computer, to avoid energy reduction, or to perform an unprocessed act requested in advance. May be.
- objective information such as nouns and actions and actions associated with users used in human subjective evaluation analysis and psychoanalysis, user age and date of birth, and user personality and emotional disposition information
- Psychoanalysis services and fortune-telling services based on the association of the expected results and state designation information, personnel evaluation services that correlate work names and person names, work difficulty, and work achievement levels, and content analysis
- the label of each item and the information based on the feature quantity that is the variable are classified and the tendency is extracted.
- An information processing system that implements personal preference services tailored to popularity and user interests can be considered.
- indices such as natural information organisms, topography, geological name and position and size, color, weight, shape, composition, material, component, state.
- An information processing system that implements an environmental survey service based on an analysis based on association can be considered. For example, if an index is captured as a node when viewed as a network model, the distance from a certain index or a person to the index or information included in the meantime and / or temporal co-occurrence relationship or co-occurrence probability , Use the context and number of indicators as semantic states
- An information processing system that analyzes, constructs, and proposes natural conditions can be considered. You can arbitrarily change the way of grasping the relationship between nodes and links, as is often the case with network models such as HMM.
- Information such as the use of crime prevention devices by statistically classifying human behavior around buildings, and the tracking of frequent offenders using road imaging devices and alarm devices.
- An information processing system that implements a safety management service based on this association can be considered. For example, a building or product and a person are captured as nodes when viewed as a network model, and the distance between a certain building or product and a person is included between the location of the number of objects and people and the location of information.
- An information processing system that analyzes, constructs, and proposes ownership and usage situations using the temporal co-occurrence relationship, co-occurrence probability, and word context as semantic states can be considered. As is often the case with network models such as HMMs, these methods may be used to arbitrarily change the way nodes and links are understood.
- an adaptive filter for filtering in a communication device it can be used for network services, implementing a firewall service, implementing a spam mail filter, identifying and configuring a network connection route, and depending on the communication quality such as radio wave strength and the number of connection retries in wireless communication.
- An information processing system that implements networking, such as the sender's name, IP address, domain, specific domain or IP space, or via a specific network route Suppressing communication based on the result of evaluating the feature quantity indicating illegal access and spam It is also possible to filter this.
- nouns such as disease names, body parts, symptoms, and chemical substances associated with medical treatment are used as labels, and shape analysis and symptom analysis of affected areas in medical equipment, chemical analysis coefficients and variables, analysis values, and
- the processed value may be used to estimate the condition of the affected area using the feature value of the sample vector, or it may be used as a dialogue pattern variable to record information using the sample vector, and communication medical care based on association of information for counseling An information processing system that implements services can be considered.
- node and link when used as a network model for medical medical applications, there is a relationship between medical characteristics and diseases such as human DNA, body characteristics, blood pressure, body temperature pulse, and body fluid component values.
- medical characteristics and diseases such as human DNA, body characteristics, blood pressure, body temperature pulse, and body fluid component values.
- weights as features that include the distance between a feature and a disease, and the number of illnesses as the number of network hops, a co-occurrence relationship and co-occurrence within a medical field such as a wider concept of information can be used.
- An information processing system that analyzes pathological forms and proposes improvements using the probability of occurrence as the semantic state of medical features can be considered.
- network models such as HMMs, these may be used to arbitrarily change the way the relationship between nodes and links is understood.
- node and link when used as a network model for surgical medical applications, the relationship between physical features and physical obstacle models, such as human body parts and physical features, and human-movable landforms and road shapes
- the distance between a certain feature and the physical space model is the number of network hops.
- a processing system is conceivable. You may arbitrarily change the way of grasping the relationship between nodes and links, as is often the case with network models such as HMM.
- the names related to expertise are used as labels, and the correlation between these labels is expressed as a distance to analyze the layer structure of abstract concepts and concrete concepts, and the coefficients
- the classification according to the present embodiment is performed by using a sample vector as a variable or a variable.
- This method constructs a network structure by capturing labels based on the names of knowledge such as various technical terms, persons, and places as nodes, and captures the number of hops, which is the number of nodes included in the information, as a distance.
- the distance is used as a feature quantity, the distance between information in the semantic space is obtained using the route search technology in the communication protocol, and the distance is evaluated. At this time, it is also possible to assign a weight to each node as an attenuation amount in connection to other nodes. If the distance is evaluated by giving a continuous interpretation to the discrete value of the number of hops. Any method can be considered.
- Information association services, educational services, information distribution services, personnel and materials using associative expert systems based on classification according to the present invention by realizing the association of information by hierarchical storage considering such a network structure Simulation service that predicts effects by combination of factors such as chemicals, equipment, distribution channels, weather forecasts, stock price and market forecasts, earthquake forecasts, economic forecasts, price forecasts, competition forecasts, horse racing forecasts, newspapers
- An information processing system that implements information summarization services for articles, magazines, and book articles can be considered.
- information that spans different regions in multiple languages is used as sample vectors for words that are spoken with shapes and words that are spoken with shapes for people who speak a specific language.
- mobile phones, PDAs, and communications that implement travel guides and translation services that can realize similar services in different languages based on information association
- An information processing system using a base station can be considered.
- the interactive user interface uses the feature amount based on the utterance probability of a meaningful word as a sample vector in the present embodiment to reduce the utterance of the speaker.
- an information processing system that provides dialogue services based on the association of information that realizes ambiguous dialogue can be considered.
- the credit information and evaluation value sample vectors are used to classify by calculating the evaluation distance within the organization and between the organizations, and the trust distance between the evaluators is obtained to determine the dividend, performance evaluation, and ability evaluation hierarchy. Therefore, there may be an information processing system that evaluates and determines values such as monetary payment system, credit line setting, discount system, profit return method, voting method, adjustment method, product amount and dividend.
- information is collected based on the map and area.
- information such as voice characteristics, image characteristics, temperature characteristics, weather characteristics, and population density indicating the location and name are used as sample vectors.
- An information processing system that performs information support based on location by assigning and classifying as a label can be considered. When used as a network model, it captures densely populated areas such as towns and villages as links, and weights the distance from one place to another as the feature with the number of places in between as the number of network hops It can be used within the range of regional names such as the number of cities, population, output, traffic volume, economic scale, their management numbers, time and Z, or a wider range of information based on physical location.
- An information processing system using car navigation systems that analyze, construct, and make proposals for moving forms using the co-occurrence relations, co-occurrence probabilities, and location positional relations as semantic states can be considered.
- index information for distribution status management is built, logistics information provision and It is also acceptable to provide detour information, congestion information, and other information provision services to reduce congestion.
- eigenvalues and eigenscales are used to evaluate image information, motion information, shape information of two-dimensional or three-dimensional objects based on coordinate information groups, and design infringement based on similarity based on the evaluation distance. It is also possible to evaluate the infringement status of intellectual property related to copyright infringement. In this case, it becomes obsolete by evaluating the distance between the information based on the distance between the information of the subject population and the information to be evaluated, which is the sample, and the announcement clause of a similar shape that accompanies a time-series change from the time of the sample publication. Situations and similarities may be quantified. [0286] Further, an information processing system for selecting arbitrary information, products, and services is conceivable. In addition, an information processing system that analyzes the relationship between music and words recalled based on music, classifies related information statistically, and selects arbitrary information, products, and services can be considered.
- an information processing system that analyzes the relationship between the tactile sensations and the words recalled based on the tactile sensation and statistically classifies the related information to select arbitrary information, products, and services can be considered.
- An information processing system that analyzes the relationship between words recalled based on taste and taste, statistically classifies related information, and selects arbitrary information, products, and services can be considered.
- An information processing system that analyzes the relationship between words recalled based on odors and categorizes related information statistically and selects arbitrary information, products, and services can be considered.
- an information processing system that analyzes the relationship between the weather and words recalled based on the weather, statistically classifies the related information, and selects arbitrary information, products, and services can be considered.
- an information processing system that analyzes the relationship between videos and words recalled based on the videos, statistically classifies the related information, and selects arbitrary information, products, and services can be considered.
- information obtained from such sensory organs and words is associated with information of different series such as words and smells related to taste and costumes, and words related to accessories, and any information, products, or services
- An information processing system that provides When these are used as a network model, the words are captured as nodes and links, and the distance between certain words is used as a feature with the number of words contained between them as the number of network hops for weighting.
- An information processing system that analyzes, constructs, and proposes semantic relationships using information co-occurrence relationships and co-occurrence probabilities, the number of characters and words, management numbers, and word context based on temporal positional relationships as semantic states. Conceivable. As is often the case with network models such as HMMs, these methods may be used to arbitrarily change the way nodes and links are captured.
- an information processing system that provides arbitrary information, products, and services based on sensibility-related words that are recalled in association with each matter can be considered.
- recalled words are classified into symbols and classification codes that are not words, for example, sensory codes that classify adjectives and adverbs such as character codes, sensory codes that classify sensibility, and emotions. Emotion codes, subjective codes that separate subjects, shape code numbers that classify visual shapes, etc., and the code is associated with any other information, or multiple pieces of arbitrary information such as features
- a co-occurrence matrix a method for defining co-occurrence distances based on the number of characters, words, management numbers, and temporal and positional relationships.
- An information processing system that records concept dictionaries and concept indexes built using them on a storage medium can be considered.
- the system can be configured.
- a sample information group of information necessary for the above-described information processing system example is generated using an arbitrary feature amount.
- These specimens can be voice, music, paintings, photographs, videos, chemical components that stimulate the sense of taste and smell, the sensation of touching and touching, the length, weight, speed, etc. It is information such as position, if it is a sentence, it is the appearance frequency and co-occurrence probability of a single word, the appearance character frequency that is a sentence feature, and a combination of any desired information It may be a feature amount configured by combining or processing, or a component ratio of these feature amounts.
- these pieces of information and feature quantities may be manually specified at the initial stage for any ID (Idification Data), label, or code for the classified population. You may classify in advance from the average and dispersion
- the sample information shows the relationship between the obtained label, ID, code, classification number, reference number, control number and the name used by humans, co-occurrence matrix nyunigram, bigram, N-gram, composite type N
- arbitrary features such as path search and matching results based on applications such as CDP matching, DP matching, Viterbi search, N-best method, trellis method, etc. It efficiently constructs concept dictionaries and concept indexes classified and recorded by the present invention after being linked by an index processing method such as a branch tree or hash buffer.
- the information entered by a person is appropriately labeled, ID, code, classification number, reference number, Information related to the control number is searched, and information related to the label, ID, code, classification number, reference number, and control number is searched, and the target information, service, product, means, procedure, route, schedule, etc. are sent to the user.
- a database composed of recording media using information generated and classified according to the present invention as an index or evaluation meter
- the information input by the user is associated with any other information according to the criteria classified according to the present embodiment, and the relation is evaluated.
- the relation is evaluated.
- these applications can realize services that take into account meaning, taste, background, and situation.
- items that are less than a certain threshold for the information that expresses the coexistence state and change of information such as the co-occurrence matrix, co-occurrence probability, and probability transition matrix described in this embodiment in numerical values.
- Is deleted from the evaluation target, or information at a certain distance from the average is deleted from the evaluation target based on the standard deviation obtained from the variance of all probabilities, or the evaluation dimension is set by a method such as Gaussian elimination. You may degenerate or you may add an evaluation item under similar conditions.
- the information classification device 100 may include an external storage device that records data on a recording medium.
- the storage medium includes a program script for executing this procedure on the information processing apparatus, a source code and a flash memory, CD-ROM (Compact Disk Read Only Memory), a hard disk in which the execution means is recorded as information. And a recording medium such as a floppy (registered trademark) disk.
- the information classification device 100 is a transmission / reception device that serves as a communication means or a bus connection means regardless of wired / wireless, such as Ethernet (registered trademark), a modem for mobile phones, and a wireless LAN (Local Area Network).
- It has optical terminals and / or electrical and electromagnetic terminals for inputting / outputting arbitrary signals to / from these apparatuses which may have arbitrary output devices.
- information terminals and information processing devices such as personal computers and car navigation systems, backbone servers and communication base stations including the information classification device 100, mobile phones and watches, jewelry-shaped terminals, remote controllers, PDAs, IC cards, intelligent RFID,
- a portable terminal such as a body-embedded terminal may be used. Since the present invention is an algorithm implementation application, the present invention can be implemented on an arbitrary apparatus as long as it has an arithmetic circuit.
- the control device that controls the information includes the information classification device 100. You may make it.
- the information classification apparatus 100 may be reduced to a portable size and used as an information terminal.
- the information classification device 100 is provided with a function for improving the convenience of society by mutually connecting and exchanging communication of a plurality of different users and, in some cases, charging with the communication. It may be a processing device.
- the information classification apparatus 100 has been described.
- the present invention is not limited to this, and the invention can be understood as an information classification method for causing a computer to execute the processing described in FIG. 2 or an information classification program for causing a computer to execute the processing described in FIG.
- FIG. 6 is a diagram showing an outline of an information classification system according to a modification of the present embodiment.
- the information classification system includes information processing apparatuses 100A and 100B and information terminals 200A to 200C.
- Information processing apparatuses 100A and 100B and information terminals 200A to 200C are connected to each other via a network 500 such as the Internet or a telephone line network.
- Information processing apparatuses 100A and 100B each have the same function as information classification apparatus 100 described above. Then, according to a request from any of the information terminals 200A to 200C, one of the information processing apparatuses 100A and 100B classifies the sample information to be classified into a plurality of populations, and is classified as the requested information terminal. Send the result.
- the power of the information processing devices 100A and 100B receives a plurality of populations from any of the information processing terminals 200A to 200C, reclassifies sample information belonging to these populations, The population classified into the requested information terminal is transmitted. As a result, it is possible to provide a population in which sample information is autonomously and stably classified.
- the information processing apparatuses 100A and 100B and the information terminals 200A to 200C as described above may be applied as an information providing system that provides an ASP (Application Service Provider) type service, a database apparatus, It can be used as a recording medium storing the classification information according to the present invention incorporated in the database device for providing a service, or as an information distribution device using the classification based on the present invention using a communication line.
- any of the information processing devices 100A and 100B receives the sample information to be classified from any of the information processing terminals 200A to 200C, and is stored in the storage unit of the information processing device. It may be determined which population belongs to, and information identifying the determined population may be transmitted to the requested information terminal. As a result, information for identifying the population to which the sample information to be classified belongs can be given autonomously and stably. In addition, the requested information terminal may be charged.
- information configured based on the present embodiment may be recorded on a recording medium and distributed as it is, distributed as a book attached, or distributed using a communication environment.
- recording media such as CD-ROM and DVD-ROM (Digital Versatile Disk Read Only Memory), printing media such as 2D barcodes, electronic media such as flash memory, telephone lines and ADSL (Asymmetric) Digital Subscriber Line), or a recording medium stored remotely via a transmission medium such as an optical fiber.
- any of the information processing devices 100, 100A, and 100B of the present embodiment further includes a database that stores the classified population, and the user or terminal device 200A.
- the sample information power of the classification target received from which power of ⁇ 200C The present invention is used as a database search system that searches which belongs to which population and delivers the search result to either the user or the terminal device 200A-200C. Can be caught. Further, the present invention can be understood as a database construction apparatus for constructing such a database.
- any ID 'label that is generally used in the past is identical.
- the evaluation distance by the arbitrary distance evaluation method is similar to the classification method according to the present invention as an index for the arbitrary feature quantities associated with the arbitrary ID 'labels, 'By judging that it belongs to the category, it can be selected as a search result and presented to the user.
- classification evaluation that can be applied to a state in which arbitrary features and information are stochastically related
- a classification method may be realized that realizes functions and switches the combination of effective functions according to the situation to enable flexible response.
- the information classification device 100 in the present embodiment can be viewed as follows.
- the information classification device 100 calculates the distance between the k samples a belonging to a certain population A and the population A.
- the distance D is determined based on the distance calculation unit, and the mother of each sample a
- the degree of belonging to the population is evaluated by the appearance probability of statistical normal distribution.
- This degree-of-affiliation evaluation unit is a value that is predicted to be out of range from the probability of normal distribution centering on the mean with a ka difference between sample a and distance mean value ⁇ with a probability of 99.7% or higher, for example. Yes 3 ⁇ a
- a closer population such as another population B or population C, is attributed to the new population, otherwise it is attributed to population A as before, Perform recursive classification so that the sample group belonging to population A can form a normal distribution.
- the distribution is symmetric, and the distance from the population is approximately 0.68 ⁇ when the distance from the center of gravity is found.
- the range includes more than 99% of the population.
- the boundaries are ambiguous in human-made populations. Therefore, it often happens that the distribution is asymmetric as described in FIG.
- the average position is indefinite depending on the sample condition, and the sample does not necessarily contain more than 99% of samples from the center of gravity of the population within 3 ⁇ from the distance average value. There is no guarantee.
- the sample is set to 3 ⁇ . If there is a population that includes it, make it belong to that population, and if it does not belong to 3 ⁇ of any population, create a new population C. In this case, if a statistical problem arises that the number of elements in population C is smaller than the required number of evaluation dimensions, the new population need not necessarily be used for evaluation.
- the distance between each element and each population is normalized, and the vector normalization distance is used to divide, combine, and change the population based on the normalized distance.
- the information classification device that performs information classification as close to a normal distribution as possible can be configured.
- the center of gravity of the population is extremely close, for example, when there are populations within a distance of 1 ⁇ from each other, unifying the population to prevent an inadvertent increase in the population.
- the average and standard obtained from the Sampnore group which is a combination of multiple populations that are around 5 ⁇ from the specific population, when sufficient samples are not collected due to statistical reasons In terms of deviation, the sample or population to be evaluated may be deleted under conditions that should be considered statistically, such as when the sample or population exceeds 4 ⁇ .
- the present invention is used as an index for evaluating information, XML (extensible Markup Language), ⁇ OA (Service Oriented Architecture) SML (simple (or Stupid or Software) Markup Language), MCF (Meta Contents Framework), DT D (Document Type Dermition), GML (Geography Markup La nguage), SMIL (Synchronized Multimedia Integration Language), SGML (Standard Generalized Mark-up Language), RDF (Resource Description Framework), and other meta-expression format classification indicators, or SOAP (Simple Object Access Protocol) UD DI (Universal Description, Discovery, and Integration), WDL (Web Services Description Language), SVG (Scalable Vector Graphics), HTML (HyperText Markup Language), etc. Service.
- XML extensible Markup Language
- ⁇ OA Service Oriented Architecture
- SML simple (or Stupid or Software) Markup Language
- MCF Metal Contents Framework
- DT D Document Type Dermition
- [0333] is composed of ⁇ xl, x2, x3, x4, x5, x6, x7, x8 ⁇ and is the input vector of the evaluation function
- the name or ID of the component as an identifier for this component is, for example, from the viewpoint of speech recognition, even if it is a label that itself has one meaning, such as a phoneme.
- a combination of a higher level concept and a lower level concept of an abstract label, such as a phoneme segment, may be used to efficiently represent any efficient representation as an identifier representing a phoneme transition state.
- This superordinate concept and subordinate concept can be used in any information space such as video elements, products, academics, culture, movies, music, etc. The structure which enables the application suitable for is possible.
- the label name or the component ID as an identifier given to the sample and the evaluation function are interchanged if they are equal or not equal as follows. Suppose that there is a case.
- the order of the sample vectors is matched to the label of the input vector of the evaluation function so that the order of the variables is the same, and the label relation of the data is the same. Assign the appropriate variables to the missing labels in the vector.
- the value to be assigned may be 0, or may be an average value of elements according to the Sampnore group used when constructing the evaluation function.
- the label co-occurrence of data and the effect at the time of co-occurrence are separated on the basis of the evaluation result, and those that are positively correlated, those that are not correlated, and those that should not be correlated are separated based on the evaluation results. They may be combined in consideration of the relationship between labels, or may be constructed by evaluating the correlation between labels using the present invention.
- the evaluation order of betatonore on the evaluation function side is sorted in the order of the largest eigenvector, the labels and element values are sorted, the sump nore vector is sorted accordingly, and the same criteria are introduced to introduce distance and similarity. May be evaluated. Also, if the input vector has many 0s, extremely small values, or many values close to the average, the covariance matrix based on the mean and variance of the evaluation function construction sample is extremely small or a value close to the average.
- the components may be reconfigured into In this case, 0 is assigned to the vector element in the following example, but the value of this element may be the sampnore average of each element value in the population to which the betatonore on the side containing that element belongs. .
- the change of the component due to the matching of the label or ID as an identifier may be used for multidimensional evaluation information such as matrix analysis or tensor analysis, not just vector analysis.
- eigen and value eigenvectors are obtained based on the vector structure with changed elements, and various transition matrices such as covariance matrix, probability transition matrix, stationary transition matrix, state transition matrix, co-occurrence matrix, co-occurrence matrix transition You can create an arbitrary matrix such as a probability matrix or reconstruct an arbitrary evaluation function. [0356] [Table 3]
- the evaluation items are aligned and any dummy data is used for items that are blank because they have no elements, and the evaluation side and the evaluated side are added or deleted as appropriate.
- the evaluation distance in the present invention is used as an element, and the sample is re-evaluated at the distance evaluated by the function by associating with the element label of the evaluation function or the element label of the sample, or the function is re-evaluated. It is easy to think of hierarchization. Also, the evaluation function input vector is not reconstructed as in this embodiment.
- Similar effects can be obtained by reconstructing the order and items of the covariance matrix used in the valence function.
- the distance evaluation if there is an evaluation function X belonging to sample A and an evaluation function Y belonging to sample B, distance evaluation using A evaluation function Y and distance evaluation using B evaluation function X are performed. In this case, when the A sample and the Y function are close, and the B sample and the X function are far, it is possible to consider the method of re-learning by changing the information processing means and the sample assignment destination.
- the reconstruction of these vectors is based on the conventional sorting algorithm, adding / deleting / changing indexes in queues and buffering, various algorithms used for replacement and label processing, DP, HMM, regular expressions, etc.
- This can be implemented by building a program by combining label matching processing using.
- a label is specified as an identifier for each variable input to the function. Label each input sample variable. Evaluate whether the labels match. If they do not match, insert dummy data on the sample side if the label is in the function and not in the sample. As this dummy data, an average value of the item, a value such as 0, or an arbitrary multiple of the standard deviation may be used.
- the distance is evaluated by the evaluation function configured as described above, and the degree of attribution is output based on the average, variance, and standard deviation. The procedure is executed.
- evaluation dimensions of these evaluation functions are dynamically controlled, and the samples are processed using the evaluation functions with a small number of evaluation dimensions.
- the results are roughly predicted in advance, and the degree of agreement between the predicted results and those after detailed classification is determined.
- re-evaluating it may be possible to deal with flexible classification.
- these re-evaluation results may be used as feature quantities in the feature vector of the present invention.
- any number of eigenvalues and Z or any number of eigenvectors obtained based on these operations can be used as feature quantities, or these eigenvalues and eigenvectors can be evaluated in any number of layers. It may be used for functions.
- the number of evaluation dimensions of each evaluation function may be used as the feature amount. In this case, for example, after normalizing the distance, if the average is regarded as half the maximum number of dimensions and the total number of dimensions is 100, the appearance probability is 9 If it is 8%, it will be 98 dimensions, if the appearance probability is 50%, it will be 50 dimensions, if the appearance probability is 5%, it will be regarded as 5 dimensions.
- the distance and the probability of appearance may be used as variables in the evaluation function.
- a function that evaluates true and a function that evaluates false are configured, and when true is close and false is far, true, when false is close and true is far away, false is close If you can't judge, but the relevance is high, and both are far away, if you can't judge, the relevance is low.
- the covariance matrix V based on the eigenvector is divided by the square root of the eigenvalue and the distance D is calculated based on the polynomial structure, and the difference from each element mean of the sample Is multiplied by the covariance matrix IJV based on the eigenvector, the constants and prior probabilities based on the eigenvalues that become correction terms when n> 4 in the formulas and Bayesian discriminants used in multidimensional distance calculations For example, the calculation result cannot be expressed in finite digits. In consideration of recursive or hierarchical evaluation, one of the element variables is predicted not to be a finite digit.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Software Systems (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Databases & Information Systems (AREA)
- Medical Informatics (AREA)
- Computing Systems (AREA)
- Health & Medical Sciences (AREA)
- Multimedia (AREA)
- General Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Probability & Statistics with Applications (AREA)
- Mathematical Physics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Description
Claims
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US11/791,705 US7693683B2 (en) | 2004-11-25 | 2005-11-17 | Information classifying device, information classifying method, information classifying program, information classifying system |
JP2007503580A JP4550882B2 (ja) | 2004-11-25 | 2005-11-17 | 情報分類装置、情報分類方法、情報分類プログラム、情報分類システム |
Applications Claiming Priority (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2004-340723 | 2004-11-25 | ||
JP2004340723 | 2004-11-25 | ||
JP2005147048 | 2005-05-19 | ||
JP2005-147048 | 2005-05-19 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2006087854A1 true WO2006087854A1 (ja) | 2006-08-24 |
Family
ID=36916267
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2005/021095 WO2006087854A1 (ja) | 2004-11-25 | 2005-11-17 | 情報分類装置、情報分類方法、情報分類プログラム、情報分類システム |
Country Status (3)
Country | Link |
---|---|
US (1) | US7693683B2 (ja) |
JP (1) | JP4550882B2 (ja) |
WO (1) | WO2006087854A1 (ja) |
Cited By (19)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2008088961A (ja) * | 2006-10-05 | 2008-04-17 | Hitachi Ltd | ガスタービンの性能診断システムと診断方法及び表示画面 |
JP2008203935A (ja) * | 2007-02-16 | 2008-09-04 | Nagoya Institute Of Technology | 迷惑メール判別方法 |
JP2009053430A (ja) * | 2007-08-27 | 2009-03-12 | Yamaha Corp | 音声処理装置およびプログラム |
JP2010118064A (ja) * | 2008-11-14 | 2010-05-27 | Palo Alto Research Center Inc | コンピュータ実施方法 |
JP2011175587A (ja) * | 2010-02-25 | 2011-09-08 | Nippon Telegr & Teleph Corp <Ntt> | ユーザ判定装置、方法、プログラム及びコンテンツ配信システム |
CN103309448A (zh) * | 2013-05-31 | 2013-09-18 | 华东师范大学 | 一种加入符号序列匹配的基于三维加速度的手势识别方法 |
JP2013225207A (ja) * | 2012-04-20 | 2013-10-31 | Docomo Technology Inc | 特許調査支援装置、特許調査支援方法、およびプログラム |
JP2013228933A (ja) * | 2012-04-26 | 2013-11-07 | Docomo Technology Inc | 特許調査結果評価装置、特許調査結果評価方法、およびプログラム |
ES2655544A1 (es) * | 2017-03-29 | 2018-02-20 | Ignacio GOMEZ MAQUEDA | Método y sistema para la monitorización de seres vivos |
JP6457058B1 (ja) * | 2017-12-06 | 2019-01-23 | 株式会社ゴールドアイピー | 知的財産システム、知的財産支援方法および知的財産支援プログラム |
CN109325294A (zh) * | 2018-09-25 | 2019-02-12 | 云南电网有限责任公司电力科学研究院 | 一种火电机组空气预热器性能状态的证据表征构建方法 |
JP2019102099A (ja) * | 2018-12-19 | 2019-06-24 | 株式会社AI Samurai | 知的財産システム、知的財産支援方法および知的財産支援プログラム |
CN110085026A (zh) * | 2019-03-28 | 2019-08-02 | 中国公路工程咨询集团有限公司 | 一种基于聚类分析和马尔科夫模型的交通状态预测方法 |
CN110110133A (zh) * | 2019-04-18 | 2019-08-09 | 贝壳技术有限公司 | 一种智能语音数据生成方法及装置 |
CN111552260A (zh) * | 2020-07-10 | 2020-08-18 | 炬星科技(深圳)有限公司 | 工人位置估算方法、设备及存储介质 |
CN111950987A (zh) * | 2020-08-18 | 2020-11-17 | 广州驰兴通用技术研究有限公司 | 一种基于互联网的远程教育培训方法及*** |
WO2022044625A1 (ja) * | 2020-08-26 | 2022-03-03 | パナソニックIpマネジメント株式会社 | 異常検出装置、異常検出方法及びプログラム |
WO2022079904A1 (ja) * | 2020-10-16 | 2022-04-21 | 日本電信電話株式会社 | パラメータ推定装置、パラメータ推定システム、パラメータ推定方法、及びプログラム |
CN114443849A (zh) * | 2022-02-09 | 2022-05-06 | 北京百度网讯科技有限公司 | 一种标注样本选取方法、装置、电子设备和存储介质 |
Families Citing this family (157)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8060112B2 (en) | 2003-11-20 | 2011-11-15 | Intellient Spatial Technologies, Inc. | Mobile device and geographic information system background and summary of the related art |
US7245923B2 (en) * | 2003-11-20 | 2007-07-17 | Intelligent Spatial Technologies | Mobile device and geographic information system background and summary of the related art |
DE102004008225B4 (de) * | 2004-02-19 | 2006-02-16 | Infineon Technologies Ag | Verfahren und Einrichtung zum Ermitteln von Merkmalsvektoren aus einem Signal zur Mustererkennung, Verfahren und Einrichtung zur Mustererkennung sowie computerlesbare Speichermedien |
US7880154B2 (en) | 2005-07-25 | 2011-02-01 | Karl Otto | Methods and apparatus for the planning and delivery of radiation treatments |
US7906770B2 (en) * | 2005-07-25 | 2011-03-15 | Karl Otto | Methods and apparatus for the planning and delivery of radiation treatments |
US7418341B2 (en) * | 2005-09-12 | 2008-08-26 | Intelligent Spatial Technologies | System and method for the selection of a unique geographic feature |
US20070179970A1 (en) * | 2006-01-31 | 2007-08-02 | Carli Connally | Methods and apparatus for storing and formatting data |
US7603351B2 (en) * | 2006-04-19 | 2009-10-13 | Apple Inc. | Semantic reconstruction |
US8379990B2 (en) * | 2006-05-10 | 2013-02-19 | Nikon Corporation | Object recognition apparatus, computer readable medium storing object recognition program, and image retrieval service providing method |
US8694302B1 (en) * | 2006-05-31 | 2014-04-08 | Worldwide Pro Ltd. | Solving a hierarchical circuit network using a Barycenter compact model |
US8538676B2 (en) * | 2006-06-30 | 2013-09-17 | IPointer, Inc. | Mobile geographic information system and method |
US7707533B2 (en) * | 2006-07-21 | 2010-04-27 | Solido Design Automation Inc. | Data-mining-based knowledge extraction and visualization of analog/mixed-signal/custom digital circuit design flow |
US10957217B2 (en) | 2006-08-25 | 2021-03-23 | Ronald A. Weitzman | Population-sample regression in the estimation of population proportions |
US11151895B2 (en) * | 2006-08-25 | 2021-10-19 | Ronald Weitzman | Population-sample regression in the estimation of population proportions |
US8744883B2 (en) * | 2006-12-19 | 2014-06-03 | Yahoo! Inc. | System and method for labeling a content item based on a posterior probability distribution |
US20080154811A1 (en) * | 2006-12-21 | 2008-06-26 | Caterpillar Inc. | Method and system for verifying virtual sensors |
US7880621B2 (en) * | 2006-12-22 | 2011-02-01 | Toyota Motor Engineering & Manufacturing North America, Inc. | Distraction estimator |
USRE46953E1 (en) | 2007-04-20 | 2018-07-17 | University Of Maryland, Baltimore | Single-arc dose painting for precision radiation therapy |
JP5024668B2 (ja) * | 2007-07-10 | 2012-09-12 | 富士ゼロックス株式会社 | 画像形成装置および情報処理装置 |
US8036764B2 (en) * | 2007-11-02 | 2011-10-11 | Caterpillar Inc. | Virtual sensor network (VSN) system and method |
US8224468B2 (en) * | 2007-11-02 | 2012-07-17 | Caterpillar Inc. | Calibration certificate for virtual sensor network (VSN) |
JP2009151540A (ja) * | 2007-12-20 | 2009-07-09 | Fuji Xerox Co Ltd | 関連要素検索装置、及び関連要素検索プログラム |
JP5500070B2 (ja) | 2008-07-30 | 2014-05-21 | 日本電気株式会社 | データ分類システム、データ分類方法、及びデータ分類プログラム |
US9361367B2 (en) * | 2008-07-30 | 2016-06-07 | Nec Corporation | Data classifier system, data classifier method and data classifier program |
TW201009627A (en) * | 2008-08-20 | 2010-03-01 | Inotera Memories Inc | Method for diagnosing tool capability |
US7917333B2 (en) | 2008-08-20 | 2011-03-29 | Caterpillar Inc. | Virtual sensor network (VSN) based control system and method |
US20130079907A1 (en) * | 2008-09-12 | 2013-03-28 | Kristopher L Homsi | Golf athleticism rating system |
US20100129780A1 (en) * | 2008-09-12 | 2010-05-27 | Nike, Inc. | Athletic performance rating system |
US7809195B1 (en) * | 2008-09-18 | 2010-10-05 | Ernest Greene | Encoding system providing discrimination, classification, and recognition of shapes and patterns |
DE112009002603T5 (de) * | 2008-10-30 | 2012-08-02 | Ford Global Technologies, Llc | Fahrzeug und Verfahren zum Angeben von Empfehlungen für einen darin befindlichen Fahrer |
US20100145990A1 (en) * | 2008-12-09 | 2010-06-10 | Washington University In St. Louis | Selection and performance of hosted and distributed imaging analysis services |
US8745090B2 (en) | 2008-12-22 | 2014-06-03 | IPointer, Inc. | System and method for exploring 3D scenes by pointing at a reference object |
US8483519B2 (en) | 2008-12-22 | 2013-07-09 | Ipointer Inc. | Mobile image search and indexing system and method |
JP5436574B2 (ja) | 2008-12-22 | 2014-03-05 | インテリジェント スペイシャル テクノロジーズ,インク. | ポインティングによって現実世界のオブジェクトとオブジェクト表現とをリンクさせるシステム及び方法 |
US8412493B2 (en) * | 2008-12-22 | 2013-04-02 | International Business Machines Corporation | Multi-dimensional model generation for determining service performance |
JP5647141B2 (ja) * | 2008-12-22 | 2014-12-24 | インテリジェント スペイシャル テクノロジーズ,インク. | 関心のあるオブジェクトを指定することにより動作を開始しフィードバックを提供するシステム及び方法 |
US8443278B2 (en) | 2009-01-02 | 2013-05-14 | Apple Inc. | Identification of tables in an unstructured document |
US9672293B2 (en) * | 2009-01-12 | 2017-06-06 | Namesforlife, Llc | Systems and methods for automatically identifying and linking names in digital resources |
CA2750094A1 (en) * | 2009-01-29 | 2010-08-05 | Nike International Ltd. | Athletic performance rating system |
US20100205034A1 (en) * | 2009-02-09 | 2010-08-12 | William Kelly Zimmerman | Methods and apparatus to model consumer awareness for changing products in a consumer purchase model |
US8972899B2 (en) | 2009-02-10 | 2015-03-03 | Ayasdi, Inc. | Systems and methods for visualization of data analysis |
US20100211894A1 (en) * | 2009-02-18 | 2010-08-19 | Google Inc. | Identifying Object Using Generative Model |
US8285414B2 (en) | 2009-03-31 | 2012-10-09 | International Business Machines Corporation | Method and system for evaluating a machine tool operating characteristics |
EP2417544A4 (en) * | 2009-04-08 | 2013-10-02 | Google Inc | SIMILARITY BASED ADJUSTMENT TO CLASSIFY |
WO2010121166A1 (en) * | 2009-04-16 | 2010-10-21 | Nike International Ltd. | Athletic performance rating system |
CA2760616A1 (en) * | 2009-05-01 | 2010-11-04 | Nike International Ltd. | Athletic performance rating system |
US20100306028A1 (en) * | 2009-06-02 | 2010-12-02 | Wagner John G | Methods and apparatus to model with ghost groups |
CN101950377A (zh) * | 2009-07-10 | 2011-01-19 | 索尼公司 | 新型马尔可夫序列生成器和生成马尔可夫序列的新方法 |
US9092668B2 (en) * | 2009-07-18 | 2015-07-28 | ABBYY Development | Identifying picture areas based on gradient image analysis |
DE102009057583A1 (de) * | 2009-09-04 | 2011-03-10 | Siemens Aktiengesellschaft | Vorrichtung und Verfahren zur Erzeugung einer zielgerichteten realitätsnahen Bewegung von Teilchen entlang kürzester Wege bezüglich beliebiger Abstandsgewichtungen für Personen- und Objektstromsimulationen |
WO2011035298A2 (en) * | 2009-09-21 | 2011-03-24 | The Nielsen Company (Us) Llc | Methods and apparatus to perform choice modeling with substitutability data |
US8738228B2 (en) * | 2009-10-30 | 2014-05-27 | Ford Global Technologies, Llc | Vehicle and method of tuning performance of same |
US8258934B2 (en) * | 2009-10-30 | 2012-09-04 | Ford Global Technologies, Llc | Vehicle and method of advising a driver therein |
US8886365B2 (en) * | 2009-10-30 | 2014-11-11 | Ford Global Technologies, Llc | Vehicle and method for advising driver of same |
US9707974B2 (en) | 2009-10-30 | 2017-07-18 | Ford Global Technologies, Llc | Vehicle with identification system |
JP2011138194A (ja) * | 2009-12-25 | 2011-07-14 | Sony Corp | 情報処理装置、情報処理方法およびプログラム |
US8543598B2 (en) * | 2010-03-01 | 2013-09-24 | Microsoft Corporation | Semantic object characterization and search |
US8903837B2 (en) * | 2010-04-13 | 2014-12-02 | Yahoo!, Inc. | Incorporating geographical locations in a search process |
US8548255B2 (en) * | 2010-04-15 | 2013-10-01 | Nokia Corporation | Method and apparatus for visual search stability |
US8490056B2 (en) * | 2010-04-28 | 2013-07-16 | International Business Machines Corporation | Automatic identification of subroutines from test scripts |
US9289627B2 (en) | 2010-06-22 | 2016-03-22 | Varian Medical Systems International Ag | System and method for estimating and manipulating estimated radiation dose |
TWI537845B (zh) * | 2010-10-20 | 2016-06-11 | 華亞科技股份有限公司 | 半導體製程管制規格之制定方法 |
US8676623B2 (en) * | 2010-11-18 | 2014-03-18 | Navteq B.V. | Building directory aided navigation |
US9159128B2 (en) | 2011-01-13 | 2015-10-13 | Rutgers, The State University Of New Jersey | Enhanced multi-protocol analysis via intelligent supervised embedding (empravise) for multimodal data fusion |
WO2012104786A2 (en) * | 2011-02-04 | 2012-08-09 | Koninklijke Philips Electronics N.V. | Imaging protocol update and/or recommender |
WO2012104780A1 (en) * | 2011-02-04 | 2012-08-09 | Koninklijke Philips Electronics N.V. | Identification of medical concepts for imaging protocol selection |
US8484024B2 (en) | 2011-02-24 | 2013-07-09 | Nuance Communications, Inc. | Phonetic features for speech recognition |
US20120223227A1 (en) * | 2011-03-04 | 2012-09-06 | Chien-Huei Chen | Apparatus and methods for real-time three-dimensional sem imaging and viewing of semiconductor wafers |
US20120259676A1 (en) | 2011-04-07 | 2012-10-11 | Wagner John G | Methods and apparatus to model consumer choice sourcing |
WO2012162405A1 (en) | 2011-05-24 | 2012-11-29 | Namesforlife, Llc | Semiotic indexing of digital resources |
US8793004B2 (en) | 2011-06-15 | 2014-07-29 | Caterpillar Inc. | Virtual sensor system and method for generating output parameters |
EP2766836A4 (en) * | 2011-10-10 | 2015-07-15 | Ayasdi Inc | SYSTEM AND METHOD FOR ALLOCATING NEW PATIENT INFORMATION TO PREVIOUS RESULTS IN SUPPORT OF TREATMENT |
US8805008B1 (en) * | 2011-11-02 | 2014-08-12 | The Boeing Company | Tracking closely spaced objects in images |
CN102521602B (zh) * | 2011-11-17 | 2013-09-25 | 西安电子科技大学 | 基于条件随机场和最小距离法的超光谱图像分类方法 |
US9311383B1 (en) | 2012-01-13 | 2016-04-12 | The Nielsen Company (Us), Llc | Optimal solution identification system and method |
US9336302B1 (en) | 2012-07-20 | 2016-05-10 | Zuci Realty Llc | Insight and algorithmic clustering for automated synthesis |
US9183600B2 (en) | 2013-01-10 | 2015-11-10 | International Business Machines Corporation | Technology prediction |
WO2014115254A1 (ja) * | 2013-01-23 | 2014-07-31 | 株式会社日立製作所 | シミュレーションシステム、およびシミュレーション方法 |
US9704136B2 (en) | 2013-01-31 | 2017-07-11 | Hewlett Packard Enterprise Development Lp | Identifying subsets of signifiers to analyze |
US8914416B2 (en) | 2013-01-31 | 2014-12-16 | Hewlett-Packard Development Company, L.P. | Semantics graphs for enterprise communication networks |
US9355166B2 (en) | 2013-01-31 | 2016-05-31 | Hewlett Packard Enterprise Development Lp | Clustering signifiers in a semantics graph |
WO2014126650A1 (en) * | 2013-02-14 | 2014-08-21 | Exxonmobil Upstream Research Company | Detecting subsurface structures |
WO2014143729A1 (en) | 2013-03-15 | 2014-09-18 | Affinnova, Inc. | Method and apparatus for interactive evolutionary optimization of concepts |
WO2014152010A1 (en) | 2013-03-15 | 2014-09-25 | Affinnova, Inc. | Method and apparatus for interactive evolutionary algorithms with respondent directed breeding |
CN104346354B (zh) * | 2013-07-29 | 2017-12-01 | 阿里巴巴集团控股有限公司 | 一种提供推荐词的方法及装置 |
US9841463B2 (en) * | 2014-02-27 | 2017-12-12 | Invently Automotive Inc. | Method and system for predicting energy consumption of a vehicle using a statistical model |
US10599705B2 (en) * | 2014-03-20 | 2020-03-24 | Gracenote Digital Ventures, Llc | Retrieving and playing out media content for a personalized playlist including a content placeholder |
US10213149B2 (en) | 2014-05-08 | 2019-02-26 | Medical Care Corporation | Systems and methods for assessing human cognition, including a quantitative approach to assessing executive function |
US20150331930A1 (en) * | 2014-05-16 | 2015-11-19 | Here Global B.V. | Method and apparatus for classification of media based on metadata |
TWI595416B (zh) * | 2014-06-12 | 2017-08-11 | 國立交通大學 | 多維資料空間的貝氏循序切割系統及其計數引擎 |
US20160004794A1 (en) * | 2014-07-02 | 2016-01-07 | General Electric Company | System and method using generative model to supplement incomplete industrial plant information |
JP6459345B2 (ja) * | 2014-09-26 | 2019-01-30 | 大日本印刷株式会社 | 変動データ管理システム及びその特異性検出方法 |
US10062033B2 (en) * | 2014-09-26 | 2018-08-28 | Disney Enterprises, Inc. | Analysis of team behaviors using role and formation information |
US11093845B2 (en) * | 2015-05-22 | 2021-08-17 | Fair Isaac Corporation | Tree pathway analysis for signature inference |
US9665735B2 (en) * | 2015-02-05 | 2017-05-30 | Bank Of America Corporation | Privacy fractal mirroring of transaction data |
US10270609B2 (en) * | 2015-02-24 | 2019-04-23 | BrainofT Inc. | Automatically learning and controlling connected devices |
JP2018508090A (ja) * | 2015-03-13 | 2018-03-22 | プロジェクト レイ リミテッド | ユーザインタフェースをユーザ注意力及び運転条件に適合化するシステム及び方法 |
US10147108B2 (en) | 2015-04-02 | 2018-12-04 | The Nielsen Company (Us), Llc | Methods and apparatus to identify affinity between segment attributes and product characteristics |
US10542961B2 (en) | 2015-06-15 | 2020-01-28 | The Research Foundation For The State University Of New York | System and method for infrasonic cardiac monitoring |
CN106295351B (zh) * | 2015-06-24 | 2019-03-19 | 阿里巴巴集团控股有限公司 | 一种风险识别方法及装置 |
US20170083920A1 (en) * | 2015-09-21 | 2017-03-23 | Fair Isaac Corporation | Hybrid method of decision tree and clustering technology |
US9882807B2 (en) * | 2015-11-11 | 2018-01-30 | International Business Machines Corporation | Network traffic classification |
EP3373089B1 (en) * | 2016-01-13 | 2021-03-10 | Mitsubishi Electric Corporation | Operating state classification device |
US10605470B1 (en) | 2016-03-08 | 2020-03-31 | BrainofT Inc. | Controlling connected devices using an optimization function |
EP3450910B1 (en) * | 2016-04-27 | 2023-11-22 | FUJIFILM Corporation | Index generation method, measurement method, and index generation device |
KR101830522B1 (ko) * | 2016-08-22 | 2018-02-21 | 가톨릭대학교 산학협력단 | 빅 데이터를 이용한 예측 대상 지역의 범죄 발생 예측 방법 |
US9946958B1 (en) * | 2016-10-14 | 2018-04-17 | Cloudera, Inc. | Image processing system and method |
US10216899B2 (en) * | 2016-10-20 | 2019-02-26 | Hewlett Packard Enterprise Development Lp | Sentence construction for DNA classification |
US10157613B2 (en) | 2016-11-17 | 2018-12-18 | BrainofT Inc. | Controlling connected devices using a relationship graph |
US10931758B2 (en) | 2016-11-17 | 2021-02-23 | BrainofT Inc. | Utilizing context information of environment component regions for event/activity prediction |
US11205103B2 (en) | 2016-12-09 | 2021-12-21 | The Research Foundation for the State University | Semisupervised autoencoder for sentiment analysis |
US10739733B1 (en) | 2017-02-01 | 2020-08-11 | BrainofT Inc. | Interactive environmental controller |
CN106874599B (zh) * | 2017-02-17 | 2019-07-09 | 武汉大学 | 快速生成卵石碎石夹杂的混凝土三维随机骨料模型的方法 |
US10067746B1 (en) * | 2017-03-02 | 2018-09-04 | Futurewei Technologies, Inc. | Approximate random number generator by empirical cumulative distribution function |
US10365893B2 (en) | 2017-03-30 | 2019-07-30 | Futurewei Technologies, Inc. | Sample-based multidimensional data cloning |
CN107515842B (zh) * | 2017-07-19 | 2018-06-19 | 中南大学 | 一种城市人口密度动态预测方法及*** |
US10922334B2 (en) * | 2017-08-11 | 2021-02-16 | Conduent Business Services, Llc | Mixture model based time-series clustering of crime data across spatial entities |
WO2019060199A1 (en) * | 2017-09-19 | 2019-03-28 | Dharma Platform, Inc. | AUTOMATIC DATA SWITCHING |
CN108304853B (zh) * | 2017-10-10 | 2022-11-08 | 腾讯科技(深圳)有限公司 | 游戏相关度的获取方法、装置、存储介质和电子装置 |
EP3471107A1 (en) * | 2017-10-12 | 2019-04-17 | Fresenius Medical Care Deutschland GmbH | Medical device and computer-implemented method of predicting risk, occurrence or progression of adverse health conditions in test subjects in subpopulations arbitrarily selected from a total population |
US11062216B2 (en) * | 2017-11-21 | 2021-07-13 | International Business Machines Corporation | Prediction of olfactory and taste perception through semantic encoding |
RU2699573C2 (ru) | 2017-12-15 | 2019-09-06 | Общество С Ограниченной Ответственностью "Яндекс" | Способы и системы для создания значений общего критерия оценки |
CN108243191B (zh) * | 2018-01-10 | 2019-08-23 | 武汉斗鱼网络科技有限公司 | 风险行为识别方法、存储介质、设备及*** |
CN108304875A (zh) * | 2018-01-31 | 2018-07-20 | 中国科学院武汉岩土力学研究所 | 一种基于统计判别分类的***块度预测方法 |
GB201802440D0 (en) * | 2018-02-14 | 2018-03-28 | Jukedeck Ltd | A method of generating music data |
US20190355477A1 (en) * | 2018-05-18 | 2019-11-21 | Beckman Coulter, Inc. | Test panel analysis |
CN110599336B (zh) * | 2018-06-13 | 2020-12-15 | 北京九章云极科技有限公司 | 一种金融产品购买预测方法及*** |
US20210089952A1 (en) * | 2018-06-19 | 2021-03-25 | Shimadzu Corporation | Parameter-searching method, parameter-searching device, and program for parameter search |
US11035943B2 (en) * | 2018-07-19 | 2021-06-15 | Aptiv Technologies Limited | Radar based tracking of slow moving objects |
GB2576501B (en) * | 2018-08-16 | 2021-03-10 | Centrica Plc | Sensing fluid flow |
CN109034269A (zh) * | 2018-08-22 | 2018-12-18 | 华北水利水电大学 | 一种基于计算机视觉技术的棉铃虫雌雄成虫判别方法 |
CN108845302B (zh) * | 2018-08-23 | 2022-06-03 | 电子科技大学 | 一种k近邻变换真假目标特征提取方法 |
JP7005463B2 (ja) * | 2018-09-27 | 2022-01-21 | 株式会社東芝 | 学習装置、学習方法及びプログラム |
CN109446467B (zh) * | 2018-09-28 | 2023-10-24 | 安徽皖仪科技股份有限公司 | 数字滤波方法及装置 |
US10878292B2 (en) * | 2018-12-07 | 2020-12-29 | Goodrich Corporation | Automatic generation of a new class in a classification system |
CN109697466B (zh) * | 2018-12-20 | 2022-10-25 | 烟台大学 | 一种自适应区间型空间模糊c均值的地物分类方法 |
EP3935581A4 (en) | 2019-03-04 | 2022-11-30 | Iocurrents, Inc. | DATA COMPRESSION AND COMMUNICATION USING MACHINE LEARNING |
US11245729B2 (en) * | 2019-07-09 | 2022-02-08 | Salesforce.Com, Inc. | Group optimization for network communications |
CN110675959B (zh) * | 2019-08-19 | 2023-07-07 | 平安科技(深圳)有限公司 | 数据智能分析方法、装置、计算机设备及存储介质 |
CN110851321B (zh) * | 2019-10-10 | 2022-06-28 | 平安科技(深圳)有限公司 | 一种业务告警方法、设备及存储介质 |
US20210173855A1 (en) * | 2019-12-10 | 2021-06-10 | Here Global B.V. | Method, apparatus, and computer program product for dynamic population estimation |
CN111078589B (zh) * | 2019-12-27 | 2023-04-11 | 深圳鲲云信息科技有限公司 | 一种应用于深度学习计算的数据读取***、方法及芯片 |
CN111191723B (zh) * | 2019-12-30 | 2023-06-20 | 创新奇智(北京)科技有限公司 | 基于级联分类器的少样本商品分类***及分类方法 |
CN111291326B (zh) * | 2020-02-06 | 2022-05-17 | 武汉大学 | 一种结合类内相似度和类间差异度的聚类有效性指标建立方法 |
CN111427984B (zh) * | 2020-03-24 | 2022-04-01 | 成都理工大学 | 一种区域地震概率空间分布生成方法 |
US11551666B1 (en) * | 2020-05-28 | 2023-01-10 | Amazon Technologies, Inc. | Natural language processing |
CN111693658A (zh) * | 2020-06-11 | 2020-09-22 | 上海交通大学 | 基于多种智能感官数据融合的食品品质鉴定方法 |
US11222232B1 (en) | 2020-06-19 | 2022-01-11 | Nvidia Corporation | Using temporal filters for automated real-time classification |
CN111912799B (zh) * | 2020-07-17 | 2021-07-27 | 中国科学院西安光学精密机械研究所 | 一种基于高光谱水体库的自适应波段选择方法 |
CN112116159B (zh) * | 2020-09-21 | 2021-08-27 | 贝壳找房(北京)科技有限公司 | 信息交互方法、装置、计算机可读存储介质及电子设备 |
US11978266B2 (en) | 2020-10-21 | 2024-05-07 | Nvidia Corporation | Occupant attentiveness and cognitive load monitoring for autonomous and semi-autonomous driving applications |
US20220138260A1 (en) * | 2020-10-30 | 2022-05-05 | Here Global B.V. | Method, apparatus, and system for estimating continuous population density change in urban areas |
US20220262455A1 (en) * | 2021-02-18 | 2022-08-18 | Recursion Pharmaceuticals, Inc. | Determining the goodness of a biological vector space |
CN113327220B (zh) * | 2021-06-24 | 2023-06-02 | 浙江成功软件开发有限公司 | 一种基于复杂网络的海洋多时间序列关联性发现方法 |
JP7504236B2 (ja) * | 2021-06-25 | 2024-06-21 | エルアンドティー テクノロジー サービシズ リミテッド | データサンプルをクラスタ化する方法およびシステム |
CN115700838A (zh) * | 2021-07-29 | 2023-02-07 | 脸萌有限公司 | 用于图像识别模型的训练方法及其装置、图像识别方法 |
CN115218893B (zh) * | 2022-06-19 | 2024-05-28 | 中国人民解放军空军工程大学 | 一种基于特征提取的地磁导航方法 |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2001167124A (ja) * | 1999-12-13 | 2001-06-22 | Sharp Corp | 文書分類装置及び文書分類プログラムを記録した記録媒体 |
JP2002183171A (ja) * | 2000-12-12 | 2002-06-28 | Matsushita Electric Ind Co Ltd | 文書データ・クラスタリングシステム |
JP2003030224A (ja) * | 2001-07-17 | 2003-01-31 | Fujitsu Ltd | 文書クラスタ作成装置、文書検索システムおよびfaq作成システム |
Family Cites Families (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP3165247B2 (ja) * | 1992-06-19 | 2001-05-14 | シスメックス株式会社 | 粒子分析方法及び装置 |
JPH09161062A (ja) | 1995-12-13 | 1997-06-20 | Nissan Motor Co Ltd | パターン認識方法 |
US6442555B1 (en) * | 1999-10-26 | 2002-08-27 | Hewlett-Packard Company | Automatic categorization of documents using document signatures |
JP3457617B2 (ja) * | 2000-03-23 | 2003-10-20 | 株式会社東芝 | 画像検索システムおよび画像検索方法 |
JP3701197B2 (ja) | 2000-12-28 | 2005-09-28 | 松下電器産業株式会社 | 分類への帰属度計算基準作成方法及び装置 |
US6728658B1 (en) * | 2001-05-24 | 2004-04-27 | Simmonds Precision Products, Inc. | Method and apparatus for determining the health of a component using condition indicators |
JP2003076976A (ja) | 2001-08-31 | 2003-03-14 | Mitsui Eng & Shipbuild Co Ltd | パターンマッチング方法 |
JP3708042B2 (ja) * | 2001-11-22 | 2005-10-19 | 株式会社東芝 | 画像処理方法及びプログラム |
JP4080276B2 (ja) * | 2002-08-27 | 2008-04-23 | 富士フイルム株式会社 | オブジェクト抽出方法および装置ならびにプログラム |
US7117108B2 (en) * | 2003-05-28 | 2006-10-03 | Paul Ernest Rapp | System and method for categorical analysis of time dependent dynamic processes |
US7548651B2 (en) * | 2003-10-03 | 2009-06-16 | Asahi Kasei Kabushiki Kaisha | Data process unit and data process unit control program |
-
2005
- 2005-11-17 WO PCT/JP2005/021095 patent/WO2006087854A1/ja not_active Application Discontinuation
- 2005-11-17 JP JP2007503580A patent/JP4550882B2/ja not_active Expired - Fee Related
- 2005-11-17 US US11/791,705 patent/US7693683B2/en not_active Expired - Fee Related
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2001167124A (ja) * | 1999-12-13 | 2001-06-22 | Sharp Corp | 文書分類装置及び文書分類プログラムを記録した記録媒体 |
JP2002183171A (ja) * | 2000-12-12 | 2002-06-28 | Matsushita Electric Ind Co Ltd | 文書データ・クラスタリングシステム |
JP2003030224A (ja) * | 2001-07-17 | 2003-01-31 | Fujitsu Ltd | 文書クラスタ作成装置、文書検索システムおよびfaq作成システム |
Non-Patent Citations (1)
Title |
---|
HAMAMOTO Y.: "Some Remarks on Statistical Pattern Recognition: Past, Present and Future", TECHNICAL REPORT OF IEICE, vol. 100, no. 507, 7 December 2000 (2000-12-07), pages 69 - 76, XP002996546 * |
Cited By (27)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2008088961A (ja) * | 2006-10-05 | 2008-04-17 | Hitachi Ltd | ガスタービンの性能診断システムと診断方法及び表示画面 |
JP2008203935A (ja) * | 2007-02-16 | 2008-09-04 | Nagoya Institute Of Technology | 迷惑メール判別方法 |
JP2009053430A (ja) * | 2007-08-27 | 2009-03-12 | Yamaha Corp | 音声処理装置およびプログラム |
JP2010118064A (ja) * | 2008-11-14 | 2010-05-27 | Palo Alto Research Center Inc | コンピュータ実施方法 |
JP2011175587A (ja) * | 2010-02-25 | 2011-09-08 | Nippon Telegr & Teleph Corp <Ntt> | ユーザ判定装置、方法、プログラム及びコンテンツ配信システム |
JP2013225207A (ja) * | 2012-04-20 | 2013-10-31 | Docomo Technology Inc | 特許調査支援装置、特許調査支援方法、およびプログラム |
JP2013228933A (ja) * | 2012-04-26 | 2013-11-07 | Docomo Technology Inc | 特許調査結果評価装置、特許調査結果評価方法、およびプログラム |
CN103309448A (zh) * | 2013-05-31 | 2013-09-18 | 华东师范大学 | 一种加入符号序列匹配的基于三维加速度的手势识别方法 |
ES2655544A1 (es) * | 2017-03-29 | 2018-02-20 | Ignacio GOMEZ MAQUEDA | Método y sistema para la monitorización de seres vivos |
WO2018178461A1 (es) * | 2017-03-29 | 2018-10-04 | Ignacio Gomez Maqueda | Método y sistema para la monitorización de seres vivos |
WO2019111545A1 (ja) * | 2017-12-06 | 2019-06-13 | 株式会社 AI Samurai | 知的財産システム、知的財産支援方法および知的財産支援プログラム |
JP6457058B1 (ja) * | 2017-12-06 | 2019-01-23 | 株式会社ゴールドアイピー | 知的財産システム、知的財産支援方法および知的財産支援プログラム |
JP2019101944A (ja) * | 2017-12-06 | 2019-06-24 | 株式会社AI Samurai | 知的財産システム、知的財産支援方法および知的財産支援プログラム |
CN109325294B (zh) * | 2018-09-25 | 2023-08-11 | 云南电网有限责任公司电力科学研究院 | 一种火电机组空气预热器性能状态的证据表征构建方法 |
CN109325294A (zh) * | 2018-09-25 | 2019-02-12 | 云南电网有限责任公司电力科学研究院 | 一种火电机组空气预热器性能状态的证据表征构建方法 |
JP2019102099A (ja) * | 2018-12-19 | 2019-06-24 | 株式会社AI Samurai | 知的財産システム、知的財産支援方法および知的財産支援プログラム |
CN110085026A (zh) * | 2019-03-28 | 2019-08-02 | 中国公路工程咨询集团有限公司 | 一种基于聚类分析和马尔科夫模型的交通状态预测方法 |
CN110110133A (zh) * | 2019-04-18 | 2019-08-09 | 贝壳技术有限公司 | 一种智能语音数据生成方法及装置 |
CN111552260A (zh) * | 2020-07-10 | 2020-08-18 | 炬星科技(深圳)有限公司 | 工人位置估算方法、设备及存储介质 |
CN111950987A (zh) * | 2020-08-18 | 2020-11-17 | 广州驰兴通用技术研究有限公司 | 一种基于互联网的远程教育培训方法及*** |
WO2022044625A1 (ja) * | 2020-08-26 | 2022-03-03 | パナソニックIpマネジメント株式会社 | 異常検出装置、異常検出方法及びプログラム |
EP4206699A4 (en) * | 2020-08-26 | 2024-03-13 | Panasonic Intellectual Property Management Co., Ltd. | ANOMALY DETECTION DEVICE, ANOMALY DETECTION METHOD AND PROGRAM |
WO2022079904A1 (ja) * | 2020-10-16 | 2022-04-21 | 日本電信電話株式会社 | パラメータ推定装置、パラメータ推定システム、パラメータ推定方法、及びプログラム |
AU2020472128B2 (en) * | 2020-10-16 | 2023-11-30 | Nippon Telegraph And Telephone Corporation | Parameter estimation device, parameter estimation system, parameter estimation method, and program |
JP7456514B2 (ja) | 2020-10-16 | 2024-03-27 | 日本電信電話株式会社 | パラメータ推定装置、パラメータ推定システム、パラメータ推定方法、及びプログラム |
CN114443849A (zh) * | 2022-02-09 | 2022-05-06 | 北京百度网讯科技有限公司 | 一种标注样本选取方法、装置、电子设备和存储介质 |
CN114443849B (zh) * | 2022-02-09 | 2023-10-27 | 北京百度网讯科技有限公司 | 一种标注样本选取方法、装置、电子设备和存储介质 |
Also Published As
Publication number | Publication date |
---|---|
JPWO2006087854A1 (ja) | 2008-08-07 |
JP4550882B2 (ja) | 2010-09-22 |
US7693683B2 (en) | 2010-04-06 |
US20080114564A1 (en) | 2008-05-15 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP4550882B2 (ja) | 情報分類装置、情報分類方法、情報分類プログラム、情報分類システム | |
CN110929164B (zh) | 一种基于用户动态偏好与注意力机制的兴趣点推荐方法 | |
CN112818861B (zh) | 一种基于多模态上下文语义特征的情感分类方法及*** | |
Chen et al. | Efficient ant colony optimization for image feature selection | |
CN112085565A (zh) | 基于深度学习的信息推荐方法、装置、设备及存储介质 | |
CN109829154B (zh) | 基于语义的人格预测方法、用户设备、存储介质及装置 | |
CN113553510B (zh) | 一种文本信息推荐方法、装置及可读介质 | |
Habib et al. | Altibbivec: a word embedding model for medical and health applications in the Arabic language | |
Sharma et al. | Supervised machine learning method for ontology-based financial decisions in the stock market | |
CN114298783A (zh) | 基于矩阵分解融合用户社交信息的商品推荐方法及*** | |
Chanda | Efficacy of BERT embeddings on predicting disaster from twitter data | |
Sadiq et al. | High dimensional latent space variational autoencoders for fake news detection | |
Chaudhuri | Visual and text sentiment analysis through hierarchical deep learning networks | |
Chemchem et al. | Deep learning and data mining classification through the intelligent agent reasoning | |
CN114417172A (zh) | 一种深度兴趣进化推荐方法、装置、设备和存储介质 | |
CN116756347B (zh) | 一种基于大数据的语义信息检索方法 | |
Liao et al. | Embedding compression with isotropic iterative quantization | |
CN113761192A (zh) | 文本处理方法、文本处理装置及文本处理设备 | |
Kumnunt et al. | Detection of Depression in Thai Social Media Messages using Deep Learning. | |
Sridhar et al. | Sentiment Analysis Using Ensemble-Hybrid Model with Hypernym Based Feature Engineering | |
Viji et al. | A hybrid approach of Poisson distribution LDA with deep Siamese Bi-LSTM and GRU model for semantic similarity prediction for text data | |
Ling | Coronavirus public sentiment analysis with BERT deep learning | |
Venkataraman et al. | FBO‐RNN: Fuzzy butterfly optimization‐based RNN‐LSTM for extracting sentiments from Twitter Emoji database | |
Tizhoosh et al. | On poem recognition | |
Aruna et al. | Feature Selection Based Naïve Bayes Algorithm for Twitter Sentiment Analysis |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
DPE1 | Request for preliminary examination filed after expiration of 19th month from priority date (pct application filed from 20040101) | ||
WWE | Wipo information: entry into national phase |
Ref document number: 2007503580 Country of ref document: JP |
|
WWE | Wipo information: entry into national phase |
Ref document number: 11791705 Country of ref document: US |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 05806849 Country of ref document: EP Kind code of ref document: A1 |
|
WWW | Wipo information: withdrawn in national office |
Ref document number: 5806849 Country of ref document: EP |
|
WWP | Wipo information: published in national office |
Ref document number: 11791705 Country of ref document: US |