WO2004053735A1 - 情報処理装置および情報処理方法、並びに情報処理プログラム - Google Patents
情報処理装置および情報処理方法、並びに情報処理プログラム Download PDFInfo
- Publication number
- WO2004053735A1 WO2004053735A1 PCT/JP2003/015865 JP0315865W WO2004053735A1 WO 2004053735 A1 WO2004053735 A1 WO 2004053735A1 JP 0315865 W JP0315865 W JP 0315865W WO 2004053735 A1 WO2004053735 A1 WO 2004053735A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- thesaurus
- correlation coefficient
- text data
- storing
- appearance frequency
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/237—Lexical tools
- G06F40/247—Thesauruses; Synonyms
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/36—Creation of semantic tools, e.g. ontology or thesauri
- G06F16/374—Thesaurus
Definitions
- the present invention relates to an information processing apparatus, an information processing method, and an information processing program.
- Information processing program Background art
- words whose appearance frequency is greater than or equal to a specified value are extracted from the extracted words, and the relevance between the extracted words is evaluated. Generates a class of co-occurrence words. At this time, if a category dictionary is created in advance according to the text to be analyzed, the analysis result of the text can be presented (for example, Japanese Patent Application Laid-Open No. 2000-101101). No. 94 (see Figure 1).
- the present invention has been made in view of such a situation, and aims to detect the characteristics of text data based on the correlation between keywords extracted from the text data. It is. Disclosure of the invention
- the information processing device inputs text data.
- Input means text data storage means for storing text data, text cutting means for executing text cutting processing, and text subjected to text cutting processing
- a parsing means for performing a parsing process on the data, a perimeter generating means for generating a perimeter from the text data subjected to the parsing process, and a perimeter generating means prepared by the perimeter generating means.
- a thesaurus storage means for storing the extracted thesaurus, a thesaurus sorting means for performing a sorting process on the text data subjected to the code cutting and the syntax analysis, and a sorting result storage means for storing the sorting result by the thesaurus sorting means.
- An appearance frequency calculating means an appearance frequency storing means for storing a result calculated by the appearance frequency calculating means, a correlation coefficient calculating means for calculating the number of correlations between the thesaurus, and a calculating means for calculating the correlation coefficient.
- Correlation coefficient storage means for storing correlation coefficients between thesauruses, correlation coefficient total calculation means for each thesaurus for calculating the sum of correlation coefficients for each thesaurus, and correlation coefficient total calculation means for each thesaurus
- the correlation coefficient total storage means for each thesaurus that stores the total correlation coefficient for each thesaurus calculated by the above, the appearance frequency stored by the appearance frequency storage means, and the correlation coefficient total storage means for each thesaurus
- Graph creation and display means for creating and displaying a graph based on the total correlation coefficient for each thesaurus that has been
- the step and the parsing means are characterized in that the first cutting processing and the parsing processing are performed again based on the thesaurus created by the thesaurus creating means.
- the information processing method includes an input step of inputting text data, a text data overnight storage step of storing text data, and a card cutting process for the text data.
- Run A text cutting step, a parsing step for parsing text data that has been subjected to a single cutting process, and a thesaurus for creating a thesaurus from the text data that has been subjected to the parsing process A creation step, a thesaurus storage step for storing the thesaurus created in the thesaurus creation step, and a word for performing the second cutting process and the parsing process again based on the thesaurus stored in the thesaurus storage step A cutting and parsing step; and a thesaurus sorting step for sorting the text data that has been subjected to cutting and parsing, and a sorting result storage for storing sorting results in the thesaurus sorting step.
- Steps sorting Result memory status Frequency calculating step of calculating an appearance frequency for each thesaurus based on the sorting results stored in the step, an appearance frequency storing step of storing the result calculated in the step of calculating the appearance frequency, and A correlation coefficient calculating step of calculating a correlation coefficient of the correlation coefficient; a correlation coefficient storage step of storing a correlation coefficient between the thesaurus calculated in the correlation coefficient calculation step; and a sum of correlation coefficients of each thesaurus.
- the information processing program includes an input step of inputting text data, a text data storing step of storing text data, and a word cutting process for text data. Run A text cutting step, a syntax analysis step of performing a syntax analysis process on the text data subjected to the code cutting process, and a thesaurus from the text data subjected to the syntax analysis process.
- a thesaurus creation step to be performed, a thesaurus storage step for storing the thesaurus created in the thesaurus creation step, and a code cutting process and a syntax analysis process are performed again based on the thesaurus stored in the thesaurus storage step
- FIG. 1 is a functional block diagram of an information processing apparatus according to an embodiment of the present invention.
- FIG. 2 is a flowchart for explaining the processing procedure of the present embodiment.
- FIG. 3 is a diagram showing an example of a thesaurus in which synonyms are aggregated.
- FIG. 4 is a diagram showing the results of sorting for each thesaurus.
- FIG. 5 is a diagram showing a correlation coefficient for each thesaurus.
- FIG. 6 is a diagram showing the appearance frequency of each thesaurus.
- FIG. 7 is a graph showing the relationship between the appearance frequency of each thesaurus and the correlation coefficient.
- FIG. 1 is a functional block diagram of an information processing apparatus according to an embodiment of the present invention.
- This embodiment is configured by a personal computer or the like. As shown in the figure, this embodiment is functionally composed of the following blocks. The processing of each block is actually executed by a predetermined application program, and each storage unit is realized by a hard disk (not shown).
- the input unit 1 inputs text data and stores the text data in the text storage unit 2.
- the word-cutting unit 3 executes a word-cutting process on the text data stored in the text storage unit 2.
- the parsing unit 4 is configured to perform parsing on the text data that has been subjected to the hard-cutting process.
- the thesaurus creating section 5 creates a thesaurus from text data stored in the text storage section 2.
- the thesaurus storage unit 6 stores the created thesaurus.
- Thesaurus sorter 7 sorts all samples for each thesaurus It is like that.
- the sorting result storage unit 8 is configured to store the sorting result.
- the appearance frequency calculation unit 9 calculates the appearance frequency for each thesaurus based on the data stored in the sorting result storage unit 8.
- the appearance frequency storage unit 10 stores the result calculated by the appearance frequency calculation unit 9.
- the correlation coefficient calculator 11 calculates a correlation coefficient between the thesaurus.
- the correlation coefficient storage unit 12 stores the correlation coefficient calculated by the correlation coefficient calculation unit 11.
- the correlation coefficient sum calculation unit 13 for each thesaurus is configured to sum the obtained correlation coefficients for each thesaurus.
- the correlation coefficient total storage unit 14 for each thesaurus stores the sum of correlation coefficients for each thesaurus calculated by the total correlation number calculation unit 13 for each thesaurus.
- the graph creation display unit 15 displays the appearance frequency stored in the appearance frequency storage unit 10 and the correlation coefficient total for each thesaurus stored in the correlation coefficient total storage unit 14 for each thesaurus. Based on the graph is created and displayed.
- step S1 text data is input from the input unit 1 for each customer. For example, suppose that a customer inputs, "I ordered a part last week, but it has not been delivered yet.” The input text data is stored in the text storage unit 2.
- step S2 the word cutting unit 3 performs a word cutting process using a predetermined text mining tool (application software). For example, the text above says, "I ordered a part I last week, but I haven't received it yet.” Obviously, the text above says, "I ordered a part I last week, but I haven't received it yet.” Obviously.
- step S3 the syntax analysis unit 4 performs a syntax analysis process using a text mining tool.
- the text above reads: "I ordered parts last week, but they have not been delivered yet.”
- the thesaurus creation unit 5 creates a thesaurus in which synonyms (keywords) are aggregated.
- synonyms keywords
- keywords of synonyms such as "one week” are aggregated in a thesaurus "last week”.
- keywords such as “order placed but” are aggregated in a thesaurus called “order placed”.
- keywords such as “to be carried in” will be consolidated in the “delivery” of the thesaurus.
- keywords such as “parts” will be consolidated into the “parts” thesaurus.
- keywords such as “information” are aggregated in a thesaurus called “communication”.
- the created thesaurus is stored in the thesaurus storage unit 6.
- step S5 word cutting is performed again by the word cutting unit 3 based on the thesaurus that has just been created and stored in the thesaurus storage unit 6, and the syntax analysis unit 4 reconfigures the structure. A sentence analysis process is performed.
- the thesaurus sorting unit 7 sorts the text data from all the customers for each thesaurus for the contents. For example, for each customer, “1” is set for a thesaurus included in the text data of the customer's complaint and the like, and “0” is set for a thesaurus not included.
- the sorting result is stored in the sorting result storage unit 8.
- FIG. 4 shows the sorting results stored in the sorting result storage unit 8.
- “K-1”, “ ⁇ -2”, “ ⁇ -3”, ⁇ “ ⁇ ⁇ ⁇ ” indicate identification numbers for identifying customers.
- customer 1 contains keywords included in the thesaurus of “order” and “parts”. It can be seen that text data has been input.
- step S7 the correlation coefficient calculating section 11 obtains a correlation coefficient between the plates.
- the correlation coefficient between “order” and “delivery” is expressed by the following equation.
- FIG. 5 shows the correlation coefficients between the thesaurus.
- the correlation coefficient between the thesaurus "last week” and the thesaurus "order” is 0.025.
- the correlation coefficient between the same thesaurus is 1.
- step S8 the correlation coefficient total calculation unit 13 for each thesaurus adds up the correlation coefficients obtained in step S7 and stored in the correlation coefficient storage unit 12 for each thesaurus. .
- the correlation coefficient 1 between the same thesaurus is excluded.
- the sum of correlation coefficients is calculated.
- the obtained sum of the correlation coefficients for each thesaurus is stored in the correlation coefficient total storage unit 14 for each thesaurus.
- the appearance frequency calculation unit 9 obtains the appearance frequency of each perilla. That is, as shown in FIG. 6, the appearance frequency of each thesaurus is obtained based on the sorting result of each thesaurus (FIG. 4).
- the thesaurus "last week” is included in the text data of the complaint of the customer ⁇ -2, ⁇ -3, ... You can see that it is included.
- the number of appearances ⁇ is calculated.
- the number of occurrences of the thesaurus "order” is B
- the number of occurrences of the thesaurus "delivery” is C
- the number of occurrences of the thesaurus "parts” is "D”.
- the sum of the number of appearances of all thesauruses ⁇ (A + B + C + D +- ⁇ ) is calculated, and the appearance frequency of each thesaurus is expressed as a percentage.
- the appearance frequency of the thesaurus "last week” is ( ⁇ / ⁇ (A + B + C + D + ⁇ ⁇ ⁇ )) * 100 (%).
- the calculated appearance frequency for each thesaurus is stored in the appearance frequency storage unit 10.
- step S10 the graph creation display unit 15 plots the frequency of occurrence (%) of each thesaurus on the X-axis and the sum of correlation coefficients for each thesaurus on the y-axis. Created. Figure 7 shows the generated graph.
- connection with other thesauruses is not so strong, but it appears frequently. That is, a thesaurus that is frequently spoken and cannot be overlooked appears in the third group.
- a standard is set at a certain level, and those exceeding the standard level are strongly linked, If it falls below, it is judged that the connection is weak.
- an input step of inputting text data, a text data storing step of storing text data, and a text data storing step are provided.
- a word cutting process and a parsing process are performed again on the basis of the thesaurus created in the thesaurus creation step, the thesaurus stored in the thesaurus created in the thesaurus creation step, and the thesaurus stored in the thesaurus storage step.
- a correlation coefficient total storage step for each thesaurus that stores the total number of relations, an appearance frequency stored in the appearance frequency storage step, and a phase relation for each thesaurus stored in the correlation coefficient total storage step for each thesaurus Graphs are created and displayed based on the total number, so a thesaurus created from keywords extracted from text data Based on the correlation Contact Yopi frequency of the scan between the text de - detecting the evening features, it is possible to analogize a potential hidden meaning in the text data.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Audiology, Speech & Language Pathology (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Machine Translation (AREA)
Abstract
Description
Claims
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
BRPI0317260-0A BR0317260A (pt) | 2002-12-12 | 2003-12-11 | aparelho de processamento de informação, método de processamento de informação e programa de processamento de informação |
EP03778809A EP1574968A4 (en) | 2002-12-12 | 2003-12-11 | INFORMATION PROCESSING DEVICE, INFORMATION PROCESSING METHOD, AND INFORMATION PROCESSING PROGRAM |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2002-360352 | 2002-12-12 | ||
JP2002360352A JP3600611B2 (ja) | 2002-12-12 | 2002-12-12 | 情報処理装置および情報処理方法、並びに情報処理プログラム |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2004053735A1 true WO2004053735A1 (ja) | 2004-06-24 |
Family
ID=32500983
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2003/015865 WO2004053735A1 (ja) | 2002-12-12 | 2003-12-11 | 情報処理装置および情報処理方法、並びに情報処理プログラム |
Country Status (6)
Country | Link |
---|---|
US (1) | US7398202B2 (ja) |
EP (1) | EP1574968A4 (ja) |
JP (1) | JP3600611B2 (ja) |
CN (1) | CN1723457A (ja) |
BR (1) | BR0317260A (ja) |
WO (1) | WO2004053735A1 (ja) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN100399334C (zh) * | 2004-09-24 | 2008-07-02 | 株式会社东芝 | 搜索结构化文档的设备和方法 |
CN113204620A (zh) * | 2021-05-12 | 2021-08-03 | 首都师范大学 | 一种叙词表自动构建的方法、***、设备以及计算机存储介质 |
Families Citing this family (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9311676B2 (en) * | 2003-09-04 | 2016-04-12 | Hartford Fire Insurance Company | Systems and methods for analyzing sensor data |
US7711584B2 (en) | 2003-09-04 | 2010-05-04 | Hartford Fire Insurance Company | System for reducing the risk associated with an insured building structure through the incorporation of selected technologies |
US20070219987A1 (en) * | 2005-10-14 | 2007-09-20 | Leviathan Entertainment, Llc | Self Teaching Thesaurus |
US20080077451A1 (en) * | 2006-09-22 | 2008-03-27 | Hartford Fire Insurance Company | System for synergistic data processing |
US8359209B2 (en) | 2006-12-19 | 2013-01-22 | Hartford Fire Insurance Company | System and method for predicting and responding to likelihood of volatility |
WO2008079325A1 (en) * | 2006-12-22 | 2008-07-03 | Hartford Fire Insurance Company | System and method for utilizing interrelated computerized predictive models |
US20090043615A1 (en) * | 2007-08-07 | 2009-02-12 | Hartford Fire Insurance Company | Systems and methods for predictive data analysis |
JP5309537B2 (ja) * | 2007-11-19 | 2013-10-09 | 富士ゼロックス株式会社 | グラフ表示装置およびプログラム |
US9665910B2 (en) | 2008-02-20 | 2017-05-30 | Hartford Fire Insurance Company | System and method for providing customized safety feedback |
JP5526396B2 (ja) * | 2008-03-11 | 2014-06-18 | クラリオン株式会社 | 情報検索装置、情報検索システム及び情報検索方法 |
JP2009277183A (ja) * | 2008-05-19 | 2009-11-26 | Hitachi Ltd | 情報識別装置及び情報識別システム |
US8612202B2 (en) * | 2008-09-25 | 2013-12-17 | Nec Corporation | Correlation of linguistic expressions in electronic documents with time information |
WO2011072125A2 (en) * | 2009-12-09 | 2011-06-16 | Zemoga, Inc. | Method and apparatus for real time semantic filtering of posts to an internet social network |
US8355934B2 (en) * | 2010-01-25 | 2013-01-15 | Hartford Fire Insurance Company | Systems and methods for prospecting business insurance customers |
US9460471B2 (en) | 2010-07-16 | 2016-10-04 | Hartford Fire Insurance Company | System and method for an automated validation system |
US9275015B2 (en) * | 2011-12-05 | 2016-03-01 | Nexalogy Environics, Inc. | System and method for performing analysis on information, such as social media |
US10394871B2 (en) | 2016-10-18 | 2019-08-27 | Hartford Fire Insurance Company | System to predict future performance characteristic for an electronic record |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2000172691A (ja) * | 1998-12-03 | 2000-06-23 | Mitsubishi Electric Corp | 情報マイニング方法、情報マイニング装置、および情報マイニングプログラムを記録したコンピュータ読み取り可能な記録媒体 |
JP2000242662A (ja) * | 1999-02-23 | 2000-09-08 | Mitsubishi Electric Corp | データベース作成装置およびデータベース検索装置 |
JP2001101194A (ja) * | 1999-09-27 | 2001-04-13 | Mitsubishi Electric Corp | テキストマイニング方法、テキストマイニング装置及びテキストマイニングプログラムが記録された記録媒体 |
JP2002117035A (ja) * | 2000-10-10 | 2002-04-19 | Citation Japan:Kk | フリーワードを用いた分析装置、分析方法および記憶媒体 |
JP2002183175A (ja) * | 2000-12-08 | 2002-06-28 | Hitachi Ltd | テキストマイニング方法 |
JP2002230006A (ja) * | 2000-11-28 | 2002-08-16 | Sadanobu Takane | 自由記述回答の解析法、自由記述文書からのキーワード抽出法、および自由記述文書の解析支援法 |
Family Cites Families (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5099426A (en) * | 1989-01-19 | 1992-03-24 | International Business Machines Corporation | Method for use of morphological information to cross reference keywords used for information retrieval |
US5056021A (en) * | 1989-06-08 | 1991-10-08 | Carolyn Ausborn | Method and apparatus for abstracting concepts from natural language |
JP2527817B2 (ja) * | 1989-07-14 | 1996-08-28 | シャープ株式会社 | 主題連想装置および単語連想装置 |
US5384703A (en) * | 1993-07-02 | 1995-01-24 | Xerox Corporation | Method and apparatus for summarizing documents according to theme |
US5675819A (en) * | 1994-06-16 | 1997-10-07 | Xerox Corporation | Document information retrieval using global word co-occurrence patterns |
WO1997008604A2 (en) * | 1995-08-16 | 1997-03-06 | Syracuse University | Multilingual document retrieval system and method using semantic vector matching |
US6845354B1 (en) * | 1999-09-09 | 2005-01-18 | Institute For Information Industry | Information retrieval system with a neuro-fuzzy structure |
US20020026435A1 (en) * | 2000-08-26 | 2002-02-28 | Wyss Felix Immanuel | Knowledge-base system and method |
US7813915B2 (en) * | 2000-09-25 | 2010-10-12 | Fujitsu Limited | Apparatus for reading a plurality of documents and a method thereof |
US7346491B2 (en) * | 2001-01-04 | 2008-03-18 | Agency For Science, Technology And Research | Method of text similarity measurement |
EP1239459A1 (en) * | 2001-03-07 | 2002-09-11 | Sony International (Europe) GmbH | Adaptation of a speech recognizer to a non native speaker pronunciation |
US7031910B2 (en) * | 2001-10-16 | 2006-04-18 | Xerox Corporation | Method and system for encoding and accessing linguistic frequency data |
EP1473639A1 (en) * | 2002-02-04 | 2004-11-03 | Celestar Lexico-Sciences, Inc. | Document knowledge management apparatus and method |
-
2002
- 2002-12-12 JP JP2002360352A patent/JP3600611B2/ja not_active Expired - Lifetime
-
2003
- 2003-12-09 US US10/730,287 patent/US7398202B2/en not_active Expired - Fee Related
- 2003-12-11 BR BRPI0317260-0A patent/BR0317260A/pt unknown
- 2003-12-11 CN CNA2003801054367A patent/CN1723457A/zh active Pending
- 2003-12-11 EP EP03778809A patent/EP1574968A4/en not_active Withdrawn
- 2003-12-11 WO PCT/JP2003/015865 patent/WO2004053735A1/ja active Application Filing
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2000172691A (ja) * | 1998-12-03 | 2000-06-23 | Mitsubishi Electric Corp | 情報マイニング方法、情報マイニング装置、および情報マイニングプログラムを記録したコンピュータ読み取り可能な記録媒体 |
JP2000242662A (ja) * | 1999-02-23 | 2000-09-08 | Mitsubishi Electric Corp | データベース作成装置およびデータベース検索装置 |
JP2001101194A (ja) * | 1999-09-27 | 2001-04-13 | Mitsubishi Electric Corp | テキストマイニング方法、テキストマイニング装置及びテキストマイニングプログラムが記録された記録媒体 |
JP2002117035A (ja) * | 2000-10-10 | 2002-04-19 | Citation Japan:Kk | フリーワードを用いた分析装置、分析方法および記憶媒体 |
JP2002230006A (ja) * | 2000-11-28 | 2002-08-16 | Sadanobu Takane | 自由記述回答の解析法、自由記述文書からのキーワード抽出法、および自由記述文書の解析支援法 |
JP2002183175A (ja) * | 2000-12-08 | 2002-06-28 | Hitachi Ltd | テキストマイニング方法 |
Non-Patent Citations (3)
Title |
---|
MIMURO K.: "Kigyo kachi o sozo suru VBIT", CHITEKI SISAN SOZO, NOMURA RESEARCH INSTITUTE, LTD., vol. 10, no. 8, 20 August 2002 (2002-08-20), pages 44 - 53, XP002980154 * |
MIMURO K.: "'Kokyaku no koe' o shisanka suru text mining", CHITEKI SHISAN SOZO, NOMURA RESEARCH INSTITUTE, LTD., vol. 9, no. 6, 1 June 2001 (2001-06-01), pages 74 - 77, XP002980155 * |
See also references of EP1574968A4 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN100399334C (zh) * | 2004-09-24 | 2008-07-02 | 株式会社东芝 | 搜索结构化文档的设备和方法 |
CN113204620A (zh) * | 2021-05-12 | 2021-08-03 | 首都师范大学 | 一种叙词表自动构建的方法、***、设备以及计算机存储介质 |
Also Published As
Publication number | Publication date |
---|---|
US20050060141A1 (en) | 2005-03-17 |
US7398202B2 (en) | 2008-07-08 |
JP2004192398A (ja) | 2004-07-08 |
EP1574968A1 (en) | 2005-09-14 |
JP3600611B2 (ja) | 2004-12-15 |
EP1574968A4 (en) | 2010-03-17 |
CN1723457A (zh) | 2006-01-18 |
BR0317260A (pt) | 2006-04-18 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2004053735A1 (ja) | 情報処理装置および情報処理方法、並びに情報処理プログラム | |
CN108647205B (zh) | 细粒度情感分析模型构建方法、设备及可读存储介质 | |
US8176050B2 (en) | Method and apparatus of supporting creation of classification rules | |
CN110472027B (zh) | 意图识别方法、设备及计算机可读存储介质 | |
EP1580667B1 (en) | Representation of a deleted interpolation N-gram language model in ARPA standard format | |
EP2378475A1 (en) | Method for calculating semantic similarities between messages and conversations based on enhanced entity extraction | |
US7430552B2 (en) | Computer based system and method of determining a satisfaction index of a text | |
US10586174B2 (en) | Methods and systems for finding and ranking entities in a domain specific system | |
CN106383836B (zh) | 将可操作属性归于描述个人身份的数据 | |
EP2378476A1 (en) | Method for calculating entity similarities | |
CN112199588A (zh) | 舆情文本筛选方法及装置 | |
US20090106023A1 (en) | Speech recognition word dictionary/language model making system, method, and program, and speech recognition system | |
CN109299235B (zh) | 知识库搜索方法、装置及计算机可读存储介质 | |
JP5429377B2 (ja) | 文字入力における候補の表示方法 | |
CN110263121B (zh) | 表格数据处理方法、装置、电子装置及计算机可读存储介质 | |
JP5772599B2 (ja) | テキストマイニングシステム、テキストマイニング方法および記録媒体 | |
CN113177061B (zh) | 一种搜索方法、装置和电子设备 | |
Wardani et al. | Sentiment Analysis on Beauty Product Review Using Modified Balanced Random Forest Method and Chi-Square | |
CN115438662A (zh) | 一种基于大数据的权重自适应方法及大数据*** | |
JP4828716B2 (ja) | データ追加型分析装置及びプログラム | |
CN110837843A (zh) | 信息分类方法、装置、计算机设备及存储介质 | |
CN111126033A (zh) | 文章的回应预测装置及方法 | |
US20080172376A1 (en) | Indexing and ranking processes for directory assistance services | |
JP2023073641A (ja) | 案件管理装置、案件管理方法およびプログラム | |
JP2000200197A (ja) | 知識蓄積・選択方法、知識蓄積・選択装置及び知識蓄積・選択プログラムを記録した記録媒体 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AK | Designated states |
Kind code of ref document: A1 Designated state(s): BR CN |
|
AL | Designated countries for regional patents |
Kind code of ref document: A1 |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
WWE | Wipo information: entry into national phase |
Ref document number: 2003778809 Country of ref document: EP |
|
WWE | Wipo information: entry into national phase |
Ref document number: 20038A54367 Country of ref document: CN |
|
WWP | Wipo information: published in national office |
Ref document number: 2003778809 Country of ref document: EP |
|
ENP | Entry into the national phase |
Ref document number: PI0317260 Country of ref document: BR |