BR112012011091A2 - método e aparelho para extração e avaliação de qualidade de palavra - Google Patents

método e aparelho para extração e avaliação de qualidade de palavra

Info

Publication number
BR112012011091A2
BR112012011091A2 BR112012011091A BR112012011091A BR112012011091A2 BR 112012011091 A2 BR112012011091 A2 BR 112012011091A2 BR 112012011091 A BR112012011091 A BR 112012011091A BR 112012011091 A BR112012011091 A BR 112012011091A BR 112012011091 A2 BR112012011091 A2 BR 112012011091A2
Authority
BR
Brazil
Prior art keywords
word
quality assessment
extraction
importance
word quality
Prior art date
Application number
BR112012011091A
Other languages
English (en)
Other versions
BR112012011091B1 (pt
Inventor
Gaolin Fang
Huaijun Liu
Zhongbo Jiang
Original Assignee
Tencent Tech Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Tech Shenzhen Co Ltd filed Critical Tencent Tech Shenzhen Co Ltd
Publication of BR112012011091A2 publication Critical patent/BR112012011091A2/pt
Publication of BR112012011091B1 publication Critical patent/BR112012011091B1/pt

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/31Indexing; Data structures therefor; Storage structures
    • G06F16/313Selection or weighting of terms for indexing

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Machine Translation (AREA)

Abstract

método e aparelho para extração e avaliação de qualidade de palavra. a presente invenção refere-se a um métod e um aparelho para extração a avaliação de qualidade de palavra. o método inclui: calcular uma frequência de documento (df) de uma palavra em uma massa de dados categorizada; avaliar a palavra em múltiplos aspectos singulares de acordo com a df da palavra; e avaliar a palavra em múltiplos aspectos de acordo com as múltiplas avaliações de aspecto singulares para obter um peso de importância da palavra. de acordo com a solução da presente invenção, a importância da palavra em uma massa de dados categorizada pode ser avaliada, e palavras com alta qualidade podem ser obtidas através de uma avaliação integrada.
BR112012011091-8A 2009-11-10 2010-06-28 método e aparelho para extração e avaliação de qualidade de palavra BR112012011091B1 (pt)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
CN200910237185.7A CN102054006B (zh) 2009-11-10 2009-11-10 一种从海量数据中提取有效信息的方法及装置
CN200910237185.7 2009-11-10
PCT/CN2010/074597 WO2011057497A1 (zh) 2009-11-10 2010-06-28 一种词汇质量挖掘评价方法及装置

Publications (2)

Publication Number Publication Date
BR112012011091A2 true BR112012011091A2 (pt) 2016-07-05
BR112012011091B1 BR112012011091B1 (pt) 2020-10-13

Family

ID=43958340

Family Applications (1)

Application Number Title Priority Date Filing Date
BR112012011091-8A BR112012011091B1 (pt) 2009-11-10 2010-06-28 método e aparelho para extração e avaliação de qualidade de palavra

Country Status (5)

Country Link
US (1) US8645418B2 (pt)
CN (1) CN102054006B (pt)
BR (1) BR112012011091B1 (pt)
RU (1) RU2517368C2 (pt)
WO (1) WO2011057497A1 (pt)

Families Citing this family (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103186612B (zh) * 2011-12-30 2016-04-27 ***通信集团公司 一种词汇分类的方法、***和实现方法
CN103885976B (zh) * 2012-12-21 2017-08-04 腾讯科技(深圳)有限公司 在网页中配置推荐信息的方法及索引服务器
CN103309984B (zh) * 2013-06-17 2016-12-28 腾讯科技(深圳)有限公司 数据处理的方法和装置
US9959364B2 (en) * 2014-05-22 2018-05-01 Oath Inc. Content recommendations
CN105183784B (zh) * 2015-08-14 2020-04-28 天津大学 一种基于内容的垃圾网页检测方法及其检测装置
CN105975518B (zh) * 2016-04-28 2019-01-29 吴国华 基于信息熵的期望交叉熵特征选择文本分类***及方法
US11347777B2 (en) * 2016-05-12 2022-05-31 International Business Machines Corporation Identifying key words within a plurality of documents
CN107463548B (zh) * 2016-06-02 2021-04-27 阿里巴巴集团控股有限公司 短语挖掘方法及装置
CN108073568B (zh) 2016-11-10 2020-09-11 腾讯科技(深圳)有限公司 关键词提取方法和装置
CN107066441A (zh) * 2016-12-09 2017-08-18 北京锐安科技有限公司 一种计算词性相关性的方法及装置
CN107169523B (zh) * 2017-05-27 2020-07-21 鹏元征信有限公司 自动确定机构的所属行业类别的方法、存储设备及终端
CN107562938B (zh) * 2017-09-21 2020-09-08 重庆工商大学 一种法院智能审判方法
CN108269125B (zh) * 2018-01-15 2020-08-21 口碑(上海)信息技术有限公司 评论信息质量评估方法及***、评论信息处理方法及***
CN108664470B (zh) * 2018-05-04 2022-06-17 武汉斗鱼网络科技有限公司 视频标题信息量的度量方法、可读存储介质及电子设备
CN109062912B (zh) * 2018-08-08 2023-07-28 科大讯飞股份有限公司 一种翻译质量评价方法及装置
CN109255028B (zh) * 2018-08-28 2021-08-13 西安交通大学 基于教学评价数据可信度的教学质量综合评价方法
CN109062905B (zh) * 2018-09-04 2022-06-24 武汉斗鱼网络科技有限公司 一种弹幕文本价值评价方法、装置、设备及介质
CN110377709B (zh) * 2019-06-03 2021-10-08 广东幽澜机器人科技有限公司 一种减少机器人客服运维复杂度的方法及装置
CN111079426B (zh) * 2019-12-20 2021-06-15 中南大学 一种获取领域文档词项分级权重的方法及装置
CN111090997B (zh) * 2019-12-20 2021-07-20 中南大学 一种基于分级词项的地质文档特征词项排序方法与装置
CN112561500B (zh) * 2021-02-25 2021-05-25 深圳平安智汇企业信息管理有限公司 基于用户数据的薪酬数据生成方法、装置、设备及介质

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6473753B1 (en) * 1998-10-09 2002-10-29 Microsoft Corporation Method and system for calculating term-document importance
US7024408B2 (en) * 2002-07-03 2006-04-04 Word Data Corp. Text-classification code, system and method
JP4233836B2 (ja) 2002-10-16 2009-03-04 インターナショナル・ビジネス・マシーンズ・コーポレーション 文書自動分類システム、不要語判定方法、文書自動分類方法、およびプログラム
CN1438592A (zh) * 2003-03-21 2003-08-27 清华大学 一种文本自动分类方法
RU2254610C2 (ru) * 2003-09-04 2005-06-20 Государственное научное учреждение научно-исследовательский институт "СПЕЦВУЗАВТОМАТИКА" Способ автоматической классификации документов
US20090119281A1 (en) * 2007-11-03 2009-05-07 Andrew Chien-Chung Wang Granular knowledge based search engine
US8577884B2 (en) * 2008-05-13 2013-11-05 The Boeing Company Automated analysis and summarization of comments in survey response data
CN100583101C (zh) * 2008-06-12 2010-01-20 昆明理工大学 基于领域知识的文本分类特征选择及权重计算方法

Also Published As

Publication number Publication date
US8645418B2 (en) 2014-02-04
US20120221602A1 (en) 2012-08-30
RU2012123216A (ru) 2013-12-20
RU2517368C2 (ru) 2014-05-27
BR112012011091B1 (pt) 2020-10-13
CN102054006A (zh) 2011-05-11
WO2011057497A1 (zh) 2011-05-19
CN102054006B (zh) 2015-01-14

Similar Documents

Publication Publication Date Title
BR112012011091A2 (pt) método e aparelho para extração e avaliação de qualidade de palavra
MX340339B (es) Metodos de transferencia de calibracion para un instrumento de pruebas.
EA201270020A1 (ru) Определение риска развития атеросклеротической болезни сердца
BRPI0923582B8 (pt) método e dispositivo para análise de uma causa de recuperação elástica em um produto formado
TW200730825A (en) Method and apparatus for correlating levels of biomarker products with disease
BR112012002436A2 (pt) processo para avaliar desempenho de colisão do membro de veículo e dispositivo de teste de colisão do membro usado para o mesmo
BR112012019347B8 (pt) método, equipamento de usuário e sistema para medir célula portadora agregada
BR112013011083A2 (pt) processo e sistema para treinamento pessoal automatizado
WO2013156746A3 (en) Testing system
GB2474385A (en) Monte carlo method for laplace inversion of NMR data
WO2015056091A3 (en) Assessment system
BR112013021590A2 (pt) detecção de capacitância em ensaio eletroquímico com resposta otimizada
BR112012022239A2 (pt) aparelhos, métodos e sistemas para análise de estratégia de investimento econométrica
GB201207297D0 (en) Analytical methods and arrays for use in the same
PE20110653A1 (es) Aparato y metodo para la deteccion objetiva de transtornos auditivos
WO2012056236A3 (en) Analytical methods and arrays for use in the identification of agents inducing sensitization in human skin
Kamin et al. The Subjective Technology Adaptivity Inventory (STAI): A motivational measure of technology usage in old age.
BR112015011951A2 (pt) métodos e aparatos para adquirir sinais compensados para a determinação de parâmetros de formação
IN2015DN01353A (pt)
BR112015014232A2 (pt) lesão renal aguda
MX2016001719A (es) Metodos y equipos para predecir el riesgo de tener una enfermedad o evento cardiovascular.
WO2012148188A3 (ko) 산모의 자궁경부 상태를 평가하기 위하여 자궁경부 탄성 이미지에 대한 이미지 분석기법의 활용
EP2685402A3 (en) Cell analyzing apparatus and cell analyzing method
IN2014CN03455A (pt)
EP2529181A4 (en) METHOD AND DEVICE FOR PROPERTYING AN OBJECT, MEDIUM OR OPTICAL PATH USING RAISE LIGHT

Legal Events

Date Code Title Description
B06F Objections, documents and/or translations needed after an examination request according [chapter 6.6 patent gazette]
B06U Preliminary requirement: requests with searches performed by other patent offices: procedure suspended [chapter 6.21 patent gazette]
B09A Decision: intention to grant [chapter 9.1 patent gazette]
B15K Others concerning applications: alteration of classification

Free format text: A CLASSIFICACAO ANTERIOR ERA: G06F 17/30

Ipc: G06F 16/35 (2019.01), G06F 16/31 (2019.01)

B16A Patent or certificate of addition of invention granted

Free format text: PRAZO DE VALIDADE: 10 (DEZ) ANOS CONTADOS A PARTIR DE 13/10/2020, OBSERVADAS AS CONDICOES LEGAIS.