CN101308512B - 一种基于网页的互译翻译对抽取方法及装置 - Google Patents
一种基于网页的互译翻译对抽取方法及装置 Download PDFInfo
- Publication number
- CN101308512B CN101308512B CN200810126468XA CN200810126468A CN101308512B CN 101308512 B CN101308512 B CN 101308512B CN 200810126468X A CN200810126468X A CN 200810126468XA CN 200810126468 A CN200810126468 A CN 200810126468A CN 101308512 B CN101308512 B CN 101308512B
- Authority
- CN
- China
- Prior art keywords
- text
- bilingual
- tuples
- unit
- character
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Landscapes
- Machine Translation (AREA)
- Information Transfer Between Computers (AREA)
Abstract
Description
合并后的双语二元组 | 频度 |
(“木马”,“trojan horse”) | 4 |
(“特洛伊木马”,“trojan horse”) | 4 |
(“叫做特洛伊木马”,“trojan horse”) | 1 |
(“全称叫做特洛伊木马”,“trojan horse”) | 1 |
(“的全称叫做特洛伊木马”,“trojan horse”) | 1 |
(“木马的全称叫做特洛伊木马”,“trojan horse”) | 1 |
(“全称特洛伊木马”,“trojan horse”) | 1 |
(“的特洛伊木马”,“trojan horse”) | 1 |
(“点的特洛伊木马”,“trojan horse”) | 1 |
(“好点的特洛伊木马”,“trojan horse”) | 1 |
(“比较好点的特洛伊木马”,“trojan horse”) | 1 |
(“个比较好点的特洛伊木马”,“trojan horse”) | 1 |
(“介绍个比较好点的特洛伊木马”, “trojan horse”) | 1 |
(“能介绍个比较好点的特洛伊木马”, “trojan horse”) | 1 |
(“谁能介绍个比较好点的特洛伊木 马”,“trojan horse”) | 1 |
双语二元组 | 分值 |
(“木马”,“trojan horse”) | 4.39 |
(“特洛伊木马”,“trojan horse”) | 7.17 |
Claims (15)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN200810126468XA CN101308512B (zh) | 2008-06-25 | 2008-07-03 | 一种基于网页的互译翻译对抽取方法及装置 |
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN200810125774.1 | 2008-06-25 | ||
CN200810125774 | 2008-06-25 | ||
CN200810126468XA CN101308512B (zh) | 2008-06-25 | 2008-07-03 | 一种基于网页的互译翻译对抽取方法及装置 |
Publications (2)
Publication Number | Publication Date |
---|---|
CN101308512A CN101308512A (zh) | 2008-11-19 |
CN101308512B true CN101308512B (zh) | 2011-09-14 |
Family
ID=40124967
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN200810126468XA Active CN101308512B (zh) | 2008-06-25 | 2008-07-03 | 一种基于网页的互译翻译对抽取方法及装置 |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN101308512B (zh) |
Families Citing this family (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2011035455A1 (en) * | 2009-09-25 | 2011-03-31 | Yahoo! Inc. | Acquisition of out-of-vocabulary translations by dynamically learning extraction rules |
CN102043808B (zh) * | 2009-10-14 | 2014-06-18 | 腾讯科技(深圳)有限公司 | 利用网页结构抽取双语词条的方法及设备 |
CN103186645A (zh) * | 2011-12-31 | 2013-07-03 | 北京金山软件有限公司 | 一种基于网络的特定资源获取方法和装置 |
CN102902667A (zh) * | 2012-10-12 | 2013-01-30 | 曾立人 | 一种翻译记忆匹配结果显示方法 |
CN103970732B (zh) * | 2014-05-22 | 2017-05-10 | 北京百度网讯科技有限公司 | 新词译文的挖掘方法和装置 |
CN105653516B (zh) * | 2015-12-30 | 2018-08-10 | 语联网(武汉)信息技术有限公司 | 平行语料对齐的方法和装置 |
CN106055543B (zh) * | 2016-05-23 | 2019-04-09 | 南京大学 | 基于Spark的大规模短语翻译模型的训练方法 |
CN109977424B (zh) * | 2017-12-27 | 2023-08-08 | 北京搜狗科技发展有限公司 | 一种机器翻译模型的训练方法及装置 |
-
2008
- 2008-07-03 CN CN200810126468XA patent/CN101308512B/zh active Active
Also Published As
Publication number | Publication date |
---|---|
CN101308512A (zh) | 2008-11-19 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
TWI636452B (zh) | 語音識別方法及系統 | |
CN107797991B (zh) | 一种基于依存句法树的知识图谱扩充方法及*** | |
CN101308512B (zh) | 一种基于网页的互译翻译对抽取方法及装置 | |
Tran et al. | JAIST: Combining multiple features for answer selection in community question answering | |
US8612206B2 (en) | Transliterating semitic languages including diacritics | |
CN109635297B (zh) | 一种实体消歧方法、装置、计算机装置及计算机存储介质 | |
Kothari et al. | SMS based interface for FAQ retrieval | |
JP5379138B2 (ja) | 領域辞書の作成 | |
JP5710581B2 (ja) | 質問応答装置、方法、及びプログラム | |
Bellare et al. | Learning extractors from unlabeled text using relevant databases | |
Jabbar et al. | An improved Urdu stemming algorithm for text mining based on multi-step hybrid approach | |
Alshalabi et al. | Arabic light-based stemmer using new rules | |
Jayan et al. | A hybrid statistical approach for named entity recognition for malayalam language | |
Gadri et al. | Information retrieval: A new multilingual stemmer based on a statistical approach | |
Sahu et al. | Twitter sentiment analysis--a more enhanced way of classification and scoring | |
Kilgarriff et al. | Longest–commonest Match | |
CN112597768B (zh) | 文本审核方法、装置、电子设备、存储介质及程序产品 | |
CN111259661B (zh) | 一种基于商品评论的新情感词提取方法 | |
Sharma et al. | Word prediction system for text entry in Hindi | |
Naemi et al. | Informal-to-formal word conversion for persian language using natural language processing techniques | |
Alzand et al. | Diacritics of Arabic Natural Language Processing (ANLP) and its quality assessment | |
JP4088171B2 (ja) | テキスト解析装置、方法、プログラム及びそのプログラムを記録した記録媒体 | |
Plu et al. | Revealing entities from textual documents using a hybrid approach | |
Baishya et al. | Present state and future scope of Assamese text processing | |
Lu et al. | Language model for Mongolian polyphone proofreading |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
ASS | Succession or assignment of patent right |
Owner name: BEIJING KINGSOFT OFFICE SOFTWARE CO., LTD. Free format text: FORMER OWNER: BEIJING JINSHAN SOFTWARE CO., LTD. Effective date: 20140312 Free format text: FORMER OWNER: BEIJING JINSHAN DIGITAL ENTERTAINMENT SCIENCE AND TECHNOLOGY CO., LTD. Effective date: 20140312 |
|
C41 | Transfer of patent application or patent right or utility model | ||
COR | Change of bibliographic data |
Free format text: CORRECT: ADDRESS; FROM: 100083 HAIDIAN, BEIJING TO: 100085 HAIDIAN, BEIJING |
|
TR01 | Transfer of patent right |
Effective date of registration: 20140312 Address after: Kingsoft No. 33 building, 100085 Beijing city Haidian District Xiaoying Road Patentee after: Beijing Kingsoft WPS Office Co., Ltd. Address before: 100083, Beijing, Haidian District No. 238 North Fourth Ring Road, No. 20, Bai Yan building Patentee before: Beijing Jinshan Software Co., Ltd. Patentee before: Beijing Jinshan Digital Entertainment Science and Technology Co., Ltd. |
|
C56 | Change in the name or address of the patentee | ||
CP01 | Change in the name or title of a patent holder |
Address after: Kingsoft No. 33 building, 100085 Beijing city Haidian District Xiaoying Road Patentee after: Beijing Kingsoft office software Limited by Share Ltd Address before: Kingsoft No. 33 building, 100085 Beijing city Haidian District Xiaoying Road Patentee before: Beijing Kingsoft WPS Office Co., Ltd. |