CN109661664A - 一种信息处理的方法及相关装置 - Google Patents
一种信息处理的方法及相关装置 Download PDFInfo
- Publication number
- CN109661664A CN109661664A CN201780054183.7A CN201780054183A CN109661664A CN 109661664 A CN109661664 A CN 109661664A CN 201780054183 A CN201780054183 A CN 201780054183A CN 109661664 A CN109661664 A CN 109661664A
- Authority
- CN
- China
- Prior art keywords
- sentence
- information
- vector
- coding
- sentences
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
- G06F40/216—Parsing using statistical methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/18—Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/12—Use of codes for handling textual entities
- G06F40/126—Character encoding
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
- G06N20/10—Machine learning using kernel methods, e.g. support vector machines [SVM]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- H—ELECTRICITY
- H03—ELECTRONIC CIRCUITRY
- H03M—CODING; DECODING; CODE CONVERSION IN GENERAL
- H03M7/00—Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
- H03M7/30—Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
- H03M7/3059—Digital compression and data reduction techniques where the original information is represented by a subset or similar information, e.g. lossy compression
-
- H—ELECTRICITY
- H03—ELECTRONIC CIRCUITRY
- H03M—CODING; DECODING; CODE CONVERSION IN GENERAL
- H03M7/00—Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
- H03M7/30—Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
- H03M7/3082—Vector coding
-
- H—ELECTRICITY
- H03—ELECTRONIC CIRCUITRY
- H03M—CODING; DECODING; CODE CONVERSION IN GENERAL
- H03M7/00—Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
- H03M7/30—Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
- H03M7/60—General implementation details not specific to a particular type of compression
- H03M7/6005—Decoder aspects
-
- H—ELECTRICITY
- H03—ELECTRONIC CIRCUITRY
- H03M—CODING; DECODING; CODE CONVERSION IN GENERAL
- H03M7/00—Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
- H03M7/30—Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
- H03M7/60—General implementation details not specific to a particular type of compression
- H03M7/6011—Encoder aspects
-
- H—ELECTRICITY
- H03—ELECTRONIC CIRCUITRY
- H03M—CODING; DECODING; CODE CONVERSION IN GENERAL
- H03M7/00—Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
- H03M7/30—Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
- H03M7/70—Type of the data to be coded, other than image and sound
-
- H—ELECTRICITY
- H03—ELECTRONIC CIRCUITRY
- H03M—CODING; DECODING; CODE CONVERSION IN GENERAL
- H03M7/00—Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
- H03M7/30—Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
- H03M7/70—Type of the data to be coded, other than image and sound
- H03M7/707—Structured documents, e.g. XML
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Data Mining & Analysis (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Evolutionary Computation (AREA)
- Computing Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Molecular Biology (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Probability & Statistics with Applications (AREA)
- Mathematical Optimization (AREA)
- Computational Mathematics (AREA)
- Mathematical Analysis (AREA)
- Pure & Applied Mathematics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Medical Informatics (AREA)
- Operations Research (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Algebra (AREA)
- Databases & Information Systems (AREA)
- Machine Translation (AREA)
Abstract
本发明实施例公开了一种信息处理的方法,包括:获取待处理文本信息及句子集合;采用第一编码器对句子集合中的句子进行编码,得到第一编码向量,并采用第二编码器对句子集合中的句子进行编码,得到第二编码向量,第一编码向量是根据句子确定的,第二编码向量是根据句子的特征确定的;根据第一编码向量与第二编码向量确定句子编码向量;采用第三编码器对句子编码向量进行编码,得到全局信息;采用解码器对全局信息进行解码处理,确定待处理文本信息中各个句子对应的概率值。本发明还提供一种信息处理装置。本发明在使用深度学习方法同时,还加入了人工抽取的句子进行特征训练,有效地提高了模型的学习能力,从而提升信息处理的能力和效果。
Description
PCT国内申请,说明书已公开。
Claims (15)
- PCT国内申请,权利要求书已公开。
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2017/089586 WO2018232699A1 (zh) | 2017-06-22 | 2017-06-22 | 一种信息处理的方法及相关装置 |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109661664A true CN109661664A (zh) | 2019-04-19 |
CN109661664B CN109661664B (zh) | 2021-04-27 |
Family
ID=64735906
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201780054183.7A Active CN109661664B (zh) | 2017-06-22 | 2017-06-22 | 一种信息处理的方法及相关装置 |
Country Status (3)
Country | Link |
---|---|
US (1) | US10789415B2 (zh) |
CN (1) | CN109661664B (zh) |
WO (1) | WO2018232699A1 (zh) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111597814A (zh) * | 2020-05-22 | 2020-08-28 | 北京慧闻科技(集团)有限公司 | 一种人机交互命名实体识别方法、装置、设备及存储介质 |
CN112269872A (zh) * | 2020-10-19 | 2021-01-26 | 北京希瑞亚斯科技有限公司 | 简历解析方法、装置、电子设备及计算机存储介质 |
CN112560398A (zh) * | 2019-09-26 | 2021-03-26 | 百度在线网络技术(北京)有限公司 | 一种文本生成方法及装置 |
CN114095033A (zh) * | 2021-11-16 | 2022-02-25 | 上海交通大学 | 基于上下文的图卷积的目标交互关系语义无损压缩***及方法 |
Families Citing this family (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2018232699A1 (zh) | 2017-06-22 | 2018-12-27 | 腾讯科技(深圳)有限公司 | 一种信息处理的方法及相关装置 |
CN109740158B (zh) * | 2018-12-29 | 2023-04-07 | 安徽省泰岳祥升软件有限公司 | 一种文本语义解析方法及装置 |
CN110781674B (zh) * | 2019-09-19 | 2023-10-27 | 北京小米智能科技有限公司 | 一种信息处理方法、装置、计算机设备及存储介质 |
CN113112993B (zh) * | 2020-01-10 | 2024-04-02 | 阿里巴巴集团控股有限公司 | 一种音频信息处理方法、装置、电子设备以及存储介质 |
CN111428024A (zh) * | 2020-03-18 | 2020-07-17 | 北京明略软件***有限公司 | 实现文本摘要抽取的方法、装置、计算机存储介质及终端 |
CN111507726B (zh) * | 2020-04-07 | 2022-06-24 | 支付宝(杭州)信息技术有限公司 | 一种报文生成方法、装置及设备 |
CN112069813B (zh) * | 2020-09-10 | 2023-10-13 | 腾讯科技(深圳)有限公司 | 文本处理方法、装置、设备及计算机可读存储介质 |
CN113642756B (zh) * | 2021-05-27 | 2023-11-24 | 复旦大学 | 基于深度学习技术的减刑刑期预测方法 |
CN113254684B (zh) * | 2021-06-18 | 2021-10-29 | 腾讯科技(深圳)有限公司 | 一种内容时效的确定方法、相关装置、设备以及存储介质 |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20040117340A1 (en) * | 2002-12-16 | 2004-06-17 | Palo Alto Research Center, Incorporated | Method and apparatus for generating summary information for hierarchically related information |
CN105930314A (zh) * | 2016-04-14 | 2016-09-07 | 清华大学 | 基于编码-解码深度神经网络的文本摘要生成***及方法 |
CN106855853A (zh) * | 2016-12-28 | 2017-06-16 | 成都数联铭品科技有限公司 | 基于深度神经网络的实体关系抽取*** |
Family Cites Families (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP6591801B2 (ja) * | 2015-06-29 | 2019-10-16 | 任天堂株式会社 | 情報処理プログラム、情報処理システム、情報処理装置、および情報処理方法 |
JP6646991B2 (ja) * | 2015-10-01 | 2020-02-14 | 任天堂株式会社 | 情報処理システム、情報処理方法、情報処理装置、および、情報処理プログラム |
CN105512687A (zh) * | 2015-12-15 | 2016-04-20 | 北京锐安科技有限公司 | 训练情感分类模型和文本情感极性分析的方法及*** |
CN105740226A (zh) * | 2016-01-15 | 2016-07-06 | 南京大学 | 使用树形神经网络和双向神经网络实现中文分词 |
CN106407178B (zh) * | 2016-08-25 | 2019-08-13 | 中国科学院计算技术研究所 | 一种会话摘要生成方法、装置、服务器设备以及终端设备 |
JP6734761B2 (ja) * | 2016-11-07 | 2020-08-05 | 任天堂株式会社 | 情報処理システム、情報処理装置、情報処理装置の制御方法および情報処理プログラム |
WO2018232699A1 (zh) | 2017-06-22 | 2018-12-27 | 腾讯科技(深圳)有限公司 | 一种信息处理的方法及相关装置 |
-
2017
- 2017-06-22 WO PCT/CN2017/089586 patent/WO2018232699A1/zh active Application Filing
- 2017-06-22 CN CN201780054183.7A patent/CN109661664B/zh active Active
-
2019
- 2019-08-15 US US16/542,054 patent/US10789415B2/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20040117340A1 (en) * | 2002-12-16 | 2004-06-17 | Palo Alto Research Center, Incorporated | Method and apparatus for generating summary information for hierarchically related information |
CN105930314A (zh) * | 2016-04-14 | 2016-09-07 | 清华大学 | 基于编码-解码深度神经网络的文本摘要生成***及方法 |
CN106855853A (zh) * | 2016-12-28 | 2017-06-16 | 成都数联铭品科技有限公司 | 基于深度神经网络的实体关系抽取*** |
Non-Patent Citations (1)
Title |
---|
我偏笑_NSNIRVANA: "浅谈智能搜索和对话式OS", 《简书---HTTPS://WWW.JIANSHU.COM/P/3A9F49834C4A》 * |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112560398A (zh) * | 2019-09-26 | 2021-03-26 | 百度在线网络技术(北京)有限公司 | 一种文本生成方法及装置 |
CN111597814A (zh) * | 2020-05-22 | 2020-08-28 | 北京慧闻科技(集团)有限公司 | 一种人机交互命名实体识别方法、装置、设备及存储介质 |
CN111597814B (zh) * | 2020-05-22 | 2023-05-26 | 北京慧闻科技(集团)有限公司 | 一种人机交互命名实体识别方法、装置、设备及存储介质 |
CN112269872A (zh) * | 2020-10-19 | 2021-01-26 | 北京希瑞亚斯科技有限公司 | 简历解析方法、装置、电子设备及计算机存储介质 |
CN112269872B (zh) * | 2020-10-19 | 2023-12-19 | 北京希瑞亚斯科技有限公司 | 简历解析方法、装置、电子设备及计算机存储介质 |
CN114095033A (zh) * | 2021-11-16 | 2022-02-25 | 上海交通大学 | 基于上下文的图卷积的目标交互关系语义无损压缩***及方法 |
CN114095033B (zh) * | 2021-11-16 | 2024-05-14 | 上海交通大学 | 基于上下文的图卷积的目标交互关系语义无损压缩***及方法 |
Also Published As
Publication number | Publication date |
---|---|
WO2018232699A1 (zh) | 2018-12-27 |
CN109661664B (zh) | 2021-04-27 |
US10789415B2 (en) | 2020-09-29 |
US20190370316A1 (en) | 2019-12-05 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109661664B (zh) | 一种信息处理的方法及相关装置 | |
KR102382499B1 (ko) | 번역 방법, 타깃 정보 결정 방법, 관련 장치 및 저장 매체 | |
JP6870076B2 (ja) | ニューラル機械翻訳システム | |
Bahdanau et al. | Learning to compute word embeddings on the fly | |
EP4206994A1 (en) | Model compression method and apparatus | |
CN111597830A (zh) | 基于多模态机器学习的翻译方法、装置、设备及存储介质 | |
CN112395385B (zh) | 基于人工智能的文本生成方法、装置、计算机设备及介质 | |
CN109740158B (zh) | 一种文本语义解析方法及装置 | |
Khan et al. | RNN-LSTM-GRU based language transformation | |
CN108536735B (zh) | 基于多通道自编码器的多模态词汇表示方法与*** | |
CN111401081A (zh) | 神经网络机器翻译方法、模型及模型形成方法 | |
CN110569505A (zh) | 一种文本输入方法及装置 | |
CN112016275A (zh) | 一种语音识别文本的智能纠错方法、***和电子设备 | |
CN110263304B (zh) | 语句编码方法、语句解码方法、装置、存储介质及设备 | |
CN113569562A (zh) | 一种降低端到端语音翻译跨模态跨语言障碍的方法及*** | |
CN110348007A (zh) | 一种文本相似度确定方法及装置 | |
CN110874535A (zh) | 依存关系对齐组件、依存关系对齐训练方法、设备及介质 | |
Kasai et al. | End-to-end graph-based TAG parsing with neural networks | |
CN113935312A (zh) | 长文本匹配方法及装置、电子设备及计算机可读存储介质 | |
CN116611436B (zh) | 一种基于威胁情报的网络安全命名实体识别方法 | |
CN116776287A (zh) | 融合多粒度视觉与文本特征的多模态情感分析方法及*** | |
WO2023165111A1 (zh) | 客服热线中用户意图轨迹识别的方法及*** | |
Paul et al. | English to bengali neural machine translation system for the aviation domain | |
Hujon et al. | Neural machine translation systems for English to Khasi: A case study of an Austroasiatic language | |
CN111783435A (zh) | 共享词汇的选择方法、装置及存储介质 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |