WO2015080559A2 - A method and system for automated word sense disambiguation - Google Patents

A method and system for automated word sense disambiguation Download PDF

Info

Publication number
WO2015080559A2
WO2015080559A2 PCT/MY2014/000154 MY2014000154W WO2015080559A2 WO 2015080559 A2 WO2015080559 A2 WO 2015080559A2 MY 2014000154 W MY2014000154 W MY 2014000154W WO 2015080559 A2 WO2015080559 A2 WO 2015080559A2
Authority
WO
WIPO (PCT)
Prior art keywords
word
sense
words
verb
sentence
Prior art date
Application number
PCT/MY2014/000154
Other languages
French (fr)
Inventor
Chu Min Xian Benjamin
Qiang Liu
Lukose Dickson
Original Assignee
Mimos Berhad
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Mimos Berhad filed Critical Mimos Berhad
Publication of WO2015080559A2 publication Critical patent/WO2015080559A2/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition

Definitions

  • the present invention relates to information processing. More specifically, the present invention relates to a system and method for automated word sense disambiguation.
  • Word Sense Disambiguation is known to be challenging subject in the field of Natural Language Processing (NLP).
  • NLP Natural Language Processing
  • the challenge arises from the lack of means to address properties of context that characterize the use of words in a given sense. Further, there is also lack a standard and exhaustive inventory for word sense. It is also noted the accuracy of the current means to process disambiguation results at the final end is often questionable.
  • the challenge lies on the determinations of the context length and context content.
  • the context length refers to the size of window of text that should be taken to determine context. It can be difficult, if not impossible, to determine if the context length should contain only a few words, or a larger portions of the string.
  • a system for disambiguating word sense from a text containing document having sentences comprises an entity recognition module adapted for extracting possible entities from the sentence using a Linked Data; a text preprocessor adapted for tokenizing sentence into lemmatized words, the text processor includes a word recognizer adapted to identify if a verb and nouns from the sentence, a lemmatizer for lemmatizing the words of the sentence, and a polysemy checker for counting a number of possible sense of the words to determine if the words are ambiguous, an index builder (134) adapted for creating an index of schema graphs for each identified verb and to extract all possible sense description for nouns; a disambiguator adapted for disambiguating word senses, wherein the disambiguator extracts all the schemas for the identified verb and placing all the identified nouns into the schemas to determine the most suitable word sense, and a disambiguation rules is utilized for disambiguating word sense.
  • an entity recognition module adapted for extracting possible entities from the sentence using a Linked
  • the text processor is operable to determine if a word is a verb or nouns through a linguistic resource.
  • the index builder may be adapted for extracting schemas through the use of a linguistic resources and building an index reference for each word entry with all the related sense descriptions.
  • the disambiguator may create a context word vector from the nouns extracted from the sense description, wherein the words in the context word vector are checked for semantic constraints with reference to a concept hierarchy from the linguistic resources, and the disambiguation rules is utilised when the ambiguous word cannot be resolved.
  • the present invention has a method of disambiguating word sense from a text containing document having sentences.
  • the method comprises extracting possible entities from the sentence using a Linked Data; tokenizing sentence into lemmatized words; lemmatizing the words of the sentence; counting a number of possible sense of the words to determine if the words are ambiguous through a polysemy checker; identifying verb and nouns from the sentence; lemmatizing the words of the sentence; counting a number of possible sense of the words to determine if the words are ambiguous through a polysemy checker; creating an index of schema graphs for each identified verb and to extract all possible sense description for nouns; disambiguating word senses through extracting all the schemas for the identified verb and placing all the identified nouns into the schemas to determine the most suitable word sense; utilizing disambiguation rules disambiguating word sense.
  • the disambiguating word sense includes determining if the entity is a verb through referring to a linguistic resource and retrieving all possible schemas related to the verb.
  • the identifying verb and nouns includes matching verb and nouns with a linguistic resource.
  • the index builder may be adapted for extracting schemas through the use of a linguistic resources and building an index reference for each word entry with all the related sense descriptions.
  • disambiguating the word sense includes creating a context word vector from the nouns extracted from the sense description, wherein the words in the context word vector are checked for semantic constraints with reference to a concept hierarchy from the linguistic resources, and the disambiguation rules is utilised when the ambiguous word cannot be resolved.
  • FIG. 1 illustrates a block diagram of a word sense disambiguation system in accordance with one embodiment of the present invention
  • FIG. 2 illustrates a process carries out by the disambiguation module of
  • FIG. 1 in accordance with one embodiment of the present invention.
  • FIGs. 3A-D exemplify an example of a sentence that is being processed to resolve ambiguity.
  • FIG. 1 illustrates a block diagram of a word sense disambiguation system 100 in accordance with one embodiment of the present invention.
  • the system 100 is adapted for automatically identifying which sense of a word (i.e. meaning) is used in the sentence context. It is particularly useful for words that are polysemous or have multiple meanings.
  • the system comprises an Entity Recognition Module 102, a text pre-processor 102, and a disambiguation module 103.
  • the entity recognition module 101 provides a preprocessing to the sentence or a target string to be processed to identify the possible entities based on a link data 112. Any entity recognition engine that is known in the market is suitable for identifying the relevant entities.
  • the entity recognition module 101 is configured to recognize entities from content of a document. Many systems and methods for recognizing entities are well known in the art and they can be adapted for the present invention. In another embodiment, the entity recognition engine or module disclosed in the Malaysia patent application entitled "SYSTEM AND METHOD FOR AUTOMATED ENTITY RECOGNITION" filed on the same day as the present application can also be adapted wherein.
  • the text preprocessor 102 comprises a word recognizer 122, a
  • the word recognizer 122 is adapted to works with the entity recognition module 101 to distinguish from the sentence if a word is a noun or a verb.
  • the word recognizer 122 also takes references from the linguistic resources 138 to perform its recognitions.
  • the lemmatizer 124 is adapted to tokenize sentence into lemmatized word form to identify ambiguous words.
  • the polysemy checker 126 is utilized to identify which ambiguous word is to be disambiguated.
  • An index builder 134 is used to create an index of schema graphs/maps for each verb.
  • the disambiguation module 103 receives the word and disambiguates its word sense based on disambiguate rules 132.
  • disambiguate rules 132 are well known in the art.
  • FIG. 2 illustrates a process carries out by the disambiguation module 103 of FIG. 1 in accordance with one embodiment of the present invention.
  • the process starts with selecting a sentence or target sentence to be processed to extract word sense of the containing words at step 202. From the target sentence, each of the words is being determined if it is a verb at step 204 through the use of the linguistic resources 138. This is done through the text preprocessor 102. When a word is determined to be a verb at step 206, all possible schemas related to the target verb are retrieved at step 208. A word that is not identified as verb, in general, it would be a noun for being salient and meaningful, which otherwise, it will proceed under step 212.
  • the polysemy checker 126 calculates a total count of different possible senses for the word. When the word has more than one possible sense, the higher count on the polysemy, the higher in likelihood that the word is ambiguous. If the word is being determined to be ambiguous at step 214, all related sense descriptions for the word (potential ambiguous word) are retrieved through the index rendered by the index builder 134 at step 216 and subsequently all nouns are extracted from the sense description to create context vectors of the word at step 218.
  • step 222 if the word is determined to be not ambiguous, at step 222, the word is being matched with the schemas of the word retrieved in step 208. A best schema, being the maximum number of concept matched, is selected at step 224. [0027] At step 226, each of the context vectors of the word is checked if it satisfies selectional constraints of the semantic role of the best schema that identified earlier. The selectional constraints check is done with the reference to a concept hierarchy from the linguistic resources 138 at step 228.
  • step 232 When the selectional constraints above are satisfied, at step 232, a best sense to the ambiguous word is selected and assigned to that word. If this can be resolved in step 234, the sense of that word is identified. If the sense of that word cannot be resolved, i.e. selectional constraints check not satisfied, at step 234, the disambiguation module 103 applies disambiguation rules to give the word a word sense.
  • FIGs. 3A-3C exemplify an example of a sentence that is being processed to disambiguate the word sense thereof.
  • the exemplified sentence is "The boy fishes the bass from the river. ".
  • the exemplified sentence can also herewith refer as a target sentence.
  • the target sentence is scanned through by the present system 100 to identify entity/noun phrase through the entity recognition module 101 with reference to the Linked Data 112.
  • the word “boy”, “bass” and “river” shall be identified.
  • verb(s) are identified from the target sentence using the linguistic resources 138.
  • fish may be identified.
  • the word “fish” will be tokenized into lemmatized form.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Character Discrimination (AREA)
  • Machine Translation (AREA)

Description

A Method and System for Automated Word Sense Disambiguation
Field of the Invention
[0001] The present invention relates to information processing. More specifically, the present invention relates to a system and method for automated word sense disambiguation.
Background
[0002] Word Sense Disambiguation (WSD) is known to be challenging subject in the field of Natural Language Processing (NLP). The challenge arises from the lack of means to address properties of context that characterize the use of words in a given sense. Further, there is also lack a standard and exhaustive inventory for word sense. It is also noted the accuracy of the current means to process disambiguation results at the final end is often questionable.
[0003] Therefore, given any text to identify the correct sense for an ambiguous word in a sentence; the main problem considered here is how to identify which sense of a meaning is used in any given sentence, when the word has a number of distinct senses. When it happens, it would require substantial amounts of training examples/tagged datasets, i.e. supervised machine learning, to handle it. For example, the word "plant" may have various meaning in different contexts when it is searched through the WordNet. It is however not that these training examples and tagged datasets are still not sufficient for extracting the word sense. Further, the accuracy for fine-grained sense distinctions is still rather lacking with the existing systems. Thus far, for example, the highest accuracies based on the state-of-the art approaches range from about 59.1% to 69.0%. The challenge lies on the determinations of the context length and context content. The context length refers to the size of window of text that should be taken to determine context. It can be difficult, if not impossible, to determine if the context length should contain only a few words, or a larger portions of the string. Similarly, it is also a challenge to decide whether all context words or only a selected word, such as words in certain part of speech or a certain grammatical relations to the target word, are to be considered for context content. There is also question whether the selected the selected words should be weighted based on their distance apart from the target word, or be treated as a "bag of words". Summary
[0004] In accordance with one aspect of the present invention, there is a system for disambiguating word sense from a text containing document having sentences. The system comprises an entity recognition module adapted for extracting possible entities from the sentence using a Linked Data; a text preprocessor adapted for tokenizing sentence into lemmatized words, the text processor includes a word recognizer adapted to identify if a verb and nouns from the sentence, a lemmatizer for lemmatizing the words of the sentence, and a polysemy checker for counting a number of possible sense of the words to determine if the words are ambiguous, an index builder (134) adapted for creating an index of schema graphs for each identified verb and to extract all possible sense description for nouns; a disambiguator adapted for disambiguating word senses, wherein the disambiguator extracts all the schemas for the identified verb and placing all the identified nouns into the schemas to determine the most suitable word sense, and a disambiguation rules is utilized for disambiguating word sense. [0005] In one embodiment, the disambiguator is operable to determine if the entity is a verb by referring to a linguistic resource and subsequently retrieve all possible schemas related to the verb.
[0006] In another embodiment, the text processor is operable to determine if a word is a verb or nouns through a linguistic resource. The index builder may be adapted for extracting schemas through the use of a linguistic resources and building an index reference for each word entry with all the related sense descriptions.
[0007] In yet another embodiment, the disambiguator may create a context word vector from the nouns extracted from the sense description, wherein the words in the context word vector are checked for semantic constraints with reference to a concept hierarchy from the linguistic resources, and the disambiguation rules is utilised when the ambiguous word cannot be resolved.
[0008] In another aspect, the present invention has a method of disambiguating word sense from a text containing document having sentences. The method comprises extracting possible entities from the sentence using a Linked Data; tokenizing sentence into lemmatized words; lemmatizing the words of the sentence; counting a number of possible sense of the words to determine if the words are ambiguous through a polysemy checker; identifying verb and nouns from the sentence; lemmatizing the words of the sentence; counting a number of possible sense of the words to determine if the words are ambiguous through a polysemy checker; creating an index of schema graphs for each identified verb and to extract all possible sense description for nouns; disambiguating word senses through extracting all the schemas for the identified verb and placing all the identified nouns into the schemas to determine the most suitable word sense; utilizing disambiguation rules disambiguating word sense.
[0009] In one embodiment, the disambiguating word sense includes determining if the entity is a verb through referring to a linguistic resource and retrieving all possible schemas related to the verb.
[0010] In another embodiment, the identifying verb and nouns includes matching verb and nouns with a linguistic resource.
[0011] Further, the index builder may be adapted for extracting schemas through the use of a linguistic resources and building an index reference for each word entry with all the related sense descriptions.
[0012] Yet, disambiguating the word sense includes creating a context word vector from the nouns extracted from the sense description, wherein the words in the context word vector are checked for semantic constraints with reference to a concept hierarchy from the linguistic resources, and the disambiguation rules is utilised when the ambiguous word cannot be resolved.
Brief Description of the Drawings
[0013] Preferred embodiments according to the present invention will now be described with reference to the figures accompanied herein, in which like reference numerals denote like elements;
[0014] FIG. 1 illustrates a block diagram of a word sense disambiguation system in accordance with one embodiment of the present invention; [0015] FIG. 2 illustrates a process carries out by the disambiguation module of
FIG. 1 in accordance with one embodiment of the present invention; and
[0016] FIGs. 3A-D exemplify an example of a sentence that is being processed to resolve ambiguity. Detailed Description
[0017] Embodiments of the present invention shall now be described in detail, with reference to the attached drawings. It is to be understood that no limitation of the scope of the invention is thereby intended, such alterations and further modifications in the illustrated device, and such further applications of the principles of the invention as illustrated therein being contemplated as would normally occur to one skilled in the art to which the invention relates.
[0018] FIG. 1 illustrates a block diagram of a word sense disambiguation system 100 in accordance with one embodiment of the present invention. The system 100 is adapted for automatically identifying which sense of a word (i.e. meaning) is used in the sentence context. It is particularly useful for words that are polysemous or have multiple meanings. Briefly, the system comprises an Entity Recognition Module 102, a text pre-processor 102, and a disambiguation module 103.
[0019] The entity recognition module 101 provides a preprocessing to the sentence or a target string to be processed to identify the possible entities based on a link data 112. Any entity recognition engine that is known in the market is suitable for identifying the relevant entities. [0020] The entity recognition module 101 is configured to recognize entities from content of a document. Many systems and methods for recognizing entities are well known in the art and they can be adapted for the present invention. In another embodiment, the entity recognition engine or module disclosed in the Malaysia patent application entitled "SYSTEM AND METHOD FOR AUTOMATED ENTITY RECOGNITION" filed on the same day as the present application can also be adapted wherein.
[0021] The text preprocessor 102 comprises a word recognizer 122, a
Lemmatizer 124, and a Polysemy Checker 126. The word recognizer 122 is adapted to works with the entity recognition module 101 to distinguish from the sentence if a word is a noun or a verb. The word recognizer 122 also takes references from the linguistic resources 138 to perform its recognitions. The lemmatizer 124 is adapted to tokenize sentence into lemmatized word form to identify ambiguous words. The polysemy checker 126 is utilized to identify which ambiguous word is to be disambiguated. [0022] An index builder 134 is used to create an index of schema graphs/maps for each verb. It extracts all possible sense descriptions from the linguistic resources 138 to determine if the word is a noun, while extracting possible related schemas to the targeted verb. The index is referenced by the disambiguator for disambiguating the word sense. [0023] It can be seen that the use of the polysemy checker 126 to identify ambiguous word, and the word recognizer 122 to distinguish verb and noun from the sentence and the index builder to extract sense description (for noun) and schemas (for verb) would be able to address the need for substantial amounts of training examples/tagged datasets.
[0024] The disambiguation module 103 receives the word and disambiguates its word sense based on disambiguate rules 132. With the use of semantic information (i.e. schemas), and the context vectors created from the sense descriptions to determine for selectional constraint for the semantic role, a high accuracy for fine-grained sense distinctions can be achieved. The disambiguation rules 132 are well known in the art.
[0025] FIG. 2 illustrates a process carries out by the disambiguation module 103 of FIG. 1 in accordance with one embodiment of the present invention. The process starts with selecting a sentence or target sentence to be processed to extract word sense of the containing words at step 202. From the target sentence, each of the words is being determined if it is a verb at step 204 through the use of the linguistic resources 138. This is done through the text preprocessor 102. When a word is determined to be a verb at step 206, all possible schemas related to the target verb are retrieved at step 208. A word that is not identified as verb, in general, it would be a noun for being salient and meaningful, which otherwise, it will proceed under step 212. At the step 212, it is further being determined if that word is ambiguous by using the polysemy checker. The polysemy checker 126 calculates a total count of different possible senses for the word. When the word has more than one possible sense, the higher count on the polysemy, the higher in likelihood that the word is ambiguous. If the word is being determined to be ambiguous at step 214, all related sense descriptions for the word (potential ambiguous word) are retrieved through the index rendered by the index builder 134 at step 216 and subsequently all nouns are extracted from the sense description to create context vectors of the word at step 218. [0026] Returning to the step 214, if the word is determined to be not ambiguous, at step 222, the word is being matched with the schemas of the word retrieved in step 208. A best schema, being the maximum number of concept matched, is selected at step 224. [0027] At step 226, each of the context vectors of the word is checked if it satisfies selectional constraints of the semantic role of the best schema that identified earlier. The selectional constraints check is done with the reference to a concept hierarchy from the linguistic resources 138 at step 228.
[0028] When the selectional constraints above are satisfied, at step 232, a best sense to the ambiguous word is selected and assigned to that word. If this can be resolved in step 234, the sense of that word is identified. If the sense of that word cannot be resolved, i.e. selectional constraints check not satisfied, at step 234, the disambiguation module 103 applies disambiguation rules to give the word a word sense.
[0029] FIGs. 3A-3C exemplify an example of a sentence that is being processed to disambiguate the word sense thereof. The exemplified sentence is "The boy fishes the bass from the river. ". The exemplified sentence can also herewith refer as a target sentence. As shown in FIG. 3A, the target sentence is scanned through by the present system 100 to identify entity/noun phrase through the entity recognition module 101 with reference to the Linked Data 112. In this case, the word "boy", "bass" and "river" shall be identified. Subsequently, verb(s) are identified from the target sentence using the linguistic resources 138. In this case, "fish" may be identified. The word "fish" will be tokenized into lemmatized form. [0030] Accordingly, as shown in FIG. 3B, all the schemas relating to "fish" are extracted as shown as S I, S2, Sn. The ambiguous words are being identified through polysemy checker 126. In this case, "fish" and "bass" are polysemous, and since "fish" is identified as verb, it can be remove from the consideration to disambiguate. Subsequently putting the words from the sentence to all the schemas extracted before. In this case, S2 can be identified as the closest match for the sentence.
[0031] Referring now to FIG. 3C, all the possible sense descriptions are retrieved through the index builder 134. Following that, all the nouns from the sense description is extracted to create a context vector. [0032] As shown in FIG. 3D, each of the context word is checked if the semantic constraints are satisfied. In the illustrated sentence, you can identify that the context word "fish" is an animate from the concept hierarchy. And therefore, the word "bass" shall be assigned with an appropriate sense from the sense description that has the word "fish" as shown in FIG. 3C. [0033] While specific embodiments have been described and illustrated, it is understood that many changes, modifications, variations, and combinations thereof could be made to the present invention without departing from the scope of the invention.

Claims

Claims
1. A system for disambiguating word sense from a text containing document having sentences, the system comprising:
an entity recognition module (101) adapted for extracting possible entities from the sentence using a Linked Data (112);
a text preprocessor (102) adapted for tokenizing sentence into lemmatized words, the text processor (102) includes a word recognizer (122) adapted to identify if a verb and nouns from the sentence, a lemmatizer (124) for lemmatizing the words of the sentence, and a polysemy checker (126) for counting a number of possible sense of the words to determine if the words are ambiguous,
an index builder (134) adapted for creating an index of schema graphs for each identified verb and to extract all possible sense description for nouns;
a disambiguator (103) adapted for disambiguating word senses, wherein the disambiguator (103) extracts all the schemas for the identified verb and placing all the identified nouns into the schemas to determine the most suitable word sense, and a disambiguation rules (132) is utilized for disambiguating word sense.
2. The system according to claim 1, wherein the disambiguator (103) is operable to determine if the entity is a verb through referring to a linguistic resource (138) and retrieve all possible schemas related to the verb.
3. The system according to claim 1, wherein the text processor (102) is operable to determine if a word is a verb or nouns through a linguistic resource (138).
4. The system according to claim 2, wherein the index builder (134) adapted for extracting schemas through the use of a linguistic resources (138) and building an index reference for each word entry with all the related sense descriptions.
5. The system according to claim 1, wherein the disambiguator (103) creates a context word vector from the nouns extracted from the sense description, wherein the words in the context word vector are based a concept hierarchy from the linguistic resources (138), and the disambiguation rules (132) is utilised when the ambiguous word cannot be resolved.
6. A method of disambiguating word sense from a text containing document having sentences, the method comprising:
extracting possible entities from the sentence using a Linked Data (112);
tokenizing sentence into lemmatized words
lemmatizing the words of the sentence
counting a number of possible sense of the words to determine if the words are ambiguous through a polysemy checker;
identifying verb and nouns from the sentence,
lemmatizing the words of the sentence, counting a number of possible sense of the words to determine if the words are ambiguous through a polysemy checker (126);
creating an index of schema graphs for each identified verb and to extract all possible sense description for nouns;
disambiguating word senses through extracting all the schemas for the identified verb and placing all the identified nouns into the schemas to determine the most suitable word sense;
utilizing disambiguation rules (132) disambiguating word sense.
7. The method according to claim 6, wherein disambiguating word sense includes determining if the entity is a verb through referring to a linguistic resource (138) and retrieving all possible schemas related to the verb.
8. The method according to claim 6, wherein identifying verb and nouns includes matching verb and nouns with a linguistic resource (138).
9. The method according to claim 6, wherein the index builder (138) adapted for extracting schemas through the use of a linguistic resources and building an index reference for each word entry with all the related sense descriptions.
10. The method according to claim 6, wherein disambiguating the word sense incudes creating a context word vector from the nouns extracted from the sense description, wherein the words in the context word vector are based on a concept hierarchy from the linguistic resources (138), and the disambiguation rules (132) is utilised when the ambiguous word cannot be resolved.
PCT/MY2014/000154 2013-11-27 2014-05-29 A method and system for automated word sense disambiguation WO2015080559A2 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
MYPI2013004280 2013-11-27
MYPI2013004280A MY182881A (en) 2013-11-27 2013-11-27 A method and system for automated entity recognition

Publications (1)

Publication Number Publication Date
WO2015080559A2 true WO2015080559A2 (en) 2015-06-04

Family

ID=51690418

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/MY2014/000154 WO2015080559A2 (en) 2013-11-27 2014-05-29 A method and system for automated word sense disambiguation

Country Status (2)

Country Link
MY (1) MY182881A (en)
WO (1) WO2015080559A2 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108509449A (en) * 2017-02-24 2018-09-07 腾讯科技(深圳)有限公司 A kind of method and server of information processing
CN109492214A (en) * 2017-09-11 2019-03-19 苏州大学 The identification of attribute word and its level construction method, device, equipment and storage medium
CN111199149A (en) * 2019-12-17 2020-05-26 航天信息股份有限公司 Intelligent statement clarifying method and system for dialog system

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108509449A (en) * 2017-02-24 2018-09-07 腾讯科技(深圳)有限公司 A kind of method and server of information processing
CN108509449B (en) * 2017-02-24 2022-07-08 腾讯科技(深圳)有限公司 Information processing method and server
CN109492214A (en) * 2017-09-11 2019-03-19 苏州大学 The identification of attribute word and its level construction method, device, equipment and storage medium
CN109492214B (en) * 2017-09-11 2023-09-19 苏州大学 Attribute word recognition and hierarchy construction method, device, equipment and storage medium
CN111199149A (en) * 2019-12-17 2020-05-26 航天信息股份有限公司 Intelligent statement clarifying method and system for dialog system
CN111199149B (en) * 2019-12-17 2023-10-20 航天信息股份有限公司 Sentence intelligent clarification method and system for dialogue system

Also Published As

Publication number Publication date
MY182881A (en) 2021-02-05

Similar Documents

Publication Publication Date Title
CN106445998B (en) Text content auditing method and system based on sensitive words
WO2015196909A1 (en) Word segmentation method and device
CN107193796B (en) Public opinion event detection method and device
Salehi et al. Using distributional similarity of multi-way translations to predict multiword expression compositionality
US9600469B2 (en) Method for detecting grammatical errors, error detection device for same and computer-readable recording medium having method recorded thereon
Jhamtani et al. Word-level language identification in bi-lingual code-switched texts
Jahan et al. A new approach to animacy detection
Jayan et al. A hybrid statistical approach for named entity recognition for malayalam language
Gupta et al. Preprocessing phase of Punjabi language text summarization
Shajalal et al. Semantic textual similarity in bengali text
Sevgili et al. N-hance at semeval-2017 task 7: A computational approach using word association for puns
WO2015080559A2 (en) A method and system for automated word sense disambiguation
Arikan et al. Detecting clitics related orthographic errors in Turkish
Sarmah et al. Word sense disambiguation for Assamese
Utt et al. Crosslingual and multilingual construction of syntax-based vector space models
Ahmed et al. Question analysis for Arabic question answering systems
CN110162615B (en) Intelligent question and answer method and device, electronic equipment and storage medium
Gautam et al. Hindi word sense disambiguation using lesk approach on bigram and trigram words
Mahafdah et al. Arabic Part of speech Tagging using k-Nearest Neighbour and Naive Bayes Classifiers Combination.
Cheng et al. Single document summarization based on triangle analysis of dependency graphs
Singh et al. Word sense disambiguation: enhanced lesk approach in Punjabi language
Lai et al. An unsupervised approach to discover media frames
CN111814025A (en) Viewpoint extraction method and device
Karisani et al. Multi-view active learning for short text classification in user-generated data
Farahmand et al. Modeling the statistical idiosyncrasy of multiword expressions

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 14783674

Country of ref document: EP

Kind code of ref document: A2

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 14783674

Country of ref document: EP

Kind code of ref document: A2