WO2018227995A1 - 基于句法依存关系提取中心词的方法、终端、设备及存储介质 - Google Patents

基于句法依存关系提取中心词的方法、终端、设备及存储介质 Download PDF

Info

Publication number
WO2018227995A1
WO2018227995A1 PCT/CN2018/077142 CN2018077142W WO2018227995A1 WO 2018227995 A1 WO2018227995 A1 WO 2018227995A1 CN 2018077142 W CN2018077142 W CN 2018077142W WO 2018227995 A1 WO2018227995 A1 WO 2018227995A1
Authority
WO
WIPO (PCT)
Prior art keywords
word
subtree
node
syntax
preliminary score
Prior art date
Application number
PCT/CN2018/077142
Other languages
English (en)
French (fr)
Inventor
吕梓燊
韦邕
赵清源
徐亮
肖京
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2018227995A1 publication Critical patent/WO2018227995A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/253Grammatical analysis; Style critique
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis

Definitions

  • the present application relates to the field of computer technology, and in particular, to a method, a terminal, a device, and a storage medium for extracting a central word based on a syntax dependency relationship.
  • the central word extraction is to extract some of the more important words that can summarize the content of the paragraph text for a natural language text.
  • the commonly used central word extraction methods include the TF-IDF method and the TextRank method. These methods have certain versatility, but the above-mentioned commonly used methods have the following disadvantages: the central word extraction is performed by using the characteristics of the text input by the user. The method is very limited. For the application of specific fields, the direct extraction of the central word by using the existing method will result in the extraction effect being unsatisfactory and unable to meet the requirements of the application.
  • the embodiment of the present invention provides a method, a terminal, a device, and a storage medium for extracting a central word based on a syntax dependency relationship, which can fully understand the text information input by the user and perform central word extraction, thereby improving the processing effect of the central word extraction, and operating Convenient and flexible configuration.
  • the embodiment of the present application provides a method for extracting a central word based on a syntax dependency relationship, the method comprising:
  • the embodiment of the present application further provides a terminal for extracting a central word based on a syntax dependency relationship, and the terminal includes:
  • a first obtaining unit configured to acquire text information input by the user
  • a determining unit configured to determine a syntax structure tree of the text information according to a preset syntax dependency rule
  • a calculating unit configured to calculate a preliminary score of each word node in the subtree
  • an extracting unit configured to extract a central word in the text information according to the preliminary score.
  • the embodiment of the present application further provides an apparatus for extracting a central word based on a syntax dependency relationship, including:
  • a memory for storing a program that implements extracting a central word
  • a processor configured to execute a program stored in the memory to implement extracting a central word, to perform the following operations:
  • the embodiment of the present application further provides a computer readable storage medium, where the one or more computer programs are stored, and the one or more computer programs may be one or more
  • the processor executes to implement the following steps:
  • the present application has the following beneficial effects: the embodiment of the present application determines the syntactic structure tree of the text information according to a preset syntax dependency rule by acquiring text information input by the user, and trimming the syntax structure tree. Constructing a subtree according to the pruned syntax tree, calculating a preliminary score of each word node in the subtree, extracting a central word in the text information according to the preliminary score, and fully understanding the text information input by the user and performing Central word extraction, in addition, the use of syntactic dependency rules to construct subtrees in a targeted manner, further improving the processing effect of central word extraction, and convenient operation and flexible configuration.
  • FIG. 1 is a schematic flow chart of a method for extracting a central word based on a syntax dependency relationship according to an embodiment of the present application.
  • FIG. 2 is a schematic diagram showing a method for extracting a central word based on a syntax dependency relationship according to an embodiment of the present application.
  • FIG. 3 is another schematic diagram of a method for extracting a central word based on a syntax dependency relationship according to an embodiment of the present application.
  • FIG. 4 is another schematic flowchart of a method for extracting a central word based on a syntax dependency relationship according to an embodiment of the present application.
  • FIG. 5 is another schematic flowchart of a method for extracting a central word based on a syntax dependency relationship according to an embodiment of the present application.
  • FIG. 6 is another schematic diagram of a method for extracting a central word based on a syntax dependency relationship according to an embodiment of the present application.
  • FIG. 7 is another schematic flowchart of a method for extracting a central word based on a syntax dependency relationship according to an embodiment of the present application.
  • FIG. 8 is a schematic flowchart of a method for extracting a central word based on a syntax dependency relationship according to another embodiment of the present application.
  • FIG. 9 is a schematic block diagram of a terminal for extracting a central word based on a syntax dependency relationship according to an embodiment of the present application.
  • FIG. 10 is another schematic block diagram of a terminal for extracting a central word based on a syntax dependency relationship according to an embodiment of the present application.
  • FIG. 11 is another schematic block diagram of a terminal for extracting a central word based on a syntax dependency relationship according to an embodiment of the present application.
  • FIG. 12 is another schematic block diagram of a terminal for extracting a central word based on a syntax dependency relationship according to an embodiment of the present application.
  • FIG. 13 is another schematic block diagram of a terminal for extracting a central word based on a syntax dependency relationship according to an embodiment of the present application.
  • FIG. 14 is a schematic structural diagram of extracting a central word based on a syntax dependency relationship according to an embodiment of the present application.
  • FIG. 1 is a schematic flowchart of a method for extracting a central word based on a syntax dependency relationship according to an embodiment of the present application.
  • the method can be run on terminals such as smart phones (such as Android phones, IOS phones, etc.), tablets, laptops, and smart devices.
  • the method mainly extracts the central word in the text information input by the user, and the method of the present application can fully understand the text information input by the user and extract the central word, and further improve the processing effect of the central word extraction, and operate Convenient and flexible configuration.
  • steps S101 to S106 of the method are shown in Fig. 1, steps S101 to S106 of the method.
  • the text information input by the user may be the query text input by the user on the web browser or the search engine of the terminal, for example, the query text input by the user is “stomach pain, thick tongue coating, and what is the reason for whitening”.
  • the web browser or search engine of the terminal will obtain the query text input by the user in real time, that is, "stomach pain, thick tongue coating, and what is the reason for whitening.”
  • the preset syntax dependency rule refers to revealing the syntactic structure by analyzing the dependencies between the components in the language unit, and indicating the syntactic relationship between the words. Determining a syntactic structure tree of the text information according to a preset syntax dependency rule, for example, for text information "stomach pain, thick tongue, and whitish reason", after analyzing by a preset syntax dependency rule, as shown in FIG.
  • a plurality of word nodes are included in the syntax structure tree, and the syntax structure tree is trimmed, for example, the syntax structure tree in FIG. 3 is trimmed, and the trimmed content includes removing the stay word.
  • the word node of the unrelated relationship component so the removed word node can be configured according to the specific domain application, and has no influence on the structure tree; specifically, as shown in FIG. 4, in step S103, the syntax is performed.
  • the tree is trimmed, and the following steps S201 to S202 are included.
  • the word nodes of the partial relationship components include word nodes with punctuation marks, word nodes with structures in the shape, and other relationship components that are not needed.
  • the word node, where the word node of the punctuation is "?", ",", and the word nodes of the structure are "still", “can", “completely”, and in the syntax tree of the figure
  • the word nodes of other relationship components that are needed include "treatment” and "healing”.
  • the word nodes of other relationship components that are not needed in the embodiments of the present application can be screened according to actual conditions, and the specific screening method is not limited herein. .
  • the word node of the partial relationship component is deleted, that is, the word node of the partial relationship component is deleted directly in the syntax structure tree, and after the word node of the partial relationship component is deleted, other existence relationships are The word nodes of the component will maintain the relative hierarchical relationship originally in the syntax tree.
  • step S104 includes steps S301 to S302.
  • the core word node is “deficient”, and the word node in parallel relationship with the heart word node is “traction”.
  • the subtree is constructed according to the relative hierarchical relationship of each node in the pruned syntax tree, and the core word node and other word nodes in parallel with the core word node, for example, as shown in FIG.
  • the pruned syntactic tree is constructed as a subtree in the figure, and in the subsequent central word extraction process, the constructed subtree is processed as a unit.
  • the preliminary score of each word node in the subtree is calculated according to factors such as part of speech, syntactic relationship role, word length, node depth and the like.
  • the scores of these part of speech are higher in the part of speech factors; in general, the longer the length of the words, the greater the amount of information, the possibility of the central words. Higher, higher scores; among the syntactic relationship role factors, the core relationship, the subject-predicate relationship, the verb-object relationship, etc.
  • the weighting rule weights the scores of the feature factors to obtain a comprehensive initial score for each word node, wherein the preset weighting rules can be set by the user, and the specific rules are not limited herein.
  • step S106 includes steps S401 to S402.
  • the word node with the highest score of the preliminary score is extracted and used as a central word in the text information.
  • the embodiment of the present application determines the syntactic structure tree of the text information according to the preset syntax dependency rule by acquiring the text information input by the user, and pruning the syntax structure tree, and constructing according to the parsed syntax structure tree.
  • Subtree calculating a preliminary score of each word node in the subtree, extracting a central word in the text information according to the preliminary score, fully understanding the text information input by the user, and performing central word extraction, and further, using the syntax
  • the dependency tree constructs the subtree in a targeted manner, which further improves the processing effect of the central word extraction, and is convenient to operate and flexible in configuration.
  • FIG. 8 is a schematic flowchart of a method for extracting a central word based on a syntax dependency relationship according to an embodiment of the present application.
  • the method can be run on terminals such as smart phones (such as Android phones, IOS phones, etc.), tablets, laptops, and smart devices.
  • the method mainly extracts the central word in the text information input by the user, and the method of the present application can fully understand the text information input by the user and extract the central word, and further improve the processing effect of the central word extraction, and operate Convenient and flexible configuration.
  • steps S501 to S507 of the method are shown in FIG. 8, steps S501 to S507 of the method.
  • the text information input by the user may be the query text input by the user on the web browser or the search engine of the terminal, for example, the query text input by the user is “stomach pain, thick tongue coating, and what is the reason for whitening”.
  • the web browser or search engine of the terminal will obtain the query text input by the user in real time, that is, "stomach pain, thick tongue coating, and what is the reason for whitening.”
  • the preset syntax dependency rule refers to revealing the syntactic structure by analyzing the dependencies between the components in the language unit, and indicating the syntactic relationship between the words. Determining a syntactic structure tree of the text information according to a preset syntax dependency rule, for example, for text information "stomach pain, thick tongue, and whitish reason", after analyzing by a preset syntax dependency rule, as shown in FIG.
  • a plurality of word nodes are included in the syntax structure tree, and the syntax structure tree is trimmed, for example, the syntax structure tree in FIG. 3 is trimmed, and the trimmed content includes removing the stay word.
  • the word nodes of the unrelated relationship components, so the removed word nodes can be configured according to the specific domain application, and have no influence on the structure tree.
  • S504 Construct a subtree according to the pruned syntax structure tree.
  • the parallel relationship means that the semantic effects between the words are relatively similar
  • the scores of the words in the group are adjusted according to the word length for the word nodes of each group of the parallel relationship.
  • a preliminary score of each word node of the parallel relationship in the subtree is calculated according to a preset allocation rule, and specifically, each group has a juxtaposition The word node of the relationship, the initial scores of the word nodes are summed within the group, and the total score is assigned according to the proportion of the word length of each word to the sum of the word lengths of all the words in the group.
  • the parallel relationship in the subtree is recalculated according to the preset allocation rule.
  • the preliminary score of each word node extracts the center in the text information according to the preliminary score, and can ensure more accurate extraction of the required central word.
  • the storage medium may be a magnetic disk, an optical disk, a read-only memory (ROM), or a random access memory (RAM).
  • a method for extracting a central word based on a syntax dependency relationship is provided.
  • the embodiment of the present application further provides a terminal for extracting a central word based on a syntax dependency relationship, where the terminal 100 includes: a first obtaining unit 101 and a determining unit. 102. Trimming unit 103, building unit 104, computing unit 105, and extracting unit 106.
  • the first obtaining unit 101 is configured to acquire text information input by a user.
  • the determining unit 102 is configured to determine a syntax structure tree of the text information according to a preset syntax dependency rule.
  • the pruning unit 103 is configured to prun the syntax structure tree.
  • the building unit 104 is configured to construct a subtree according to the pruned syntax structure tree.
  • the calculating unit 105 is configured to calculate a preliminary score of each word node in the subtree.
  • the extracting unit 106 is configured to extract a central word in the text information according to the preliminary score.
  • the pruning unit 103 includes: a second obtaining unit 1031, configured to acquire a word node of a partial relationship component in the syntax structure tree; and a deleting unit 1032, configured to delete a word node of a partial relationship component .
  • the building unit 104 includes: a third obtaining unit 1041, configured to acquire a core word node in the pruned syntax structure tree and other word nodes in a side-by-side relationship with the core word node;
  • the unit 1042 is configured to construct a subtree according to the core word node and other word nodes in a side-by-side relationship with the core word node.
  • the extracting unit 106 includes: a sorting unit 1061, configured to sort each word node in the subtree according to the preliminary score; and extract a subunit 1062, according to the sorted result. Extract the central word in the text information.
  • a method for extracting a central word based on a syntax dependency relationship is provided.
  • the embodiment of the present application further provides a terminal for extracting a central word based on a syntax dependency relationship, where the terminal 200 includes: a first acquiring unit 201 and a determining unit. 202, a pruning unit 203, a building unit 204, a judging unit 205, a calculating subunit 206, and an extracting unit 207.
  • the first obtaining unit 201 is configured to acquire text information input by the user; the determining unit 202 is configured to determine a syntax structure tree of the text information according to a preset syntax dependency rule; and the pruning unit 203 is configured to The syntax structure tree is pruned; the construction unit 204 is configured to construct a subtree according to the pruned syntax structure tree; the determining unit 205 is configured to determine whether a word node of the parallel relationship exists in the subtree; and the calculation subunit 206 uses And if there is a word node of the parallel relationship in the subtree, calculating a preliminary score of each word node of the parallel relationship in the subtree according to a preset allocation rule; and extracting unit 207, configured to extract the text according to the preliminary score The central word in the message.
  • the foregoing first obtaining unit 101, determining unit 102, trimming unit 103, building unit 104, computing unit 105, extracting unit 106, etc. may be embedded in hardware device or in a device independent of data processing, or may be Stored in software in the memory of the data processing device for the processor to invoke the operations corresponding to the various units above.
  • the processor can be a central processing unit (CPU), a microprocessor, a microcontroller, or the like.
  • FIG. 14 is a schematic structural diagram of a device for extracting a central word based on a syntax dependency relationship according to the present application.
  • the device 300 can include an input device 301, an output device 302, a transceiver 303, a memory 304, and a processor 305, where:
  • the input device 301 is configured to receive input data of an external access control device.
  • the input device 301 in the embodiment of the present application may include a keyboard, a mouse, a photoelectric input device, a sound input device, a touch input device, a scanner, and the like.
  • the output device 302 is configured to output output data of the access control device to the outside.
  • the output device 302 described in this embodiment of the present application may include a display, a speaker, a printer, and the like.
  • the transceiver device 303 is configured to send data to or receive data from other devices through a communication link.
  • the transceiver 303 of the embodiment of the present application may include a transceiver device such as a radio frequency antenna.
  • the memory 304 is configured to store a program that implements extracting a central word.
  • the memory 304 of an embodiment of the present application may be a system memory such as volatile (such as RAM), non-volatile (such as ROM, flash memory, etc.), or a combination of both.
  • the memory 304 of the embodiment of the present application may also be an external memory outside the system, such as a magnetic disk, an optical disk, a magnetic tape, or the like.
  • the processor 305 is configured to run a program for extracting a central word stored in the memory 304 to perform the following operations: acquiring text information input by a user; determining a syntax structure of the text information according to a preset syntax dependency rule a tree; pruning the syntax structure tree; constructing a subtree according to the pruned syntax structure tree; calculating a preliminary score of each word node in the subtree; and extracting a central word in the text information according to the preliminary score.
  • the pruning the syntax structure tree includes: acquiring a word node of a partial relationship component in the syntax structure tree; and deleting a word node of the partial relationship component.
  • the constructing the subtree according to the pruned syntax structure tree includes: acquiring a core word node in the pruned syntax structure tree and other word nodes in a side by side relationship with the core word node; according to the core word node And other word nodes in a side-by-side relationship with the core word node construct a subtree.
  • the calculating the preliminary score of each word node in the subtree comprises: calculating the preliminary score according to the part of speech, the syntactic relationship role, the word length, and the node depth of each word node in the subtree.
  • the extracting the central word in the text information according to the preliminary score comprises: sorting each word node in the subtree according to the preliminary score; and extracting the text information according to the sorted result The central word in the middle.
  • the calculating a preliminary score of each word node in the subtree includes: determining whether a word node of the parallel relationship exists in the subtree; if there is a word node of the parallel relationship in the subtree, according to the preset
  • the allocation rule calculates a preliminary score for each word node of the parallel relationship in the subtree.
  • the embodiment of the device for extracting a central word based on the syntax dependency relationship shown in FIG. 14 does not constitute a limitation on the specific configuration of the device for extracting the central word based on the syntax dependency relationship.
  • based on A device that extracts a central word from a syntactic dependency may include more or fewer components than the illustration, or a combination of certain components, or a different component arrangement.
  • the device for extracting the central word based on the syntax dependency may include only the memory and the processor. In such an embodiment, the structure and function of the memory and the processor are consistent with the embodiment shown in FIG. This will not be repeated here.
  • the application provides a computer readable storage medium storing one or more computer programs, the one or more computer programs being executable by one or more processors to implement the above-described syntax-based The method of extracting the central word from the dependency relationship.
  • the foregoing storage medium of the present application includes: a magnetic disk, an optical disk, a read-only memory (ROM), or a random access memory (RAM), and the like, which can store a program code.
  • the units in all the embodiments of the present application may be implemented by a general-purpose integrated circuit, such as a CPU (Central Processing Unit), or by an ASIC (Application Specific Integrated Circuit).
  • a general-purpose integrated circuit such as a CPU (Central Processing Unit), or by an ASIC (Application Specific Integrated Circuit).
  • the steps in the method for extracting the central word based on the syntax dependency relationship in the embodiment of the present application may be sequentially adjusted, merged, and deleted according to actual needs.
  • the unit in the terminal that extracts the central word based on the syntax dependency relationship may be merged, divided, and deleted according to actual needs.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Machine Translation (AREA)

Abstract

本申请实施例公开了一种基于句法依存关系提取中心词的方法、终端、设备及存储介质,其中所述方法包括:获取用户输入的文本信息;根据预设的句法依存规则确定所述文本信息的句法结构树;对所述句法结构树进行修剪;根据修剪后的句法结构树构建子树;计算所述子树中各词节点的初步分数;根据所述初步分数提取所述文本信息中的中心词。本申请充分理解用户所输入的文本信息并进行中心词提取,另外,利用句法依存规则针对性地构建子树,进一步提高了中心词提取的处理效果,并且操作方便、配置灵活。

Description

基于句法依存关系提取中心词的方法、终端、设备及存储介质
本申请要求于2017年6月16日提交中国专利局、申请号为CN 201710458259.4、申请名称为“一种基于句法依存关系提取中心词的方法、终端以及设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及计算机技术领域,尤其涉及一种基于句法依存关系提取中心词的方法、终端、设备及存储介质。
背景技术
中心词提取,顾名思义,即是为一段自然语言文本提取出其中比较重要的、能够概括该段文本的内容的一些词语。目前,常用的中心词提取方法有TF-IDF方法、TextRank方法等,这些方法具备一定的通用性,但是上述常用的方法中存在以下缺点:利用用户所输入的文本本身的特征进行中心词提取的方式十分有限,对于各特定领域的应用而言,直接使用现有的方法进行中心词提取会导致提取的效果很不理想,无法满足应用的要求。
发明内容
本申请实施例提供一种基于句法依存关系提取中心词的方法、终端、设备及存储介质,可以充分理解用户所输入的文本信息并进行中心词提取,提高了中心词提取的处理效果,并且操作方便、配置灵活。
一方面,本申请实施例提供了一种基于句法依存关系提取中心词的方法,该方法包括:
获取用户输入的文本信息;
根据预设的句法依存规则确定所述文本信息的句法结构树;
对所述句法结构树进行修剪;
根据修剪后的句法结构树构建子树;
计算所述子树中各词节点的初步分数;
根据所述初步分数提取所述文本信息中的中心词。
另一方面,本申请实施例还提供了一种基于句法依存关系提取中心词的终端,该终端包括:
第一获取单元,用于获取用户输入的文本信息;
确定单元,用于根据预设的句法依存规则确定所述文本信息的句法结构树;
修剪单元,用于对所述句法结构树进行修剪;
构建单元,用于根据修剪后的句法结构树构建子树;
计算单元,用于计算所述子树中各词节点的初步分数;
提取单元,用于根据所述初步分数提取所述文本信息中的中心词。
又一方面,本申请实施例还提供了一种基于句法依存关系提取中心词的设备,包括:
存储器,用于存储实现提取中心词的程序,以及
处理器,用于运行所述存储器中存储的实现提取中心词的程序,以执行以下操作:
获取用户输入的文本信息;
根据预设的句法依存规则确定所述文本信息的句法结构树;
对所述句法结构树进行修剪;
根据修剪后的句法结构树构建子树;
计算所述子树中各词节点的初步分数;
根据所述初步分数提取所述文本信息中的中心词。
再一方面,本申请实施例还提供了一种计算机可读存储介质,所述计算机可读存储介质存储有一个或者一个以上计算机程序,所述一个或者一个以上计算机程序可被一个或者一个以上的处理器执行,以实现以下步骤:
获取用户输入的文本信息;
根据预设的句法依存规则确定所述文本信息的句法结构树;
对所述句法结构树进行修剪;
根据修剪后的句法结构树构建子树;
计算所述子树中各词节点的初步分数;
根据所述初步分数提取所述文本信息中的中心词。
综上所述,本申请具有以下有益效果:本申请实施例通过获取用户输入的文本信息,根据预设的句法依存规则确定所述文本信息的句法结构树,对所述句法结构树进行修剪,根据修剪后的句法结构树构建子树,计算所述子树中各词节点的初步分数,根据所述初步分数提取所述文本信息中的中心词,可以充分理解用户所输入的文本信息并进行中心词提取,另外,利用句法依存规则针对性地构建子树,进一步提高了中心词提取的处理效果,并且操作方便、配置灵活。
附图说明
为了更清楚地说明本申请实施例技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1是本申请实施例提供的一种基于句法依存关系提取中心词的方法的示意流程图。
图2是本申请实施例提供的一种基于句法依存关系提取中心词的方法的演示示意图。
图3是本申请实施例提供的一种基于句法依存关系提取中心词的方法的另一演示示意图。
图4是本申请实施例提供的一种基于句法依存关系提取中心词的方法的另一示意流程图。
图5是本申请实施例提供的一种基于句法依存关系提取中心词的方法的另一示意流程图。
图6是本申请实施例提供的一种基于句法依存关系提取中心词的方法的另一演示示意图。
图7是本申请实施例提供的一种基于句法依存关系提取中心词的方法的另一示意流程图。
图8是本申请另一实施例提供的一种基于句法依存关系提取中心词的方法的示意流程图。
图9是本申请实施例提供的一种基于句法依存关系提取中心词的终端的示意性框图。
图10是本申请实施例提供的一种基于句法依存关系提取中心词的终端的另一示意性框图。
图11是本申请实施例提供的一种基于句法依存关系提取中心词的终端的另一示意性框图。
图12是本申请实施例提供的一种基于句法依存关系提取中心词的终端的另一示意性框图。
图13是本申请实施例提供的一种基于句法依存关系提取中心词的终端的另一示意性框图。
图14是本申请实施例提供的一种基于句法依存关系提取中心词的结构组成示意图。
具体实施方式
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
应当理解,当在本说明书和所附权利要求书中使用时,术语“包括”和“包含”指示所描述特征、整体、步骤、操作、元素和/或组件的存在,但并不排除一个或多个其它特征、整体、步骤、操作、元素、组件和/或其集合的存在或添加。
还应当理解,在此本申请说明书中所使用的术语仅仅是出于描述特定实施例的目的而并不意在限制本申请。如在本申请说明书和所附权利要求书中所使用的那样,除非上下文清楚地指明其它情况,否则单数形式的“一”、“一个”及“该”意在包括复数形式。
请参阅图1,图1是本申请实施例提供的一种基于句法依存关系提取中心词的方法的示意流程图。该方法可以运行在智能手机(如Android手机、IOS手机等)、平板电脑、笔记本电脑以及智能设备等终端中。该方法主要是对用户输入的文本信息中的中心词进行提取,本申请的方法可以充分理解用户所输入的文本信息并进行中心词提取,另外,还可以提高中心词提取的处理效果,并且操作方便、配置灵活。如图1所示,该方法的步骤S101~S106。
S101,获取用户输入的文本信息。
在本申请实施例中,用户输入的文本信息可以是用户在终端的网页浏览器或者搜索引擎上输入的查询文本,例如用户输入的查询文本为“胃痛,舌苔厚、发白是什么原因”,终端的网页浏览器或者搜索引擎便会实时获取用户输入的查询文本,即“胃痛,舌苔厚,发白是什么原因”。
S102,根据预设的句法依存规则确定所述文本信息的句法结构树。
在本申请实施例中,所述预设的句法依存规则指的是通过分析语言单位内成分之间的依存关系揭示其句法结构,并指出词语之间在句法上的搭配关系。根据预设的句法依存规则确定所述文本信息的句法结构树,例如对于文本信息“胃痛,舌苔厚、发白是什么原因”,通过预设的句法依存规则分析之后可以得到如图2所示的句法结构树,其中“是”为该文本信息的核心关系,“胃痛”、“舌苔厚”、“舌苔发白”均为该文本信息的主语,并与核心关系“是”组成主谓关系,“原因”为该文本信息的宾语,“什么”与宾语“原因”组成定中关系;相同地,再例如,对于文本信息“颈椎动脉供血不足,牵引治疗还是手术治疗能彻底治愈?”,通过预设的句法依存规则分析之后可以得到如图3所示的句法结构树。
S103,对所述句法结构树进行修剪。
在本申请实施例中,在所述句法结构树中包括有多个词节点,对所述句法结构树进行修剪,例如对图3中的句法结构树进行修剪,修剪的内容包括去除停留词,以及不需要的关系成分的词节点,因此所去除的词节点可以根据具体领域应用再加以配置,对结构树并未构成影响;具体地,如图4所示,步骤S103中,对所述句法结构树进行修剪,包括如下步骤S201~S202。
S201,获取所述句法结构树中部分关系成分的词节点。
在本申请实施例中,例如图3所示,在图中的句法结构树中,部分的关系成分的词节点包括有标点符号的词节点、状中结构的词节点以及不需要的其它关系成分的词节点,其中,标点符号的词节点为“?”、“,”,状中结构的词节点为“还是”、“能”、“彻底”,另外,在图中的句法结构树中不需要的其它关系成分的词节点包括有“治疗”、“治愈”,具体地,本申请实施例中不需要的其它关系成分的词节点可以根据实际情况进行筛选,具体的筛选方法在此不作限制。
S202,将部分关系成分的词节点删除。
在本申请实施例中,将部分关系成分的词节点删除,即,直接在所述句法结构树中将部分关系成分的词节点删除即可,当删除部分关系成分的词节点后,其它存在关系成分的词节点将保持原来在该句法结构树中的相对层级关系。
S104,根据修剪后的句法结构树构建子树。
进一步地,如图5所示,步骤S104包括步骤S301~S302。
S301,获取修剪后的句法结构树中的核心词节点以及与所述核心词节点并列关系的其它词节点。
在本申请实施例中,例如图3所示的句法结构树中,核心词节点为“不足”,与该心词节点为并列关系的词节点为“牵引”。
S302,根据所述核心词节点以及与所述核心词节点并列关系的其它词节点构建子树。
在本申请实施例中,根据修剪后的句法结构树中的各节点的相对层级关系,以及所述核心词节点和与该核心词节点并列关系的其它词节点构建子树,例如图6所示,将修剪后的句法结构树构建如图中的子树,在后续的中心词提取过程中将以所构建的子树为单元进行处理。
S105,计算所述子树中各词节点的初步分数。
在本申请实施例中,根据其词性、句法关系角色、词长、节点深度等因素计算所述子树中每个词节点的初步分数。
需要说明的是,由于中心词通常都是名词、动词、形容词等,故词性因素中这些词性的分数较高;通常而言,词语长度越长,其信息量越大,是中心词的可能性更高,分数较高;句法关系角色因素中,核心关系、主谓关系、动宾关系等是中心词的可能性比较高,分数较高;例如,在在线医疗问诊文本场景中,在修剪后的句法结构树中深度较大的词节点通常都是重要的词,比如限定了某某症状出现的身体部位等等,因此词节点深度越大,分数越高,具体的,可以根据预设的加权规则加权组合这些特征因素的得分,为每个词节点得到综合的初步打分结果,其中,所述预设的加权规则可以由用户自行设置,具体的规则在此不作限定。
S106,根据所述初步分数提取所述文本信息中的中心词。
进一步地,如图7所示,步骤S106包括步骤S401~S402。
S401,根据所述初步分数对所述子树中各词节点进行排序。
S402,根据所述排序的结果提取所述文本信息中的中心词。
在本申请实施例中,例如将所述初步分数得分最高的词节点进行提取,并作为所述文本信息中的中心词。
由以上可见,本申请实施例通过获取用户输入的文本信息,根据预设的句法依存规则确定所述文本信息的句法结构树,对所述句法结构树进行修剪,根据修剪后的句法结构树构建子树,计算所述子树中各词节点的初步分数,根据所述初步分数提取所述文本信息中的中心词,可以充分理解用户所输入的文本信息并进行中心词提取,另外,利用句法依存规则针对性地构建子树,进一步提高了中心词提取的处理效果,并且操作方便、配置灵活。
请参阅图8,图8是本申请实施例提供的一种基于句法依存关系提取中心词的方法的示意流程图。该方法可以运行在智能手机(如Android手机、IOS手机等)、平板电脑、笔记本电脑以及智能设备等终端中。该方法主要是对用户输入的文本信息中的中心词进行提取,本申请的方法可以充分理解用户所输入的文本信息并进行中心词提取,另外,还可以提高中心词提取的处理效果,并且操作方便、配置灵活。如图8所示,该方法的步骤S501~S507。
S501,获取用户输入的文本信息。
在本申请实施例中,用户输入的文本信息可以是用户在终端的网页浏览器或者搜索引擎上输入的查询文本,例如用户输入的查询文本为“胃痛,舌苔厚、发白是什么原因”,终端的网页浏览器或者搜索引擎便会实时获取用户输入的查询文本,即“胃痛,舌苔厚,发白是什么原因”。
S502,根据预设的句法依存规则确定所述文本信息的句法结构树。
在本申请实施例中,所述预设的句法依存规则指的是通过分析语言单位内成分之间的依存关系揭示其句法结构,并指出词语之间在句法上的搭配关系。根据预设的句法依存规则确定所述文本信息的句法结构树,例如对于文本信息“胃痛,舌苔厚、发白是什么原因”,通过预设的句法依存规则分析之后可以得到如图2所示的句法结构树,其中“是”为该文本信息的核心关系,“胃痛”、“舌苔厚”、“舌苔发白”均为该文本信息的主语,并与核心关系“是”组成主谓关系,“原因”为该文本信息的宾语,“什么”与宾语“原因”组成定中关系;相同地,再例如,对于文本信息“颈椎动脉供血不足,牵引治疗还是手术治疗能彻底治愈?”,通过预设的句法依存规则分析之后可以得到如图3所示的句法结 构树。
S503,对所述句法结构树进行修剪。
在本申请实施例中,在所述句法结构树中包括有多个词节点,对所述句法结构树进行修剪,例如对图3中的句法结构树进行修剪,修剪的内容包括去除停留词,以及不需要的关系成分的词节点,因此所去除的词节点可以根据具体领域应用再加以配置,对结构树并未构成影响。
S504,根据修剪后的句法结构树构建子树。
S505,判断所述子树中是否存在并列关系的词节点。
在本申请实施例中,由于并列关系意味着这些词之间的语义作用比较类似,在得到初步分数之后,对每一组并列关系的词节点,根据词长来调整组内各词的得分。
S506,若所述子树中存在并列关系的词节点,根据预设分配规则计算所述子树中并列关系的各词节点的初步分数。
在本申请实施例中,若所述子树中存在并列关系的词节点,根据预设分配规则计算所述子树中并列关系的各词节点的初步分数,具体的,将每一组具有并列关系的词节点,在组内对各词节点的初步分数进行求和,并将总分数根据各词的词长占组内所有词词长之和的比重来分配。
S507,根据所述初步分数提取所述文本信息中的中心词。
由以上可见,本申请实施例通过判断所述子树中是否存在并列关系的词节点,若所述子树中存在并列关系的词节点,根据预设分配规则重新计算所述子树中并列关系的各词节点的初步分数,根据所述初步分数提取所述文本信息中的中心,可以保证更加准确的提取到所需要的中心词。
本领域普通技术员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机程序来指令相关的硬件来完成,所述的程序可存储于一计算机可读取存储介质中,该程序在执行时,可包括如上述各方法的实施例的流程。其中,所述的存储介质可为磁碟、光盘、只读存储记忆体(Read-Only Memory,ROM)或随机存储记忆体(Random Access Memory,RAM)等。
请参阅图9,对应上述一种基于句法依存关系提取中心词的方法,本申请实施例还提出一种基于句法依存关系提取中心词的终端,该终端100包括:第一获取单元101、确定单元102、修剪单元103、构建单元104、计算单元105、提 取单元106。
其中,所述第一获取单元101,用于获取用户输入的文本信息。确定单元102,用于根据预设的句法依存规则确定所述文本信息的句法结构树。修剪单元103,用于对所述句法结构树进行修剪。构建单元104,用于根据修剪后的句法结构树构建子树。计算单元105,用于计算所述子树中各词节点的初步分数。提取单元106,用于根据所述初步分数提取所述文本信息中的中心词。
如图10所示,所述修剪单元103,包括:第二获取单元1031,用于获取所述句法结构树中部分关系成分的词节点;删除单元1032,用于将部分关系成分的词节点删除。
如图11所示,所述构建单元104,包括:第三获取单元1041,用于获取修剪后的句法结构树中的核心词节点以及与所述核心词节点并列关系的其它词节点;构建子单元1042,用于根据所述核心词节点以及与所述核心词节点并列关系的其它词节点构建子树。
如图12所示,所述提取单元106,包括:排序单元1061,用于根据所述初步分数对所述子树中各词节点进行排序;提取子单元1062,用于根据所述排序的结果提取所述文本信息中的中心词。
请参阅图13,对应上述一种基于句法依存关系提取中心词的方法,本申请实施例还提出一种基于句法依存关系提取中心词的终端,该终端200包括:第一获取单元201、确定单元202、修剪单元203、构建单元204、判断单元205、计算子单元206、提取单元207。
其中,所述第一获取单元201,用于获取用户输入的文本信息;确定单元202,用于根据预设的句法依存规则确定所述文本信息的句法结构树;修剪单元203,用于对所述句法结构树进行修剪;构建单元204,用于根据修剪后的句法结构树构建子树;判断单元205,用于判断所述子树中是否存在并列关系的词节点;计算子单元206,用于若所述子树中存在并列关系的词节点,根据预设分配规则计算所述子树中并列关系的各词节点的初步分数;提取单元207,用于根据所述初步分数提取所述文本信息中的中心词。
在硬件实现上,以上第一获取单元101、确定单元102、修剪单元103、构建单元104、计算单元105、提取单元106等可以以硬件形式内嵌于或独立于数据处理的装置中,也可以以软件形式存储于数据处理装置的存储器中,以便处 理器调用执行以上各个单元对应的操作。该处理器可以为中央处理单元(CPU)、微处理器、单片机等。
图14为本申请一种基于句法依存关系提取中心词的设备的结构组成示意图。如图14所示,该设备300可包括:输入装置301、输出装置302、收发装置303、存储器304以及处理器305,其中:
所述输入装置301,用于接收外部访问控制设备的输入数据。具体实现中,本申请实施例所述的输入装置301可包括键盘、鼠标、光电输入装置、声音输入装置、触摸式输入装置、扫描仪等。
所述输出装置302,用于对外输出访问控制设备的输出数据。具体实现中,本申请实施例所述的输出装置302可包括显示器、扬声器、打印机等。
所述收发装置303,用于通过通信链路向其他设备发送数据或者从其他设备接收数据。具体实现中,本申请实施例的收发装置303可包括射频天线等收发器件。
所述存储器304,用于存储实现提取中心词的程序。本申请实施例的存储器304可以是***存储器,比如,挥发性的(诸如RAM),非易失性的(诸如ROM,闪存等),或者两者的结合。具体实现中,本申请实施例的存储器304还可以是***之外的外部存储器,比如,磁盘、光盘、磁带等。
所述处理器305,用于运行所述存储器304中存储的实现提取中心词的程序,以执行如下操作:获取用户输入的文本信息;根据预设的句法依存规则确定所述文本信息的句法结构树;对所述句法结构树进行修剪;根据修剪后的句法结构树构建子树;计算所述子树中各词节点的初步分数;根据所述初步分数提取所述文本信息中的中心词。
进一步地,所述对所述句法结构树进行修剪,包括:获取所述句法结构树中部分关系成分的词节点;将部分关系成分的词节点删除。
进一步地,所述根据修剪后的句法结构树构建子树,包括:获取修剪后的句法结构树中的核心词节点以及与所述核心词节点并列关系的其它词节点;根据所述核心词节点以及与所述核心词节点并列关系的其它词节点构建子树。
进一步地,所述计算所述子树中各词节点的初步分数,包括:根据所述子树中各个词节点的词性、句法关系角色、词长以及节点深度计算所述初步分数。
进一步地,所述根据所述初步分数提取所述文本信息中的中心词,包括: 根据所述初步分数对所述子树中各词节点进行排序;根据所述排序的结果提取所述文本信息中的中心词。
进一步地,所述计算所述子树中各词节点的初步分数,包括:判断所述子树中是否存在并列关系的词节点;若所述子树中存在并列关系的词节点,根据预设分配规则计算所述子树中并列关系的各词节点的初步分数。
本领域技术人员可以理解,图14中示出的基于句法依存关系提取中心词的设备的实施例并不构成对基于句法依存关系提取中心词的设备具体构成的限定,在其他实施例中,基于句法依存关系提取中心词的设备可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置。例如,在一些实施例中,基于句法依存关系提取中心词的设备可以仅包括存储器及处理器,在这样的实施例中,存储器及处理器的结构及功能与图14所示实施例一致,在此不再赘述。
本申请提供了一种计算机可读存储介质,计算机可读存储介质存储有一个或者一个以上计算机程序,所述一个或者一个以上计算机程序可被一个或者一个以上的处理器执行,以实现上述基于句法依存关系提取中心词的方法。
本申请前述的存储介质包括:磁碟、光盘、只读存储记忆体(Read-Only Memory,ROM)或随机存储记忆体(Random Access Memory,RAM)等各种可以存储程序代码的介质。
本申请所有实施例中的单元可以通过通用集成电路,例如CPU(Central Processing Unit,中央处理器),或通过ASIC(Application Specific Integrated Circuit,专用集成电路)来实现。
本申请实施例基于句法依存关系提取中心词的方法中的步骤可以根据实际需要进行顺序调整、合并和删减。
本申请实施例基于句法依存关系提取中心词的终端中的单元可以根据实际需要进行合并、划分和删减。
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到各种等效的修改或替换,这些修改或替换都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以权利要求的保护范围为准。

Claims (20)

  1. 一种基于句法依存关系提取中心词的方法,其特征在于,所述方法包括:
    获取用户输入的文本信息;
    根据预设的句法依存规则确定所述文本信息的句法结构树;
    对所述句法结构树进行修剪;
    根据修剪后的句法结构树构建子树;
    计算所述子树中各词节点的初步分数;
    根据所述初步分数提取所述文本信息中的中心词。
  2. 如权利要求1所述的方法,其特征在于,所述对所述句法结构树进行修剪,包括:
    获取所述句法结构树中部分关系成分的词节点;
    将部分关系成分的词节点删除。
  3. 如权利要求1所述的方法,其特征在于,所述根据修剪后的句法结构树构建子树,包括:
    获取修剪后的句法结构树中的核心词节点以及与所述核心词节点并列关系的其它词节点;
    根据所述核心词节点以及与所述核心词节点并列关系的其它词节点构建子树。
  4. 如权利要求1所述的方法,其特征在于,所述计算所述子树中各词节点的初步分数,包括:
    根据所述子树中各个词节点的词性、句法关系角色、词长以及节点深度计算所述初步分数;
    所述根据所述初步分数提取所述文本信息中的中心词,包括:
    根据所述初步分数对所述子树中各词节点进行排序;
    根据所述排序的结果提取所述文本信息中的中心词。
  5. 如权利要求1所述的方法,其特征在于,所述计算所述子树中各词节点的初步分数,包括:
    判断所述子树中是否存在并列关系的词节点;
    若所述子树中存在并列关系的词节点,根据预设分配规则计算所述子树中 并列关系的各词节点的初步分数。
  6. 一种基于句法依存关系提取中心词的终端,其特征在于,所述终端包括:
    第一获取单元,用于获取用户输入的文本信息;
    确定单元,用于根据预设的句法依存规则确定所述文本信息的句法结构树;
    修剪单元,用于对所述句法结构树进行修剪;
    构建单元,用于根据修剪后的句法结构树构建子树;
    计算单元,用于计算所述子树中各词节点的初步分数;
    提取单元,用于根据所述初步分数提取所述文本信息中的中心词。
  7. 如权利要求6所述的终端,其特征在于,所述修剪单元,包括:
    第二获取单元,用于获取所述句法结构树中部分关系成分的词节点;
    删除单元,用于将部分关系成分的词节点删除。
  8. 如权利要求6所述的终端,其特征在于,所述构建单元,包括:
    第三获取单元,用于获取修剪后的句法结构树中的核心词节点以及与所述核心词节点并列关系的其它词节点;
    构建子单元,用于根据所述核心词节点以及与所述核心词节点并列关系的其它词节点构建子树。
  9. 如权利要求6所述的终端,其特征在于,
    所述计算单元具体用于:
    根据所述子树中各个词节点的词性、句法关系角色、词长以及节点深度计算所述初步分数;
    所述提取单元,包括:
    排序单元,用于根据所述初步分数对所述子树中各词节点进行排序;
    提取子单元,用于根据所述排序的结果提取所述文本信息中的中心词。
  10. 如权利要求6所述的终端,其特征在于,所述计算单元,包括:
    判断单元,用于判断所述子树中是否存在并列关系的词节点;
    计算子单元,用于若所述子树中存在并列关系的词节点,根据预设分配规则计算所述子树中并列关系的各词节点的初步分数。
  11. 一种基于句法依存关系提取中心词的设备,其特征在于,包括:
    存储器,用于存储实现提取中心词的程序;以及
    处理器,用于运行所述存储器中存储的实现提取中心词的程序,以执行以 下操作:
    获取用户输入的文本信息;
    根据预设的句法依存规则确定所述文本信息的句法结构树;
    对所述句法结构树进行修剪;
    根据修剪后的句法结构树构建子树;
    计算所述子树中各词节点的初步分数;
    根据所述初步分数提取所述文本信息中的中心词。
  12. 如权利要求11所述的设备,其特征在于,所述对所述句法结构树进行修剪,包括:
    获取所述句法结构树中部分关系成分的词节点;
    将部分关系成分的词节点删除。
  13. 如权利要求11所述的设备,其特征在于,所述根据修剪后的句法结构树构建子树,包括:
    获取修剪后的句法结构树中的核心词节点以及与所述核心词节点并列关系的其它词节点;
    根据所述核心词节点以及与所述核心词节点并列关系的其它词节点构建子树。
  14. 如权利要求11所述的设备,其特征在于,所述计算所述子树中各词节点的初步分数,包括:
    根据所述子树中各个词节点的词性、句法关系角色、词长以及节点深度计算所述初步分数;
    所述根据所述初步分数提取所述文本信息中的中心词,包括:
    根据所述初步分数对所述子树中各词节点进行排序;
    根据所述排序的结果提取所述文本信息中的中心词。
  15. 如权利要求11所述的设备,其特征在于,所述计算所述子树中各词节点的初步分数,包括:
    判断所述子树中是否存在并列关系的词节点;
    若所述子树中存在并列关系的词节点,根据预设分配规则计算所述子树中并列关系的各词节点的初步分数。
  16. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质存储 有一个或者一个以上计算机程序,所述一个或者一个以上计算机程序可被一个或者一个以上的处理器执行,以实现以下步骤:
    获取用户输入的文本信息;
    根据预设的句法依存规则确定所述文本信息的句法结构树;
    对所述句法结构树进行修剪;
    根据修剪后的句法结构树构建子树;
    计算所述子树中各词节点的初步分数;
    根据所述初步分数提取所述文本信息中的中心词。
  17. 如权利要求16所述的计算机可读存储介质,其特征在于,所述对所述句法结构树进行修剪,包括:
    获取所述句法结构树中部分关系成分的词节点;
    将部分关系成分的词节点删除。
  18. 如权利要求16所述的计算机可读存储介质,其特征在于,所述根据修剪后的句法结构树构建子树,包括:
    获取修剪后的句法结构树中的核心词节点以及与所述核心词节点并列关系的其它词节点;
    根据所述核心词节点以及与所述核心词节点并列关系的其它词节点构建子树。
  19. 如权利要求16所述的计算机可读存储介质,其特征在于,所述计算所述子树中各词节点的初步分数,包括:
    根据所述子树中各个词节点的词性、句法关系角色、词长以及节点深度计算所述初步分数;
    所述根据所述初步分数提取所述文本信息中的中心词,包括:
    根据所述初步分数对所述子树中各词节点进行排序;
    根据所述排序的结果提取所述文本信息中的中心词。
  20. 如权利要求16所述的计算机可读存储介质,其特征在于,所述计算所述子树中各词节点的初步分数,包括:
    判断所述子树中是否存在并列关系的词节点;
    若所述子树中存在并列关系的词节点,根据预设分配规则计算所述子树中并列关系的各词节点的初步分数。
PCT/CN2018/077142 2017-06-16 2018-02-24 基于句法依存关系提取中心词的方法、终端、设备及存储介质 WO2018227995A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201710458259.4 2017-06-16
CN201710458259.4A CN107748742A (zh) 2017-06-16 2017-06-16 一种基于句法依存关系提取中心词的方法、终端以及设备

Publications (1)

Publication Number Publication Date
WO2018227995A1 true WO2018227995A1 (zh) 2018-12-20

Family

ID=61255414

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/077142 WO2018227995A1 (zh) 2017-06-16 2018-02-24 基于句法依存关系提取中心词的方法、终端、设备及存储介质

Country Status (2)

Country Link
CN (1) CN107748742A (zh)
WO (1) WO2018227995A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111985232A (zh) * 2020-08-10 2020-11-24 南京航空航天大学 基于nlp的机载显控***需求的领域模型提取方法

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110569494B (zh) * 2018-06-05 2023-04-07 北京百度网讯科技有限公司 用于生成信息的方法、装置、电子设备及可读介质
CN109033073B (zh) * 2018-06-28 2020-07-28 中国科学院自动化研究所 基于词汇依存三元组的文本蕴含识别方法及装置
CN109190115B (zh) * 2018-08-14 2023-05-26 重庆邂智科技有限公司 一种文本匹配方法、装置、服务器及存储介质
CN110069624B (zh) 2019-04-28 2021-05-04 北京小米智能科技有限公司 文本处理方法及装置
CN112487801A (zh) * 2020-10-23 2021-03-12 南京航空航天大学 一种面向安全关键软件的术语推荐方法及***

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6374209B1 (en) * 1998-03-19 2002-04-16 Sharp Kabushiki Kaisha Text structure analyzing apparatus, abstracting apparatus, and program recording medium
CN101246492A (zh) * 2008-02-26 2008-08-20 华中科技大学 基于自然语言的全文检索***
CN101510221A (zh) * 2009-02-17 2009-08-19 北京大学 一种用于信息检索的查询语句分析方法与***
CN103020148A (zh) * 2012-11-23 2013-04-03 复旦大学 一种将中文短语结构树库转化为依存结构树库的***和方法
CN106528531A (zh) * 2016-10-31 2017-03-22 北京百度网讯科技有限公司 基于人工智能的意图分析方法及装置

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6374209B1 (en) * 1998-03-19 2002-04-16 Sharp Kabushiki Kaisha Text structure analyzing apparatus, abstracting apparatus, and program recording medium
CN101246492A (zh) * 2008-02-26 2008-08-20 华中科技大学 基于自然语言的全文检索***
CN101510221A (zh) * 2009-02-17 2009-08-19 北京大学 一种用于信息检索的查询语句分析方法与***
CN103020148A (zh) * 2012-11-23 2013-04-03 复旦大学 一种将中文短语结构树库转化为依存结构树库的***和方法
CN106528531A (zh) * 2016-10-31 2017-03-22 北京百度网讯科技有限公司 基于人工智能的意图分析方法及装置

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111985232A (zh) * 2020-08-10 2020-11-24 南京航空航天大学 基于nlp的机载显控***需求的领域模型提取方法
CN111985232B (zh) * 2020-08-10 2024-04-19 南京航空航天大学 基于nlp的机载显控***需求的领域模型提取方法

Also Published As

Publication number Publication date
CN107748742A (zh) 2018-03-02

Similar Documents

Publication Publication Date Title
WO2018227995A1 (zh) 基于句法依存关系提取中心词的方法、终端、设备及存储介质
US10956464B2 (en) Natural language question answering method and apparatus
US11017178B2 (en) Methods, devices, and systems for constructing intelligent knowledge base
JP6436494B2 (ja) 自動音声認識のための方法およびシステム
CN110032728B (zh) 疾病名称标准化的转换方法和装置
CN110427491B (zh) 一种基于电子病历的医学知识图谱构建方法及装置
US7908552B2 (en) Mere-parsing with boundary and semantic driven scoping
WO2021114632A1 (zh) 疾病名称标准化方法、装置、设备及存储介质
US20100104200A1 (en) Comparison of Documents Based on Similarity Measures
CN109215796B (zh) 搜索方法、装置、计算机设备和存储介质
US10242670B2 (en) Syntactic re-ranking of potential transcriptions during automatic speech recognition
JP2016509711A5 (zh)
WO2014190901A1 (zh) 语法编译方法、语义解析方法、装置、计算机存储介质和设备
WO2021217850A1 (zh) 疾病名称对码方法、装置、计算机设备及存储介质
CN109658931B (zh) 语音交互方法、装置、计算机设备及存储介质
CN110413751B (zh) 药品搜索方法、装置、终端设备以及存储介质
CN109117474A (zh) 语句相似度的计算方法、装置及存储介质
WO2023029513A1 (zh) 基于人工智能的搜索意图识别方法、装置、设备及介质
WO2021082070A1 (zh) 智能对话方法及相关设备
WO2020164204A1 (zh) 文本模板识别方法、装置及计算机可读存储介质
JP2015194919A (ja) 文書要約装置、文書要約方法、及び、プログラム
WO2021159743A1 (zh) 文本纠错方法、装置、设备及存储介质
CN108009157B (zh) 一种语句归类方法及装置
CN106844325B (zh) 医疗信息处理方法和医疗信息处理装置
JP6546703B2 (ja) 自然言語処理装置及び自然言語処理方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18818695

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18818695

Country of ref document: EP

Kind code of ref document: A1

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205N DATED 06/02/2020)

122 Ep: pct application non-entry in european phase

Ref document number: 18818695

Country of ref document: EP

Kind code of ref document: A1