CN115204190A - Device and method for converting financial field terms into English - Google Patents

Device and method for converting financial field terms into English Download PDF

Info

Publication number
CN115204190A
CN115204190A CN202211107345.8A CN202211107345A CN115204190A CN 115204190 A CN115204190 A CN 115204190A CN 202211107345 A CN202211107345 A CN 202211107345A CN 115204190 A CN115204190 A CN 115204190A
Authority
CN
China
Prior art keywords
word
english
term
splitting
terms
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202211107345.8A
Other languages
Chinese (zh)
Other versions
CN115204190B (en
Inventor
韩双江
姜长江
苏雯斐
王辉
黄荣辉
战启铭
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sino Credit Information Technology Beijing Co ltd
Original Assignee
Sino Credit Information Technology Beijing Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sino Credit Information Technology Beijing Co ltd filed Critical Sino Credit Information Technology Beijing Co ltd
Priority to CN202211107345.8A priority Critical patent/CN115204190B/en
Publication of CN115204190A publication Critical patent/CN115204190A/en
Application granted granted Critical
Publication of CN115204190B publication Critical patent/CN115204190B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language
    • G06F40/42Data-driven translation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Machine Translation (AREA)

Abstract

The invention discloses a device for converting terms in the financial field into English, which comprises: the term bank preprocessing module is used for loading a term bank, and financial field terms are preset in the term bank; the Chinese semantic analysis word segmentation module is used for acquiring a sentence to be processed and splitting the sentence into a plurality of words and/or words according to financial domain terms in the sentence sequence; the Chinese-to-English conversion module is used for converting the split words and/or expressions into English terms; and the result output module is used for outputting the sentence to be processed and the corresponding English term. The method has the beneficial effects of accurately analyzing the semantics of the financial field and accurately converting the semantics into English terms. The invention provides a method for converting financial field terms into English, which is accurate in semantic analysis of the financial field and accurate in English term conversion.

Description

Device and method for converting financial field terms into English
Technical Field
The invention relates to the technical field of semantic analysis of terms in the financial field. More particularly, the present invention relates to a device and method for converting financial domain terminology into English.
Background
The existing devices in the industry at present comprise ending Word segmentation, a HanLP Chinese language processing packet, jcseg lightweight Java Chinese Word segmentation, sego Chinese Word segmentation, foolNLTK Chinese Word segmentation, ansj Chinese Word segmentation, word segmentation and the like, but most of the technologies are based on Chinese semantics to split, the split Word groups are more compound Chinese semantic Word groups, but the splitting and the use of financial terms are not satisfied, and meanwhile, the tool only performs Word segmentation and does not perform conversion of Chinese and English terms. In addition, the technology in the industry also comprises a tool which is developed by staff of Teradata and based on Microsoft Excel processing, the tool is convenient and fast, and can meet the requirements of financial term splitting and Chinese-English conversion, but the tool is processed based on Microsoft Excel by adopting VB language, the Excel is designed to have the copyright problem in most scenes, and a plurality of systems cannot be installed.
Disclosure of Invention
An object of the present invention is to solve at least the above problems and to provide at least the advantages described later.
To achieve these objects and other advantages in accordance with the purpose of the invention, there is provided an apparatus for converting terminology of the financial field into english, comprising:
the term bank preprocessing module is used for loading a term bank, and financial field terms are preset in the term bank;
the Chinese semantic analysis word segmentation module is used for acquiring a sentence to be processed and splitting the sentence into a plurality of words and/or words according to financial field terms in the sentence sequence;
the Chinese-to-English conversion module is used for converting the split words and/or expressions into corresponding English terms;
the method for splitting the sentence by the Chinese semantic analysis participle module comprises the following steps of:
the method comprises the steps of firstly, splitting sentences one by one according to Chinese words, calculating the number n of the Chinese words, forming An ordered set An = (A1, A2, am.. An) by the split Chinese words according to the sentence sequence, and setting the number n of the Chinese words as iteration cycle times;
step two, semantic analysis: taking out a first element A1 in the ordered set An, comparing the first element A1 with the word stock, recording the element if the first element A1 is hit, mapping a corresponding English term, performing cumulative combination on the element and the rest elements in the set An to form a new split vocabulary, comparing the new split vocabulary with the word stock, combining the hit element and the vocabulary to form An ordered set Bn, and iteratively creating a new vocabulary for multiple times until the cumulative combination of the A1 and the rest n-1 elements is finished;
step three, reading the minimum unit semantic word with the longest length in the set Bn as a first splitting term word, wherein the length of the first splitting term word is m, starting 2 nd iteration by using an A (m + 1) word, reading the minimum unit semantic word with the longest length from the 2 nd round Bn set as a second hit splitting term word, mapping corresponding English terms, repeating for multiple times until m = n is completed, completing semantic splitting, and forming a splitting vocabulary;
and the result output module is used for outputting the sentences to be processed and the corresponding English terms.
Preferably, the thesaurus comprises a built-in thesaurus and a custom thesaurus, the built-in thesaurus comprises a plurality of financial field terms, the custom thesaurus is used for adding new terms, and the priority of the custom thesaurus is higher than that of the built-in thesaurus.
Preferably, if any vocabulary is missed when comparing with the word stock in the second step and the third step, the longest semantic word with the longest a length accumulated by the combination of the elements A1 is taken as a split word, and the mapped english term is represented in the form of a placeholder.
Preferably, the system further comprises a target semantic integration module, which is used for splicing two split adjacent words and/or phrases by using preset symbols, and is used for splicing corresponding adjacent english terms by using preset symbols.
Preferably, the target semantic integration module is configured to provide that english terms corresponding to the sentence are displayed in a humped form, a full upper case form and a full lower case form.
Preferably, the result output module outputs the result in one of a console mode and an Excel file.
A method for converting the term of the financial field into English is provided, which comprises the following steps:
step one, obtaining a sentence to be processed, and splitting the sentence into a plurality of words and/or phrases according to financial field terms in a sentence sequence, wherein a word stock containing a plurality of financial field terms is preset;
the method for splitting the statement comprises the following steps:
step a, splitting sentences one by one according to Chinese words, calculating the number n of the Chinese words, forming An ordered set An = (A1, A2, am.. An) by the split Chinese words according to the sentence sequence, and setting the number n of the Chinese words as the iteration cycle times;
step b, semantic analysis: taking out a first element A1 in the ordered set An, comparing the first element A1 with the word stock, recording the element if the first element A1 is hit, mapping a corresponding English term, performing cumulative combination on the element and the rest elements in the set An to form a new split vocabulary, comparing the new split vocabulary with the word stock, combining the hit element and the vocabulary to form An ordered set Bn, and iteratively creating a new vocabulary for multiple times until the cumulative combination of the A1 and the rest n-1 elements is finished;
step c, reading the minimum unit semantic word with the longest length in the set Bn as a first splitting term, wherein the length of the first splitting term is m, starting iteration in the 2 nd round by using an A (m + 1) word, reading the minimum unit semantic word with the longest length from the set Bn in the 2 nd round as a second hit splitting term, mapping corresponding English terms, iterating for multiple times until m = n, and ending circulation to complete semantic splitting to form a split vocabulary;
step two, converting the split words and/or expressions into English terms;
and step three, outputting the sentences to be processed and the corresponding English terms.
The invention at least comprises the following beneficial effects: the word segmentation algorithm of the device is completely designed for the company, and can independently and independently meet the requirement of term splitting in the financial field. The device can realize the splitting of Chinese semantic analysis words, can also realize the conversion of Chinese terms and English terms based on the built-in word stock and the self-defined word stock, outputs three English term vocabularies of hump type, full capitalization and full lowercase in one step according to the conversion result, is convenient to use and does not need secondary processing.
The Windows operating system can be operated on, or can be operated in a Linux or Unix operating system, and the requirement on the environment is low.
Additional advantages, objects, and features of the invention will be set forth in part in the description which follows and in part will become apparent to those having ordinary skill in the art upon examination of the following or may be learned from practice of the invention.
Drawings
FIG. 1 is a block diagram of the apparatus according to one embodiment of the present invention;
FIG. 2 is a flow chart of semantic splitting according to one embodiment of the present invention;
FIG. 3 is a diagram of an example of the Chinese semantic analysis participle module analysis of the present invention;
FIG. 4 is a screenshot of a user interface of one of the embodiments of the present invention;
FIG. 5 is an example of EXCEL output according to one embodiment of the present invention.
Detailed Description
The present invention is further described in detail below with reference to the attached drawings so that those skilled in the art can implement the invention by referring to the description text.
It is to be understood that the terminology indicating the positions or positional relationships is based on the positions or positional relationships shown in the drawings, and is for the purpose of convenience in describing the invention and simplifying the description, and does not indicate or imply that the device or element being referred to must have a particular orientation, configuration and operation in a particular orientation, and therefore, should not be taken as limiting the invention.
As shown in fig. 1 to 5, the present invention provides a device for converting a term in the financial field into english, comprising:
the term bank preprocessing module is used for loading a term bank, and financial field terms are preset in the term bank; the term bank preprocessing module is internally provided with two term processing mechanisms, the built-in word bank comprises common terms of the financial field summarized in the industry in recent years, and the custom word bank can modify the terms of the built-in word bank or add new terms according to the needs. After the device is started, the built-in word bank is loaded through the term bank preprocessing module, then the user-defined word bank is loaded, if the same terms as the built-in word bank exist in the loaded user-defined word bank, the terms of the user-defined word bank cover the built-in word bank, and the term effective life cycle of the user-defined word bank is within the device starting cycle and cannot permanently cover the built-in word bank.
The Chinese semantic analysis word segmentation module is used for acquiring a sentence to be processed and splitting the sentence into a plurality of words and/or words according to financial field terms in the sentence sequence; after the loading of the term bank preprocessing module is finished, the Chinese semantic analysis word segmentation module is started, the module receives sentences to be processed, after the sentences are successfully received, a plurality of words with the longest length of fusion domain terms in the term bank are compared and hit according to the sequence of the sentences, the sentences are split according to the words, and placeholders are uniformly adopted for representing missed words. This completes the splitting of the sentence according to the term of the financial field.
The Chinese-to-English conversion module is used for converting the split words and/or expressions into English terms; and converting the split vocabulary corresponding to the sentence into English terms.
And the result output module is used for outputting the sentence to be processed and the corresponding English term. And outputting the single statement in a console standard mode, and outputting the batch statements in an Excel file mode.
In the above technical solution, the present device mainly handles 2 main functions: 1. and secondly, converting the split Chinese words into English terms. In order to realize the two main functions, the device mainly comprises 4 processing modules which are respectively a term bank preprocessing module, a Chinese semantic analysis word segmentation module, a Chinese to English conversion module and a result output module. The device is designed and developed by adopting Java language, the deployment environment JDK is open source and does not relate to the copyright problem, and in addition, the Java language has the characteristic of compiling and running at multiple places at one time, so that all running environments in the market can be supported, namely the Java language can run on a Windows operating system and can also run in a Linux or Unix operating system, and the requirement on the environment is low. The word segmentation algorithm of the device is designed for the company to be completely independent and independent, and can meet the requirement of term splitting in the financial field. The device can realize the word splitting of Chinese semantic analysis, can also realize the conversion of Chinese and English terms based on the built-in semantic library and the self-defined semantic library, and outputs three English term vocabularies of hump, full capitalization and full lowercase in one step according to the conversion result, thereby being convenient to use and not needing secondary processing.
In another technical scheme, the method for splitting the sentence by the Chinese semantic analysis participle module comprises the following steps:
the method comprises the steps of firstly, splitting sentences one by one according to Chinese words, calculating the number n of the Chinese words, forming An ordered set An = (A1, A2, am.. An) by the split Chinese words according to the sentence sequence, and setting the number n of the Chinese words as iteration cycle times; after the iteration counter is successfully set, semantic analysis is started, a hit minimum unit semantic word set Bn of the iteration word is created in each semantic analysis, and all hit split words and mapping English terms of the iteration word are recorded by the Bn.
Step two, semantic analysis: taking out a first element A1 in the ordered set An, comparing the first element A1 with the word stock, recording the element if the element is hit, recording the element in a set Bn, mapping corresponding English terms, then performing cumulative combination on the element and the rest elements Am in the set An to form a new split vocabulary, comparing the new split vocabulary with the word stock, combining the hit element and the vocabulary to form the ordered set Bn, and iteratively creating a new vocabulary for multiple times until the cumulative combination of the A1 and the rest n-1 elements is completed; and after the element A1 is iterated, reading the minimum unit semantic word with the longest length from the Bn as a splitting term word.
And step three, after the element A1 iteration is completed, reading the length m of the hit split word of the element A1, starting the 2 nd iteration from the m word, and reading the longest semantic word from the 2 nd iteration Bn set to be used as the second hit split term word. And (5) repeating multiple iterations until m = n, ending the loop, and completing semantic splitting. Reading the minimum unit semantic word with the longest length in the set Bn as a first splitting term word, wherein the length of the first splitting term word is m, starting iteration in the 2 nd round by using an A (m + 1) word, reading the minimum unit semantic word with the longest length from the set Bn in the 2 nd round as a second hit splitting term word, mapping corresponding English terms, iterating for multiple times, ending the cycle until m = n, completing semantic splitting, and forming a split vocabulary.
As shown in fig. 3:
suppose the statement to be processed is: the ' number of public deposit accounts ' is obtained by dividing each Chinese character to obtain n =7, and the corresponding coordinates of each Chinese character are ' pair (1) ', ' public (2) ', ' deposit (3), ' money (4), ' account (5), ' account (6), ' number (7) ".
The first iteration A1= pair (1), then the corresponding combined cumulative word is: the number of the 7 new words of 'right, fair account' is total, when the word bank is matched, only 'right, fair' is obtained, the word bank is hit, so Bn belongs to [ right, fair ], the longest semantic word is 'fair', that is, m =2, iteration 2 starts from A3= deposit (3), and splits the longest semantic word into "deposit", iteration 3 starts from A5= account (5), splits the longest semantic word into "account", iteration 4 starts from A7= number (7), and splits the longest semantic word into "number". After splitting, the finally split semantic participles are 'to public', 'deposit', 'account' and 'number'.
After the splitting, the computation complexity is approximately equal difference number series and is summed as follows: sn = [ n (a 1+ an) ]/2, assuming that the processing time of each iteration is similar, the time complexity is determined by the iteration times, i.e. the number of chinese characters in the sentence is: o (n).
After the Chinese semantic analysis, split vocabulary is formed, and is processed by a Chinese-to-English module of the device, such as ' converting the ' official ' into ' Corp ', ' converting the deposit ' into ' Dpst ', ' converting the account ' into ' Acct ' and ' converting the number ' into ' Cnt '.
In another technical scheme, if no vocabulary is hit when the word stock is compared in the second step and the third step, the longest semantic word with the longest An length is accumulated by combining the elements A1 as a split word, and the mapped English term is represented in the form of a placeholder. After the iteration of the A1 word is finished, the smallest unit semantic word with the longest length is read from the Bn and is used as a split term word, if any word bank is missed in the processing process, the A1 combination accumulated An longest semantic word is used as the split term, and corresponding English is represented as English mapping words in a form of a "+" placeholder. Assuming that the number does not exist, the integrated Chinese is "Pair public _ Credit _ Account _ number +", and English is "Corp _ Dpst _ Acct _ +".
In another technical scheme, the system further comprises a target semantic integration module, which is used for splicing two split adjacent words and/or phrases by using preset symbols, and is used for splicing corresponding adjacent english terms by using preset symbols.
In the above technical solution, the processed chinese participles and english terms are processed by the target semantic integration module, and the words are spliced in a form of "_" underlining, for example, the participles after chinese splitting are integrated into "to public _ credit _ account _ number", and the english terms are integrated into "Corp _ dspst _ acc _ Cnt", and in order to meet various requirements, the english terms also provide two modes of full capitalization (e.g., "Corp _ Dpst _ acc _ Cnt") and full lowercase ("Corp _ Dpst _ acc _ Cnt"). The Chinese integration is mainly to display the split vocabulary after output, so that the term lexicon is convenient to adjust and the splitting accuracy is increased, and the English term integration is to make the term lexicon accord with the financial term specification. If a term is encountered in which the thesaurus does not exist, the module will start the "+" placeholder mechanism, assuming that the "number" does not exist, and after the integration, the Chinese is "Pair _ Credit _ Account _ number +", and the English is "Corp _ Dpst _ Acct _ +".
In another technical scheme, the target semantic integration module is used for providing that English terms corresponding to the sentences are displayed in a humped mode, a full upper case and a full lower case mode. The Chinese terms and English terms processed by the target meaning integration module are processed by the result output module, and are output in a console standard mode for single sentences and in an Excel file form for batch sentences, such as Excel batch output shown in FIG. 5.
A method for converting the term of the financial field into English is provided, which comprises the following steps:
the method comprises the steps of firstly, obtaining a sentence to be processed, and splitting the sentence into a plurality of words and/or phrases according to financial domain terms in the sentence sequence, wherein a word bank containing a plurality of financial domain terms is preset;
step two, converting the split words and/or expressions into English terms;
and step three, outputting the sentence to be processed and the corresponding English term.
In the technical scheme, the sentences can be split according to the financial field terms, the financial field semantics can be accurately analyzed, and the sentences are mapped with English terms, so that the aim of accurate translation is fulfilled.
Compared with the prior art, the method is more accurate in word segmentation accuracy, suitable for the financial industry, more in output modes, wider in environment dependence and capable of being used as long as JDK can be installed, and the copyright of an Excel system is avoided.
While embodiments of the invention have been described above, it is not intended to be limited to the details shown, described and illustrated herein, but is to be accorded the widest scope consistent with the principles and novel features herein disclosed, and to such extent that such modifications are readily available to those skilled in the art, and it is not intended to be limited to the details shown and described herein without departing from the general concept as defined by the appended claims and their equivalents.

Claims (7)

1. An apparatus for converting a term in the financial domain into english, comprising:
the term bank preprocessing module is used for loading a term bank, and financial field terms are preset in the term bank;
the Chinese semantic analysis word segmentation module is used for acquiring a sentence to be processed and splitting the sentence into a plurality of words and/or words according to financial domain terms in the sentence sequence;
the Chinese-to-English conversion module is used for converting the split words and/or expressions into corresponding English terms;
the method for splitting the sentence by the Chinese semantic analysis participle module comprises the following steps of:
the method comprises the steps of firstly, splitting sentences one by one according to Chinese words, calculating the number n of the Chinese words, forming An ordered set An = (A1, A2, am.. An) by the split Chinese words according to the sentence sequence, and setting the number n of the Chinese words as iteration cycle times;
step two, semantic analysis: taking out a first element A1 in the ordered set An, comparing the first element A1 with the word stock, recording the element if the first element A1 is hit, mapping a corresponding English term, performing cumulative combination on the element and the rest elements in the set An to form a new split vocabulary, comparing the new split vocabulary with the word stock, combining the hit element and the vocabulary to form An ordered set Bn, and iteratively creating a new vocabulary for multiple times until the cumulative combination of the A1 and the rest n-1 elements is finished;
step three, reading the minimum unit semantic word with the longest length in the set Bn as a first splitting term word, wherein the length of the first splitting term word is m, starting 2 nd iteration by using an A (m + 1) word, reading the minimum unit semantic word with the longest length from the 2 nd round Bn set as a second hit splitting term word, mapping corresponding English terms, repeating for multiple times until m = n is completed, completing semantic splitting, and forming a splitting vocabulary;
and the result output module is used for outputting the sentence to be processed and the corresponding English term.
2. The apparatus for converting a term in a financial domain into english as claimed in claim 1, wherein the thesaurus comprises a built-in thesaurus including a plurality of terms in a financial domain and a custom thesaurus for adding a new term, and the custom thesaurus has a higher priority than the built-in thesaurus.
3. The apparatus for converting a term in a financial domain into English according to claim 1, wherein if there is no word hit in the word bank in the second step and the third step, the longest semantic word with the longest An length is accumulated as a split word in the element A1 combination, and the mapped English term is represented in the form of a placeholder.
4. The apparatus for converting financial domain terms into english as claimed in claim 1, further comprising a target semantic integration module for splicing the split adjacent words and/or phrases by using preset symbols and for splicing the corresponding adjacent english terms by using preset symbols.
5. The apparatus for converting financial domain terms into english as claimed in claim 4, wherein the target semantic integration module is configured to provide english terms corresponding to the sentence to be displayed in humped, capitalized and lowercase forms.
6. The apparatus for converting financial domain terminology into english as recited in claim 1, wherein said result output module outputs the result in one of a console mode and an Excel file.
7. The method for converting the term of the financial field into English is characterized by comprising the following steps:
the method comprises the steps of firstly, obtaining a sentence to be processed, and splitting the sentence into a plurality of words and/or phrases according to financial domain terms in the sentence sequence, wherein a word bank containing a plurality of financial domain terms is preset;
the method for splitting the statement comprises the following steps:
step a, splitting sentences one by one according to Chinese words, calculating the number n of the Chinese words, forming An ordered set An = (A1, A2, am.. An) by the split Chinese words according to the sentence sequence, and setting the number n of the Chinese words as the iteration cycle number;
step b, semantic analysis: taking out a first element A1 in the ordered set An, comparing the first element A1 with the word stock, recording the element if the first element A1 is hit, mapping a corresponding English term, performing cumulative combination on the element and the rest elements in the set An to form a new split vocabulary, comparing the new split vocabulary with the word stock, combining the hit element and the vocabulary to form An ordered set Bn, and iteratively creating a new vocabulary for multiple times until the cumulative combination of the A1 and the rest n-1 elements is finished;
step c, reading the minimum unit semantic word with the longest length in the set Bn as a first splitting term word, wherein the length of the first splitting term word is m, starting 2 nd iteration by using an A (m + 1) word, reading the minimum unit semantic word with the longest length from the 2 nd round Bn set as a second hit splitting term word, mapping corresponding English terms, repeating for multiple times until m = n is completed, completing semantic splitting, and forming a splitting vocabulary;
step two, converting the split words and/or expressions into English terms;
and step three, outputting the sentence to be processed and the corresponding English term.
CN202211107345.8A 2022-09-13 2022-09-13 Device and method for converting financial field terms into English Active CN115204190B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211107345.8A CN115204190B (en) 2022-09-13 2022-09-13 Device and method for converting financial field terms into English

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211107345.8A CN115204190B (en) 2022-09-13 2022-09-13 Device and method for converting financial field terms into English

Publications (2)

Publication Number Publication Date
CN115204190A true CN115204190A (en) 2022-10-18
CN115204190B CN115204190B (en) 2022-11-22

Family

ID=83573552

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211107345.8A Active CN115204190B (en) 2022-09-13 2022-09-13 Device and method for converting financial field terms into English

Country Status (1)

Country Link
CN (1) CN115204190B (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102708147A (en) * 2012-03-26 2012-10-03 北京新发智信科技有限责任公司 Recognition method for new words of scientific and technical terminology
US20140278359A1 (en) * 2013-03-15 2014-09-18 Luminoso Technologies, Inc. Method and system for converting document sets to term-association vector spaces on demand
CN106095753A (en) * 2016-06-07 2016-11-09 大连理工大学 A kind of financial field based on comentropy and term credibility term recognition methods
CN107908712A (en) * 2017-11-10 2018-04-13 哈尔滨工程大学 Cross-language information matching process based on term extraction
CN108287825A (en) * 2018-01-05 2018-07-17 中译语通科技股份有限公司 A kind of term identification abstracting method and system
CN112966508A (en) * 2021-04-05 2021-06-15 集智学园(北京)科技有限公司 General automatic term extraction method

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102708147A (en) * 2012-03-26 2012-10-03 北京新发智信科技有限责任公司 Recognition method for new words of scientific and technical terminology
US20140278359A1 (en) * 2013-03-15 2014-09-18 Luminoso Technologies, Inc. Method and system for converting document sets to term-association vector spaces on demand
CN106095753A (en) * 2016-06-07 2016-11-09 大连理工大学 A kind of financial field based on comentropy and term credibility term recognition methods
CN107908712A (en) * 2017-11-10 2018-04-13 哈尔滨工程大学 Cross-language information matching process based on term extraction
CN108287825A (en) * 2018-01-05 2018-07-17 中译语通科技股份有限公司 A kind of term identification abstracting method and system
CN112966508A (en) * 2021-04-05 2021-06-15 集智学园(北京)科技有限公司 General automatic term extraction method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
黄政豪 等: "基于术语自动抽取的科技文献翻译辅助***的设计与实现", 《万方数据知识服务平台》 *

Also Published As

Publication number Publication date
CN115204190B (en) 2022-11-22

Similar Documents

Publication Publication Date Title
CN108170749B (en) Dialog method, device and computer readable medium based on artificial intelligence
Kasewa et al. Wronging a right: Generating better errors to improve grammatical error detection
US5610812A (en) Contextual tagger utilizing deterministic finite state transducer
US5930746A (en) Parsing and translating natural language sentences automatically
Liao et al. Improving readability for automatic speech recognition transcription
Kanakaraddi et al. Survey on parts of speech tagger techniques
US20100088085A1 (en) Statistical machine translation apparatus and method
US20040243409A1 (en) Morphological analyzer, morphological analysis method, and morphological analysis program
US20140163951A1 (en) Hybrid adaptation of named entity recognition
US11157686B2 (en) Text sequence segmentation method, apparatus and device, and storage medium thereof
CN111611810A (en) Polyphone pronunciation disambiguation device and method
CN114580382A (en) Text error correction method and device
EP1290676A2 (en) Creating a unified task dependent language models with information retrieval techniques
US10282421B2 (en) Hybrid approach for short form detection and expansion to long forms
Na Conditional random fields for Korean morpheme segmentation and POS tagging
US20040186706A1 (en) Translation system, dictionary updating server, translation method, and program and recording medium for use therein
US20220366135A1 (en) Extended open information extraction system
Go et al. Using Stanford part-of-speech tagger for the morphologically-rich Filipino language
US20070129932A1 (en) Chinese to english translation tool
Liu et al. Language model augmented relevance score
US10083170B2 (en) Hybrid approach for short form detection and expansion to long forms
CN115204190B (en) Device and method for converting financial field terms into English
KR20120045906A (en) Apparatus and method for correcting error of corpus
US20210133394A1 (en) Experiential parser
JP5454763B2 (en) Device for associating words in a sentence pair and computer program therefor

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant