CN108073292A - A kind of intelligent word method and apparatus, a kind of device for intelligent word - Google Patents

A kind of intelligent word method and apparatus, a kind of device for intelligent word Download PDF

Info

Publication number
CN108073292A
CN108073292A CN201610996202.5A CN201610996202A CN108073292A CN 108073292 A CN108073292 A CN 108073292A CN 201610996202 A CN201610996202 A CN 201610996202A CN 108073292 A CN108073292 A CN 108073292A
Authority
CN
China
Prior art keywords
group
speech
word
score
path
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201610996202.5A
Other languages
Chinese (zh)
Other versions
CN108073292B (en
Inventor
费腾
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Sogou Technology Development Co Ltd
Original Assignee
Beijing Sogou Technology Development Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Sogou Technology Development Co Ltd filed Critical Beijing Sogou Technology Development Co Ltd
Priority to CN201610996202.5A priority Critical patent/CN108073292B/en
Publication of CN108073292A publication Critical patent/CN108073292A/en
Application granted granted Critical
Publication of CN108073292B publication Critical patent/CN108073292B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/02Input arrangements using manually operated switches, e.g. using keyboards or dials
    • G06F3/023Arrangements for converting discrete items of information into a coded form, e.g. arrangements for interpreting keyboard generated codes as alphanumeric codes, operand codes or instruction codes
    • G06F3/0233Character input methods
    • G06F3/0237Character input methods using prediction or retrieval techniques

Landscapes

  • Engineering & Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Machine Translation (AREA)

Abstract

An embodiment of the present invention provides a kind of intelligent word method and apparatus, a kind of device for intelligent word, method therein specifically includes:Obtain the input content of user;Obtain the corresponding part of speech for treating group vocabulary and respectively treat group vocabulary of the input content;According to preset part of speech collocation rule and the part of speech for respectively treating group vocabulary, the part of speech collocation score treated in corresponding group of word path of group vocabulary between adjacent words is determined;Wherein, the preset part of speech collocation rule is used to describe the Matching Relation between part of speech;The part of speech collocation score between adjacent words is included according to described group of word path, determines the path score in described group of word path;According to the path score, the acquisition group word candidate from described group of word path.The embodiment of the present invention can improve the reasonability and quality of group word candidate, in this way, even if in the case of intelligent word fails, be also capable of providing relatively reasonable group word candidate, and then improve the input efficiency of user.

Description

A kind of intelligent word method and apparatus, a kind of device for intelligent word
Technical field
The present invention relates to computerized information input technology field, more particularly to a kind of intelligent word method and apparatus, one Kind is used for the device of intelligent word.
Background technology
At present, it is related to interactive equipment, it usually needs user is by input method system by oneself operation intention and equipment Interactive identification.For example, user can input input string, it then should by the input method system Standard Map rule preset according to its Input string is converted to the candidate item of corresponding language and displaying, and then will shield in the candidate item of user's selection.
When the entry that input string is directly hit is not present in dictionary, input method system can trigger intelligent word function. Existing intelligent word scheme is:The binary crelation in binary storehouse is searched, this calculates every according to the hit situation of the binary crelation The path probability of vocabulary string in a group of word scheme, and the group word scheme with maximum path probability is returned into use as preference Family.Wherein, which refers to the Matching Relation between vocabulary and vocabulary, such as " weather-good heat ", " I-know ", " like- You ", " 100,000-eight thousand " etc. can have binary crelation.Intelligent word function is extremely important, and the quality of intelligent word result will be straight The quality for determining input method system is connect, will also directly influence the experience of user.
In practical applications, for including the intelligent word of number, quantifier or adverbial word, very more two are generally required First relation.However, on the one hand, memory space is limited to, the binary crelation of storage is limited;On the other hand, stored in binary storehouse Binary crelation is obtained often by the mode of statistical learning, and the binary crelation being commonly stored is it is difficult to ensure that can cover all Situation;In this way, if the binary crelation in can not hitting binary storehouse during intelligent word, will cause intelligent word to fail.Example Such as, if not storing " 90,000-eight thousand " and " 8,000-member " in binary storehouse, input string " jiuwanliangqianyuan " is corresponding Vocabulary " 90,000 " and " 8,000 " and " 8,000 " and " member " will be unable to the binary crelation in hit binary storehouse, and then cause intelligence Group word failure.When intelligent word fails, existing scheme is often chosen the highest word of word frequency and is combined, to obtain corresponding group Word candidate, for example, corresponding group of word candidate of above-mentioned input string " jiuwanliangqianyuan " is " with regard to Wan Liangqian institutes ", but " just Wan Liangqian institutes " are evident as a relatively low, more unreasonable candidate of quality, and the probability for meeting the input intention of user is relatively low.
The content of the invention
In view of the above problems, the embodiment of the present invention proposes one kind and overcomes the above problem or solve at least partly above-mentioned Intelligent word method, intelligent word device and the device for intelligent word of problem, the embodiment of the present invention can improve a group word The reasonability and quality of candidate in this way, even if in the case of intelligent word fails, is also capable of providing relatively reasonable group word and waits Choosing, and then improve the input efficiency of user.
To solve the above-mentioned problems, the invention discloses a kind of intelligent word method, including:
Obtain the input content of user;
Obtain the corresponding part of speech for treating group vocabulary and respectively treat group vocabulary of the input content;
According to preset part of speech collocation rule and the part of speech for respectively treating group vocabulary, determine described to treat corresponding group of word of group vocabulary Part of speech collocation score in path between adjacent words;Wherein, the preset part of speech collocation rule is for describing between part of speech Matching Relation;
The part of speech collocation score between adjacent words is included according to described group of word path, determines the road in described group of word path Footpath score;
According to the path score, the acquisition group word candidate from described group of word path.
Optionally, it is described to determine the part of speech collocation score treated in corresponding group of word path of group vocabulary between adjacent words The step of, including:
According to the part of speech for respectively treating group vocabulary, the word for treating adjacent words in corresponding group of word path of group vocabulary is determined Property;
When the part of speech collocation of the adjacent words meets preset part of speech collocation rule, the preset part of speech is arranged in pairs or groups regular Corresponding score, as the part of speech collocation score between the adjacent words.
Optionally, the corresponding score of the preset part of speech collocation rule is obtained as follows:
The part of speech collocation content for meeting the preset part of speech collocation rule is obtained from preset language material;
Count the collocation probability between adjacent words in each part of speech collocation content;
According to the collocation probability between adjacent words in all parts of speech collocation content, the preset part of speech collocation rule is determined Corresponding score.
Optionally, the input content includes:Input string, then the method further include:
Cutting is carried out to the input string, to obtain corresponding cutting result;
It is searched in dictionary, to obtain the vocabulary to match with the cutting result, is corresponded to as the input string Treat a group vocabulary.
Optionally, the input content further includes:The corresponding context of the input string, then the input content is corresponding Treat that group vocabulary includes:The input string is corresponding to treat group vocabulary and the context.
Optionally, the part of speech collocation score included according to described group of word path between adjacent words, determines described The step of path score in group word path, including:
Part of speech collocation score between all adjacent words included according to described group of word path, obtains the Zu Ci roads The path score in footpath;Or
Part of speech collocation score and the Zu Ci roads between all adjacent words included according to described group of word path The binary crelation score of footpath hit, obtains the path score in described group of word path.
Optionally, described according to preset part of speech collocation rule and the part of speech for respectively treating group vocabulary, determine described to treat group In corresponding group of word path of vocabulary between adjacent words part of speech collocation score the step of before, the method further includes:
According to described in treat adjacent words in corresponding group of word path of group vocabulary, searched in binary storehouse, with obtain with The binary crelation that the adjacent words match;
In the lookup miss in the binary storehouse, execution is described respectively to treat a group word according to preset part of speech collocation rule with described The part of speech of remittance, determine it is described treat in corresponding group of word path of group vocabulary between adjacent words part of speech collocation score the step of.
Optionally, it is described according to the path score, from described group of word path the step of acquisition group word candidate, including:
The path score is ranked up;
According to the ranking results of the path score, the group word path work for coming top N is chosen from described group of word path For a group word candidate.
Optionally, the preset part of speech collocation rule includes:Collocation rule, number and quantifier between number and number it Between collocation rule, the collocation rule between adverbial word and verb, collocation rule, verb and noun between adverbial word and adjective it Between collocation rule, in the collocation rule between adjective and noun and the collocation rule between quantifier and noun at least It is a kind of.
On the other hand, the invention discloses a kind of intelligent word device, including:
Content receiver module, for obtaining the input content of user;
Vocabulary part of speech acquisition module, for obtaining, the input content is corresponding to be treated group vocabulary and respectively treats group vocabulary Part of speech;
Collocation score determining module, for according to preset part of speech collocation rule and the part of speech for respectively treating group vocabulary, determining The part of speech collocation score treated in corresponding group of word path of group vocabulary between adjacent words;Wherein, the preset part of speech collocation Rule is used to describe the Matching Relation between part of speech;
Path score determining module is arranged in pairs or groups for including the part of speech between adjacent words according to described group of word path Divide, determine the path score in described group of word path;And
Group word candidate's acquisition module, for according to the path score, the acquisition group word candidate from described group of word path.
Optionally, the collocation score determining module includes:
Part of speech determination sub-module, for according to the part of speech for respectively treating group vocabulary, determining described to treat corresponding group of group vocabulary The part of speech of adjacent words in word path;And
Score determination sub-module, for when the collocation of the part of speech of the adjacent words meets preset part of speech collocation rule, inciting somebody to action The preset regular corresponding score of part of speech collocation, as the part of speech collocation score between the adjacent words.
Optionally, described device further includes:For obtaining must separately winning for the preset part of speech regular corresponding score of collocation Modulus block;
The score acquisition module includes:
Part of speech collocation content submodule, for obtaining the part of speech for meeting the preset part of speech collocation rule from preset language material Collocation content;
Collocation probability statistics submodule, for counting the collocation probability in each part of speech collocation content between adjacent words;With And
Score determination sub-module, for according to the collocation probability between adjacent words in all parts of speech collocation content, determining The preset regular corresponding score of part of speech collocation.
Optionally, the input content includes:Input string, then described device further include:
Cutting module, for carrying out cutting to the input string, to obtain corresponding cutting result;
Dictionary searching module for being searched in dictionary, to obtain the vocabulary to match with the cutting result, is made A group vocabulary is treated for the input string is corresponding.
Optionally, the input content further includes:The corresponding context of the input string, then the input content is corresponding Treat that group vocabulary includes:The input string is corresponding to treat group vocabulary and the context.
Optionally, the path score determining module includes:
First path score determination sub-module, between all adjacent words for being included according to described group of word path Part of speech collocation score, obtains the path score in described group of word path;Or
Second path score determination sub-module, between all adjacent words for being included according to described group of word path Part of speech collocation score and the binary crelation score of described group of word path hit, obtain the path score in described group of word path.
Optionally, described device further includes:
Binary library lookup module, for regular and described each according to the collocation of preset part of speech in the collocation score determining module Treat the part of speech of group vocabulary, it must be divided by determining that the part of speech treated in corresponding group of word path of group vocabulary between adjacent words is arranged in pairs or groups Before, it treats adjacent words in corresponding group of word path of group vocabulary according to described in, is searched in binary storehouse, to obtain and the phase The binary crelation that adjacent vocabulary matches, and in the lookup miss in the binary storehouse, trigger the collocation score determining module.
Optionally, described group of word candidate's acquisition module includes:
Sorting sub-module, for being ranked up to the path score;
Submodule is chosen, for the ranking results according to the path score, is chosen from described group of word path and comes preceding N The group word path of position is as group word candidate.
Optionally, the preset part of speech collocation rule includes:Collocation rule, number and quantifier between number and number it Between collocation rule, the collocation rule between adverbial word and verb, collocation rule, verb and noun between adverbial word and adjective it Between collocation rule, in the collocation rule between adjective and noun and the collocation rule between quantifier and noun at least It is a kind of.
In another aspect, the invention discloses a kind of device for intelligent word, include memory and one or More than one program, either more than one program storage in memory and is configured to by one or one for one of them A Yi Shang processor performs the one or more programs and includes the instruction operated below:
Obtain the input content of user;
Obtain the corresponding part of speech for treating group vocabulary and respectively treat group vocabulary of the input content;
According to preset part of speech collocation rule and the part of speech for respectively treating group vocabulary, determine described to treat corresponding group of word of group vocabulary Part of speech collocation score in path between adjacent words;Wherein, the preset part of speech collocation rule is for describing between part of speech Matching Relation;
The part of speech collocation score between adjacent words is included according to described group of word path, determines the road in described group of word path Footpath score;
According to the path score, the acquisition group word candidate from described group of word path.
The embodiment of the present invention includes advantages below:
The embodiment of the present invention determines to treat corresponding group of group vocabulary during intelligent word using preset part of speech collocation rule Part of speech collocation score in word path between adjacent words;Due to preset part of speech collocation rule for describing taking between part of speech With relation, the Matching Relation between usual part of speech is stronger, then corresponding part of speech collocation score is higher, the Matching Relation between part of speech Weaker, then corresponding part of speech collocation score is lower, and therefore, which is arranged in pairs or groups score as above-mentioned Zu Ci roads by the embodiment of the present invention The foundation of the path score in footpath so that the path score in the group word path of Matching Relation by force between part of speech is higher than between part of speech The path score in the weak group word path of Matching Relation, and then the strong group word path of the Matching Relation between part of speech is improved as group word The probability of candidate, that is, the embodiment of the present invention using the part of speech collocation score as the path score in above-mentioned group of word path foundation, The reasonability and quality of group word candidate can be improved, in this way, even if in the case of intelligent word fails, is also capable of providing more Rational group word candidate, and then improve the input efficiency of user.
Description of the drawings
Fig. 1 is a kind of step flow chart of intelligent word embodiment of the method one of the present invention;
Fig. 2 is a kind of step flow chart of intelligent word embodiment of the method two of the present invention;
Fig. 3 is a kind of structure diagram of intelligent word device embodiment of the present invention;
Fig. 4 is a kind of block diagram of device 900 for intelligent word of the present invention;And
Fig. 5 is the structure diagram of server in some embodiments of the present invention.
Specific embodiment
In order to make the foregoing objectives, features and advantages of the present invention clearer and more comprehensible, it is below in conjunction with the accompanying drawings and specific real Applying mode, the present invention is described in further detail.
Embodiment of the method one
With reference to Fig. 1, show a kind of step flow chart of intelligent word embodiment of the method one of the present invention, can specifically wrap Include following steps:
Step 101, the input content for obtaining user;
Step 102 obtains the corresponding part of speech for treating group vocabulary and respectively treat group vocabulary of the input content;
Step 103, arrange in pairs or groups according to preset part of speech rule and the part of speech for respectively treating group vocabulary determine described to treat a group vocabulary pair Part of speech collocation score in the group word path answered between adjacent words;Wherein, the preset part of speech collocation rule is used for descriptor Matching Relation between property;
Step 104 includes the part of speech collocation score between adjacent words according to described group of word path, determines described group of word The path score in path;
Step 105, according to the path score, the acquisition group word candidate from described group of word path.
The embodiment of the present invention can be applied to the input method system of various input modes, such as above-mentioned input mode specifically may be used To include the input modes such as keyboard symbol, hand-written information, phonetic entry, i.e. user can pass through coded string, hand-written attribute Shield content in the inputs such as feature.Exemplified by a manner of phonetic entry, input method system can gather voice signal input by user, will The voice signal is converted to text message, is to treat that group vocabulary carries out a group word to the cutting of text information.Below mainly with coded word It is illustrated exemplified by the input mode of symbol string (hereinafter referred to as input string), other input mode cross-reference.
In input method system field, the either input method system of Chinese, Japanese, Korean or other Languages is all handle The input string of user is converted into the candidate item of corresponding language, then by user come select output to application program content, here Pass through content of the upper screen operation output to application program namely upper screen content.Wherein, it is converted into accordingly the input string of user During the candidate item of language, the corresponding entry of input string can be searched directly from dictionary, if searching hit, can be incited somebody to action Obtained entry is searched as candidate item, for example, directly in dictionary lookup obtain input string " nihao " or The entries such as " tianqihenhao " corresponding " hello " or " weather is fine ".Optionally, the dictionary of the embodiment of the present invention is specific It can include:System dictionary, user thesaurus, cell dictionary, cloud dictionary etc., the embodiment of the present invention is not added with for specific dictionary With limitation.
However, in practical applications, a lot of reasons will cause there is no the entry that input string is directly hit in dictionary, can Selection of land in the vocabulary quantity more (such as phrase or long sentence) that user to be inputted or is intended to not input before inputting interior Rong Shi, it is understood that there may be the situation of the entry directly hit in dictionary there is no input string, input method system can be in such cases Trigger intelligent word function.For example, user is wanted through input string " jiuwanliangqianyuan " input " 98,000 Member ", alternatively, wanting by input string " jiuwanliangqian " input " 98,000 ", alternatively, wanting to pass through input string During " qingqingdifangxia " input " lightly putting down ", alternatively, wanting to input by " genghaodilijiebenfam " When " more fully understanding the present invention ", the entry that these input strings are directly hit may be not present in dictionary.
Existing intelligent word scheme utilizes the binary crelation (Matching Relation between vocabulary and vocabulary) in binary storehouse, pin A group word is carried out to input string.However, the intelligent word for including number, quantifier or adverbial word, generally requires very more Binary crelation, there is higher requirement in this not only for the size and memory space in binary storehouse, and often because binary is closed The coverage rate of system is insufficient and intelligent word is caused to fail.By taking the intelligent word of number as an example, need to store all numbers in binary storehouse Matching Relation between word, if the coverage rate of storage is inadequate, it will intelligent word is caused to fail.Although it assuming that is stored in binary storehouse Have " 10,000-one thousand ", " 20,000-one thousand ", " 30,000-one thousand " ..., " 90,000-one thousand ", " 20,000-two thousand " ... " 90,000-nine Thousand ", " 1,000-one hundred " ..., the substantial amounts of binary crelation such as " 9,000-nine hundred ", if but do not store " 90,000-eight thousand " and " 8,000- 200 ", then when input string is " jiuwanliangqianwan ", also it is present with the situation of intelligent word failure.
It is creatively carried for the above problem, the embodiment of the present invention existing for the intelligent word of number, quantifier or adverbial word Go out preset part of speech collocation rule, and determine to treat that group vocabulary is corresponding using the preset part of speech collocation rule during intelligent word Part of speech collocation score in group word path between adjacent words;Due to preset part of speech collocation rule for describing between part of speech Matching Relation, the Matching Relation between usual part of speech is stronger, then corresponding part of speech collocation score is higher, and the collocation between part of speech is closed System is weaker, then corresponding part of speech collocation score is lower, and therefore, which is arranged in pairs or groups score as above-mentioned group of word by the embodiment of the present invention The foundation of the path score in path so that the path score in the group word path of Matching Relation by force between part of speech is higher than between part of speech The weak group word path of Matching Relation path score, and then improve the strong group word path of the Matching Relation between part of speech as group The probability of word candidate, that is, the embodiment of the present invention using the part of speech collocation score as the path score in above-mentioned group of word path according to According to can improve the reasonability and quality of group word candidate, in this way, even if in the case of intelligent word fails, also be capable of providing Relatively reasonable group word candidate, and then improve the input efficiency of user.
In the embodiment of the present invention, optionally, the input content can include:Input string, then the embodiment of the present invention can Obtain that the input string is corresponding to treat a group vocabulary to be searched in dictionary.For example, input string is " jiuwanliangqianyuan ", It is then corresponding to treat that group vocabulary include:" 90,000 ", " 2,000 ", " member " or " just playing ", " Liang Qian ", " institute " etc..
In another alternative embodiment of the present invention, above-mentioned input content can also include in addition to including input string: The corresponding context of the input string.This can be adapted for user by repeatedly inputting the scene of continuity content above.For example, with Family wants to input " 82,340 ", inputs first and upper screen " 80,000 ", and then inputs " liangqian ", " 80,000 " vocabulary corresponding with " liangqian " can be then used as and treat a group vocabulary.This hereafter can be adapted for user and edits The situation of upper screen content.For example, user has input " today is fine " first, before cursor then is moved to " sunny ", And keyed in input string " feich ", then the embodiment of the present invention can be by " feic " corresponding vocabulary and its hereafter " sunny " progress Group word.It is appreciated that the embodiment of the present invention is not any limitation as the corresponding specific group of word scene of context.
In the embodiment of the present invention, above-mentioned preset part of speech collocation rule can be used for describing identical part of speech or different parts of speech etc. Matching Relation between arbitrary part of speech.Also, above-mentioned preset part of speech collocation rule is available to be related to two kinds or two or more parts of speech Between Matching Relation.Optionally, above-mentioned preset part of speech collocation rule can specifically include:Collocation rule between number and number Then, the collocation between the collocation rule between the collocation rule between number and quantifier, adverbial word and verb, adverbial word and adjective is advised Then, taking between the collocation rule between the collocation rule between verb and noun, adjective and noun and quantifier and noun With at least one of rule.It is appreciated that those skilled in the art can determine required preset according to practical application request Part of speech collocation is regular, and the Matching Relation between arbitrary part of speech is in the regular protection of the preset part of speech collocation of the embodiment of the present invention Within the scope of.
The embodiment of the present invention can treat that group vocabulary carries out a group word to above-mentioned, to obtain corresponding group of word path.It is for example, each Group word path can include n and treat a group vocabulary, be expressed as V1、V2…Vi…Vn, then the embodiment of the present invention treat a group word described During the group word of remittance, can arrange in pairs or groups rule and the part of speech for respectively treating group vocabulary according to preset part of speech, determine described to treat a group word The part of speech collocation score converged in corresponding group of word path between adjacent words.Optionally, the part of speech between adjacent words is arranged in pairs or groups V can be expressed as by dividingi-1With ViCollocation score between the two adjacent words, can also be expressed as Vi-1、Vi、Vi+1Between take With score.
It is above-mentioned to determine described to treat adjacent word in corresponding group of word path of group vocabulary in a kind of alternative embodiment of the present invention The step 103 of part of speech collocation score between remittance, can specifically include:According to the part of speech for respectively treating group vocabulary, determine described Treat the part of speech of adjacent words in corresponding group of word path of group vocabulary;Meet preset part of speech in the part of speech collocation of the adjacent words to take During with rule, the corresponding score of the preset part of speech collocation rule is arranged in pairs or groups score as the part of speech between the adjacent words. Assuming that input content is corresponding to treat group vocabulary as P, each word path of organizing can treat a group vocabulary including n, and usual P is more than n, then The part of speech that can treat group vocabulary according to P, determines to treat the part of speech of adjacent words in the corresponding each group word path of group vocabulary.For example, Input string " jiuwanliangqianyuan " is corresponding to treat that group vocabulary can include:" 90,000 ", " 2,000 ", " member ", " just play ", " Liang Qian ", " institute " etc., then can be from all parts of speech for treating to obtain adjacent words in each group word path in group vocabulary, such as Zu Ci roads In footpath 1 " 90,000+two thousand+member " in the part of speech of adjacent words or group word path 2 " with regard to object for appreciation+Liang Qian+institute " adjacent words part of speech Deng.
In the embodiment of the present invention, optionally, the preset regular corresponding score of part of speech collocation can be obtained by preset, example Such as, input method system can be based on the preset above-mentioned preset regular corresponding score of part of speech collocation of experience, alternatively, user can be based on Preset above-mentioned preset regular corresponding score of part of speech collocation of self-demand etc..
In a kind of alternative embodiment of the present invention, the above-mentioned preset regular corresponding score of part of speech collocation can be divided into Several scoring ranks, wherein different scoring ranks is used to represent the power of the Matching Relation between part of speech.For example, above-mentioned Graduate number can be 3, with reference to table 1, show a kind of preset part of speech collocation rule of the present invention and its to reserved portion Example, wherein, A>B>C, for example, " 90,000 " and " 2,000 " are all numbers, Matching Relation therebetween is very strong, therefore corresponding Score can be A;And if the Matching Relation between quantifier and noun then compares, such as quantifier " platform " and noun " TV " can be taken Match somebody with somebody, but the Matching Relation between quantifier " platform " and noun " people " is then weaker.Optionally, A=1, B=0.7, C=0.4, It is appreciated that those skilled in the art can determine the value of A, B, C according to practical application request, the embodiment of the present invention is for preset The corresponding specific score value of part of speech collocation rule is not any limitation as.
Table 1
Preset part of speech collocation rule Score
Collocation rule between number and number A
Collocation rule between number and quantifier A
Collocation rule between verb and noun B
Collocation rule between adjective and noun B
Collocation rule between adverbial word and verb B
Collocation rule between adverbial word and adjective B
Collocation rule between quantifier and noun C
In a kind of alternative embodiment of the present invention, preset part of speech collocation rule can be obtained based on the statistics of preset language material Corresponding score, correspondingly, obtaining the process of the corresponding score of the preset part of speech collocation rule can include:From preset language material Middle acquisition meets the part of speech collocation content of the preset part of speech collocation rule;It counts in each part of speech collocation content between adjacent words Collocation probability;According to the collocation probability between adjacent words in all parts of speech collocation content, the preset part of speech collocation is determined The corresponding score of rule.
In practical applications, above-mentioned preset language material can derive from existing corpus, such as rapidly inputting for Chinese, Existing corpus can be including Chinese corpus etc., alternatively, above-mentioned preset language material can also derive from famous books, internet History input record that language material, input method procedure are recorded etc..It is appreciated that arbitrary language material is in the preset of the embodiment of the present invention Within the protection domain of language material.
The part of speech collocation for meeting the preset part of speech collocation rule can be obtained in the embodiment of the present invention from preset language material Content for example, for the collocation rule between number and number, can be obtained from preset language material and met between number and number Collocation rule part of speech collocation content, such as " 10,000-one thousand ", " 20,000-one thousand ", " 30,000-one thousand ", " 90,000-one thousand ", " 20,000-two thousand " etc.;Further, it is possible to the collocation obtained using statistical in each part of speech collocation content between adjacent words is general Rate, optionally, the collocation probability can be obtained according to the adjacent co-occurrence probability of adjacent words, for example, dividing preset language material The quantity of the sentence obtained after word either word string is that occurrence number of the Q parts of speech collocation content in Q sentence or word string is M, then corresponding adjacent co-occurrence probability be M/Q, it will be understood that the embodiment of the present invention for arrange in pairs or groups probability specific statistical not It is any limitation as.
Collocation probability in content of arranging in pairs or groups according to all parts of speech between adjacent words determines the preset part of speech collocation rule Then during corresponding score, the collocation probability in the content that can arrange in pairs or groups to all parts of speech between adjacent words is averaged, and Using the average as the corresponding score of preset part of speech collocation rule, alternatively, adjacent words in the content that can arrange in pairs or groups to all parts of speech Between collocation probability be weighted average treatment, it is and weighted average handling result is corresponding as preset part of speech collocation rule Score, it will be understood that the embodiment of the present invention is for the collocation probability arranged in pairs or groups according to all parts of speech in content between adjacent words, really The detailed process of the fixed preset regular corresponding score of part of speech collocation is not any limitation as.A kind of in the present invention applies example In, for the collocation rule between number and number, the collocation corresponded in all part of speech collocation contents between adjacent words is general Rate is higher, therefore corresponding score is also higher;And it is regular for the collocation between quantifier and noun, in the collocation of some parts of speech Hold the collocation likelihood ratio between adjacent words in (such as quantifier " platform " and noun " TV ", such as quantifier " a " and noun " apple ") It is higher, in some parts of speech collocation content (such as quantifier " platform " and noun " people ", quantifier " item " and noun " people ") adjacent words it Between collocation likelihood ratio it is relatively low, therefore corresponding score is also than relatively low.
In another alternative embodiment of the present invention, it can be obtained based on the statistics of the binary crelation recorded in binary storehouse The preset regular corresponding score of part of speech collocation specifically, can obtain from binary storehouse and meet the more of preset part of speech collocation rule Kind binary crelation, and average to the collocation probability between two vocabulary corresponding to a variety of binary crelations, to obtain preset word Property the corresponding score of collocation rule.Exemplified by collocation rule between number and number, it can be obtained from binary storehouse and meet number Between word and number collocation rule all binary crelations, such as " 10,000-one thousand ", " 20,000-one thousand ", " 30,000-one thousand ", " 90,000-one thousand ", " 20,000-two thousand " etc., and equal is asked to the collocation probability between two vocabulary corresponding to a variety of binary crelations Value.It is corresponding that the embodiment of the present invention obtains preset part of speech collocation rule for the statistics based on the binary crelation recorded in binary storehouse The detailed process of score is not any limitation as.
Mainly preset part of speech collocation rule is illustrated by taking the preset part of speech collocation rule of Chinese as an example above, it can be with Understand, those skilled in the art can be applicable in pre- according to practical application request for other language setting in addition to Chinese Part of speech collocation rule is put, such as part of speech for English sets corresponding preset part of speech collocation rule, false, flat for the piece in Japanese False part of speech sets corresponding preset part of speech collocation rule, and corresponding preset part of speech collocation rule etc. is set for the part of speech of French Deng, it will be understood that the Matching Relation between the arbitrary part of speech of any language is arranged in pairs or groups in the preset part of speech of the embodiment of the present invention advises Within protection domain then.
In a kind of alternative embodiment of the present invention, step 103 can have corresponding trigger condition, specifically, in step Before rapid 103, the method can also include:Adjacent words in corresponding group of word path of group vocabulary are treated according to described in, in binary It is searched in storehouse, to obtain the binary crelation to match with the adjacent words;In the lookup miss in the binary storehouse, It performs described according to preset part of speech collocation rule and the part of speech for respectively treating group vocabulary, determines described to treat corresponding group of word of group vocabulary The step 103 of part of speech collocation score in path between adjacent words.Of course, it is possible in the case of no any trigger condition Step 103 is performed, alternatively, step 103 can be performed when the lookup in the binary storehouse is hit, it in such cases, can be simultaneously What part of speech collocation score and described group of word path between all adjacent words included according to described group of word path were hit Binary crelation score obtains the path score in described group of word path.It is appreciated that tool of the embodiment of the present invention for step 103 Body trigger condition is not any limitation as.
The described group of word path that step 104 can be exported according to step 103 includes the part of speech collocation between adjacent words Score determines the path score in described group of word path.In a kind of alternative embodiment of the present invention, step 104 can include:
Part of speech collocation score between all adjacent words included according to described group of word path, obtains the Zu Ci roads The path score in footpath;Or
Part of speech collocation score and the Zu Ci roads between all adjacent words included according to described group of word path The binary crelation score of footpath hit, obtains the path score in described group of word path.
In practical applications, the foundation of path score can only include part of speech collocation score, can also include:Part of speech is taken Combination with score Yu other scores, optionally, other scores can include:Binary crelation score (namely hit binary storehouse The score during binary crelation of middle record), respectively treat in group word path word frequency in group vocabulary, dictionary (wherein user thesaurus Divide the score more than non-user dictionary) etc..It wherein, can be to word when using the combination of part of speech collocation score and other scores Property collocation score and other scores be weighted it is average, for example, part of speech collocation score, binary crelation score, word frequency, dictionary etc. are There can be corresponding weight, it will be understood that those skilled in the art can determine corresponding weight according to practical application request, It is respectively 0.3,0.4,0.15 that such as part of speech collocation score, binary crelation score, word frequency, dictionary, which can have corresponding weight, With 0.15 etc., the embodiment of the present invention for part of speech collocation score, binary crelation score, word frequency, the corresponding specific weight of dictionary not It is any limitation as.
In a kind of alternative embodiment of the present invention, in order to ensure the priority of binary crelation, the part of speech collocation score Weight be no more than the weight of the binary crelation score, certainly, the embodiment of the present invention is closed for part of speech collocation score and binary It is that the specific weight of score is not any limitation as.
The path score that step 105 can be exported according to step 104, acquisition group word is waited from described group of word path Choosing.For example, can be according to path score, the group word path of path selection highest scoring is waited as group word from described group of word path Choosing, alternatively, can from described group of word path path selection score be more than score threshold group word path as group word candidate or Person, can from described group of word path path selection highest scoring multiple groups of word paths as a group word candidate, specifically, can be with The path score is ranked up, and according to the ranking results of the path score, chooses and comes from described group of word path As group word candidate, wherein N is natural number in the group word path of top N.
To sum up, the intelligent word method of the embodiment of the present invention utilizes preset part of speech collocation rule during intelligent word Determine to treat the part of speech collocation score between adjacent words in corresponding group of word path of group vocabulary;Due to the preset part of speech collocation rule For describing the Matching Relation between part of speech, the Matching Relation between usual part of speech is stronger, then corresponding part of speech collocation score is got over Height, the Matching Relation between part of speech is weaker, then corresponding part of speech collocation score is lower, and therefore, the embodiment of the present invention is by the part of speech Foundation of the score of arranging in pairs or groups as the path score in above-mentioned group of word path so that the strong group word path of Matching Relation between part of speech The path score in the path score group word path weak higher than the Matching Relation between part of speech, and then improve the collocation between part of speech and close Probability of the strong group word path of system as group word candidate, that is, the part of speech is arranged in pairs or groups score as above-mentioned group by the embodiment of the present invention The foundation of the path score in word path can improve the reasonability and quality of group word candidate, in this way, even if failing in intelligent word In the case of, relatively reasonable group word candidate is also capable of providing, and then improves the input efficiency of user.
Embodiment of the method two
With reference to Fig. 2, show a kind of step flow chart of intelligent word embodiment of the method two of the present invention, can specifically wrap Include following steps:
Step 201, the input content for obtaining user;Above-mentioned input content can include:Input string or the input string and Its corresponding context;
Step 202 carries out cutting to the input string, to obtain corresponding cutting result;
Step 203 is searched in dictionary, to obtain the vocabulary to match with the cutting result, as the input string It is corresponding to treat a group vocabulary;
Step 204, acquisition respectively treat the part of speech of group vocabulary;
Step 205, arrange in pairs or groups according to preset part of speech rule and the part of speech for respectively treating group vocabulary determine described to treat a group vocabulary pair Part of speech collocation score in the group word path answered between adjacent words;Wherein, the preset part of speech collocation rule is used for descriptor Matching Relation between property;
Step 206 includes the part of speech collocation score between adjacent words according to described group of word path, determines described group of word The path score in path;
Step 207, according to the path score, the acquisition group word candidate from described group of word path.
In practical applications, cutting can be carried out to input string according to the rule of input string.If the input string is phonetic String then can carry out cutting according to syllable rule.One input string may have one or more kinds of cutting schemes, therein every Kind cutting scheme may each comprise one or more substrings.For example, input string " jiuwanliangqianyuan " can be split for " jiu ' wan ' liang ' qian ' yuan ", input string " fangan " can be split as " fang ' an " or " fan ' gan ".
In practical applications, can be searched in such as dictionary of system dictionary, user thesaurus, to obtain each substring It is corresponding to treat a group vocabulary.Treat that group vocabulary can include as " jiu ' wan " is corresponding:" 90,000 ", " just playing ", " liang ' qian " is right That answers treats that group vocabulary can include:" 2,000 ", " Liang Qian ", " yuan " is corresponding to treat that group vocabulary can include:" member ", " institute " etc. Deng, wherein, " 90,000 ", " just play ", " 2,000 ", " Liang Qian ", " member ", the part of speech of " institute " be respectively number, verb, number, noun, Quantifier, noun.
The embodiment of the present invention is respectively treated during the group word for treating group vocabulary, according to preset part of speech collocation rule with described The part of speech of group vocabulary determines the part of speech collocation score treated in corresponding group of word path of group vocabulary between adjacent words.
For those skilled in the art is made to more fully understand the embodiment of the present invention, a kind of intelligent word of the present invention is provided herein Method example, the example specifically may include steps of:
Step S1, input string " jiuwanliangqianyuan " is received;
Step S2, cutting is carried out to the input string, to obtain cutting result
“jiu’wan’liang’qian’yuan”;
Step S3, searched in dictionary, to obtain corresponding with above-mentioned cutting result treating a group vocabulary;
Step S4, treat that group vocabulary carries out a group word to described, to obtain corresponding group of word path;Assuming that group word path 1:" nine Ten thousand+two thousand+member ", group word path 2:" with regard to object for appreciation+Liang Qian+institute ";
Step S5, arrange in pairs or groups rule and the part of speech for respectively treating group vocabulary according to preset part of speech, determine described to treat a group vocabulary pair Part of speech collocation score in the group word path answered between adjacent words;
In practical applications, the rule that can be arranged in pairs or groups using preset part of speech carry out to group word path 1 and group word path 2 the two It gives a mark in group word path.For " 90,000+two thousand+member ", due to " 90,000+two thousand " therein meet number and number it Between collocation rule, therefore score A can be obtained, " 2,000+member " therein meets the collocation rule between number and quantifier, therefore can To obtain score A, therefore, the part of speech collocation score of " 90,000+two thousand+member " is 2A;For " with regard to object for appreciation+Liang Qian+institute ", due to The collocation rule therein met " with regard to object for appreciation+Liang Qian " between verb and noun, therefore score B can be obtained, " Liang Qian+institute " therein Preset part of speech collocation rule is not met, therefore does not obtain score, therefore, the part of speech collocation score of " with regard to object for appreciation+Liang Qian+institute " is B.
Step S6, include part of speech between adjacent words according to described group of word path to arrange in pairs or groups score, determine described group of word The path score in path, and according to the path score, the acquisition group word candidate from described group of word path.
Assuming that above-mentioned group of word path 1 and group word path 2 all without hitting binary crelation, then corresponding path score is respectively 2A and B, since 2A is much larger than B, therefore can using a group word path 1 " 90,000+two thousand+member " corresponding candidate's " 92,000 yuan " as Group word candidate.
The embodiment of the present invention is using part of speech collocation score as the foundation of the path score in above-mentioned group of word path so that part of speech Between the strong group word path of Matching Relation the path score group word path strong higher than the Matching Relation between part of speech path Score, and then probability of the strong group word path of the Matching Relation between part of speech as group word candidate is improved, that is, the present invention is implemented Example using the part of speech collocation score as the path score in above-mentioned group of word path foundation, can improve group reasonability of word candidate and Quality in this way, even if in the case of intelligent word fails, is also capable of providing relatively reasonable group word candidate, and then improves and use The input efficiency at family.
It should be noted that for embodiment of the method, in order to be briefly described, therefore it is dynamic that it is all expressed as to a series of movement It combines, but those skilled in the art should know, the embodiment of the present invention and from the limit of described athletic performance order System, because according to the embodiment of the present invention, some steps may be employed other orders or be carried out at the same time.Secondly, art technology Personnel should also know that embodiment described in this description belongs to preferred embodiment, and involved athletic performance simultaneously differs Surely necessary to being the embodiment of the present invention.
Device embodiment
With reference to Fig. 3, show a kind of structure diagram of input unit embodiment of the present invention, can specifically include:Content Receiving module 301, vocabulary part of speech acquisition module 302, collocation score determining module 303, path score determining module 304 and group word Candidate's acquisition module 305.
Wherein, content receiver module 301, for obtaining the input content of user;
Vocabulary part of speech acquisition module 302, for obtaining, the input content is corresponding to be treated group vocabulary and respectively treats a group vocabulary Part of speech;
The score determining module of arranging in pairs or groups 303, for according to preset part of speech collocation rule and the part of speech for respectively treating group vocabulary, really The fixed part of speech collocation score treated in corresponding group of word path of group vocabulary between adjacent words;Wherein, the preset part of speech is taken It is used to describe the Matching Relation between part of speech with rule;
Path score determining module 304, for including the collocation of the part of speech between adjacent words according to described group of word path Score determines the path score in described group of word path;
Group word candidate acquisition module 305, for according to the path score, acquisition group word to be waited from described group of word path Choosing.
Optionally, the collocation score determining module 303 can include:
Part of speech determination sub-module, for according to the part of speech for respectively treating group vocabulary, determining described to treat that group vocabulary corresponds to respectively Group word path in adjacent words part of speech;And
Score determination sub-module, for when the collocation of the part of speech of the adjacent words meets preset part of speech collocation rule, inciting somebody to action The preset regular corresponding score of part of speech collocation, as the part of speech collocation score between the adjacent words.
Optionally, described device can also include:It is obtained for obtaining the corresponding score of the preset part of speech collocation rule Module;
The score acquisition module can include:
Part of speech collocation content submodule, for obtaining the part of speech for meeting the preset part of speech collocation rule from preset language material Collocation content;
Collocation probability statistics submodule, for counting the collocation probability in each part of speech collocation content between adjacent words;With And
Score determination sub-module, for according to the collocation probability between adjacent words in all parts of speech collocation content, determining The preset regular corresponding score of part of speech collocation.
Optionally, the input content can include:Input string, then described device can also include:
Cutting module, for carrying out cutting to the input string, to obtain corresponding cutting result;
Dictionary searching module for being searched in dictionary, to obtain the vocabulary to match with the cutting result, is made A group vocabulary is treated for the input string is corresponding.
Optionally, the input content can also include:The corresponding context of the input string, the then input content pair That answers treats that group vocabulary can include:The input string is corresponding to treat group vocabulary and the context.
Optionally, the path score determining module 304 can include:
First path score determination sub-module, between all adjacent words for being included according to described group of word path Part of speech collocation score, obtains the path score in described group of word path;Or
Second path score determination sub-module, between all adjacent words for being included according to described group of word path Part of speech collocation score and the binary crelation score of described group of word path hit, obtain the path score in described group of word path.
Optionally, described device can also include:
Binary library lookup module, for arranging in pairs or groups rule and institute according to preset part of speech in the collocation score determining module 303 The part of speech for respectively treating group vocabulary is stated, determines the part of speech collocation score treated in corresponding group of word path of group vocabulary between adjacent words Before, according to described in treat adjacent words in corresponding group of word path of group vocabulary, searched in binary storehouse, with obtain with it is described The binary crelation that adjacent words match, and in the lookup miss in the binary storehouse, trigger the collocation score and determine mould Block 303.
Optionally, described group of word candidate acquisition module 305 can include:
Sorting sub-module, for being ranked up to the path score;
Submodule is chosen, for the ranking results according to the path score, is chosen from described group of word path and comes preceding N The group word path of position is as group word candidate.
Optionally, the preset part of speech collocation rule can include:Collocation rule, number and amount between number and number Collocation rule, verb and the name between the collocation rule between collocation rule, adverbial word and verb, adverbial word and adjective between word In the collocation rule arranged in pairs or groups between rule and quantifier and noun between collocation rule, adjective and noun between word It is at least one.
For device embodiment, since it is basicly similar to embodiment of the method, so description is fairly simple, it is related Part illustrates referring to the part of embodiment of the method.
Each embodiment in this specification is described by the way of progressive, the highlights of each of the examples are with The difference of other embodiment, just to refer each other for identical similar part between each embodiment.
On the device in above-described embodiment, wherein modules perform the concrete mode of operation in related this method Embodiment in be described in detail, explanation will be not set forth in detail herein.
Fig. 4 is the block diagram according to a kind of device 900 for intelligent word shown in an exemplary embodiment.For example, dress It can be mobile phone to put 900, computer, digital broadcast terminal, messaging devices, game console, tablet device, medical treatment Equipment, body-building equipment, personal digital assistant etc..
With reference to Fig. 4, device 900 can include following one or more assemblies:Processing component 902, memory 904, power supply Component 906, multimedia component 908, audio component 910, the interface 912 of input/output (I/O), sensor module 914 and Communication component 916.
The integrated operation of 902 usual control device 900 of processing component, such as with display, call, data communication, phase Machine operates and record operates associated operation.Processing element 902 can refer to including one or more processors 920 to perform Order, to perform all or part of the steps of the methods described above.In addition, processing component 902 can include one or more modules, just Interaction between processing component 902 and other assemblies.For example, processing component 902 can include multi-media module, it is more to facilitate Interaction between media component 908 and processing component 902.
Memory 904 is configured as storing various types of data to support the operation in equipment 900.These data are shown Example is included for the instruction of any application program or method that are operated on device 900, contact data, and telephone book data disappears Breath, picture, video etc..Memory 904 can be by any kind of volatibility or non-volatile memory device or their group It closes and realizes, such as static RAM (SRAM), electrically erasable programmable read-only memory (EEPROM) is erasable to compile Journey read-only memory (EPROM), programmable read only memory (PROM), read-only memory (ROM), magnetic memory, flash Device, disk or CD.
Power supply module 906 provides electric power for the various assemblies of device 900.Power supply module 906 can include power management system System, one or more power supplys and other generate, manage and distribute electric power associated component with for device 900.
Multimedia component 908 is included in the screen of one output interface of offer between described device 900 and user.One In a little embodiments, screen can include liquid crystal display (LCD) and touch panel (TP).If screen includes touch panel, screen Curtain may be implemented as touch-screen, to receive input signal from the user.Touch panel includes one or more touch sensings Device is to sense the gesture on touch, slide, and touch panel.The touch sensor can not only sense touch or sliding motion The border of action, but also detect duration and pressure associated with the touch or slide operation.In some embodiments, Multimedia component 908 includes a front camera and/or rear camera.When equipment 900 is in operation mode, mould is such as shot When formula or video mode, front camera and/or rear camera can receive external multi-medium data.Each preposition camera shooting Head and rear camera can be a fixed optical lens system or have focusing and optical zoom capabilities.
Audio component 910 is configured as output and/or input audio signal.For example, audio component 910 includes a Mike Wind (MIC), when device 900 is in operation mode, during such as call model, logging mode and speech recognition mode, microphone by with It is set to reception external audio signal.The received audio signal can be further stored in memory 904 or via communication set Part 916 is sent.In some embodiments, audio component 910 further includes a loud speaker, for exports audio signal.
I/O interfaces 912 provide interface between processing component 902 and peripheral interface module, and above-mentioned peripheral interface module can To be keyboard, click wheel, button etc..These buttons may include but be not limited to:Home button, volume button, start button and lock Determine button.
Sensor module 914 includes one or more sensors, and the state for providing various aspects for device 900 is commented Estimate.For example, sensor module 914 can detect opening/closed state of equipment 900, and the relative positioning of component, for example, it is described Component is the display and keypad of device 900, and sensor module 914 can be with 900 1 components of detection device 900 or device Position change, the existence or non-existence that user contacts with device 900,900 orientation of device or acceleration/deceleration and device 900 Temperature change.Sensor module 914 can include proximity sensor, be configured to detect without any physical contact Presence of nearby objects.Sensor module 914 can also include optical sensor, such as CMOS or ccd image sensor, for into As being used in application.In some embodiments, which can also include acceleration transducer, gyro sensors Device, Magnetic Sensor, pressure sensor or temperature sensor.
Communication component 916 is configured to facilitate the communication of wired or wireless way between device 900 and other equipment.Device 900 can access the wireless network based on communication standard, such as WiFi, 2G or 3G or combination thereof.In an exemplary implementation In example, communication component 916 receives broadcast singal or broadcast related information from external broadcasting management system via broadcast channel. In one exemplary embodiment, the communication component 916 further includes near-field communication (NFC) module, to promote short range communication.Example Such as, NFC module can be based on radio frequency identification (RFID) technology, Infrared Data Association (IrDA) technology, ultra wide band (UWB) technology, Bluetooth (BT) technology and other technologies are realized.
In the exemplary embodiment, device 900 can be believed by one or more application application-specific integrated circuit (ASIC), number Number processor (DSP), digital signal processing appts (DSPD), programmable logic device (PLD), field programmable gate array (FPGA), controller, microcontroller, microprocessor or other electronic components are realized, for performing the above method.
In the exemplary embodiment, a kind of non-transitorycomputer readable storage medium including instructing, example are additionally provided Such as include the memory 904 of instruction, above-metioned instruction can be performed to complete the above method by the processor 920 of device 900.For example, The non-transitorycomputer readable storage medium can be ROM, random access memory (RAM), CD-ROM, tape, floppy disk With optical data storage devices etc..
A kind of non-transitorycomputer readable storage medium, when the instruction in the storage medium is by the processing of intelligent terminal When device performs so that intelligent terminal is able to carry out a kind of intelligent word method, the described method includes:In the input for obtaining user Hold;Obtain the corresponding part of speech for treating group vocabulary and respectively treat group vocabulary of the input content;According to preset part of speech collocation rule and The part of speech for respectively treating group vocabulary determines described to treat that the part of speech in corresponding group of word path of group vocabulary between adjacent words is arranged in pairs or groups Point;Wherein, the preset part of speech collocation rule is used to describe the Matching Relation between part of speech;It is included according to described group of word path Part of speech collocation score between adjacent words determines the path score in described group of word path;According to the path score, from described Acquisition group word candidate in group word path.
Fig. 5 is the structure diagram of server in some embodiments of the present invention.The server 1900 can be because of configuration or property Energy is different and generates bigger difference, can include one or more central processing units (central processing Units, CPU) 1922 (for example, one or more processors) and memory 1932, one or more storage applications The storage medium 1930 of program 1942 or data 1944 (such as one or more mass memory units).Wherein, memory 1932 and storage medium 1930 can be of short duration storage or persistent storage.One can be included by being stored in the program of storage medium 1930 A or more than one module (diagram does not mark), each module can include operating the series of instructions in server.More into One step, central processing unit 1922 could be provided as communicating with storage medium 1930, and storage medium is performed on server 1900 Series of instructions operation in 1930.
Server 1900 can also include one or more power supplys 1926, one or more wired or wireless nets Network interface 1950, one or more input/output interfaces 1958, one or more keyboards 1956 and/or, one or More than one operating system 1941, such as Windows ServerTM, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM Etc..
Those skilled in the art will readily occur to the present invention its after considering specification and putting into practice invention disclosed herein Its embodiment.It is contemplated that cover the present invention any variations, uses, or adaptations, these modifications, purposes or Person's adaptive change follows the general principle of the present invention and including the undocumented common knowledge in the art of the disclosure Or conventional techniques.Description and embodiments are considered only as illustratively, and true scope and spirit of the invention are by following Claim is pointed out.
It should be appreciated that the invention is not limited in the precision architecture for being described above and being shown in the drawings, and And various modifications and changes may be made without departing from the scope thereof.The scope of the present invention is only limited by appended claim
The foregoing is merely presently preferred embodiments of the present invention, is not intended to limit the invention, it is all the present invention spirit and Within principle, any modifications, equivalent replacements and improvements are made should all be included in the protection scope of the present invention.
Smart group is used for a kind of intelligent word method provided by the present invention, a kind of intelligent word device and one kind above The device of word, is described in detail, and specific case used herein explains the principle of the present invention and embodiment It states, the explanation of above example is only intended to help to understand method and its core concept of the invention;Meanwhile for this field Those skilled in the art, thought according to the invention, in specific embodiments and applications there will be changes, to sum up institute It states, this specification content should not be construed as limiting the invention.

Claims (11)

  1. A kind of 1. intelligent word method, which is characterized in that including:
    Obtain the input content of user;
    Obtain the corresponding part of speech for treating group vocabulary and respectively treat group vocabulary of the input content;
    According to preset part of speech collocation rule and the part of speech for respectively treating group vocabulary, determine described to treat corresponding group of word path of group vocabulary Part of speech collocation score between middle adjacent words;Wherein, the preset part of speech collocation rule is used to describe the collocation between part of speech Relation;
    The part of speech collocation score between adjacent words is included according to described group of word path, determines that the path in described group of word path obtains Point;
    According to the path score, the acquisition group word candidate from described group of word path.
  2. 2. according to the method described in claim 1, it is characterized in that, described determine described treat in corresponding group of word path of group vocabulary Between adjacent words part of speech collocation score the step of, including:
    According to the part of speech for respectively treating group vocabulary, the part of speech for treating adjacent words in corresponding group of word path of group vocabulary is determined;
    It is when the part of speech collocation of the adjacent words meets preset part of speech collocation rule, the preset part of speech collocation rule is corresponding Score, as between the adjacent words part of speech arrange in pairs or groups score.
  3. 3. method according to claim 1 or 2, which is characterized in that obtain the preset part of speech collocation as follows The corresponding score of rule:
    The part of speech collocation content for meeting the preset part of speech collocation rule is obtained from preset language material;
    Count the collocation probability between adjacent words in each part of speech collocation content;
    According to the collocation probability between adjacent words in all parts of speech collocation content, determine that the preset part of speech collocation rule is corresponding Score.
  4. 4. method according to claim 1 or 2, which is characterized in that the input content includes:Input string, the then side Method further includes:
    Cutting is carried out to the input string, to obtain corresponding cutting result;
    It is searched in dictionary, to obtain the vocabulary to match with the cutting result, is treated as the input string is corresponding Group vocabulary.
  5. 5. according to the method described in claim 4, it is characterized in that, the input content further includes:The input string is corresponding Context, then the input content is corresponding treats that group vocabulary includes:The input string is corresponding to treat group vocabulary and the context.
  6. 6. method according to claim 1 or 2, which is characterized in that described to include adjacent word according to described group of word path Part of speech collocation score between remittance, the step of determining the path score in described group of word path, including:
    Part of speech collocation score between all adjacent words included according to described group of word path, obtains described group of word path Path score;Or
    Part of speech collocation score and described group of word path life between all adjacent words included according to described group of word path In binary crelation score, obtain the path score in described group of word path.
  7. 7. method according to claim 1 or 2, which is characterized in that described according to preset part of speech collocation rule and described It respectively treats the part of speech of group vocabulary, determines described to treat part of speech collocation score in corresponding group of word path of group vocabulary between adjacent words Before step, the method further includes:
    According to described in treat adjacent words in corresponding group of word path of group vocabulary, searched in binary storehouse, with obtain with it is described The binary crelation that adjacent words match;
    In the lookup miss in the binary storehouse, perform described according to preset part of speech collocation rule and group vocabulary of respectively treating Part of speech, determine it is described treat in corresponding group of word path of group vocabulary between adjacent words part of speech collocation score the step of.
  8. 8. method according to claim 1 or 2, which is characterized in that it is described according to the path score, from the Zu Ci roads In footpath the step of acquisition group word candidate, including:
    The path score is ranked up;
    According to the ranking results of the path score, the group word path for coming top N is chosen from described group of word path as group Word candidate.
  9. 9. method according to claim 1 or 2, which is characterized in that the preset part of speech collocation rule includes:Number and number The collocation rule between the collocation rule between collocation rule, number and quantifier, adverbial word and verb, adverbial word between word is with describing The collocation rule and quantifier between the collocation rule between collocation rule, verb and noun, adjective and noun between word At least one of collocation rule between noun.
  10. 10. a kind of intelligent word device, which is characterized in that including:
    Content receiver module, for obtaining the input content of user;
    Vocabulary part of speech acquisition module, for obtaining the corresponding part of speech for treating group vocabulary and respectively treat group vocabulary of the input content;
    It arranges in pairs or groups score determining module, for according to preset part of speech collocation rule and the part of speech for respectively treating group vocabulary, determining described Treat the part of speech collocation score between adjacent words in corresponding group of word path of group vocabulary;Wherein, the preset part of speech collocation rule For describing the Matching Relation between part of speech;
    Path score determining module, for including the collocation score of the part of speech between adjacent words according to described group of word path, really The path score in fixed described group of word path;And
    Group word candidate's acquisition module, for according to the path score, the acquisition group word candidate from described group of word path.
  11. 11. a kind of device for intelligent word, which is characterized in that include memory and one or more than one Program, either more than one program storage in memory and is configured to by one or more than one processing for one of them Device performs the one or more programs and includes the instruction operated below:
    Obtain the input content of user;
    Obtain the corresponding part of speech for treating group vocabulary and respectively treat group vocabulary of the input content;
    According to preset part of speech collocation rule and the part of speech for respectively treating group vocabulary, determine described to treat corresponding group of word path of group vocabulary Part of speech collocation score between middle adjacent words;Wherein, the preset part of speech collocation rule is used to describe the collocation between part of speech Relation;
    The part of speech collocation score between adjacent words is included according to described group of word path, determines that the path in described group of word path obtains Point;
    According to the path score, the acquisition group word candidate from described group of word path.
CN201610996202.5A 2016-11-11 2016-11-11 Intelligent word forming method and device for intelligent word forming Active CN108073292B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610996202.5A CN108073292B (en) 2016-11-11 2016-11-11 Intelligent word forming method and device for intelligent word forming

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610996202.5A CN108073292B (en) 2016-11-11 2016-11-11 Intelligent word forming method and device for intelligent word forming

Publications (2)

Publication Number Publication Date
CN108073292A true CN108073292A (en) 2018-05-25
CN108073292B CN108073292B (en) 2021-10-15

Family

ID=62153729

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610996202.5A Active CN108073292B (en) 2016-11-11 2016-11-11 Intelligent word forming method and device for intelligent word forming

Country Status (1)

Country Link
CN (1) CN108073292B (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108664143A (en) * 2018-09-06 2018-10-16 上海二三四五网络科技有限公司 A kind of control method and control device handling context association input in input method system
CN110209765A (en) * 2019-05-23 2019-09-06 武汉绿色网络信息服务有限责任公司 A kind of method and apparatus by semantic search key
CN110309513A (en) * 2019-07-09 2019-10-08 北京金山数字娱乐科技有限公司 A kind of method and apparatus of context dependent analysis
CN110781288A (en) * 2019-10-30 2020-02-11 安阳师范学院 Method and device for composing words by Chinese characters
CN110908523A (en) * 2018-09-14 2020-03-24 北京搜狗科技发展有限公司 Input method and device
CN112987941A (en) * 2019-12-17 2021-06-18 北京搜狗科技发展有限公司 Method and device for generating candidate words

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101013443A (en) * 2007-02-13 2007-08-08 北京搜狗科技发展有限公司 Intelligent word input method and input method system and updating method thereof
US20120297332A1 (en) * 2011-05-20 2012-11-22 Microsoft Corporation Advanced prediction
CN104182059A (en) * 2013-05-23 2014-12-03 华为技术有限公司 Generation method and system of natural language
CN104423623A (en) * 2013-09-02 2015-03-18 联想(北京)有限公司 To-be-selected word processing method and electronic equipment
CN104850241A (en) * 2015-05-28 2015-08-19 北京奇点机智信息技术有限公司 Mobile terminal and text input method thereof

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101013443A (en) * 2007-02-13 2007-08-08 北京搜狗科技发展有限公司 Intelligent word input method and input method system and updating method thereof
US20120297332A1 (en) * 2011-05-20 2012-11-22 Microsoft Corporation Advanced prediction
CN104182059A (en) * 2013-05-23 2014-12-03 华为技术有限公司 Generation method and system of natural language
CN104423623A (en) * 2013-09-02 2015-03-18 联想(北京)有限公司 To-be-selected word processing method and electronic equipment
CN104850241A (en) * 2015-05-28 2015-08-19 北京奇点机智信息技术有限公司 Mobile terminal and text input method thereof

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108664143A (en) * 2018-09-06 2018-10-16 上海二三四五网络科技有限公司 A kind of control method and control device handling context association input in input method system
CN110908523A (en) * 2018-09-14 2020-03-24 北京搜狗科技发展有限公司 Input method and device
CN110209765A (en) * 2019-05-23 2019-09-06 武汉绿色网络信息服务有限责任公司 A kind of method and apparatus by semantic search key
CN110309513A (en) * 2019-07-09 2019-10-08 北京金山数字娱乐科技有限公司 A kind of method and apparatus of context dependent analysis
CN110781288A (en) * 2019-10-30 2020-02-11 安阳师范学院 Method and device for composing words by Chinese characters
CN112987941A (en) * 2019-12-17 2021-06-18 北京搜狗科技发展有限公司 Method and device for generating candidate words
CN112987941B (en) * 2019-12-17 2024-02-13 北京搜狗科技发展有限公司 Method and device for generating candidate words

Also Published As

Publication number Publication date
CN108073292B (en) 2021-10-15

Similar Documents

Publication Publication Date Title
CN108073292A (en) A kind of intelligent word method and apparatus, a kind of device for intelligent word
US20190385599A1 (en) Speech recognition method and apparatus, and storage medium
CN105531758B (en) Use the speech recognition of foreign words grammer
WO2021128880A1 (en) Speech recognition method, device, and device for speech recognition
CN114596861A (en) Display device and method for question and answer
CN109710732B (en) Information query method, device, storage medium and electronic equipment
CN108008832A (en) A kind of input method and device, a kind of device for being used to input
CN108121736A (en) A kind of descriptor determines the method for building up, device and electronic equipment of model
CN107155121B (en) Voice control text display method and device
CN107918496A (en) It is a kind of to input error correction method and device, a kind of device for being used to input error correction
CN108255940A (en) A kind of cross-language search method and apparatus, a kind of device for cross-language search
CN108345612A (en) A kind of question processing method and device, a kind of device for issue handling
CN107870677A (en) A kind of input method, device and the device for input
CN108345608A (en) A kind of searching method, device and equipment
CN108255939A (en) A kind of cross-language search method and apparatus, a kind of device for cross-language search
CN107291704A (en) Treating method and apparatus, the device for processing
CN107564526A (en) Processing method, device and machine readable media
CN109815396A (en) Search term Weight Determination and device
CN108803890A (en) A kind of input method, input unit and the device for input
CN108628819A (en) Treating method and apparatus, the device for processing
CN107424612A (en) Processing method, device and machine readable media
CN109063182B (en) Content recommendation method based on voice search questions and electronic equipment
US12008988B2 (en) Electronic apparatus and controlling method thereof
CN108628461A (en) A kind of input method and device, a kind of method and apparatus of update dictionary
CN108073294A (en) A kind of intelligent word method and apparatus, a kind of device for intelligent word

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant