CN1043542C - A kanji conversion apparatus - Google Patents

A kanji conversion apparatus Download PDF

Info

Publication number
CN1043542C
CN1043542C CN93119055A CN93119055A CN1043542C CN 1043542 C CN1043542 C CN 1043542C CN 93119055 A CN93119055 A CN 93119055A CN 93119055 A CN93119055 A CN 93119055A CN 1043542 C CN1043542 C CN 1043542C
Authority
CN
China
Prior art keywords
word
syllable
conversion
chinese
register
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN93119055A
Other languages
Chinese (zh)
Other versions
CN1093184A (en
Inventor
周峻慧
谢明
林启杆
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Panasonic Holdings Corp
Original Assignee
Matsushita Electric Industrial Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Matsushita Electric Industrial Co Ltd filed Critical Matsushita Electric Industrial Co Ltd
Publication of CN1093184A publication Critical patent/CN1093184A/en
Application granted granted Critical
Publication of CN1043542C publication Critical patent/CN1043542C/en
Anticipated expiration legal-status Critical
Expired - Fee Related legal-status Critical Current

Links

Images

Landscapes

  • Document Processing Apparatus (AREA)
  • Machine Translation (AREA)

Abstract

To improve the efficiency for correct conversion into the KANJI (Chinese syllabary writings) required by a user of the KANJI conversion device for preparing the Chinese sentences. The syllable in which one of the inputted phonetic character string is not converted, is segmented as the object of the next KANJI conversion. Then, a dictionary retrieval section 17 retrieves whether or not the corresponding word is in a direction, by taking the syllable as a retrieval key. If there is no corresponding word, a consecutive word detecting section 20 detects whether or not the 1st syllable is consecutive characters such a preposition or conjunction. If it is the consecutive word, a dictionary retrieval section 17 takes the 2nd synllable as a retrieval key and retrieves whether or not word comprising the word having more number of KANJIs by integrating with the consecutive word in a dictionary section 18. If such a word is found, an expansion word generation section 19 combines the corresponding word with the consecutive word to make an expansion word with more KANJIs. Then, a conversion processing section 16 converts the expansion word into KANJI in prior to the word retrieved by the dictionary.

Description

Chinese-characters changing device
The present invention relates to the watch sound character string is transformed into the Chinese-characters changing device of Chinese character.
The Chinese character that uses in the Chinese article has more than 10,000.How therefrom accurately and at a high speed import desired Chinese characters such as Chinese article author, and how to be transformed into Chinese character be to comprise most important problem in the Chinese Computer Processing of Chinese character processor.Secondly, the input medium towards want Chinese-characters changing device in the past has voice recognition, character recognition, keyboard etc., wherein, with the keyboard input the most accurately and reliably, obtains to be extensive use of.Be divided into according to the input mode of Chinese-character pronunciation with according to the input mode of Chinese character pattern with the imputting Chinese characters of keyboard.And above-mentioned input mode according to font must be remembered input rule in advance, must change the considerable time in order to remember it, and more time-consuming before custom.On the other hand, also be widely adopted in the Japanese word processor according to the input mode of Chinese-character pronunciation, because of easily learning again naturally, estimation can become the main flow of imputting Chinese characters in the future.Therefore, the invention relates to the Chinese-characters changing device that adopts this pronunciation input mode.
The Chinese-characters changing device of in the past pronunciation input, the existing device given as Taiwan number of patent application 75105838.Fig. 6 is the pie graph of this Chinese-characters changing device.The 100th, the input part of input made in watch sound characters such as the phonetic of the Chinese character of conversion that the Chinese article author thinks, phonetic notation, Roman capitals, can import random length (watch sound number of characters).The 180th, the dictionary portion of (permanent storage) is registered in the conversion that the watch sound character string is corresponding with it together accordingly with word.And a plurality of Chinese words press the word stroke sum of Chinese word during corresponding to same watch sound character string, and usage frequency is arranged, when constituting word, arrange by the usage frequency of this Chinese character, the sequential scheduling of character code by a Chinese character, and by this priority conversion.Certainly, if transformation results is not that the importer wants, next word, Chinese character are exported in available other approach operation in order, and this is the same with the Japanese character processor.Also have, so-called watch sound character string, its be in nature a plurality of watch sound characters as one, be principle with word of conversion or Chinese character, therefore adds " string " and represent, but this notion also comprises a watch sound character.In the syllable string that this occurs afterwards etc. too.Also have, so-called word also comprises word of Chinese character, and much less, " Japan ", " Tokyo " may not be defined as the Chinese word.The 140th, the NCHAR register of the syllable number of storage input watch sound character string.120,130 is respectively PTR register and the NP register that uses when the watch sound character string is transformed into word, where PTR register 120 storage begins to carry out the Chinese character conversion from input watch sound character string, the content of NP register storage is the conversion word length when input watch sound character string is transformed into word, i.e. storage constitutes the Chinese character number or the syllable number (in the Chinese, being a Chinese character one syllable in principle) of word.The 150th, comparing section, it is after the conversion process of the word with certain length or perhaps certain formation Chinese character number is all carried out, subtract 1 by value, make the work of conversion control part, constitute in the Chinese character number at least 1 word and preferentially do the Chinese character conversion so that allow with above-mentioned NP register 130.The 160th, the conversion control part, its effect is that the desired location of above-mentioned PTR register 120 is passed successively backward by the original position of input watch sound character string, the formation Chinese character number of transforming object-word of setting according to NP register 130 is a syllable number, check the syllable whether Chinese character conversion is arranged, if one does not have conversion yet, and the registered word that correspondence is arranged just is transformed into this correspondence word in the dictionary portion 180.The 170th, the dictionary search part, the syllable string that above-mentioned conversion control part 160 is sent here is that key is retrieved dictionary portion 180 contents.The 190th, efferent, output is by the result of 160 conversion of above-mentioned conversion control part.
According to above formation and effect, with the corresponding Chinese word of watch sound character string of input part input, can be with the longest consensus method of what is called, be the 1st preferential promptly with the formation Chinese character number of word, the syllable of input is the 2nd preferential method earlier, and conversion shows Chinese character successively, and then makes Chinese article.
But, in the above-mentioned Chinese-characters changing device, be to be change of scale with the word that is registered in the dictionary portion, and with the longest consensus method conversion.Therefore, identical at candidate word length (constituting the Chinese character number), and when certain syllable in the watch sound character string that is transfused to and its front and back syllable constitute two corresponding words (preceding word and back word) respectively, preceding syllable word (preceding word) corresponding to input earlier preferentially carries out conversion, after treating this word conversion, corresponding to the syllable of back word, remove with the common syllable of preceding word syllable in addition and carry out conversion again, and most cases is that unit does conversion with a Chinese character.In this case, owing to be that the syllable that is left with conversion is the Chinese character conversion of object, such mistake conversion below can occurring on the Chinese character of the same syllable in dictionary is arranged.For example, when thinking input " having a day ", the unisonance Chinese character of " " has " benefit ", therefore with the corresponding front and back word that " useful " and " a day " arranged of watch sound character string of importing " having one day ".Therefore the preferential conversion of word " useful " before, conversion then " my god ", " useful day " such mistake conversion takes place.Also have, when exporting " killing a person is crime ", the unisonance Chinese character of " criminal " has " model ", the unisonance Chinese character of "Yes" has " showing ", owing to constitute " demonstration " this word with two Chinese characters, has " demonstration " and " crime " front and back word, the just preferential conversion of preceding word " demonstration ", there is the unisonance Chinese character of " crime " this Chinese character to have again " ", and arranges and go up more preferably, therefore " demonstration of killing a person " such mistake conversion just takes place than " crime ".In addition, Chinese " its feature " also can mistake be transformed into " peculiar levying ".Yet the Chinese article that might produce the conversion of such mistake all is registered in the dictionary in advance, is actually difficulty.Therefore people wish to have a kind of Chinese-characters changing device that prevents the conversion of this mistake.The present invention finishes under the purpose that solves this problem.
For achieving the above object, structure of the present invention has (1) part of the syllable of conversion or this syllable not as yet for input watch sound character string, allow the word that is transformed into max number of characters be the 1st preferential, the conversion earlier syllable of input is second preferential, reduce the syllable number of transforming object on this basis one by one, and the syllable of transforming object is moved successively backward, the syllable that cuts the syllable string that to become current transforming object cuts out portion, (2) register the dictionary portion that the watch sound character string reaches the Chinese word corresponding with it in advance, (3) cutting out the syllable string that becomes transforming object that portion cuts out with above-mentioned syllable is the retrieval key, retrieve above-mentioned dictionary portion content, detect the dictionary search part of Chinese word, (4) cut out in the syllable string that becomes transforming object that portion cuts out at above-mentioned syllable, if the 1st syllable is the phrase word, then detect the phrase word detecting element of this phrase word by institute's definite sequence, when (5) using predicate group word detecting element and detect corresponding phrase word, the syllable that begins with the 2nd syllable of the syllable string that becomes current transforming object is the retrieval key, retrieve dictionary portion content with above-mentioned dictionary search part, if corresponding word is arranged, then, generate the expansion word generating unit of the more expansion word of Chinese character number with this phrase word and this combinations of words, (6) control the conversion process portion that the preferential Chinese character of expansion word that above-mentioned expansion word generating unit is generated is transformed into the word that above-mentioned dictionary search part retrieved.
The present invention utilizes said structure, syllable cuts out portion to conversion or this syllable an as yet part in the watch sound character string that is transfused to, make the word first that is transformed into maximum number of words preferential, make the syllable second of conversion input earlier preferential, and reduce the syllable number of transforming object gradually, and allow syllable move backward successively, cut syllable string as current Chinese character transforming object.Register watch sound character string and the Chinese word corresponding in the dictionary portion in advance with it.The syllable string that the dictionary search part will be cut out is made the retrieval key, retrieval dictionary portion content, the Chinese word that retrieval is corresponding.Phrase word detecting element cuts out the syllable string that becomes transforming object that portion cuts out for above-mentioned syllable, if the 1st syllable is the phrase word, then detects this phrase word by institute's definite sequence.When expansion word generating unit is used predicate group word detecting element and is detected corresponding phrase word, make retrieval key with other syllable that second syllable of the syllable string that becomes current transforming object begins, retrieve dictionary portion content with above-mentioned dictionary search part, if corresponding word is arranged, just, generate the expansion word of word word with this phrase word and this combinations of words.The expansion word that conversion process portion generates above-mentioned extension preferentially is transformed into the word of above-mentioned dictionary search part retrieval.
Followingly the present invention is described according to embodiment.
Fig. 1 is the pie graph about Chinese-characters changing device one embodiment of the present invention.Fig. 2-Fig. 4 is the processing flow chart of this embodiment.Among Fig. 1, the 10th, import the input part of the watch sound character such as phonetic, phonetic notation, Roman capitals etc. of Chinese character that corresponding sub-Chinese article author wants conversion, article.The 18th, registration (permanent storage in advance) watch sound character string reaches the dictionary portion of the Chinese word corresponding with it.In addition, when a relative watch sound character string has a plurality of word, with these words using the high series arrangement of frequency, and preferential in this order Chinese character conversion.The 14th, the NCHAR register of expression input watch sound character string syllable number.Also have, because Chinese is syllable of a Chinese character correspondence, the syllable number of therefore importing the watch sound character string is the Chinese character number of conversion.12,13 is respectively PTR register and the NP register that uses when the watch sound character string is transformed into word.Here, where PTR register 12 storage input watch sound character strings from carrying out shifting one's position of Chinese character conversion.NP register 13 storage input watch sound character strings become the word length of transforming object when being transformed into word, i.e. storage constitutes Chinese character number, the syllable number of this word.That is to say, if the content of above-mentioned two registers is respectively ptr, np, represent that then from a ptr continuous np syllable of input watch sound character string be the retrieval key (following setting np, ptr and the value 7 of hereinafter nchar of appearance respectively being represented NP, PTR and NCHAR register) of dictionary when retrieving.The 11st, initialization of register portion.Its detailed operation content illustrates later on, it calculates the syllable number nchar of input watch sound character string, this value is set in the NCHAR register 14, and if this is worth big son when being registered in Chinese character that the longest word in the dictionary portion 18 promptly constitutes the maximum word of Chinese character table and counting max (being 8 in the present embodiment), above-mentioned max is set in the NP register 13, if below 8, then the syllable number with above-mentioned input watch sound character string is set in the NP register 13, and the value of PTR register 12 is set at 1.The 15th, whether continuous inspection (np+ptr)>(nchar+1) sets up, if set up, then the value with NP register 13 subtracts 1, makes the value of PTR register 12 set 1 comparing section once more for.Therefore, have only the word that 1 Chinese character constitutes also can be as the Chinese character transforming object, and beginning at first and can do the Chinese character conversion to the word as the number of words of its object from input watch sound character.The 21st, the phrase word dictionary portion of storing pronunciation He this literal of special phrase word.The 20th, serve as the retrieval key with ptr syllable importing the watch sound character string, the phrase word detecting element of retrieval phrase word dictionary portion 21.The 19th, detect corresponding phrase word when using predicate group word detecting element 20, then np-1 the syllable that begins with ptr+1 syllable importing the watch sound character string is the retrieval key, whether the word of retrieving corresponding dictionary portion 18 internal memories registers to dictionary search part 17, if corresponding word is registered, then by making the combination of this correspondence word and this phrase word, generate the word that the importer wanted conversion originally, promptly generate the expansion word generating unit of expansion word.The 16th, conversion process portion.Its detailed operation illustrates later on, its checks whether input watch sound character string has the syllable of conversion in continuous np the syllable of ptr syllable, if the syllable of conversion is arranged, the value of PTR register 12 of syllable that then should handle the not conversion of its back increases by 1, during the syllable of no conversion, and if corresponding word is arranged, then when the Chinese character conversion done in this word, on ptr, add np, if when not having corresponding word, then add 1 on the ptr of a syllabogram string after should processed in sequence.The 17th, the retrieval key made in the syllable that above-mentioned conversion process portion 16 or expansion word generating unit 19 are sent here, take out the word that meets from dictionary portion 18, if a plurality of words are arranged, then take out at first and be arranged as the highest word of possibility, again the word that takes out is delivered to the dictionary search part of above-mentioned conversion process portion 16 or expansion word generating unit 19.The 22nd, export the result's of 16 conversion of above-mentioned conversion process portion efferent.
The workflow of present embodiment below is described.In addition, Fig. 2 to Fig. 4 is a figure originally, and is limited because of picture dimension, is divided into 3 figure.Fig. 2 is with the workflow at the center that is initialized as of register, according to this figure its work is described.
(S1) will calculate the register zero setting of syllable number earlier.
(S2) input watch sound character string
(S3) whether the watch sound character of the current input of inspection is the tone key.If the tone key just enters (S4), if not just enter (S5).
(S4) with the increase of input syllable number, the content of calculating the register of syllable number is added 1, get back to (S2).
(S5) whether the watch sound character of the current input of inspection is the end of input key.If the end of input key, then entering can Chinese character conversion (S6).If not input key code is then got back to (S2).
(S6) check that whether the syllable number of current input is above 8.If more than 9, then enter (S7).If below 8, then enter (S8).
(S7) value of establishing the NP register is 8.
(S8) value of establishing the NP register is the syllable number of current input.
(S9) value of establishing the NCHAR register is the syllable number of current input.
(S10) value of establishing the PTR register is 1.
Like this, from input watch sound character string at first, select syllable number promptly to select constituting the many words of Chinese character number is the Chinese character transforming object, finishes initializing set.
Then, Fig. 3 and Fig. 4 are to be the workflow at center with conversion process portion, according to this figure its work are described.
(S11) cut and begin a continuous np syllable from ptr syllable of watch sound character string.
(S12) check with the syllable whether conversion is arranged in (S11) section syllable that goes out.If the syllable of conversion is arranged, then enters (S16).And this moment, the syllable of conversion does not become in the littler stage in the np value and becomes transforming object as yet.If do not have the syllable of conversion, then enter (S13).
(S13) whether retrieval is stored in the dictionary portion with (S11) section word that the syllable that goes out is corresponding.If have, then enter (S14), if do not have, then enter (S17).
(S14) continuous np syllable will importing ptr of watch sound character string beginning is transformed into the corresponding word that is detected by (S13), enters (S15) then.
(S15) on the value of PTR register, add np.This is for the process object of the syllabogram string that follows closely after (S14) Chinese character conversion and come as next Chinese character conversion.
(S16) on the value of PTR register, add 1.This is in order to allow the identical syllabogram string of next syllable number make the process object of Chinese character conversion.
(S17) value and 2 of present NP register is made comparisons.If, then enter (S18), if, then enter (S16) than 2 little than 2 big.
(S18) check by institute's definite sequence whether first syllable by (S11) section syllable that goes out is the pronunciation of phrase word.If the phrase character pronunciation then enters (S19), if not, then enter (S16).
(S19) check by whether being registered in the dictionary portion from np-1 the pairing word of syllable that second syllable begins in (S11) section syllable that goes out.If registered, then enter (S20).If do not have, then enter (S16).
(S20) the phrase word that (S18) detected fuses with the word that (S19) detects, and is combined into the expansion word of a so-called word, will be transformed into the expansion word from ptr the continuous np of a beginning syllable of input syllable, enters (S15) then.
(S21) size of the value of inspection (np+ptr) and value (nchar+1).If when the former is big, enter (S22).If the former is little, then enter (S11).
(S22) value with NP register 13 subtracts 1, and enters (S23) after the value of PTR register 12 is set at 1.This is that to constitute the Chinese character number be that 1 word that lacks number of words is like this handled as the object of the next Chinese character conversion that begins at first in the input syllabogram string in order to allow.
(S23) whether the value of inspection NP register 13 is zero.If zero, then the Chinese character conversion process of watch sound character string is imported in end.If non-vanishing, then return (S11).
The phrase word dictionary portion of below explanation relation crucial portion of the present invention.
Fig. 5 is the synoptic diagram of data structure of the phrase word dictionary portion of present embodiment.This dictionary is made up of the corresponding tables of basic phrase character pronunciation and corresponding its phrase word.Present embodiment the preposition of Chinese " ", " from ", " general " etc., the speech that continues " then ", " if ", " with ", " with " wait, negative word "Yes", " no ", " having " etc. and appointment speech " its ", " being somebody's turn to do " etc. are all regarded the phrase word as certainly.And in the Chinese, these phrase words are connected with word, constitute the more word of Chinese character number of words.In addition, in this instructions the word that so constitutes is called the expansion word, for example " from three years old ", " in the Taibei ", " unnecessary ", " having one day " etc.These expand word, are regarded as a word (so-called idiom) in the present embodiment.And, preferential conversion in adopting this Chinese-characters changing device of the longest consensus method.
Below with concrete word do the above-mentioned formation of example explanation present embodiment (below, owing to be to file an application by the electronic intelligence disposal system, so spendable character is restricted, this instructions is used similar " sha " respectively from being convenient to understanding, " ren/ ", " shi ", " fan ", " zui ", " de. " replaces Fig. 7 (a), (b), (c), (d), (e), (f) watch sound character and the mark shown in, execution regulations according to the electronic intelligence disposal system, this also originally is to be the invention of object with Chinese written language processor etc., does not therefore infer outer literal quilt and is used in a large number again loaded down with trivial detailsly.〕
Explanation now " sha ren/shi fan zui de " work of input of character string aspect.Below expression to should input of character string, be stored in whole words in the dictionary, might conversion.
" sha ren/ kills a person "
" ren/shi occurrences in human life "
" fan zui demonstration "
" zui de. crime "
Describe according to flow process again.
When above-mentioned watch sound character string is transfused to (S2), initialization of register portion just should import the watch sound character string according to the tone signal and obtain syllable number (S3-S5) by the syllable segmentation.Because above-mentioned input of character string has six syllables, so syllable number 6 is set in (S9) in the NCHAR register.And, because of Chinese character several 8 (in present embodiment be 8) little (S6s) of this value, therefore establish 6 initial values (S8) as the NP register than the longest word in the dictionary portion, the value of establishing the PTR register is 1 (S10).Here, the effect of NP register is to store the Chinese character number of the word that goes for by current conversion.At this moment, according to initial setting input 6, be that 6 word begins successively the object as the Chinese character conversion from the syllabication number.Again, the value ptr of PTR register is the reference position of current dictionary retrieval, represents ptr syllable of above-mentioned input watch sound character string.At this moment, because of initial setting is input as 1, so begin to become successively the object of Chinese character conversion from the initial syllable of watch sound character string.
Behind the initial value according to input syllable number setting PTR register, NP register, NCHAR register, elder generation of conversion process portion is according to the value of PTR register and NP register, the 6th continuous syllable string of 1-" sha ren/shi fan zui de " of the watch sound character string that input part is sent here cuts (S11), and then checks the syllable (S12) whether Chinese character conversion is wherein arranged.If all not conversion of all syllables, and these syllable strings are inputs just, because of neither one by the Chinese character conversion, so these six syllables are all delivered to the dictionary search part as the retrieval key, make it to retrieve dictionary portion content (S13).Owing to do not have corresponding word, and whether the value of current NP register is phrase character pronunciation (S18) than 2 big (S17) so check first syllable.Judge when the 1st syllable is not the phrase word, just allow the value of PTR register only add 1 (S16).At this moment, comparing section is judged the value big (S21) of the value of (np+ptr) than (nchar+1), indicates to carry out the conversion of the word of 6 syllable number, can not cut out other syllable string.Therefore, allow the value of NP register subtract 1, become 5, the value of establishing PTR once more is 1 (S22).Because the value of NP register is not zero (S23), enter the work with the Chinese character conversion process of section going out of carrying out next syllable continuously.Equally, according to the new value 1 of PTR register and the value 5 of NP register, initial cut " the sha ren/shi fan zui " that, this is not also by the Chinese character conversion, therefore add 1 by ptr and become 2 (S16), then cut " ren/shi fan zui de " (S11) also be still the syllable (S12) that does not have conversion, corresponding word is not stored in dictionary portion (S13), the 1st pronunciation (S18) that syllable is not the phrase word in addition be not so carry out the Chinese character conversion process.Just the value of PTR register adds 1 (S16) again." ren/shi fan zui de " be considered to not to be (S13) behind the object of Chinese character conversion, the value of PTR register adds 1 again and becomes 3.At this moment, comparing section is judged (np+ptr)>(nchar+1) (S21), and judgement will be carried out 5 word conversion that constitute the Chinese character number, can not cut out other syllable string of back.Therefore the value of NP register subtracts 1 and becomes 4, and the value that resets the PTR register is 1 (S22).Because the value non-vanishing (S23) of NP register, then set about carrying out next syllable continuously cut and the Chinese character conversion process.The value of NP register is 4 o'clock, identical with 5 situations, value according to PTR register and NP register, therefore (S11) all not conversion as yet (S12) of continuous syllable string " sha ren/shi fan " " ren/shi fan ziu " of forming of 4 syllables that go out from input watch sound character truncation successively serve as that the retrieval key is retrieved dictionary portion content with them.Because there is not their word (S13) of correspondence, and the 1st pronunciation (S18) that syllable is not the phrase word, these syllables are not as Chinese character conversion process object.In this stage, make the value 2 of PTR register increase by 1 (S16) again.When the value of PTR register becomes 3, then cut out the 3rd watch sound character string " shi fan ziu de " (S11), this remains as yet the syllable (S12) of not conversion.In this stage, the word of corresponding this syllable string is not stored in (S13) in the dictionary portion, and the value of NP register checks whether the 1st syllable is phrase character pronunciation (S18) than 2 big (S17) so become.And because of pronunciation " shi " registered in phrase word dictionary, so will whether have corresponding word to be stored in (S19) in the dictionary portion to other syllable " fan zui de " retrieval.Yet owing to do not have the corresponding word of registration, the generation of expansion word much less, the Chinese character conversion is also not all right.Then, the value of PTR register adds 1 and becomes 4 (S16).At this moment, judge (np+ptr)>(nchar+1) (S21), judge: carry out the word conversion of 4 syllable number, can not cut out other syllable string that is positioned at the rear by comparing section.Therefore, the NP register subtracts 1 and becomes 3.And the value of PTR register is set as 1 (S22) again.At this moment because the value non-vanishing (S23) of NP register, carry out next syllable continuously cut and conversion.
The value of NP register is that 3 situation is also identical with 6,5,4 situation, according to the effect and the setting value of PTR register and NP register, cut out continuous syllable string " sha ren/shi ", " ren/shi fan " successively (S11) from input watch sound character string.At this moment, these syllables are not conversion as yet (S12) all, becomes with these syllables to make retrieval key retrieval dictionary portion content, but does not have their word (S13) of correspondence, and because the 1st pronunciation (S18) that syllable is not the phrase word can not carry out the Chinese character conversion process.Just the value of PTR register adds 1 (S16) one by one.The value of PTR register is 3 o'clock shi fan zui that cut out " (S11) also not conversion as yet (S12); and do not have corresponding word (S13) in the dictionary portion; because than 2 big (S17), becoming, the value of NP register checks whether first syllable is the pronunciation (S18) of phrase word again.At this moment, in the phrase word dictionary portion corresponding pronunciation is arranged " shi " the word "Yes".Therefore retrieve and whether register in the dictionary portion and the corresponding word (S19) of other syllable " fan zui ".Because registration has corresponding word " crime ", make the "Yes" and corresponding word " crime " combination of phrase word, generate " being crime " this expansion word (S20).So, " shi fan zui " be transformed into the expansion word " being crime ".Then the value of PTR register adds the value of NP register, so the value of ptr is 6 (S15).Therefore, judge (np+ptr)>(nchar+1) (S21) with comparing section.Judge thus: carry out the word conversion of 3 syllable number, other syllable truncation of back can not be gone out, it is 2 that the value of NP register subtracts 1.And the value of establishing the PTR register again is 1 (S22).At this moment, because the value non-vanishing (S23) of NP register, can do section going out and the Chinese character conversion process of next continuous syllable.
The value of NP register is 2, the value of PTR register is 1 o'clock, and continuous syllable string " sha ren/ " is cut (S11), and because of being the syllable of conversion (S12) not as yet, the dictionary search part serve as to retrieve key to retrieve dictionary portion content with it.Because the corresponding word of storage " kills a person " in the dictionary portion, and " sha rer/ " is transformed into " killing a person ", makes the value of PTR register add that the value of NP register becomes 3 (S15).At this moment, because comparing section is judged: (np+ptr) than (nchar+1) little (S21), carry out the conversion of the word of 2 Chinese character numbers, other syllable truncation can be gone out, so then cut continuous syllable string " shi fan ", " fan zui ", " zui de " (S11).Yet these syllables have comprised the syllable (S12) of conversion, so can not do any processing.
Then, the value of NP register is 1 o'clock, because no matter whether the value of PTR register is 1,2,3,4 or 5, cuts the syllable conversion entirely of going out, so can not do any processing.If the value of PTR register is 6,,, retrieve dictionary portion content so it is delivered to the dictionary search part because of " de " not conversion as yet that this stage cuts out.In Chinese article, in the Chinese character of corresponding " de " sound " " usage frequency the highest, come before, so it carries out the Chinese character conversion at first, " de " is transformed into " ".On this basis, the value that makes the value of PTR register add the NP register is 7 (S15).At this moment, comparing section judges that (np+ptr) is than (nchar+1) big (S21).Therefore, to subtract 1 be zero to the value of NP register.Judge that then the Chinese character conversion process of the input watch sound character string of conversion process portion finishes, sends " killing a person is crime " this transformation results to efferent so far.
With embodiment the present invention has been described above, self-evident, the present invention is not placed restrictions on by any of the foregoing description.That is to say, for example
(1) may not just begin later all watch sound character strings of input watch sound character string are transformed into Chinese character at the end of input keyed jointing, even import the tone key, then also can will carry out conversion since end of input position to the current watch sound character string that is transfused to that imported the watch sound character string last time whenever syllable of input.
(2) also can utilize mark to represent the phrase word, and these phrase words and this mark are stored in the dictionary portion accordingly, replace with this phrase word is stored in the phrase word dictionary independently.
(3) according to situations such as manufacturings, can with each the formation portion physical property shown in the claim item be divided into a plurality of parts, also can with a plurality of formation portion physical property merge into 1 part, can also be with they appropriate combination.For example the section processes of PTR register and NP register and comparing section and conversion process portion 16 is equivalent to the processing that syllable cuts out portion among the embodiment.
(4) notion of watch sound character is not only limited to illustrative phoneme character, syllabogram etc., comprises the phoneme mark yet, and the literal such as proverb literary composition of the assumed name of Japanese, Korea's literary composition also can become the object of Chinese character conversion.
(5) aspect purposes, much less also comprise Chinese character word processor etc. in the Japanese.
(6) can also add learning functionality etc. certainly.
As mentioned above, according to the present invention, when input syllable string is transformed into Chinese text (Chinese character), for in wanting the syllable of conversion, preposition corresponding to Chinese, conjunction, the pronunciation of the phrase word of characteristics such as speech, negative word, appointment speech are certainly arranged, and on continuous syllable thereafter, have the word that constitutes the many words of Chinese character number with this phrase word one, then by making this phrase word and this combinations of words, become transforming object, generating many Chinese characters also is polysyllabic expansion word.The Chinese character conversion is preferential to constitute the many words of Chinese character number, therefore for be the difficulty that exists under the preferential conversion principle only in the past, can both solve easily as mistake conversion to " having one day ", " killing a person is crime ", " its feature " etc. to constitute Chinese character number and previous word.Therefore, the accuracy rate height of the Chinese character conversion when making Chinese article, its technical effect is very big.
Fig. 1 is the pie graph of Chinese-characters changing device one embodiment of the present invention;
Fig. 2 is the initialization process flow diagram of register in the foregoing description;
Fig. 3 is with two process flow diagrams as the center of conversion process portion in the foregoing description;
Fig. 4 is that the work with conversion process portion is the process flow diagram at center in the foregoing description;
Fig. 5 is the data structure synoptic diagram of phrase word dictionary in the foregoing description;
Fig. 6 is the pie graph of Chinese-characters changing device in the past;
Fig. 7 is the watch sound character string complete list that uses among the explanation embodiment.
Symbol is respectively following implication among the figure.10: input part, 11: initialization of register portion, 12:PTR register, 13:NP register, 14:NCHAR register, 15: comparing section, 16: conversion process portion, 17: dictionary search part, 18: dictionary portion, 19: expansion word generating unit, 20: phrase word detecting element, 21: phrase word dictionary portion, 22: efferent.

Claims (1)

1. Chinese-characters changing device is characterized in that it has:
With the part of the watch sound character string of the input syllable of conversion not as yet, be benchmark, syllable number successively decreased successively cut to cut out portion with syllable as current Chinese character transforming object with maximum word length that can conversion:
Store the dictionary portion of the Chinese word of watch sound character string and correspondence thereof in advance;
Above-mentioned syllable is cut out syllable that portion cuts out is the conversion key, detect Chinese word from above-mentioned dictionary portion dictionary search part;
For above-mentioned syllable cut out portion cut out in the syllable of wanting conversion, if first syllable is the phrase word detecting element that phrase word person is detected;
When above-mentioned phrase word detecting element retrieves corresponding phrase word, above-mentioned dictionary search part then will want second all the other syllable that rise of the syllable of conversion to be retrieval key retrieval dictionary portion at present, the expansion word generating unit of then this phrase word and this group of words being synthesized a bigger expansion word if any corresponding word;
The conversion process portion of the preferential conversion of word that the word that control expansion word generating unit is generated retrieves than dictionary search part.
CN93119055A 1993-04-01 1993-10-22 A kanji conversion apparatus Expired - Fee Related CN1043542C (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
JP05-075912 1993-04-01
JP075912/93 1993-04-01
JP07591293A JP3234338B2 (en) 1993-04-01 1993-04-01 Kanji conversion device

Publications (2)

Publication Number Publication Date
CN1093184A CN1093184A (en) 1994-10-05
CN1043542C true CN1043542C (en) 1999-06-02

Family

ID=13590022

Family Applications (1)

Application Number Title Priority Date Filing Date
CN93119055A Expired - Fee Related CN1043542C (en) 1993-04-01 1993-10-22 A kanji conversion apparatus

Country Status (2)

Country Link
JP (1) JP3234338B2 (en)
CN (1) CN1043542C (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2218669B1 (en) 2007-11-21 2018-12-26 Toyota Jidosha Kabushiki Kaisha Web carrier, web carrying method, and web carriage control program

Also Published As

Publication number Publication date
JP3234338B2 (en) 2001-12-04
JPH06290183A (en) 1994-10-18
CN1093184A (en) 1994-10-05

Similar Documents

Publication Publication Date Title
KR100656736B1 (en) System and method for disambiguating phonetic input
CN1260704C (en) Method for voice synthesizing
EP0440197B1 (en) Method and apparatus for inputting text
CN1230764C (en) Equipment, method, computer system and storage medium for speed identification
CN1227657A (en) Natural language parser with dictionary-based part-of-speech probabilities
CN110555091A (en) Associated word generation method and device based on word vectors
CN1043542C (en) A kanji conversion apparatus
CN1136496C (en) Simplified spelling-touching screen mouse chinese character input method
JP6998017B2 (en) Speech synthesis data generator, speech synthesis data generation method and speech synthesis system
CN1043490C (en) Muti-word exchanging apparatus and Chinese character exchanging apparatus
CN1186708C (en) Chinese characters inputting method and its apparatus
CN1257444C (en) Complete pronunciation Chinese input method for computer
CN1043821C (en) Chinese character shifting apparatus
CN1069420C (en) Method for inputting Chinese characters by using their pronunciations and shapes
JPS60132265A (en) "kana" "kanji" converting device
CN1048341C (en) Fuzzy character transtormer
CN1151540A (en) 4-in-one code computer Chinese character coding input method
CN1272655A (en) English-Chinese translation machine
JP2766084B2 (en) Kana-Kanji conversion method
JPH0916575A (en) Pronunciation dictionary device
JP3048793B2 (en) Character converter
JPH0760434B2 (en) Kanji converter
CN1061293A (en) English code compression method and the keyboard of keying in fast
JPH03225462A (en) Roman character/kanji converter
CN1156853A (en) Method for quickly searching tone sequence code and from voice to lexicon

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
C19 Lapse of patent right due to non-payment of the annual fee
CF01 Termination of patent right due to non-payment of annual fee