CN1614584A - Electronic dictionary and its data structure forming method and spelling information determining method - Google Patents

Electronic dictionary and its data structure forming method and spelling information determining method Download PDF

Info

Publication number
CN1614584A
CN1614584A CN 200310114889 CN200310114889A CN1614584A CN 1614584 A CN1614584 A CN 1614584A CN 200310114889 CN200310114889 CN 200310114889 CN 200310114889 A CN200310114889 A CN 200310114889A CN 1614584 A CN1614584 A CN 1614584A
Authority
CN
China
Prior art keywords
phonetic
character
phrase
acquiescence
chinese
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN 200310114889
Other languages
Chinese (zh)
Inventor
杨大为
金浩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Canon Inc
Original Assignee
Canon Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Canon Inc filed Critical Canon Inc
Priority to CN 200310114889 priority Critical patent/CN1614584A/en
Publication of CN1614584A publication Critical patent/CN1614584A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Document Processing Apparatus (AREA)

Abstract

A Chinese medicine dictionary and method for forming its data structure includes confirming default phonetic alphabet (DPA) for each Chinese character and setting up non-default phonetic alphabet list (NDPAL) for Chinese character with multiple phonetic alphabet, forming auxiliary phonetic alphabet information which presents that phonetic alphabet ni NDPAL is selected when word phonetic alphabet can not be formed by DPA of each Chinese character in word including the word having Chinese character with multiple phonetic alphabet.

Description

Electronic dictionary and data structure formation method thereof, Pinyin information are determined method
Technical field
The present invention relates to employed data structure formation method in the Chinese electronic dictionary of storage Chinese word spelling message breath, the Chinese electronic dictionary that comprises the storer that stores formed data structure, and the method for the definite Chinese word spelling sound that in described Chinese electronic dictionary, uses.
Background technology
Electronic dictionary has obtained widespread use in recent years.Electronic dictionary is a kind of equipment that disposes storer, wherein, stores a large amount of phrase information and index information, so that search relevant Chinese phrase information.
For most electronic dictionary system, Pinyin information is very important a part of information in the phrase information.Generally, each Chinese phrase has its oneself Pinyin information in electronic dictionary, and Pinyin information that should the Chinese phrase generally directly is stored in the electronic dictionary.For example, the phrase phonetic of " mingling with men of letters and pose as a lover of culture " is " fu4yong1feng1ya3 ".Therefore, in traditional electronic dictionary, for phrase " was mingled with men of letters and pose as a lover of culture ", its Pinyin information " fu4yong1feng1ya3 " directly was stored in after this speech, and will take 16 bytes of memory spaces.
According to above-described storage means, the Pinyin information of phrase will occupy quite a few storage space of electronic dictionary.But because the relation of capacity and weight, the storage space of electronic dictionary generally all is restricted.
Therefore, utilize storer will become extremely important in mode more efficiently.Particularly under the situation in electronic dictionary being installed to the very limited equipment of storer, for example PDA saves the storage requirement that is used for electronic dictionary and will have more value.
Summary of the invention
An object of the present invention is to address the above problem, provide a kind of data structure formation method that is used for Chinese electronic dictionary, the aid pronunciation information of wherein storing phrase; The present invention also provides a kind of Chinese electronic dictionary that comprises the storer that stores formed data structure, so that more effectively store Pinyin information and improve storage space efficient.
Another object of the present invention is to address the above problem, and a kind of method that is used for determining Chinese word spelling sound in Chinese electronic dictionary is provided, and described Chinese electronic dictionary stores the aid pronunciation information of phrase, thereby improves the efficient of storage space.
To achieve these goals, according to an aspect of the present invention, provide a kind of data structure formation method that in Chinese electronic dictionary, is used for the Pinyin information of phrase, comprise step: the acquiescence phonetic of determining each Chinese character; For the Chinese character with a plurality of phonetics is created non-acquiescence pinyin table; And forming aid pronunciation information, this aid pronunciation information shows when the phonetic of described phrase can not be formed by the acquiescence phonetic that is included in each Chinese character in the described phrase, be included in the phonetic selection of Chinese character from described non-acquiescence pinyin table that has a plurality of phonetics in the described phrase.
To achieve these goals, according to a further aspect in the invention, provide a kind of electronic dictionary with storer, this storer comprises: the acquiescence phonetic of each Chinese character; Be used to have the non-acquiescence pinyin table of the Chinese character of a plurality of phonetics; And aid pronunciation information, show when the phonetic of phrase can not be formed by the acquiescence phonetic that is included in each Chinese character in the described phrase, be included in the phonetic of Chinese character from described non-acquiescence pinyin table that has a plurality of phonetics in the described phrase and select.
To achieve these goals, in accordance with a further aspect of the present invention, a kind of method that is used for the phonetic of definite Chinese phrase in Chinese electronic dictionary is provided, described Chinese electronic dictionary stores the aid pronunciation information of phrase, the method comprising the steps of: obtain aid pronunciation information, the phonetic that described aid pronunciation information shows when the phonetic of phrase can not be formed by the acquiescence phonetic that is included in each Chinese character in the described phrase, be included in the Chinese character that has a plurality of phonetics in the described phrase is selected; And, select its phonetic can not directly give tacit consent to the phonetic of the character of pinyin representation with it according to the aid pronunciation information of being obtained.
The present invention also provides a kind of computer program that is recorded at least a computer-readable medium, comprises the program code of the method for the data structure formation method that is used for carrying out above-mentioned Chinese electronic dictionary and the above-mentioned phonetic of determining Chinese phrase in the Chinese electronic dictionary of the aid pronunciation information that stores phrase.
Description of drawings
After the detailed description of the preferred embodiment, other purpose of the present invention, feature and advantage will be clearer below having known.The accompanying drawing that constitutes this instructions part illustrates embodiments of the invention, and is used from the principle of the present invention of explaining with following description one.Wherein:
Fig. 1 illustrates the process flow diagram that is used for the data structure formation method of Chinese electronic dictionary according to of the present invention;
Fig. 2 illustrates the example schematic diagram that is stored in the acquiescence phonetic and the non-acquiescence phonetic of some Chinese characters in the Chinese electronic dictionary according to of the present invention;
Fig. 3 illustrates the example schematic diagram according to the data structure of the aid pronunciation information that is stored in some phrases in the Chinese electronic dictionary of the present invention;
It is example that Fig. 4 illustrates with Chinese phrase shown in Figure 3, traditional Pinyin information and according to the occupied storage size contrast of aid pronunciation information of the present invention; And
Fig. 5 illustrates the process flow diagram according to the method for the phonetic that is used for determining Chinese phrase in Chinese electronic dictionary of the present invention.
Embodiment
Describe the preferred embodiments of the present invention in detail below in conjunction with accompanying drawing.
Most of Chinese characters only have a phonetic, and for example, the phonetic of character " cake " is " bing3 ".But some Chinese character has a plurality of phonetics, and for example, the phonetic of character " poor " has " cha4-cha1-chai1-ci1 ".In some cases, the pronunciation of the tone of some characters in phrase will become softly, and for example, the phonetic of character " heat " is " re4 ", and the phonetic that character " makes a noise " is " nao4 ", but the phonetic that phrase " is livened up " but is " re4nao5 ".
At said circumstances, in Chinese electronic dictionary according to the present invention, be phonetic that is called " acquiescence phonetic " of all Chinese characters (comprising Chinese character) storage with a plurality of phonetics.Have the Chinese character of a plurality of phonetics for those, in Chinese electronic dictionary, also further create and store one and be called the question blank of " non-acquiescence pinyin table " for it.
When showing single Chinese character, its acquiescence phonetic will show on screen with this single Chinese character simultaneously.
But for the Chinese phrase that comprises more than a Chinese character, the method according to this invention, only store the aid pronunciation information of this phrase in Chinese electronic dictionary, described aid pronunciation information takies the seldom several bytes of storage space, even 1 can not take.Different with the electronic dictionary of the Pinyin information of traditional direct storage phrase, by using, can utilize the storage space of electronic dictionary efficiently according to this structure of the present invention.
Below with reference to flow chart description shown in Figure 1 used data structure formation method in according to the Chinese electronic dictionary of aid pronunciation information of only storing phrase of the present invention.
Fig. 1 is illustrated in the process flow diagram according to the data structure formation method in the Chinese electronic dictionary of the aid pronunciation information of storage phrase of the present invention.
As shown in Figure 1, treatment scheme is from step SP101.At step SP102, calculate the quantity N that is included in the Chinese character in the phrase, so that determine its aid pronunciation information.
Then, at step SP103, whether determine to be included in the quantity N of the Chinese character in this phrase greater than 1.Be not more than 1 if be included in the quantity N of the Chinese character in this phrase, then treatment scheme advances to step SP114, shows to store aid pronunciation information in addition for this speech, and treatment scheme finishes.
If the quantity N that determines to be included in the Chinese character in the described phrase in step SP103 greater than 1, then handles distance and advances to step SP104.At step SP104, judge whether all Chinese characters that are included in this phrase only have acquiescence phonetic, and do not have the possibility of other selection.If all Chinese characters of determining to be included in this phrase are all sent out its acquiescence phonetic, and do not have the possibility of other selection, then treatment scheme advances to step SP106.At step SP106, be empty with the aid pronunciation information setting of this phrase, treatment scheme advances to step SP114 then, and treatment scheme is returned, and will store phrase in the Chinese electronic dictionary into so that handle other.That is to say,, will no longer preserve any information relevant, can save the storage space of electronic dictionary thus with the phonetic of this phrase if the phonetic of a phrase can be formed by " the acquiescence phonetic " of character in this phrase.
Yet, if in step SP104, determine not all Chinese character that is included in the described phrase acquiescence phonetic is only arranged, that is to say, at least one is included in, and Chinese character in this phrase has a plurality of phonetics or its tone will become softly in this phrase, and then treatment scheme advances to step SP105.
At step SP105, the initial value that increases progressively variable M of the position of representing character in this phrase is made as 1.
Afterwards, at step SP107, obtain m character of this phrase.Then, judge whether that in step SP108 the phonetic of this m character is its acquiescence phonetic, and do not have the possibility of other any selection.If determining the phonetic of this m character should be its acquiescence phonetic, then the aid pronunciation information of described phrase is not done any change, and treatment scheme advances to step SP111.
Be not its acquiescence phonetic and other selection is arranged that promptly, this m character may have a plurality of phonetics or its tone may become softly in this phrase if determine the phonetic of this m character in step SP108, then treatment scheme advances to step SP109.At step SP109, the m position in the preceding N position of the aid pronunciation information of this phrase is set to 1.Here suppose to comprise N Chinese character in the phrase, and the phonetic of which character of preceding this phrase of N bit representation of aid pronunciation information is not sent out its acquiescence phonetic.
Then, at step SP110, according to the aid pronunciation information of this m the character of formatting that describes below.After the phonetic of expression character is not the preceding N position of acquiescence phonetic, be that its phonetic is not the aid pronunciation information of giving tacit consent to each character of phonetic in the aid pronunciation information.Use 4 aid pronunciation information of enough preserving a Chinese character.In these 4, preceding 3 phonetics that are used for selecting character from " non-acquiescence pinyin table ", and last 1 be " sign softly ".After the above-mentioned setting of the aid pronunciation information of finishing this character, treatment scheme advances to step SP111.
At step SP111, the value of M increases progressively 1.In step SP112, the value of M and the value of N are relatively then.If the value of M is not more than the value of N, then treatment scheme is returned step SP107, handles the character late that is included in the phrase.Processing procedure and above-described similar.
If the value of determining M is greater than N, then treatment scheme advances to step SP113.In step SP113, the aid pronunciation information of phrase is adjusted, so that it has the integral words joint.That is to say,, then add invalid bit, so that aid pronunciation information has the integer byte at its end if the length of the effective information of aid pronunciation information is not the integer byte.
Afterwards, in this phrase back, treatment scheme is returned at step SP114 then with formed aid pronunciation information stores, continues to handle other phrase that will store in the Chinese electronic dictionary.
The example that is stored in the aid pronunciation information of some phrases in the Chinese electronic dictionary according to the present invention is shown among Fig. 2 and Fig. 3.
Fig. 2 is illustrated in according to the acquiescence phonetic of some Chinese characters of storing in the Chinese electronic dictionary of the present invention and non-acquiescence phonetic.Fig. 3 is illustrated in the data structure according to the aid pronunciation information of some phrases of storing in the Chinese electronic dictionary of the present invention.
With reference to Fig. 2 and Fig. 3, for phrase " bagpipe ", the phonetic of two word all is " acquiescence phonetic ", and therefore, its aid pronunciation information is empty.And " liven up " for phrase, because the phonetic that character " makes a noise " in phrase " is livened up " can obtain softly by its acquiescence phonetic " hao4 " is become, therefore, its aid pronunciation information is " 01000100 ".For phrase " hand drill ", because the phonetic of character " brill " is first " non-acquiescence phonetic ", therefore, the aid pronunciation information of phrase " hand drill " is " 00100100 ".
For phrase " barely satisfactory ", because the phonetic of character " poor " " the non-acquiescence phonetic; and the phonetic of character " by force " is second " non-acquiescence phonetic "; so the aid pronunciation information of character " poor " should be " 0010 ", and the aid pronunciation information of character " by force " should be " 0010 " that is first.In addition, the front two of aid pronunciation information should be made as " 1 ", and 12 significance bits are used to represent the Pinyin information of phrase " barely satisfactory " altogether.Thus, need add 4 invalid informations in these 12 significance bit back, so that aid pronunciation information is the integer byte, thereby the aid pronunciation information of phrase " barely satisfactory " is " 1100001000100000 ".
" the acquiescence phonetic " of those Chinese characters above-mentioned and " non-acquiescence phonetic " are shown in Fig. 2 as example.In addition, the aid pronunciation information of those Chinese phrases above-mentioned is shown in Fig. 3 as example.
Fig. 4 illustrates for Chinese phrase shown in Figure 3 as example, the comparative result of the storage space that traditional Pinyin information and aid pronunciation information according to the present invention are shared.
With reference to Fig. 4, the new Pinyin information of phrase " bagpipe " (aid pronunciation information) is empty, occupies 0 bytes of memory space, but its initial size is 8 bytes in traditional electronic dictionary.Therefore, can save 8 bytes of memory spaces." liven up " for phrase, its new Pinyin information (aid pronunciation information) is " 01000100 ", occupies 1 byte, but its initial size is 7 bytes in traditional electronic dictionary.Therefore, can save 6 bytes.
Similarly, the aid pronunciation information of phrase " hand drill " is " 00100100 ", occupies 1 byte, therefore, compares with the initial size of its 14 byte in traditional electronic dictionary, can save 13 bytes.For phrase " barely satisfactory ", its aid pronunciation information is " 1100001000100000 ", occupies 2 bytes, thus, compares with the initial size of its 17 byte in traditional electronic dictionary, can save 15 bytes of memory spaces.
From top comparative result as can be seen, according to storage means of the present invention, can store Pinyin information more effectively, and can improve the storage space efficient of storer.For example, at the basic dictionary that is used for NLP (natural language processing) module that proposes by the inventor, nearly altogether 70000 phrases.In these 70000 phrases, the longest phrase comprises 4 Chinese characters.And in these phrases, 33525 phrases do not comprise the Chinese character with a plurality of phonetics, and 28014 phrases only comprise 1 Chinese character with a plurality of phonetics.This means that at least 33525 phrases do not need to store its Pinyin information, and the aid pronunciation information of 28014 phrases only occupies 1 byte.Therefore, compare with traditional method, according to the present invention, the storage space that is used to store Pinyin information can significantly reduce.
Below, describe the method for the phonetic be used for determining Chinese phrase with reference to Fig. 5, and the aid pronunciation information of these phrases is stored in the electronic dictionary in advance as mentioned above.
Fig. 5 illustrates the process flow diagram of the method for the phonetic that is used for definite Chinese phrase according to the present invention in the Chinese electronic dictionary of the aid pronunciation information of storing phrase.
As shown in Figure 5, treatment scheme is from step SP201.At step SP202, calculating will be determined Chinese character quantity N included in the phrase of its Pinyin information.
Then, at step SP203, whether determine to be included in Chinese character quantity N in this phrase greater than 1.Be not more than 1 if be included in the quantity N of the Chinese character in this phrase, mean that then this phrase only comprises 1 Chinese character, treatment scheme advances to step SP216.In step SP216, obtain the acquiescence Pinyin information of the Chinese character that is included in this phrase.Then, treatment scheme advances to step SP214, process ends.
If the quantity N that determines to be included in the Chinese character in this phrase at step SP203 is greater than 1, then treatment scheme advances to step SP204.In step SP204, judge whether the Chinese character that is included in this phrase all only has 1 acquiescence phonetic, and do not have other possibility.That is to say, judge whether the aid pronunciation information that is stored in the electronic dictionary is empty.If the aid pronunciation information that is stored in the electronic dictionary is sky, all Chinese characters of then determining to be included in this phrase are all only sent out its acquiescence phonetic, and do not have other possibility to select, and treatment scheme advances to step SP206.At step SP206, obtain the Pinyin information of described phrase by the acquiescence phonetic that makes up each character simply, flow process advances to step SP214 then, finishes Pinyin information and determines to handle.
Yet, if determine in step SP204 is not that all Chinese characters that are included in this phrase are all sent out its acquiescence phonetic, that is to say, at least one be included in Chinese character in this phrase have a plurality of phonetics or its tone in this phrase for softly, then treatment scheme advances to step SP205.
At step SP205, the variable M that increases progressively that represents the position of character in phrase is made as 1.
Afterwards, at step SP207, obtain m character of this phrase.Then, at step SP208, judge whether this m character only sends out its acquiescence phonetic, and do not have other possibility.That is to say, judge whether the m position in the preceding N position of aid pronunciation information of this phrase is 1.Here, suppose that 1 Chinese phrase comprises N Chinese character, and before in its aid pronunciation information in this phrase of N bit representation the phonetic of which character be not its acquiescence phonetic.
If determine that in step SP208 the m position is 0 in the preceding N position of aid pronunciation information, the phonetic that then means m character should be its acquiescence phonetic, and the Pinyin information of m character of this phrase can directly obtain from its acquiescence phonetic, treatment scheme advances to step SP215, at this, directly obtain the acquiescence phonetic of this m character.Then, treatment scheme advances to the step SP211 that will be described below.
But, if the m position in step SP208 in the preceding N position of definite aid pronunciation information is 1, the phonetic that then means this m character is not its acquiescence phonetic and other selection is arranged, that is to say that this m character may have a plurality of phonetics or the tone of this character is for softly in this phrase, treatment scheme advances to step SP209.
At step SP209, obtain aid pronunciation information about m character from the aid pronunciation information of this phrase.
Then, at step SP210,, obtain the Pinyin information of this m character according to the aid pronunciation information of this m character.As described above, after the phonetic of expression character is not the preceding N position of its acquiescence phonetic in aid pronunciation information, be that its phonetic is not the aid pronunciation information of each character of " acquiescence phonetic ".Use 4 Pinyin informations of enough determining a character.Wherein preceding 3 phonetics that are used for selecting character from " non-acquiescence pinyin table ", last 1 is to indicate softly ".That is to say,, then from " non-acquiescence pinyin table ", select the Pinyin information of this character according to preceding 3 value if preceding 3 is not 0 entirely.If the 4th is 1, then mean flag set softly, the Pinyin information of this character can obtain by its acquiescence phonetic is become softly.After finishing above-mentioned processing, treatment scheme advances to step SP211.
At step SP211, the value of M adds 1.Then, in step SP212, the value of M and the value of N are compared.If M is not more than N, then treatment scheme turns back to step SP207, then handles other character that is included in this phrase.Processing procedure and described above similar is omitted its detailed description at this.
If the value of determining M is greater than N, then treatment scheme advances to step SP213.In step SP213, obtain the Pinyin information of this phrase according to the Pinyin information that is included in each character in this phrase of above-mentioned processing acquisition by combination.
Afterwards, finish the treatment scheme of the Pinyin information of determining phrase, and return at step SP114.
As mentioned above, according to the method for in the Chinese electronic dictionary of the aid pronunciation information of storing phrase, determining the phonetic of Chinese phrase of the present invention, can determine the Pinyin information of phrase easily, and compare that the storage space that is used to store Pinyin information can significantly reduce with traditional method.
It should be noted that, though the present invention describes in the context of complete functionalization enforcement the method according to this invention, but will be understood by those skilled in the art that process of the present invention can distribute with instruction or computer-readable medium form and various other forms of other functional description data, and the present invention is suitable for equally and is used for realizing that with actual the particular type of the signal bearing medium distributed is irrelevant.But the wired or wireless communication link that the example of computer-readable medium comprises record type medium such as floppy disk, hard disk drive, RAM, CD-ROM, DVD-ROM and transmission type media such as numeral and analog communication links, use transmission form is wireless frequency and light wave transmissions for example.Computer-readable medium can adopt the form of the coded format of decoding at the actual use in the concrete data handling unit (DHU) assembly.The functional description data is the information of function being authorized machine.The functional description data includes but not limited to definition, object and the data structure of computer program, instruction, rule, the fact (fact), calculable functions.
Description of the invention provides for example with for the purpose of describing, and is not exhaustively or limit the invention to disclosed form.Many modifications and variations are obvious for the ordinary skill in the art.Selecting and describing embodiment is for better explanation principle of the present invention and practical application, thereby and makes those of ordinary skill in the art understand the various embodiment that have various modifications that the present invention's design is suitable for concrete purposes.

Claims (16)

1. data structure formation method that is used for the Pinyin information of phrase in Chinese electronic dictionary comprises step:
Determine the acquiescence phonetic of each Chinese character;
For the Chinese character with a plurality of phonetics is created non-acquiescence pinyin table; And
Form aid pronunciation information, described aid pronunciation information shows when the phonetic of described phrase can not be formed by the acquiescence phonetic that is included in each Chinese character in the described phrase, be included in the phonetic selection of Chinese character from described non-acquiescence pinyin table that has a plurality of phonetics in the described phrase.
2. the data structure formation method that is used for the Pinyin information of phrase as claimed in claim 1 is characterized in that: if be included in the phonetic of all characters in the described phrase all is acquiescence phonetic, and then aid pronunciation information is empty.
3. the data structure formation method that is used for the Pinyin information of phrase as claimed in claim 1, it is characterized in that: if described phrase comprises that at least one its phonetic is not for giving tacit consent to the character of phonetic, then the preceding N position of aid pronunciation information is used to represent that the phonetic of character is not acquiescence phonetic, ensuing position is used to represent that its phonetic is not the aid pronunciation information of each character of acquiescence phonetic, and wherein N is the quantity that is included in the character in the described phrase.
4. the data structure formation method that is used for the Pinyin information of phrase as claimed in claim 3 is characterized in that: if the phonetic of m character is not its acquiescence phonetic in the described phrase, then the m position in the preceding N position of aid pronunciation information is made as 1; And use 4 to preserve the aid pronunciation information that its phonetic is not each character of acquiescence phonetic, 3 phonetics that are used for selecting from non-acquiescence pinyin table character wherein, 1 is that the phonetic that is used to show this character should become mark softly softly from the acquiescence phonetic of this character.
5. the data structure formation method that is used for the Pinyin information of phrase as claimed in claim 4, it is characterized in that: if be included in the length of the effective information in the aid pronunciation information is not the integral multiple of byte, then adds invalid information in the back of aid pronunciation information.
6. Chinese electronic dictionary with storer is characterized in that described storer comprises:
The acquiescence phonetic of each Chinese character;
Be used to have the non-acquiescence pinyin table of the Chinese character of a plurality of phonetics; And
Aid pronunciation information shows when the phonetic of phrase can not be formed by the acquiescence phonetic that is included in each Chinese character in the described phrase, is included in the phonetic of Chinese character from described non-acquiescence pinyin table that has a plurality of phonetics in the described phrase and selects.
7. Chinese electronic dictionary as claimed in claim 6 is characterized in that: if be included in the phonetic of all characters in the described phrase all is acquiescence phonetic, and then aid pronunciation information is empty.
8. Chinese electronic dictionary as claimed in claim 6, it is characterized in that: if described phrase comprises that at least one its phonetic is not for giving tacit consent to the character of phonetic, then the preceding N position of aid pronunciation information is used to represent that the phonetic of character is not acquiescence phonetic, ensuing position is used to represent that its phonetic is not the aid pronunciation information of each character of acquiescence phonetic, and wherein N is the quantity that is included in the character in the described phrase.
9. Chinese electronic dictionary as claimed in claim 8 is characterized in that: if the phonetic of m character is not its acquiescence phonetic in the described phrase, then the m position in the preceding N position of aid pronunciation information is made as 1; And use 4 to preserve the aid pronunciation information that its phonetic is not each character of acquiescence phonetic, 3 phonetics that are used for selecting from non-acquiescence pinyin table character wherein, 1 is that the phonetic that is used to show this character should become mark softly softly from the acquiescence phonetic of this character.
10. Chinese electronic dictionary as claimed in claim 9 is characterized in that: if be included in the length of the effective information in the aid pronunciation information is not the integral multiple of byte, then adds invalid information in the back of aid pronunciation information.
11. a method that is used for the phonetic of definite Chinese phrase in Chinese electronic dictionary, described Chinese electronic dictionary stores the aid pronunciation information of phrase, and described method comprises step:
Obtain aid pronunciation information, the phonetic that described aid pronunciation information shows when the phonetic of phrase can not be formed by the acquiescence phonetic that is included in each Chinese character in the described phrase, be included in the Chinese character that has a plurality of phonetics in the described phrase is selected; And
According to the aid pronunciation information of being obtained, select its phonetic can not directly give tacit consent to the phonetic of the character of pinyin representation with it.
12. the method that is used for the phonetic of definite Chinese phrase as claimed in claim 11, it is characterized in that: be not the character of acquiescence phonetic for its phonetic, if have mark softly in the aid pronunciation information of this character, then the phonetic of this character becomes softly by the acquiescence phonetic with this character and obtains; Otherwise the phonetic of this character is selected from non-acquiescence pinyin table according to the selection information in the aid pronunciation information that is included in this character.
13. the method that is used for the phonetic of definite Chinese phrase as claimed in claim 12, it is characterized in that: the acquiescence phonetic of each Chinese character is stored in the described Chinese electronic dictionary in advance.
14. the method that is used for the phonetic of definite Chinese phrase as claimed in claim 12, it is characterized in that: the described non-acquiescence pinyin table with Chinese character of a plurality of phonetics is created in advance and is stored in the described Chinese electronic dictionary.
15. a computer program that is recorded at least a computer-readable medium comprises when computing machine uses, and makes the computing machine enforcement of rights require the functional descriptions material of any one described method step among the 1-5.
16. a computer program that is recorded at least a computer-readable medium comprises when computing machine uses, and makes the computing machine enforcement of rights require the functional descriptions material of any one described method step among the 11-14.
CN 200310114889 2003-11-07 2003-11-07 Electronic dictionary and its data structure forming method and spelling information determining method Pending CN1614584A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN 200310114889 CN1614584A (en) 2003-11-07 2003-11-07 Electronic dictionary and its data structure forming method and spelling information determining method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN 200310114889 CN1614584A (en) 2003-11-07 2003-11-07 Electronic dictionary and its data structure forming method and spelling information determining method

Publications (1)

Publication Number Publication Date
CN1614584A true CN1614584A (en) 2005-05-11

Family

ID=34760242

Family Applications (1)

Application Number Title Priority Date Filing Date
CN 200310114889 Pending CN1614584A (en) 2003-11-07 2003-11-07 Electronic dictionary and its data structure forming method and spelling information determining method

Country Status (1)

Country Link
CN (1) CN1614584A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102033859A (en) * 2009-09-28 2011-04-27 佳能株式会社 Method and system for compressing dictionary and processing words, text-to-speed system and electronic equipment

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102033859A (en) * 2009-09-28 2011-04-27 佳能株式会社 Method and system for compressing dictionary and processing words, text-to-speed system and electronic equipment
CN102033859B (en) * 2009-09-28 2013-04-10 佳能株式会社 Method and system for compressing dictionary and processing words, text-to-speed system and electronic equipment

Similar Documents

Publication Publication Date Title
CN1260704C (en) Method for voice synthesizing
US8117026B2 (en) String matching method and system using phonetic symbols and computer-readable recording medium storing computer program for executing the string matching method
CN1285068C (en) Text normalization using context-free grammar
US7451075B2 (en) Compressed speech lexicon and method and apparatus for creating and accessing the speech lexicon
CN101075252A (en) Method and system for searching network
RU2008128440A (en) METHOD AND DEVICE FOR ACCESSING A DIGITAL FILE FROM A SET OF DIGITAL FILES
CN101051319A (en) File name generating method and device in file distribution system
CN101075237A (en) Method for storing, fetching and indexing data
CN1614584A (en) Electronic dictionary and its data structure forming method and spelling information determining method
CN113674734A (en) Information query method, system, equipment and storage medium based on voice recognition
CN1262473A (en) Chinese-caracter input method by phonetic letters with numeral key pad
CN1529264A (en) Method for searching associated multimedia content through text block position coding
CN1645356A (en) Multiple dimensional Chinese studying systems
CN1084900C (en) Retrieval method for Chinese character
CN1475896A (en) Chinese language phonetic transcription simple and quick full spelling input method and its keyboare
CN1641640A (en) Method and device for merging data structure of multiple prefessional dictionary for electronic dictionary
CN1122476A (en) Literal information processing method and apparatus
KR102571199B1 (en) Method for guessing password based on hangeul using transform rules
CN1021259C (en) Code compression method for quick key-in english and keyboard
CN86102418A (en) Chinese syllable processor and Chinese syllable disposal route
CN1218932A (en) Change-over processor for Chinese input and method of change-over processing for Chinese input
CN1303506C (en) Chinese character phonetic and tone indication determinist input method
Putnam A Contrastive Grammar of Brazilian Pomeranian
CN1100288C (en) Four-stroke sequential syllable Chinese character coding method
JP2944524B2 (en) Kana-Kanji conversion method

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C12 Rejection of a patent application after its publication
RJ01 Rejection of invention patent application after publication

Open date: 20050511