WO2006115825A2

WO2006115825A2 - Abbreviated handwritten ideographic entry phrase by partial entry

Info

Publication number: WO2006115825A2
Application number: PCT/US2006/014051
Authority: WO
Inventors: Jianchao Wu; Lu Zhang; Jenny Huang-Yu Lai; Michael R. Longe
Original assignee: Tegic Communications, Inc.
Priority date: 2005-04-25
Filing date: 2006-04-14
Publication date: 2006-11-02
Also published as: CN1862472A; TWI313425B; KR20080007261A; CN1862472B; JP5037491B2; TW200703063A; WO2006115825A3; JP2008539477A

Abstract

A computer receives incomplete entry of an intended ideographic language multi-character phrase via user input tool, The incomplete entry is incomplete in at least the following respect: the incomplete entry includes less-than-all strokes of one or more characters of the intended phrase. The computer receives user designation of one or more delimiters indicating an order of intended characters in the intended phrase and any skipped characters in the incompletely entered phrase. The computer compares the incompletely entered phrase as represented by the received user input and any skipped characters to a prescribed vocabulary of reference phrases, and identifies any reference phrases that include the incompletely entered phrase in the indicated order of intended characters. The computer presents human-readable output of the identified reference phrases to the user.

Description

ABBREVIATED HANDWRITING ENTRY OF IDEOGRAPHIC

LANGUAGE PHRASE BY SUBMITTING LESS-THAN-ALL CHARACTERS

AND/OR LESS-THAN-ALL STROKES OF ANY GIVEN CHARACTER(S)

BACKGROUND OF THE INVENTION

1. Field of the Invention

[1001] The present invention relates to computer-driven systems for users to enter ideographic language text. More particularly, the invention concerns a computer-driven system for the abbreviated user input of ideographic language phrases by submitting less-than-all characters and/or less-than-all strokes of any given character(s).

2. Description of the Related Art

[1002] Digital devices are widespread today. People commonly use desktop computers, notebook computers, cell phones, personal data assistants (PDAs), and many more such devices. Broadly, each of these is a different implementation of computer. In these devices, it essential to provide the human user with a suitably reliable, convenient, expedient method of submitting input to the computer. For this reason, engineers have developed an incredible variety of keyboards, mice, trackballs, joysticks, digitizing surfaces, speech recognition systems, eye gaze tracking systems, and many more. [1003] User entry of ideographic languages has always been a special challenge. Unlike English where words are broken down to an alphabet of twenty-six constituent letters, written languages such as Chinese use thousands of different characters. Engineers have approached this challenge by developing solutions employing many different technologies. Some examples include huge character-based keyboards, intricate computer menu systems for use with normal keyboards and mice, stick and button entry tools, handwriting digitizers, and many more. [1004] For many, handwriting digitizers are the tool of choice, offering the convenience and natural feel of handwriting input. Still, it has been nearly impossible to overcome the slow speed of known handwriting digitizers, owing to the complexity of Chinese and other ideographic languages where some characters contain more than twenty strokes. Plus, individual variations in style and slant add to the challenge, making handwriting recognition on computers and mobile devices prone to error, and forcing the user to write even more clearly and carefully, and further reducing the speed of input.

SUMMARY OF THE INVENTION

[1005] A computer receives incomplete entry of an intended ideographic language multi-character phrase via user input tool. The incomplete entry is incomplete in at least the following respect: the incomplete entry includes less- than-all strokes of one or more characters of the intended phrase. The computer receives user designation of one or more delimiters indicating an order of intended characters in the intended phrase and any skipped characters in the incompletely entered phrase. The computer compares the incompletely entered phrase as represented by the received user input and any skipped characters to a prescribed vocabulary of reference phrases, and identifying any reference phrases that include therein the incompletely entered phrase in the indicated order of intended characters. The computer presents human-readable output of the identified reference phrases to the user.

BRIEF DESCRIPTION OF THE DRAWINGS

[1006] FIGURE 1A is a block diagram of the hardware components and interconnections of an ideographic language handwriting entry and processing system. [1007] FIGURE 1 B illustrates various examples of stored maps and vocabularies.

[1008] FIGURE 2 is a block diagram of a digital data processing machine.

[1009] FIGURE 3 shows an exemplary signal-bearing medium.

[1010] FIGURE 4 is a perspective view of an exemplary logic circuit.

[1011] FIGURE 5 is a flowchart of a general operational sequence for receiving and processing of an incomplete ideographic language phrase.

[1012] FIGURE 6 is a diagram of an exemplary incomplete ideographic language phrase including various incomplete characters and various delimiters.

[1013] FIGURE 7 is a diagram showing an exemplary sequence for comparing an intended character sequence to a phase vocabulary, and identifying matching reference phrases.

[1014] FIGURE 8 is a flowchart showing a more specific operational sequence for receiving and processing of an incomplete ideographic language phrase.

DETAILED DESCRIPTION

[1015] The nature, objectives, and advantages of the invention will become more apparent to those skilled in the art after considering the following detailed description in connection with the accompanying drawings.

HARDWARE COMPONENTS & INTERCONNECTIONS

Overall Structure

[1016] One aspect of the present disclosure concerns an ideographic language handwriting entry and processing system. This system may be embodied by various hardware components and interconnections, with one example being described by the system 100 of FIGURE 1A. [1017] More particularly, with reference to FIGURE 1A, the system 100 includes a display 102, data entry tool 104, processor 106, and storage 108. In one example, the display 102 comprises a relatively small LCD display of a PDA. However, the display 102 may be implemented by another size or configuration of LCD display, CRT, plasma display, or any other device for receiving a machine-readable input signal and providing a human-readable output. In one example, the data entry tool 104 comprises a handwriting digitizing component of a PDA. In this respect, the tool 104 may include a handwriting digitizing surface such a touch screen, digitizing pad, or virtually any other digitizing surface configured to receive a user's handwriting submitted with stylus, pen, pencil, finger, etc. In addition, or as an alternative, the tool 104 may include a different gesture-input tool such as mouse, roller ball stylus, track ball, mouse, pointing stick, or other mechanism appropriate to the application at hand. Furthermore, the tool 104 may include a combination of the foregoing devices. In one example, where the tool 104 is implemented as a handwriting digitizing surface, the display 102 and tool 104 may be co-located such that the digitizing surface overlies the display 102.

[1018] In one example, the storage 108 comprises micro-sized flash memory of the type used in compact applications such as PDAs. However, the storage 108 may be implemented by a variety of hardware such as magnetic media (such as tape or disk storage), firmware, electronic non-volatile memory (such as ROM or EPROM or flash PROM or EPROM), volatile memory such as RAM, optical storage, and virtually any apparatus for storing machine-readable data as appropriate to the application discussed herein. As to data structure, components in the storage 108 may be implemented by linked lists, lookup tables, relational databases, or any other useful data structure. [1019] As illustrated, the storage 108 includes certain subcomponents, namely, a phrase vocabulary 110, character vocabulary 111 , character stroke map 112, and character phonetic map 113. The system 100 may be configured to implement one or multiple ideographic character sets, but for ease of explanation it is described in the context of a single installed character set such as simplified Chinese. For the installed character set, then, the vocabulary 110 contains a listing of ideographic phrases. This listing may be taken or derived from various known standards, extracted from corpus, etc. The vocabulary 110 may be fixed at manufacture of the system 100, or downloaded upon installation or boot-up or reconfiguration or another suitable time. The vocabulary 110 may self-update to gather new phrases from time to time, by consulting users' previous input, the Internet, wireless network, or another source. [1020] In FIGURE 1 B, ref. 152a (and alternatively ref. 152b) show two exemplary Chinese language phrases that could reside in the vocabulary 110. The two phrases of 152a are examples of a simplified Chinese character set. The two phrases of 152b are examples of a traditional Chinese character set. In one example, where simplified Chinese is the installed character set, then the vocabulary 110 would include the characters of 152a.

[1021] Analogous to the phrase vocabulary 110, the character vocabulary 111 contains a listing of recognized ideographic characters, for the installed character set. In one example, where simplified Chinese is the installed character set, the vocabulary 111 may contain individual characters of 152a. [1022] Optionally, one or both of the vocabularies 110-111 may include data regarding usage frequency of the characters or phrases. This data may be contained in the vocabularies 110-111 or stated elsewhere with appropriate links to the related characters and/or phrases in the vocabularies 110-111. In one embodiment, the usage frequency is stated in a linguistic model (not shown), which broadly indicates usage frequency of characters (and/or phrases) relative to other characters (and/or phrases), or the probability that the user intends to select that character (or phrase) next. Frequency may be determined by the number of occurrences of the character in written text or in conversation; by the grammar of the surrounding sentence; by its occurrence following the preceding character or characters; by the context in which the system is currently being used, such as typing names into a phonebook application; by its repeated or recent use in the system (the user's own frequency or that of some other source of text); or by any combination thereof. In addition, a character may be prioritized by the probability that a matching component occurs in the character at that point in the entered stroke sequence. In another embodiment, usage frequency is based on the usage of characters or phrases by a particular user, or in a particular context, such as a message or article being composed by the user. In this example, frequently used characters or phrases become more likely characters or phrases.

[1023] For some or all of the characters in the vocabulary 111 , the character stroke map 112 includes a cross-reference between that character and a listing of its constituent strokes. Optionally, the map 112 includes position information relative to other strokes and alternative strokes in shape (font) and order. Further, the map 112 may accommodate one or multiple stroke orders for a given character.

[1024] In FIGURE 1 B, ref. 154 shows an exemplary Chinese character and its constituent strokes. In this example, the character of 154 is listed in the character vocabulary 111 and the constituent strokes are listed in the map 112 in association with the character. In this example, the map 112 may also include one or more stroke orders, specifying an order of entering the strokes of 154. [1025] Similarly, the map 113 includes a cross-reference between each character and a phonetic representation, that is, phonetic symbols, ruby characters, or other written symbols representing pronunciation of the given character according to a recognized system such as Bopomofo, Pinyin, etc. In the illustrated example, the phonetic representations of the map 113 include the appropriate tone(s) in addition to the phonetic symbol(s). In FIGURE 1 B₁ refs. 156-158 show a sample traditional Chinese character 154 and its representation according to Bopomofo symbol (156) and tone (158) representation. [1026] As an alternative to the setup 100, applications with sufficiently large storage 108 may omit the character vocabulary 111. In this embodiment, each phrase in the vocabulary 110 is broken down directly into constituent strokes (112) and phonetic information (113).

[1027] Referring to FIGURE 1A, one example of the processor 106 is a digital data processing entity of the type utilized in PDAs. However, in a more general sense, the function of the processor 106 may be implemented by one or more hardware devices, software devices, a portion of one or more hardware or software devices, or a combination of the foregoing without limitation. The makeup of some illustrative subcomponents is described in greater detail below, with reference to an exemplary digital data processing apparatus, logic circuit, and signal bearing medium.

Exemplary Digital Data Processing Apparatus

[1028] As mentioned above, data processing entities such as the processor 106 may be implemented in various forms. One example is a digital data processing apparatus, as exemplified by the hardware components and interconnections of the digital data processing apparatus 200 of FIGURE 2. [1029] The apparatus 200 includes a processor 202, such as a microprocessor, personal computer, workstation, controller, microcontroller, state machine, or other processing machine, coupled to digital data storage 204. In the present example, the storage 204 includes a fast-access storage 206, as well as nonvolatile storage 208. The fast-access storage 206 may comprise random access memory ("RAM"), and may be used to store the programming instructions executed by the processor 202. The nonvolatile storage 208 may comprise, for example, battery backup RAM, EEPROM, flash PROM, one or more magnetic data storage disks such as a "hard drive", a tape drive, or any other suitable storage device. The apparatus 200 also includes an input/output 210, such as a line, bus, cable, electromagnetic link, or other means for the processor 202 to exchange data with other hardware external to the apparatus 200. [1030] Despite the specific foregoing description, ordinarily skilled artisans (having the benefit of this disclosure) will recognize that the apparatus discussed above may be implemented in a machine of different construction, without departing from the scope of the invention. As a specific example, one of the components 206, 208 may be eliminated; furthermore, the storage 204, 206, and/or 208 may be provided on-board the processor 202, or even provided externally to the apparatus 200.

Signal-Bearing Media

[1031] In contrast to the digital data processing apparatus described above, a different aspect of this disclosure concerns one or more signal-bearing media tangibly embodying a program of machine-readable instructions executable by such a digital processing apparatus. In one example, the machine-readable instructions are executable to carry out various functions related to this disclosure, such as the operations described in greater detail below. In another example, the instructions upon execution serve to install a software program upon a computer, where such software program is independently executable to perform other functions related to this disclosure, such as the operations described below.

[1032] In any case, the signal-bearing media may take various forms. In the context of FIGURE 2, such a signal-bearing media may comprise, for example, the storage 204 or another signal-bearing media, such as a magnetic data storage diskette 300 (FIGURE 3), directly or indirectly accessible by a processor 302. Whether contained in the storage 206, diskette 300, or elsewhere, the instructions may be stored on a variety of machine-readable data storage media. Some examples include direct access storage (e.g., a conventional "hard drive", redundant array of inexpensive disks ("RAID"), or another direct access storage device ("DASD")), serial-access storage such as magnetic or optical tape, electronic non-volatile memory (e.g., ROM, EPROM, flash PROM, or EEPROM), battery backup RAM, optical storage (e.g., CD-ROM, WORM, DVD, digital optical tape), or other suitable computer readable media.

Logic Circuitry

[1033] In contrast to the signal-bearing media and digital data processing apparatus discussed above, a different embodiment of this disclosure uses logic circuitry instead of computer-executed instructions to implement data processing entities such as the processor 106. FIGURE 4 shows an example of the logic circuitry in the form of an integrated circuit 400.

[1034] Depending upon the particular requirements of the application in the areas of speed, expense, tooling costs, and the like, this logic may be implemented by constructing an application-specific integrated circuit (ASIC) having thousands of tiny integrated transistors. Such an ASIC may be implemented with CMOS, TTL, VLSI, or another suitable construction. Other alternatives include a digital signal processing chip (DSP), discrete circuitry (such as resistors, capacitors, diodes, inductors, and transistors), field programmable gate array (FPGA), programmable logic array (PLA), programmable logic device (PLD), and the like.

OPERATION

[1035] Having described the structural features of the present disclosure, the operational aspect of the disclosure will now be described.

Overall Sequence of Operation [1036] FIGURE 5 shows a sequence 500 to illustrate one example of the method aspect of this disclosure, For ease of explanation, but without any intended limitation, the example of FIGURE 5 is described in the context of the system 100 described above.

[1037] Broadly, this sequence serves to process user entry of a phrase of a ideographic writing system that uses characters to represent individual words or morphemes. The notion of "ideographic" characters herein is used without limitation, and includes Chinese pictograms, Chinese ideograms proper, Chinese indicatives, Chinese sound-shape compounds (phonologograms), Japanese characters (Kanji), Korean characters (Hanja), etc. Furthermore, the system 100 may be implemented to a particular standard, such as traditional Chinese characters, simplified Chinese characters, or another standard, or to a particular font, such as Chinese Song. And, although the term "ideographic" is used in these examples, this is used without any intended limitation and shall include a variety of many different logographic, ideographic, lexigraphic, morpho-syllabic, or other such writing systems that use characters to represent individual words, concepts, syllables, morphemes, etc.

User Input

[1038] In step 502, the system 100 receives user input. Via the data entry tool

104, and more particularly the handwriting digitizing surface, the user enters and the system 100 receives incomplete entry of an intended ideographic language multi-character phrase. The user enters the characters of the phrase by handwriting strokes upon the digitizing surface 104.

[1039] Optionally, in step 502 the user may enter supplementary phonetic information such any phonetic symbols or tones describing the pronunciation of characters in the intended phrase. This step is optional, however, as the routine

500 may function using stroke input exclusively. On the other hand, the user may substitute phonetic input for stroke input as to some or all of the user's intended characters.

[1040] The user's entry is incomplete in one or both of the following respects. In one example, the entry includes less-than-all characters in the intended phrase. In a five character phrase, for instance, characters two and four might be missing completely. In a different example, the entry is incomplete because it includes less-than-all strokes of one or more characters in the intended phrase, rendering those characters incomplete. In a five character phrase, for instance, the user's entry may fail to include certain strokes of the second, third, and fourth characters of a five character phrase.

[1041] Also occurring in step 502, the user designates one or more character delimiters, and the system 100 receives this information. Broadly, the delimiters carry out the broad function of specifying where one character ends and the next character begins. Accordingly, to implement this broad function, the operation of the user designating a delimiter to the system 100 may be carried out in many different ways. Delimiters may be explicit or implicit. Explicit delimiters require the user to take some action that is not otherwise required. Implicit delimiters assume character boundaries based on some characteristic of the mere act of entering the characters.

[1042] One example of an explicit delimiter is where the user handwrites some or all strokes of a desired character, and then presses a soft or hard key such as the "enter" key, a prescribed numeric key (such as "4"), a letter key, or another key. Another example of an explicit delimiter is where the user double taps the stylus on the handwriting digitizing surface 104 or performs another distinguishing gesture.

[1043] Another example of an explicit delimiter is described as follows. Here, the processor 106 causes the display 102 to sequentially show a sequence of boxes, each box for receiving a different character (in order) of the intended character sequence. After each character is entered, the box is blanked-out, changed in color, removed, or otherwise changed in appearance to signal receipt of a new character. Thus, each box is designed to receive user entry of a single character, or "passing" on that character to submit a blank. The order of presenting the boxes impliedly corresponds to the order of the characters in the phrase. In order to progress from one box to the next, the system 100 requires an explicit delimiter such as a user command, distinctive gesture, menu entry, checkbox entry, stylus tap, or other affirmative action of delimiter entry. A final delimiter, to indicate the end of the phrase, may be entered or not, depending upon the details of phrase matching as discussed in detail below. [1044] In the case of implicit delimiters, the processor 106 may cause the display 102 show graphics that automatically prompt, encourage, format, or regulate the user's entry of characters. Accordingly, one example of an implicit delimiter is where the processor 106 causes the display 102 to present a structure including a number of boxes. FIGURE 6 shows an example of a structure 602 including five boxes 604, 606. 608, 610, 612. By implication, each box corresponds to one character. Therefore, if the user enters character strokes in the box 604, then the processor 106 assumes that these strokes pertain to the first character in the phrase. If the user enters character strokes in the box 606, then the processor 106 assumes that these strokes pertain to the second character in the phrase.

[1045] In this example, the system 100 does not require any explicit delimiters to determine the end of the intended phrase. Namely, in subsequent steps where the system 100 interprets the user's input (discussed in greater detail below), these are performed to obviate the need for express delimiters. In one case, the system 100 assumes that the number of characters in the intended phrase is indicated by the right-most box with any user input. In the illustrated example, the system 100 assumes that a two-character phrase is intended, since the box 605 is the right-most box containing any user input. In another case, the system 100 considers boxes with user input as a basis for searching for phrases starting with those characters, ending with those characters, or containing those characters within the phrase. Furthermore, as discussed below, the system 100 may continually revise its search for phrases potentially matching the user's input; here, the user keeps entering strokes and phonetic data until the system 100 presents the desired phrases in a suggested output list. At any rate, no explicit delimiter is required.

[1046] In a different embodiment of FIGURE 6, a final explicit delimiter is required for the user to designate which character or box is the last one in the phrase. This delimiter may be entered, for example, by utilizing a pull-down menu, performing a distinct gesture in or association with the last box, operating a GUI to delete or mark-out the box after the last character, or any one of a thousand different affirmative acts that might be performed to enter a final delimiter.

[1047] In another example of FIGURE 6, the boxes in structure 602 are not displayed to the user, but the user enters the next character stroke a sufficient distance away from the preceding strokes to indicate that a new character is being started. The necessary distance may be predetermined or proportional to the size or area of the strokes being entered. Here, the distance between characters serves as an implicit delimiter. More subtly, the system 100 may detect that the user has moved the mouse cursor or stylus, e.g. as used with a Tablet PC, from the current character entry region to an adjacent character entry region to start the next character, or has moved it to a phrase selection list in preparation of step 820 as described below.

[1048] As another example of an implicit delimiter, the system 100 may assume that user entry of tonal information for a character signals constitutes a delimiter for that character. As an additional example of an implicit delimiter, the system 100 may recognize that the user has paused for a period of time before starting a next character. The necessary period of time may be predetermined or proportional to the speed of entry of the strokes. Determine Relative Positions

[1049] After user entry of the incomplete phrase and its delimiters (step 502), the processor 106 uses the delimiters to determine the relative positions of the received characters and any interstitial blanks in the intended phrase and the total number of characters in the intended phrase (step 504). The following example is given, where one prescribed explicit delimiter is a stylus tap, and all user inputs are made in a single region of a handwriting digitizing surface 104. Here, the system 100 receives a first group of character strokes, then tap-tap, then a second group of character strokes, then tap-tap-tap. In step 504, the processor 106 interprets the user entry as follows; (1) a first character, to which the first group of keystrokes correspond, (2) a second character, which is blank, (3) a third character, to which the second group of keystrokes correspond, (4) a fourth character, which is blank, and (5) a fifth character, which is blank. Therefore, in this example of step 504, the processor 106 determines that there are five characters in the user's intended phrase. In another variation of this example, the three trailing taps are omitted and the processor 106 later identifies (708) words and phrases that are at least three characters long.

Find Matching Reference Phrases

[1050] In step 506, the processor 106 compares the incomplete phrase (as represented by the received characters, blanks, and total number of characters if known) to reference phrases in the phrase vocabulary 110 in order to identify any reference phrases that include the incomplete phrase.

[1051] The sequence 700 describes one exemplary sequence for carrying out step 506, as illustrated by FIGURE 7. Broadly, the sequence 700 performs a character-by-character comparison between the user-submitted incomplete phrase and each phrase in the stored vocabulary 110. More specifically, for the first character in the user's intended phrase, the processor 106 attempts to find a match between the user's input (e.g., stroke and/or phonetic input) and the maps 112-113 to identify which of the known characters 111 could possibly represent a completion of the user's actions. In other words, step 702 compares the user's entry for the first character to the maps 112-113 to find characters that could possibly represent the user's entry (or to exclude characters that could not). If the user entered all strokes corresponding to a particular character, then, the best match output of step 702 would be that single character. The processor 106 repeats step 702 repeats (as shown by 702a) for each character in the user- entered phrase. If the user skipped a character (indicating a "blank"), then all characters in the maps 112-113 might conceivably match the user's entry for that character.

[1052] In addition to evaluating the strokes and phonetic data, step 702 may further consider the stroke order for embodiments where such information is included in the map 112. For instance, step 702 may rule-out characters from 111 where the stroke order observed by the user is not consistent with any of the one or more stroke orders recorded in the map 112.

[1053] In step 704, the processor 106 records the matches resulting from step 702. In step 706, the processor 106 generates all potential character sequences that can be formed from the matches of step 704. The following example is given, where "A," "B" and "C" are arbitrary alphabetic representations of ideographic characters employed for the sake of convenient discussion. With a two character sequence, if the matches for the first character were "A" and "B", and the best match for the second character is only "C," then the possible character sequences (706) include AC and BC. If there are matching characters, and if an optional linguistic model is utilized to order characters 110 according to frequency of use (as discussed above), then step 706 may favor, order, rank, or limit the output selections according to an order dictated by frequency of use. [1054] In step 708, the processor 106 takes a character sequence of step 706 and compares it to the phrase vocabulary 708 to see if there is a match. This process is repeated (708A) for every possible character sequence from step 706. As one example of step 708, comparison of the user's intended character sequence may be limited to phrases in the vocabulary 110 with the same number of characters. In this example, when the user specifies a five character sequence, the only outputs of step 708 are five character sequences from the vocabulary 110. In a different example, the processor 106 searches the vocabulary 110 for phrases that begin with the user's intended character sequence (but include a greater number of following characters), or phrases that end with the user's intended character sequence (but include a greater number of preceding characters), and/or phrases that contain unspecified preceding and following characters with the user's intended character sequence somewhere in the middle.

[1055] As an alternative to steps 702-708, these steps may be combined into a single operation of comparing the user input directly to a phrase vocabulary and associated map of phrases to constituent stroke and/or phonetic data. In this embodiment, the character vocabulary 111 is omitted, and the phrase vocabulary 110 is broken down into constituent strokes (112) and phonetic information (113). [1056] In step 710, the processor 106 records, outputs, temporarily stores, recognizes, flags, or otherwise notes the matches resulting from step 708.

Present Output

[1057] Referring again to FIGURE 5, the output from step 710 (FIGURE 7) provides the matching reference phrases referred to in step 506. Next, step 508 displays an output of the phrase(s) from step 506. If there is only one potentially intended phrase, this gives the user an opportunity to confirm that the user intended to enter this phrase. If there are multiple phrases, the system 100 affords the user with an interactive opportunity in step 508 to select from this list, for example by using the data entry tool 104. [1058] If there are multiple phrases, an optional linguistic model may be utilized to order proposed phrases from 111 according to frequency of use (as discussed above), then step 508 may order the output selections according to such order. In either case (i.e., one or multiple phrases), the system 100 receives the user's confirmation or selection of a phrase via action such as tapping the stylus, pressing a button, or performing another conveniently abbreviated action. In one embodiment, one of the phrases is selected by default and certain user actions are interpreted as confirming the default phrase. For example, choosing a punctuation symbol, or beginning to write a new phrase starting in box 604 of FIGURE 6, confirms the default phrase. [1059] In one embodiment, the steps 504, 506, 508 are repeated each time the user submits a new stroke, phonetic item, or delimiter, or other user input of step 502. Thus, the system 100 continually revises its search for phrases potentially matching the user's input. In this context, the user may continue entering strokes and/or phonetic data (502) until the system 100 presents (508) the intended phrases in the suggested output list.

Specific Example

[1060] FIGURE 8 shows a more specific sequence 800 to illustrate an example of how the operations 500 may be carried out in practice. In step 802, the user enters (and the system 100 receives) an indication of which character is about to be entered, that is, the order of the character in the phrase to be entered. This character is referred to as the "current" character. In one example, the user's indication of the current character may be implied. For instance, the processor 106 may assume that the user's first actions pertain to the first character in the phrase to be entered. In another example, the user satisfies step 802 by using a pull-down menu or other GUI feature to indicate that the next character to be entered will be the first, or third, or another character. [1061] The user-performed act of entering the current character occurs in step 804, and includes entry of character stroke inputs, phonetic information describing the pronunciation of the current character, or both. Stroke inputs are made by handwriting the strokes with a stylus on the handwriting digitizing tool 104. Phonetic inputs are made by handwriting alphabetic characters, phonetic symbols, ruby characters, tonal inputs, or other symbols indicating the desired phonetic input. Tonal inputs are made by handwriting the desired tonal input, using a pull-down menu or other GUI device.

[1062] As discussed in greater detail below, step 804 is repeated for each character in an intended phrase. The overall input as to the phrase may include stroke information only, phonetic information only, or a mix of the two. Further, the user may leave one or more characters blank. The use of phonetic information is optional, however, as the routine 800 may function using stroke input exclusively.

[1063] As mentioned above, user entries (804) may be made via sequentially presented screens, all-at-once screens (as in FIGURE 6), or another arrangement. Optionally, the all-at-once embodiment (FIGURE 6) may include one box to receive stroke and phonetic inputs, or separate boxes for different aspects of character entry.

[1064] As part of step 804 (or step 810, below), the processor 106 applies one or more suitable handwriting recognition routines to the user entries. For example, the processor 106 may calculate different probabilities that each entered stroke or phonetic input matches one of the recognized entries from the maps 112-113. Handwriting recognition applicable to ideographic language characters is discussed in greater detail in the following reference: U.S. Patent No. 6,970,599, the entirety of which is incorporated herein by reference. [1065] In one example of step 804, the user handwrites a Chinese language character in one of the boxes 604, 606, 608, 610, 612 shown on the display 102. In addition to satisfying step 804, this act also impliedly satisfies the designation of step 802 since the entered character's order in the phrase is indicated by choice of box. The user's handwriting entry in this example is detected by the handwriting digitizing feature of the tool 104, which overlays the display 102. [1066] In step 806, the user enters a delimiter after completing character entry (804). As described above in detail, delimiter entry may occur by virtually any means that is adequate to carry out the broad function of specifying where one character ends and the next character begins. Accordingly, the delimiter of step 806 may be implicit or explicit.

[1067] In one embodiment, after step 806 the routine 800 proceeds to step 810. As discussed below, however, step 810 may be skipped and the sequence 800 resumes at step 812. In step 810 as illustrated, the processor 106 performs a matching operation to effectively rule-out characters in the vocabulary 111 that are inconsistent with user input from steps 804, 806. As to the remaining characters in the vocabulary 111 , these might represent characters that the user has entered or was working toward entering in step 804. The details of step 810, as discussed above, generally involve matching the user's stroke entries from step 804 and the map 112. This is performed in order to identify known characters 111 that the user could have intended. If the user entered phonetic information in step 804, then step 810 considers this additional input in matching the user's overall entry to one or more of the characters 111 via the map 113. Step 810 may also consider stroke order if such information is implemented in the map 112, for instance. Step 810 considers single and multi-radical characters, for example, where a user has entered a stroke that is present in some single- radical characters and other multi-radical characters. Furthermore, if the current character is blank or missing, then all characters of the vocabulary 111 are potentially matching.

[1068] In one embodiment, after step 810 the routine 800 proceeds to steps 804 or 812, depending on whether a "final" delimiter is entered. Since the routine 800 is aimed at processing incomplete character sequences, it might not always be otherwise clear when the user has finished making his/her phrase entry. Therefore, the processor 106 may recognize one or more prescribed affirmative or implied user inputs to indicate when the user has finished entering his/her phrase. The final delimiter may comprise any suitable signal to achieve this purpose. In one example, delimiters are entered after each character, and the final delimiter comprises a further, distinctive user action that punctuates and thereby concludes the user's entire phrase entry.

[1069] As a specific example, when the system 100 receives characters entered one-after-another via the same box or other area of the display 102, with adjacent characters punctuated by prescribed user-entered tap sequences, the system 100 may recognize a different distinctive tap sequence to indicate the last character. In contrast, with the multi-box embodiment of FIGURE 6, the system 100 may recognize an "all-complete" signal indicated by user completion of a check box, pull-down menu, distinctive stylus tap, etc.

[1070] In the presently described embodiment, if the delimiter (806) was non- final, then step 810 returns (810a) to step 804 to receive user entry of another character. When the final delimiter is received (810b), step 810 progresses to step 812.

[1071] At step 812, the user's entry may be incomplete in one or both of the following respects. In one respect, the entry may include less-than-all characters in the intended phrase. For example, in a five character phrase, characters two and four might be missing. In a different respect, the entry might be incomplete because it includes less-than-all strokes of one or more incomplete characters in the intended phrase. For example, the user's entry may fail to include certain strokes of the second, third, and fourth characters of a five character phrase. [1072] In step 812, the processor 106 uses the delimiters to determine the relative positions of the received characters and any interstitial blanks in the intended phrase and the total number of characters in the intended phrase. Examples of this operation are discussed above in the context of step 504. In step 814, the processor 106 generates all potential character sequences that can be formed using the matches of step 810, observing the relative positions, order, and character count of step 812. Examples of this operation are discussed in detail above in the context of step 706,

[1073] After step 814, the processor 106 compares each possible character sequence (from 814), as represented by the received characters, blanks, and total number of characters, to the reference phrases in the phrase vocabulary 110 in order to identify any reference phrases that include the possible character sequences. In other words, step 816 evaluates each possible character sequence from step 814 against the phrase vocabulary 110 in order to rule out vocabulary 110 entries that could not possibly represent the generated character sequences from 814. Examples of this operation (816) are given above in the context of steps 506 and 700.

[1074] Next, in step 818 the display 102 presents an output of all resultant reference phrases from step 816 for user to review and identify his/her intended phrase. This gives the user an opportunity (step 820) to view and confirm the proposed phrase(s) that the system 100 outputs. Plus, if step 816 resulted in multiple potentially intended phrases, step 820 enables the user to select the one that the user actually meant. The user may identify (step 820) the desired phrase by tapping the stylus, pressing a button, or performing another conveniently abbreviated user action.

[1075] Optionally, the routine 800 may facilitate user editing of the phrase as entered so-far, for example in step 820 or an earlier step. In this operation, the processor 106 makes a sequential, all-at-once, or other presentation of the incomplete character phrase as entered, and thereafter facilitates user additions, changes, and/or deletions of one or more entered characters and/or their subcomponents. Along this line, a different embodiment of the sequence 800 may be responsive to user input to generate a valid longer phrase by matching single characters and two or three character "words" and stringing them together, even if that phrase is not present in the phrase vocabulary 110. [1076] In a different embodiment, the sequence 800 may be implemented without requiring any final delimiter 810b. In this embodiment, users progress from step 802 through 818 and optionally go back (818a) to 802 until the intended phrase is displayed at 818 and selected by the user. In this embodiment, step 810 may be incorporated into step 814 (or skipped entirely if the storage 108 includes phrase 110 to stroke/phonetics 112/113 mapping).

ENHANCED USE OF RADICAL INFORMATION

[1077] Optionally, the foregoing system may be implemented with enhanced utilization of ideographic language character subcomponents such as semantic character subcomponents (called "radicals") and phonetic character components (not to be confused with written symbols separate from the character, and representing pronunciation of the given character according to a recognized system such as Bopomofo, Pinyin, etc.) This takes advantage of certain natural user tendencies, for example, the tendency to input a complete subcomponent even when entering less-than-all strokes.

[1078] In the exemplary system, the storage 108 is configured such that the character vocabulary 111 is further broken down into character subcomponents (e.g., radicals). In this example, the stroke map 112 includes a cross-reference between each character subcomponent and a listing of its constituent strokes. Similarly, the map 113 includes a cross-reference between each character subcomponent and a phonetic representation, that is, phonetic symbols, ruby characters, or other written symbols representing pronunciation of the given character according to a recognized system such as Bopomofo, Pinyin, etc. [1079] Optionally, reference character subcomponents may be ranked according to, for example, frequency of general appearance among ideographic characters in the language, frequency of use of each character containing each radical, score from a handwriting recognition engine, frequency of use of that character subcomponent by the user, etc.

[1080] This context provides one example of a system 100 that supports greater utilization of ideographic language character subcomponents. In one exemplary usage, the operator uses explicit or implicit delimiters similar to step 502. Delimiters may also be used to indicate character subcomponents. The boxes 604-612, for example, may be used to receive character subcomponents instead of, or in addition to, characters; the boxes 604-612 may be further subdivided with guide lines or other cues for entering a subcomponent. In addition, in step 502 a small number of subcomponents that match the user input for a character thus far may be displayed, such as in a pop-up list, allowing the user to select a particular interpretation of the subcomponent entry and that , selection action serving as an explicit delimiter. Then, step 504 determines the relative positions of character subcomponents and blanks, and total character subcomponent count. The use of character subcomponents may help expedite step 506, and namely the character match (step 702), by acting early to rule out inapplicable characters that do not include the radical that the user entered. In another embodiment, step 506 gives greater priority to matched phrases in which the character at a position that was partially entered contains one of the possible subcomponents matched by that partial entry.

[1081] In one example, step 818 may present the user the recognized phrases with character subcomponents indicated, providing the user with an opportunity to select one of the phrases or edit the phrase as recognized so far by character, character subcomponent, etc.

OTHER EMBODIMENTS

[1082] While the foregoing disclosure shows a number of illustrative embodiments, it will be apparent to those skilled in the art that various changes and modifications can be made herein without departing from the scope of the invention as defined by the appended claims. Accordingly, the disclosed embodiment are representative of the subject matter which is broadly contemplated by the present invention, and the scope of the present invention fully encompasses other embodiments which may become obvious to those skilled in the art, and that the scope of the present invention is accordingly to be limited by nothing other than the appended claims.

[1083] All structural and functional equivalents to the elements of the above- described embodiments that are known or later come to be known to those of ordinary skill in the art are expressly incorporated herein by reference and are intended to be encompassed by the present claims. Moreover, it is not necessary for a device or method to address each and every problem sought to be solved by the present invention, for it to be encompassed by the present claims. Furthermore, no element, component, or method step in the present disclosure is intended to be dedicated to the public regardless of whether the element, component, or method step is explicitly recited in the claims. No claim element herein is to be construed under the provisions of 35 U. S. C. 112, sixth paragraph, unless the element is expressly recited using the phrase "means for" or, in the case of a method claim, the phrase "step for." [1084] Furthermore, although elements of the invention may be described or claimed in the singular, reference to an element in the singular is not intended to mean "one and only one" unless explicitly so stated, but shall mean "one or more". Additionally, ordinarily skilled artisans will recognize that operational sequences must be set forth in some specific order for the purpose of explanation and claiming, but the present invention contemplates various changes beyond such specific order.

[1085] In addition, those of ordinary skill in the relevant art will understand that information and signals may be represented using a variety of different technologies and techniques. For example, any data, instructions, commands, information, signals, bits, symbols, and chips referenced herein may be represented by voltages, currents, electromagnetic waves, magnetic fields or particles, optical fields or particles, other items, or a combination of the foregoing. [1086] Moreover, ordinarily skilled artisans will appreciate that any illustrative logical blocks, modules, circuits, and process steps described herein may be implemented as electronic hardware, computer software, or combinations of both. To clearly illustrate this interchangeability of hardware and software, various illustrative components, blocks, modules, circuits, and steps have been described above generally in terms of their functionality. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the overall system. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention. [1087] The various illustrative logical blocks, modules, and circuits described in connection with the embodiments disclosed herein may be implemented or performed with a general purpose processor, a digital signal processor (DSP), an application specific integrated circuit (ASIC), a field programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or any combination thereof designed to perform the functions described herein. A general purpose processor may be a microprocessor, but in the alternative, the processor may be any conventional processor, controller, microcontroller, or state machine. A processor may also be implemented as a combination of computing devices, e.g., a combination of a DSP and a microprocessor, a plurality of microprocessors, one or more microprocessors in conjunction with a DSP core, or any other such configuration. [1088] The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. A software module may reside in RAM memory, flash memory, ROM memory, EPROM memory, EEPROM memory, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art. An exemplary storage medium is coupled to the processor such the processor can read information from, and write information to, the storage medium. In the alternative, the storage medium may be integral to the processor.

[1089] The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims

CLAIMS WHAT IS CLAIMED IS:

1. Circuitry of multiple interconnected electrically conductive elements configured to perform operations to process user entry of an ideographic language phrase, the operations comprising: receiving user input, comprising operations of: via a user input tool, receiving incomplete entry of an intended ideographic language multi-character phrase, the incomplete entry being incomplete in at least the following respect: the incomplete entry includes less-than-all strokes of one or more characters of the intended phrase; receiving user designation of one or more delimiters indicating an order of intended characters in the intended phrase and any skipped characters in the incompletely entered phrase; comparing the incompletely entered phrase as represented by the received user input and any skipped characters to a prescribed vocabulary of reference phrases, and identifying any reference phrases that include therein the incompletely entered phrase in the indicated order of intended characters; presenting human-readable output of the identified reference phrases to the user.

2. The circuitry of claim 1 , the incomplete entry being further incomplete in the following respect: the incomplete entry includes user input representing some but less-than-all characters of the intended phrase.

3. The circuitry of claim 1 , where said less-than-all strokes comprise at least one of: zero strokes of an intended character; some but not all strokes of an intended character.

4. The circuitry of claim 1 , where the operation of receiving user input comprises receiving phonetic information describing one or more intended characters, the user input being free of any input as to character stroke.

5. The circuitry of claim 1 , where the operation of receiving incomplete entry of an intended ideographic language multi-character phrase comprises; receiving user input representing phonetic information describing one or more intended characters.

6. The circuitry of claim 5, where: the comparing operation further comprises, for each intended character of the intended phrase, examining any user-entered phonetic information against a mapping between reference characters and their constituent phonetic information to rule out reference characters whose phonetic information is inconsistent with user entry as to that reference character.

7. The circuitry of claim 1 , where the operation of receiving user input comprises receiving character stroke entries describing one or more intended characters.

8. The circuitry of claim 1 , where the user input is received as a sequence of user submissions; where the comparing operation is repeated for each submission; where the presenting operation is repeated each time that repetition of the comparing operation revises the identified reference phrases.

9. The circuitry of claim 1 , where the operation of receiving user designation of one or more delimiters includes receiving a delimiter indicating total number of characters in the intended phrase.

10. The circuitry of claim 1 , where: the operation of comparing the incompletely entered phrase further comprises comparing an order in which character strokes have been entered by the user to a prescribed stroke order for characters in the reference phrases.

11. The circuitry of claim 1 , where the operations further comprise: ordering the identified reference phrases according to prescribed frequency of use criteria before presenting the human-readable output of the identified reference phrases.

12. The circuitry of claim 1 , where the comparing operation comprises: for each intended character of the intended phrase, examining any user- entered strokes against a mapping between reference characters and their constituent strokes to identify reference characters that possibly represent a completion of the user-entered strokes; preparing a list of potential phrases by generating different combinations of identified reference characters, where each combination includes one identified reference character for each intended character of the intended phrase in a same order as occurrence of the intended characters in the intended phrase; comparing each potential phrase in the list to the prescribed vocabulary of reference phrases, and identifying any reference phrases that include therein the potential phrase.

13. The circuitry of claim 12, where the examining operation further comprises examining an order of user-entered strokes against a mapping between reference characters and order of strokes to rule out reference characters that could not represent a completion of user-entered strokes,

14. The circuitry of claim 1 , the operations further comprising: facilitating user selection of the intended phrase from the output.

15. The circuitry of claim 1 , where the operations further include an operation of categorizing strokes of the incomplete entry of intended ideographic language multicharacter phrase according to predetermined stroke categories; where the comparing operation further includes comparing the stroke categories of the incomplete entry of intended of intended ideographic language multi-character phrase to stroke categories of the prescribed vocabulary of reference phrases.

16. The circuitry of claim 1 , where the operation of receiving user input includes, via a user input tool receiving entry of character subcomponents of distinct phonetic or semantic meaning; where the comparing operation further comprises, for each intended character of the intended phrase, ruling out reference characters inconsistent with any user-entered character subcomponents of the intended character.

17. Circuitry of multiple interconnected electrically conductive elements configured to perform operations to process user entry of an ideographic language phrase, the operations comprising: receiving digitized user handwriting entry of a partial multi-character ideographic language phrase that forms an intended multi-character ideographic language phrase except for at least one of: (1) one or more strokes of one or more characters missing, (2) one or more characters missing completely; receiving user designation of delimiters specifying relative positions of characters of the intended multi-character ideographic language phrase; utilizing the delimiters to determine relative positions of the received characters and any missing characters in the partial phrase; comparing the partial phrase including the relative positions to multicharacter reference entries in a predetermined phrase list to identify any reference entries for which the partial phrase forms a constituent part with same relative character positions; presenting a human-readable output of any identified reference entries to the user.

18. At least one computer readable storage medium tangibly embodying a program of machine-readable instructions executable by a digital processing apparatus to perform operations to process user entry of an ideographic language phrase, the operations comprising: receiving user input, comprising operations of: via a user input tool, receiving incomplete entry of an intended ideographic language multi-character phrase, the incomplete entry being incomplete in at least the following respect: the incomplete entry includes less-than-all strokes of one or more characters of the intended phrase; receiving user designation of one or more delimiters indicating an order of intended characters in the intended phrase and any skipped characters in the incompletely entered phrase; comparing the incompletely entered phrase as represented by the received user input and any skipped characters to a prescribed vocabulary of reference phrases, and identifying any reference phrases that include therein the incompletely entered phrase in the indicated order of intended characters; presenting human-readable output of the identified reference phrases to the user.

19. A method of processing user entry of an ideographic language phrase, the operations comprising: receiving user input, comprising operations of: via a user input tool, receiving incomplete entry of an intended ideographic language multi-character phrase, the incomplete entry being incomplete in at least the following respect: the incomplete entry includes less-than-all strokes of one or more characters of the intended phrase; receiving user designation of one or more delimiters indicating an order of intended characters in the intended phrase and any skipped characters in the incompletely entered phrase; comparing the incompletely entered phrase as represented by the received user input and any skipped characters to a prescribed vocabulary of reference phrases, and identifying any reference phrases that include therein the incompletely entered phrase in the indicated order of intended characters; presenting human-readable output of the identified reference phrases to the user.

20. A computer-driven system for the abbreviated user input of ideographic language phrases by submitting less-than-all characters, less-than-all parts of any given characters, or both, the system comprising: a display; a user input tool; a phrase vocabulary containing a prescribed vocabulary of reference phrases made up of ideographic characters, and an identification of constituent strokes of at least one of: phrases in the phrase vocabulary, characters of phrases in the phrase vocabulary; a processor programmed to perform operations to process user entry of an ideographic language phrase, the operations comprising: receiving user input, comprising operations of: via the user input tool, receiving incomplete entry of an intended ideographic language multi-character phrase, the incomplete entry being incomplete in at least the following respect: the incomplete entry includes less-than-all strokes of one or more characters of the intended phrase; via the user input tool, receiving user designation of one or more delimiters indicating an order of intended characters in the intended phrase and any skipped characters in the incompletely entered phrase; comparing the incompletely entered phrase as represented by the received user input and any skipped characters to phrase vocabulary, and identifying any reference phrases that include therein the incompletely entered phrase in the indicated order of intended characters; operating the display to present human-readable output of the identified reference phrases to the user.

21. A computer-driven system for the abbreviated user input of ideographic language phrases by submitting less-than-all characters, less-than-all parts of any given characters, or both, the system comprising: display means for presenting human-readable information; user input means for receiving user input; phrase vocabulary means for containing a prescribed vocabulary of reference phrases made up of ideographic characters, and an identification of constituent strokes of at least one of; phrases in the phrase vocabulary, characters of phrases in the phrase vocabulary; processing means for perform operations to process user entry of an ideographic language phrase, the operations comprising: receiving user input, comprising operations of: via the user input means, receiving incomplete entry of an intended ideographic language multi-character phrase, the incomplete entry being incomplete in at least the following respect: the incomplete entry includes less-than-all strokes of one or more characters of the intended phrase; via the user input means, receiving user designation of one or more delimiters indicating an order of intended characters in the intended phrase and any skipped characters in the incompletely entered phrase; comparing the incompletely entered phrase as represented by the received user input and any skipped characters to phrase vocabulary means, and identifying any reference phrases that include therein the incompletely entered phrase in the indicated order of intended characters; operating the display means to present human-readable output of the identified reference phrases to the user.

22. Circuitry of multiple interconnected electrically conductive elements configured to perform operations to process user entry of an ideographic language phrase, the operations comprising: receiving user input, comprising operations of: via a user input tool, receiving incomplete entry of an intended ideographic language multi-character phrase, the incomplete entry being incomplete in at least the following respect: the incomplete entry includes less-than-all strokes of one or more characters of the intended phrase; receiving user designation of one or more delimiters indicating at least one of the following: an order of intended characters in the intended phrase and any skipped characters in the incompletely entered phrase; an order, within the intended phrase, of intended character subcomponents of distinct phonetic or semantic meaning, and any skipped character subcomponents in the incompletely entered phrase; comparing the incompletely entered phrase as represented by the received user input and any skipped characters or character subcomponents to a prescribed vocabulary of reference phrases, and identifying any reference phrases that include therein the incompletely entered phrase in the indicated order of intended characters or character subcomponents; presenting human-readable output of the identified reference phrases to the user.