KR20060046392A

KR20060046392A - Handwritten input for asian languages

Info

Publication number: KR20060046392A
Application number: KR1020050049347A
Authority: KR
Inventors: 동 리; 동후이 장; 용 장
Original assignee: 마이크로소프트 코포레이션
Priority date: 2004-06-10
Filing date: 2005-06-09
Publication date: 2006-05-17
Also published as: KR101159323B1; SG118351A1; JP2006040263A

Abstract

사용자들이 아시아 언어로 정보를 입력하는 것을 돕기 위한 시스템 및 프로세스가 기재되어 있다. 어떤 양태들에 있어서, 간략화된 중국어 및 다른 언어들을 위한 입력이 수기 입력(handwritten input)과 관련하여 기재되어 있다.Systems and processes are described to help users enter information in Asian languages. In certain aspects, input for simplified Chinese and other languages has been described in connection with handwritten input.

수기 입력, 아시아 언어, 입력방법편집기(IME), 문자, 병음, 표음 Handwriting, Asian, Input Method Editor (IME), Text, Pinyin, Phoneme

Description

아시아 언어들을 위한 수기 입력{HANDWRITTEN INPUT FOR ASIAN LANGUAGES}Handwritten input for Asian languages {HANDWRITTEN INPUT FOR ASIAN LANGUAGES}

본 발명의 다양한 양태들이 첨부 도면들에 예시된다.Various aspects of the invention are illustrated in the accompanying drawings.

도 1 및 도 2는 본 발명의 하나 또는 그 이상의 양태들을 지원하는 범용 컴퓨팅 환경들을 도시한 도면.1 and 2 illustrate general purpose computing environments that support one or more aspects of the invention.

도 3 및 도 4는 본 발명의 양태들과 함께 이용될 수 있는 다양한 하드웨어 사용자 인터페이스 장치들을 도시한 도면.3 and 4 illustrate various hardware user interface devices that may be used with aspects of the present invention.

도 5 내지 도 8은 본 발명의 양태들에 따른 다양한 사용자 인터페이스들을 도시한 도면.5-8 illustrate various user interfaces in accordance with aspects of the present invention.

도 9는 본 발명의 실시예들에 따라 수기(handwritten) 정보를 입력하기 위한 사용자 인터페이스를 도시한 도면.FIG. 9 illustrates a user interface for entering handwritten information in accordance with embodiments of the present invention. FIG.

도 10 및 도 11은 수기 정보의 예들을 도시한 도면.10 and 11 show examples of handwritten information.

도 12 및 도 13은 본 발명의 양태들에 따른 흐름도들.12 and 13 are flow charts in accordance with aspects of the present invention.

<도면의 주요 부분에 대한 부호의 설명><Explanation of symbols for the main parts of the drawings>

301: 디지타이저301: Digitizer

302: 키보드302: keyboard

303: 병음 입력 인식기303: Pinyin input recognizer

304: 운영체계 및/또는 애플리케이션304: Operating system and / or application

본 출원은, Dong Li, Dong-Hui Zhang 및 Yong Zhang에 의해 "Handwritten Input For Asian Languages"라는 명칭으로, 2004년 6월 10일에 출원된 중국 출원 번호(003797.01015)를 우선권으로 주장하고 있다. 이 출원의 내용들은 본 명세서에 참고로서 명백하게 통합되어 있다.This application claims the priority of the Chinese application number (003797.01015) filed June 10, 2004 under the name "Handwritten Input For Asian Languages" by Dong Li, Dong-Hui Zhang and Yong Zhang. The contents of this application are expressly incorporated herein by reference.

본 발명의 양태들은 하드웨어 및 소프트웨어 제품들에 관한 것이다. 특히, 본 발명의 양태들은 사용자들에게, 아시아 언어들로 정보를 입력하기 위한 개선된 프로세스를 제공하는 것에 관한 것이다.Aspects of the present invention relate to hardware and software products. In particular, aspects of the present invention relate to providing users with an improved process for entering information in Asian languages.

컴퓨팅 시스템들은 다수의 언어들로 존재한다. 이 언어들은 단어들의 문자-기반 표시들(character-based representations) 및 기호-기반 표시들(symbol-based representations)을 포함한다. 전세계적으로 서양의 104키 키보드가 널리 이용되고 있지만, 기호-기반 언어들의 사용자들은 키보드들이 제공하는 한정적인 입력을 이용하면서 기호를 입력하기 위한 방법을 필요로 해왔다. 기호 언어들을 입력하기 위한 하나의 방법은, 언어에 특정한 입력 방법 편집기(input method editor)(Microsoft Corporation에 의한 IME)를 사용하는 것이다.Computing systems exist in a number of languages. These languages include character-based representations and symbol-based representations of words. While western 104-key keyboards are widely used around the world, users of symbol-based languages have needed a way to enter symbols while using the limited input provided by keyboards. One way to enter symbolic languages is to use a language specific input method editor (IME by Microsoft Corporation).

아시아의 원문 입력(Asian textual input)은 오늘날 존재하는 가장 도전해볼 만한 컴퓨팅 문제들 중 하나이다. 이것은 아시아 언어 컴퓨팅의 병목이 되어 왔다. 아시아 언어 문자 세트는 유니코드(Unicode) 표준에 대한 개정마다 계속하여 증가하고 있다. 예를 들어, 유니코드 2.0에 정의된 CJK(중국어, 일본어, 한국어) 문자들은 20,902개의 문자들이다. 유니코드 3.0은 27,484개의 문자들을 포함한다. 또한, 확장 B는 40,771개의 문자들을 더 추가한다.Asian textual input is one of the most challenging computing problems that exist today. This has been a bottleneck for Asian language computing. Asian character sets continue to increase with each revision to the Unicode standard. For example, the CJK (Chinese, Japanese, Korean) characters defined in Unicode 2.0 are 20,902 characters. Unicode 3.0 contains 27,484 characters. Extension B also adds 40,771 characters.

IME들은 영어 글자들(English letters)을 아시아 문자들(Asian characters)로 변환하기 위해 변환 장치를 제공한다. 일반적으로, 아시아 문자들의 인코딩은 아시아 문자의 표음학(phonetics)에 근거한다. 이것은 글자들 또는 글자들과 숫자들의 조합을 포함할 수 있다. 종종, 영어 구두점(punctuation)을 아시아 언어의 구두점으로 변환하는 것이 필요할 수 있다. 또한, 영어 텍스트(text)는 아시아 텍스트와 혼합(및/또는 기호들, 표음 글자들/문자들 및 아시아 표의문자(ideograph)(중국어 문자들)와 혼합)될 수 있으므로, 인코딩 방법들 간에 신속하고 용이하게 전환하는 능력을 필요로 한다.IMEs provide a conversion device for converting English letters into Asian characters. In general, the encoding of Asian characters is based on the phonetics of Asian characters. This may include letters or a combination of letters and numbers. Often, it may be necessary to convert English punctuation to Asian punctuation. In addition, English text can be mixed with Asian text (and / or with symbols, phonetic letters / letters, and Asian ideographs (Chinese characters)), so that the encoding method can be used quickly between encoding methods. Requires the ability to switch easily.

종래의 접근법(approach)들과 관련된 다수의 문제들이 존재한다.There are a number of problems associated with conventional approaches.

a. 수기 입력(handwriting input)이 키보드 입력보다 자연스럽지만, 키보드 입력은 아시아 언어들을 위한 주요 입력 메커니즘이다.a. While handwriting input is more natural than keyboard input, keyboard input is the main input mechanism for Asian languages.

b. 수기 입력은 일반적으로 중국어 문자들에 대하여 빠르지만, 병음(倂音, pinyin) 글자들의 키보드 타이핑이 느리다.b. Handwriting input is generally faster for Chinese characters, but keyboard typing of pinyin characters is slow.

c. 전형적인 수기 인식 입력(handwriting recognition input)은 사용자들이 중국어 문자들(동아시아 표의문자)을 쓰는 것을 필요로 한다. 중국어 문자들은 많은 획(stroke)들로 이루어져 있기 때문에, 중국어 문자들을 쓰는 것은 복잡하다. 또한, 현재의 중국어 수기 인식 입력 방법은, 보다 높은 인식률(정확성)을 얻기 위 하여, 사용자들이 별개의 획(흘림체(cursive)가 아님)으로 쓰는 것을 필요로 한다. 종합하면, 복잡성, 비-흘림체 기록 및 보다 낮은 정확성(에러 정정율에 근거함)으로 인해 수기 인식 입력 속도가 낮아진다.c. Typical handwriting recognition input requires users to write Chinese characters (East Asian ideograms). Since Chinese characters consist of many strokes, writing Chinese characters is complicated. In addition, current Chinese handwriting recognition input methods require users to write in separate strokes (not cursive) in order to obtain higher recognition rates (accuracy). Taken together, the handwriting recognition input speed is lowered due to the complexity, non-flow writing, and lower accuracy (based on error correction rate).

사용자들이 아시아 언어들로 신속하고 용이하게 텍스트를 입력할 수 있게 하는 개선된 시스템이 요구된다.There is a need for an improved system that allows users to enter text quickly and easily in Asian languages.

본 발명의 양태들은 전술한 문제들 중 하나 또는 그 이상을 다룸으로써, 아시아 언어들로의 텍스트 입력에 대하여 해결책을 제공한다. 본 발명의 양태들은 스타일러스(stylus)를 이용하여 정보를 입력하는 능력을 포함한다.Aspects of the present invention provide a solution to text entry into Asian languages by addressing one or more of the above-mentioned problems. Aspects of the present invention include the ability to enter information using a stylus.

이들 양태들 및 다른 양태들은 도면들 및 관련 설명과 관련하여 다루어진다.These and other aspects are addressed in conjunction with the drawings and associated description.

본 발명의 양태들은 텍스트를 아시아 언어들로 입력하기 위한 능력을 제공하는 것에 관한 것이다.Aspects of the invention relate to providing the ability to enter text in Asian languages.

이하, 독자를 돕기 위하여 여러개의 부분으로 나누어 설명하기로 한다. 이 부분들의 표제들은, 잉크의 특성; 용어들; 범용 컴퓨팅 환경; 하드웨어 입력들; 사용자 인터페이스들; 및 수기 사용자 입력 인터페이스들을 포함한다.Hereinafter, the description will be divided into several parts to assist the reader. The headings of these parts may include the characteristics of the ink; Terms; General purpose computing environment; Hardware inputs; User interfaces; And handwritten user input interfaces.

잉크의 특성(Characteristics of Ink)Characteristics of Ink

잉크 펜을 사용하는 사용자들에게 알려져 있는 바와 같이, 물리적 잉크(잉크통이 구비된 펜을 사용하여 종이에 쓰여지는 종류)는 선분들에 의해 연결되는 일련의 좌표들(coordinates)보다 많은 정보를 전달할 수 있다. 예를 들어, 물리적 잉 크는 (잉크의 농도에 의하여) 펜 압력, (선분 또는 곡선분의 모양 및 식별 점들(discreet points) 주위의 잉크의 작용(behavior)에 의하여) 펜 각도, 및 (직진성, 선폭, 및 선 또는 곡선 중간에서의 선폭 변화들에 의하여) 펜촉의 속도를 반영할 수 있다. 이들 추가적인 성질들 때문에, 점들 사이가 균일한 선폭으로 되어 있는 것보다 훨씬 더 즉각적으로 감정, 개성, 강세 등이 전달될 수 있다.As is known to users of ink pens, physical ink (a type written on paper using a pen equipped with an ink container) can convey more information than a series of coordinates connected by line segments. have. For example, the physical ink may include pen pressure (by ink concentration), pen angle (by shape of line or curve), and pen angle (by behavior of ink around discreet points), and (straightness, line width). , And line width changes in the middle of the line or curve). Because of these additional properties, emotions, personality, stress, etc. can be transferred much more immediately than having a uniform line width between the points.

전자 잉크(또는 잉크)는 사용자가 스타일러스-기반의 입력 장치를 사용하는 경우에 캡쳐(capture)되는 전자 정보의 캡쳐 및 표시에 관한 것이다. 전자 잉크는 일련의 획들을 가리키며, 각각의 획은 일련의 점들로 이루어진다. 점들은 직교 좌표(Cartesian coordinate)(X,Y), 극좌표(r,Θ)를 포함하는 다양한 공지 기술들, 및 이 기술분야에 공지되어 있는 다른 기술들을 이용하여 표시될 수 있다. 전자 잉크는 압력, 각도, 속도, 컬러, 스타일러스 사이즈, 및 잉크 불투명도(opacity)를 포함하는 실제 잉크의 특성들의 표시들을 포함할 수 있다. 전자 잉크는 다른 정보 중에서, 잉크가 페이지에 피착되었던 순서(대부분의 서양 언어들에 대하여 왼쪽-오른쪽, 그 다음에 아래쪽으로의 래스터(raster) 패턴), (잉크가 피착되었던 시간을 표시하는) 타임스탬프(timestamp), 잉크의 저자(author)의 표시, 및 발원 장치(originating device)(잉크가 뽑혀졌던 기계의 식별 또는 잉크를 피착하는데 이용된 펜의 식별 중 적어도 하나)를 포함하는 다른 특성들을 더 포함할 수 있다.Electronic ink (or ink) relates to the capture and display of electronic information that is captured when a user uses a stylus-based input device. Electronic ink points to a series of strokes, each stroke consisting of a series of points. Points can be represented using Cartesian coordinates (X, Y), various known techniques including polar coordinates (r, Θ), and other techniques known in the art. The electronic ink may include indications of the properties of the actual ink including pressure, angle, speed, color, stylus size, and ink opacity. Electronic ink is, among other information, the order in which the ink was deposited on the page (raster pattern left-right, then downward for most Western languages), and the time (indicative of the time the ink was deposited). Other properties including a timestamp, an indication of the author of the ink, and an originating device (at least one of the identification of the machine from which the ink was drawn or the identification of the pen used to deposit the ink) It may include.

용어들(Terms)Terms

잉크(Ink): 특성들을 갖는 획들의 시퀀스 또는 세트. 획들의 시퀀스는 정렬된 형태의 획들을 포함할 수 있다. 시퀀스는 잉크의 저자에 의해 캡쳐된 시간 또는 페이지나 협력적인 위치들(collaborative situations)에서 획들이 나타나는 곳에 의해 정렬될 수 있다. 다른 정렬들이 가능하다. 획들의 세트는 획들 또는 정렬되지 않은 획들의 시퀀스들 또는 이들의 임의의 조합을 포함할 수 있다. 또한, 어떤 특성들은 각각의 획 또는 그 획에서의 점에 대해 고유할 수 있다(예를 들면, 압력, 속도, 각도 등). 이들 특성들은 획 또는 점 레벨에서 저장될 수 있고, 잉크 레벨에서는 저장될 수 없다. Ink : A sequence or set of strokes with properties. The sequence of strokes can include strokes in an aligned form. The sequence can be sorted by the time captured by the author of the ink or by where the strokes appear in the page or collaborative situations. Other arrangements are possible. The set of strokes may include strokes or sequences of unaligned strokes or any combination thereof. In addition, certain properties may be unique for each stroke or point in that stroke (eg, pressure, velocity, angle, etc.). These properties can be stored at the stroke or point level and not at the ink level.

잉크 객체(Ink object): 특성들을 갖거나, 또는 갖지 않는, 잉크를 저장하는 데이터 구조. Ink object : A data structure that stores ink, with or without properties.

획(Stroke): 캡쳐된 점들의 시퀀스 또는 세트. 예를 들어, 렌더링(rendering)된 경우, 점들의 시퀀스는 선들로 연결될 수 있다. 대안적으로, 획은 점과, 인접한 점의 방향으로의 벡터로서 표시될 수 있다. 간단히 말해서, 획은 점들의 하부 표시(underlying representation) 및/또는 이 점들을 연결하는 것에 상관없이, 잉크에 관한 점들 또는 부분들의 임의의 표시를 포함하는 것으로 의도된다. Stroke : A sequence or set of captured points. For example, when rendered, the sequence of points may be connected by lines. Alternatively, the stroke can be represented as a vector in the direction of the point and adjacent points. In short, a stroke is intended to include any representation of points or portions with respect to ink, regardless of the underlying representation of the points and / or connecting these points.

점(Point): 공간에서 위치를 정의하는 정보. 예를 들어, 점들은 캡쳐 공간(예를 들면, 디지타이저(digitizer)에서의 점들), 가상 잉크 공간(캡쳐된 잉크가 놓이는 공간에서의 좌표들), 및/또는 표시 공간(표시 장치의 점들 또는 픽셀들)에 대하여 정의될 수 있다. Point : Information that defines a location in space. For example, the points may be captured space (eg, points in a digitizer), virtual ink space (coordinates in the space in which the captured ink is placed), and / or display space (points or pixels of the display device). Can be defined for

문서(Document): 볼 수 있는 표시 및 내용을 갖는 임의의 전자 파일. 문서는 웹 페이지, 단어 처리 문서, 노트 페이지 또는 패드, 스프레드시트, 비주얼 프 리젠테이션(visual presentation), 데이터베이스 레코드, 이미지 파일, 및 이들의 조합들을 포함할 수 있다. Document : Any electronic file with visible markings and content. The document may include web pages, word processing documents, note pages or pads, spreadsheets, visual presentations, database records, image files, and combinations thereof.

범용 컴퓨팅 환경(General-Purpose Computing Environment)General-Purpose Computing Environment

도 1 및 도 2는 본 발명이 구현될 수 있는 적절한 운영 환경들(100 및 201)의 예들을 도시한다. 운영 환경들(100 및 201)은 적절한 운영 환경들 중 단지 몇개의 예들일 뿐이고, 본 발명의 이용 또는 기능의 범위에 임의의 한정을 암시하는 것으로 의도되지 않는다. 본 발명과 함께 이용하는데 적절할 수 있는 다른 잘 알려진 컴퓨팅 시스템들, 환경들 및/또는 구성들은 개인용 컴퓨터들, 서버 컴퓨터들, 핸드헬드 또는 랩톱 장치들, 멀티프로세서 시스템들, 마이크로프로세서-기반 시스템들, 프로그램가능 소비자 전자장치들(programmable consumer electronics), 네트워크 PC들, 미니컴퓨터들, 메인프레임 컴퓨터들, 전술한 시스템들 또는 장치들 중 임의의 것을 포함하는 분산 컴퓨팅 환경들 등을 포함할 수 있으나, 이에 한정되지는 않는다.1 and 2 show examples of suitable operating environments 100 and 201 in which the present invention may be implemented. Operating environments 100 and 201 are only a few examples of suitable operating environments, and are not intended to imply any limitation on the scope of use or functionality of the present invention. Other well known computing systems, environments, and / or configurations that may be suitable for use with the present invention include personal computers, server computers, handheld or laptop devices, multiprocessor systems, microprocessor-based systems, Programmable consumer electronics, network PCs, minicomputers, mainframe computers, distributed computing environments including any of the systems or devices described above, and the like. It is not limited.

본 발명의 양태들은 하나 또는 그 이상의 컴퓨터들 또는 다른 장치들에 의해 실행되는, 프로그램 모듈들과 같은 컴퓨터-실행가능한 명령들의 일반적인 문맥으로 기술될 수 있다. 일반적으로, 프로그램 모듈들은 특정 태스크들을 수행하거나 특정 추상 데이터 유형들을 구현하는 루틴들, 알고리즘들, 프로그램들, 객체들, 컴포넌트들, 데이터 구조들 등을 포함한다. 통상적으로, 프로그램 모듈들의 기능은 다양한 실시예들에서 원하는 대로 조합 또는 분산될 수 있다.Aspects of the invention may be described in the general context of computer-executable instructions, such as program modules, being executed by one or more computers or other devices. Generally, program modules include routines, algorithms, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. Typically, the functionality of the program modules may be combined or distributed as desired in various embodiments.

컴퓨팅 장치 시스템들(100 및 201)은 전형적으로 컴퓨터 판독가능한 매체의 적어도 소정의 형태를 포함한다. 컴퓨터 판독가능한 매체는 서버(103) 또는 시스템(201)에 의해 액세스될 수 있는 임의의 이용가능한 매체일 수 있다. 예시적으로, 컴퓨터 판독가능한 매체는 컴퓨터 기억 매체 및 통신 매체를 포함할 수 있지만, 이에 한정되지는 않는다. 컴퓨터 기억 매체는 컴퓨터 판독가능한 명령들, 데이터 구조들, 프로그램 모듈들 또는 다른 데이터와 같은 정보의 저장을 위해 임의의 방법 또는 기술로 구현되는 휘발성 및 비휘발성, 착탈식 및 비-착탈식 매체를 포함한다. 컴퓨터 기억 매체는 RAM, ROM, EEPROM, 플래시 메모리 또는 다른 메모리 기술, CD-ROM, 디지털 다기능 디스크(DVD) 또는 다른 광 기억 장치, 자기 카세트들, 자기 테이프, 자기 디스크 기억 장치 또는 다른 자기 기억 장치들, 또는 원하는 정보를 저장하는데 이용될 수 있고 서버(103) 또는 시스템(201)에 의해 액세스될 수 있는 임의의 다른 매체를 포함하지만, 이에 한정되지는 않는다. 통신 매체는 전형적으로 컴퓨터 판독가능한 명령들, 데이터 구조들, 프로그램 모듈들 또는 다른 데이터를, 반송파와 같은 변조된 데이터 신호 또는 다른 전송 메커니즘 내에 포함하고, 임의의 정보 전달 매체를 포함한다. "변조된 데이터 신호"라는 용어는, 정보를 신호로 인코딩하는 방식으로 설정 또는 변경된 특성들 중 하나 또는 그 이상을 갖는 신호를 의미한다. 예시적으로, 통신 매체는 유선 네트워크 또는 직접-배선(direct-wired) 접속과 같은 유선 매체 및 음향, RF, 적외선 및 다른 무선 매체와 같은 무선 매체를 포함하지만, 이에 한정되지는 않는다. 또한, 전술한 매체 중 임의의 것의 조합들은 컴퓨터 판독가능한 매체의 범위 내에 포함되어야 한다.Computing device systems 100 and 201 typically include at least some form of computer readable media. Computer-readable media can be any available media that can be accessed by server 103 or system 201. By way of illustration, computer readable media may include, but are not limited to, computer storage media and communication media. Computer storage media includes volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data. Computer storage media may include RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital versatile disk (DVD) or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices. , Or any other medium that can be used to store desired information and can be accessed by the server 103 or the system 201. Communication media typically includes computer readable instructions, data structures, program modules or other data in a modulated data signal or other transmission mechanism such as a carrier wave and includes any information delivery medium. The term "modulated data signal" means a signal that has one or more of its characteristics set or changed in such a manner as to encode information in the signal. By way of example, communication media includes, but is not limited to, wired media such as wired networks or direct-wired connections and wireless media such as acoustic, RF, infrared and other wireless media. Combinations of any of the above should also be included within the scope of computer readable media.

도 2를 참조하면, 본 발명의 양태들을 구현하기 위한 예시적인 시스템은 장 치(201)와 같은 컴퓨팅 장치를 포함한다. 가장 기본적인 구성에 있어서, 장치(201)는 통상적으로 처리 장치(204) 및 메모리(203)를 포함한다. 컴퓨팅 장치의 정확한 구성 및 타입에 따라, 메모리(203)는 (RAM과 같은) 휘발성, (ROM, 플래시 메모리 등과 같은) 비휘발성, 또는 이 둘의 소정의 조합일 수 있다. 또한, 장치(201)는 자기 또는 광 디스크 또는 테이프와 같은 대용량 기억 장치(착탈식 및/또는 비-착탈식)(205, 206)를 가질 수도 있다. 유사하게, 장치(201)는 마우스, 스타일러스, 키보드, 트랙볼(trackball) 등과 같은 입력 장치들(208) 및/또는 표시장치 등과 같은 출력 장치들(207)을 가질 수도 있다. 장치(201)의 다른 양태들은 유선 또는 무선 매체(210)를 이용한, 다른 장치들, 컴퓨터들, 네트워크들, 서버들 등과의 네트워크 접속들(209)을 포함할 수 있다. 모든 이러한 장치들은 이 기술분야에 잘 알려져 있으며, 본 명세서에서는 상세하게 논의될 필요가 없다.2, an example system for implementing aspects of the present invention includes a computing device, such as device 201. In its most basic configuration, device 201 typically includes a processing device 204 and a memory 203. Depending on the exact configuration and type of computing device, memory 203 may be volatile (such as RAM), nonvolatile (such as ROM, flash memory, etc.), or some combination of the two. The device 201 may also have mass storage devices (removable and / or non-removable) 205, 206, such as magnetic or optical disks or tapes. Similarly, device 201 may have input devices 208, such as a mouse, stylus, keyboard, trackball, and / or the like, and / or output devices 207, such as a display. Other aspects of the device 201 may include network connections 209 with other devices, computers, networks, servers, or the like, using a wired or wireless medium 210. All such devices are well known in the art and need not be discussed in detail herein.

하드웨어 입력들(Hardware Inputs)Hardware Inputs

본 발명의 양태들에 관한 시스템에 수기 정보(handwritten information)를 입력하기 위한 다양한 입력들이 존재할 수 있다.There may be various inputs for entering handwritten information in a system relating to aspects of the present invention.

도 3은, 디지타이저(301)가 수기 입력을 수신하고 그 입력을 입력 인식기(303)에 발송하면, 입력 인식기(303)가 인식된 입력을 운영 체계 및/또는 애플리케이션(304)에 발송하는 것을 도시하고 있다. 시스템은 입력 인식기에 발송되는 사용자 입력을 수신하는 키보드(302)를 포함할 수도 있다. 본 명세서에서, 입력 인식기(303)는 IME 단독 및/또는 추가 능력을 구비한 IME일 수 있다. 예를 들어, 입력 인식기(303)는 수기 문자를 인식하는 수기 인식 엔진(handwriting recognition engine)을 포함할 수 있다. 인식될 문자의 개수가 한정되는 경우, 인식 정확도는 증가할 것이다. 본 명세서에서, 예를 들어, 병음(Pinyin)을 사용한다면, 408개의 문자들/조합들만이 인식될 필요가 있다. 이것들은 영어만, 영어 및 중국어 문자들이 있는 간단한 중국어, 또는 중국어 문자들이 있는 간단한 중국어일 수 있다.3 shows that when digitizer 301 receives a handwritten input and sends the input to input recognizer 303, input recognizer 303 sends the recognized input to operating system and / or application 304. Doing. The system may also include a keyboard 302 for receiving user input sent to an input recognizer. In this specification, the input recognizer 303 may be an IME alone and / or with additional capabilities. For example, the input recognizer 303 may include a handwriting recognition engine that recognizes handwritten characters. If the number of characters to be recognized is limited, the recognition accuracy will increase. In the present specification, for example, using Pinyin, only 408 characters / combinations need to be recognized. These can be English only, simple Chinese with English and Chinese characters, or simple Chinese with Chinese characters.

도 4는 도 3의 변형을 도시한다. 도 4에서는, (능동 디지타이저(301A) 및 수동 디지타이저(301B)를 포함하여) 다양한 타입의 디지타이저들이 이용될 수 있다. 또한, 본 발명의 양태들은 임의의 개수의 키(N)를 갖는 키보드들(302A)을 이용할 수 있다. 수기 입력은 수기 입력 인식기(401)에 의해 인식될 수 있다. 그 다음, 수기 인식기의 출력은 IME 인식기(402)에 의해 인식된다. 키보드(302A)로부터의 출력은 IME 인식기(402)에 의해 인식될 수 있다.4 shows a variant of FIG. 3. In FIG. 4, various types of digitizers (including active digitizer 301A and passive digitizer 301B) may be used. In addition, aspects of the present invention may utilize keyboards 302A having any number of keys N. FIG. Handwriting input may be recognized by handwriting input recognizer 401. The output of the handwriting recognizer is then recognized by the IME recognizer 402. Output from the keyboard 302A may be recognized by the IME recognizer 402.

시스템은 하드웨어 키보드가 있거나 또는 없이도 이용될 수 있다. 이를테면, 병음 수기 입력은 키보드가 있거나 또는 없이도 이용될 수 있다. 예를 들어, 수기 인식기에 의해 인식될 수 있는 문자들을 그리거나 단어들을 쓰기 위해 스타일러스 또는 다른 위치지정 장치를 이용할 수 있다. 이를테면, 인식기로의 입력으로서 다양한 획들을 갖는 전자 잉크를 이용할 수 있다. 그 다음, 수기 인식기는 수기 인식기로부터의 입력을 인식하기 위해 IME 인식기에 연결될 수 있다.The system can be used with or without a hardware keyboard. For example, pinyin input can be used with or without a keyboard. For example, a stylus or other positioning device can be used to draw letters or write words that can be recognized by a handwriting recognizer. For example, an electronic ink having various strokes can be used as input to the recognizer. The handwriting recognizer may then be connected to an IME recognizer to recognize input from the handwriting recognizer.

수기 인식기(401)는 IME 인식기(402)의 소정 양태들로부터 분리되거나 연결될 수 있다. 이를테면, 수기 인식기(401)는 사전정의된 인식 정보에 근거하여 획들 또는 다른 입력을 인식할 수 있다. 대안적으로, 수기 인식기(401)는 IME 인식기(402)의 커널 변환 엔진(kernel conversion engine)의 일부를 이용할 수 있다.Handwriting recognizer 401 may be separate or coupled from certain aspects of IME recognizer 402. For example, the handwriting recognizer 401 may recognize strokes or other input based on predefined recognition information. Alternatively, the handwriting recognizer 401 can use part of the kernel conversion engine of the IME recognizer 402.

사용자 인터페이스들(User Interfaces)User Interfaces

다양한 사용자 인터페이스들은 특수 키들과 IME의 조합에 의해 이용될 수 있다. 도 5 내지 도 9는 병음 IME와 이용하기 위한 다양한 사용자 인터페이스들을 도시한다. 이들은 다양한 키보드들에 의해 이용될 수 있다.Various user interfaces may be used by a combination of special keys and an IME. 5-9 illustrate various user interfaces for use with the Pinyin IME. These can be used by various keyboards.

도 5는 사용자가 문자들을 조합하는 것을 돕기 위한 정보를 표시하는 다양한 영역들을 도시한다. 영역(1101)으로 조합 창(composition window)이 도시된다. 조합 창(1101)은 이미 조합된 문자들(1102) 및 조합되고 있는 문자들(1103)을 포함한다. 또한, 도 5는 1103에서의 문자의 표음 음성(phonetic sound)과 부합하는 후보들을 보여주는 후보 창(candidate window)(1104)을 포함한다. 그 다음, 사용자는 적절한 후보를 선택하고, 선택된 후보는 1103을 대체하여, 조합된 문자들(1102)에 추가된다. 마지막으로, 도 5는 상태 바(status bar)(1105)를 도시한다.5 shows various areas for displaying information to help a user combine characters. A composition window is shown as area 1101. Combination window 1101 includes characters 1102 already combined and characters 1103 being combined. 5 also includes a candidate window 1104 showing candidates that match the phonetic sound of the text at 1103. The user then selects the appropriate candidate, and the selected candidate is added to the combined characters 1102, replacing 1103. Finally, FIG. 5 shows a status bar 1105.

도 6은 결정된 문자열(1201) 및 상태 바(1202)를 도시한다. 도 6은 재-변환 이전의 사용자 인터페이스를 도시한다. 여기서, 1201에서의 문자들은 결정되었다.6 shows the determined string 1201 and the status bar 1202. 6 shows the user interface before re-conversion. Here, the characters at 1201 were determined.

도 7은 조합 창(1301), 후보 목록(1302) 및 상태 바(1303)를 도시한다. 재-변환 이후에, 페이지로부터의 텍스트 문자열이 조합 창(1301)으로 다시 로딩(load back)되고, 후보 목록(1302)이 표시된다. 도 6에서, 사용자는 텍스트를 입력하고 시스템이 적절한 문자를 고르도록 할 수 있다. 도 7에서, 사용자는, 사용자가 의도하는 것으로 텍스트를 수정하기 위해 시스템에 추가적인 기회를 요구한다.7 shows a combination window 1301, a candidate list 1302, and a status bar 1303. After re-conversion, the text string from the page is loaded back into the combination window 1301 and the candidate list 1302 is displayed. In FIG. 6, the user can enter text and have the system pick the appropriate characters. In FIG. 7, the user requires additional opportunities in the system to modify the text as the user intended.

도 8은 최종 사용자 정의된 구문 툴(end user defined phrase tool)을 도시한다. 여기서, 사용자는 표음 입력들을 위한 바람직한 문자들을 입력할 수 있다. 여기서, 이것들은 최종 사용자 정의된 구문들로서 칭해질 수 있다. 이를테면, 기술 문서를 타이핑하고 있었고, 하나의 구문이 다른 구문들보다 훨씬 자주 사용되었다면, 사용자는 표음 텍스트가 대응해야 하는 문자를 특정하기 위한 능력을 제공받을 수 있다. 이것은 문자들에 대한 보다 빠른 입력을 허용한다.8 shows an end user defined phrase tool. Here, the user can enter desirable characters for phonetic inputs. Here, these may be referred to as end user defined phrases. For example, if you were typing a technical document and one phrase was used much more frequently than the other phrases, the user may be provided with the ability to specify the character to which the phonetic text should correspond. This allows for faster entry of characters.

수기 입력 사용자 인터페이스들(Handwritten Input User Interfaces)Handwritten Input User Interfaces

전술한 부분들은 변환될 정보의 키보드 입력을 설명하고 있다. 수기 입력 및 음성 입력을 포함하여 추가적인 입력들이 이용될 수도 있다. 이하, 전자 잉크를 사용하여 아시아 언어들을 조합하기 위한 표음 입력에 대하여 설명하기로 한다.The foregoing sections describe keyboard input of information to be converted. Additional inputs may be used including handwriting input and voice input. Hereinafter, a phonetic input for combining Asian languages using electronic ink will be described.

동아시아 언어들에 있어서, 이 언어들은 CJK(중국어, 일본어 및 한국어) 문자들로 조합되지만, 언어의 발음은 다양한 표음 방식(scheme)들에 의해 표시된다. 표음 방식들은 한정된 표음 글자들로 조합된다. 이를테면, 중국어에서는, 표음 방식을 병음(倂音: pinyin)이라고 부른다. 전술한 바와 같이, 표음 글자들은 영어에서 볼 수 있는 글자들과 동일하다. 음조(tone)가 없는 유효 병음은 408개의 음절(syllable)들이다. CJK 문자들은 유니코드 2.0에서 20,902개이지만, 동아시아 언어들에서 사용되는 문자들은 80,000개 이상이 존재한다.In East Asian languages, these languages are combined with CJK (Chinese, Japanese and Korean) letters, but the pronunciation of the language is represented by various phoneme schemes. Phonetic schemes are combined into limited phonetic letters. For example, in Chinese, the phonetic system is called pinyin. As mentioned above, phonetic letters are the same as those found in English. An effective pinyin without tone is 408 syllables. While CJK characters are 20,902 in Unicode 2.0, there are more than 80,000 characters used in East Asian languages.

통상적인 접근법들은 수기 입력의 획 인식을 이용해 왔다. 그러나, 이들 접근법들은 특히, 노트에 필기하는 시나리오들(note-taking scenario)에서, 흘림체로 쓸 때 문자의 복잡성 및 만족스런 인식 정확도에 의해 한계가 있다.Conventional approaches have used stroke recognition of handwritten input. However, these approaches are limited, in particular in note-taking scenarios, by the complexity of the characters and satisfactory recognition accuracy when writing in scribble.

중국어 키보드 IME는 이 기술분야에 알려져 있는 바와 같은 통계적 언어 모델(statistical language model)을 이용하여 병음을 중국어 문자들로 변환한다. 본 명세서에서 설명된 수기 인식(문자 수기 인식이라고도 칭해짐)은 CJK(문자들)의 수기 잉크를 텍스트 CJK 문자들로 변환한다. 본 발명의 어떤 양태들은 수기 인식과 중국어 키보드 IME를 조합한다. 이들 양태들은 수기 입력 및 인식의 타고난 성질과 키보드-기반의 IME 변환 엔진의 검증된 효율성을 조합한다. 단어 또는 표음 음성을 완성하는데 필요한 획 수의 감소 때문에, 복잡한 중국어 문자들을 쓰는 것에 비하여, (동등한 영어 단어 또는 문자를 이용하여) 병음으로 쓰는 것이 더 빠르다. 다른 양태들에 있어서, 쓰기 방법은 스텝들(steps)(또는 표음 부분들)로 조합되는 원하는 문자 및 한정된 유효 병음 어휘(408개)에 근거하여 보다 큰 인식 정확도를 제공하면서, 병음 입력으로 흘려 쓰기할 수 있다. 간단히 말해서, 직접 문자 수기 인식은 정확성, 사용의 용이성 및 효율성에 대한 문제들 때문에 키보드-기반 IME들만큼 대중적이지 않다.The Chinese keyboard IME converts Pinyin into Chinese characters using a statistical language model as is known in the art. Handwriting recognition (also called character handwriting recognition) described herein converts handwritten ink of CJK (characters) into text CJK characters. Certain aspects of the present invention combine handwriting recognition with a Chinese keyboard IME. These aspects combine the innate nature of handwriting input and recognition with the proven efficiency of a keyboard-based IME conversion engine. Because of the reduced number of strokes required to complete a word or phonetic voice, it is faster to write Pinyin (using equivalent English words or letters) than to write complex Chinese characters. In other aspects, the writing method flows into the Pinyin input while providing greater recognition accuracy based on the desired character and the limited effective Pinyin vocabulary 408 combined in steps (or phonetic portions). can do. In short, direct handwriting recognition is not as popular as keyboard-based IMEs because of problems with accuracy, ease of use and efficiency.

이 기술분야에 알려져 있는 바와 같이, 동아시아 키보드 IME는 양호한 정확성으로 표음(본 명세서에서는, 중국어의 병음)을 CJK 문자들로 변환하는 언어 모델 및 알고리즘에 성공적이다. 병음의 표음 입력은 한정된 입력 즉, 408개 유효 조합들을 갖는 26개의 영어 글자들을 포함한다. 이러한 한정된 어휘에 근거하여, 수기 인식 시스템은 입력 표음들을 인식하고, 이용가능한 결과들을 생성할 수 있다.As is known in the art, East Asian keyboard IMEs are successful in language models and algorithms for converting phonemes (in the present specification, Chinese Pinyin) into CJK characters with good accuracy. The Pinyin phonetic input includes 26 English letters with limited input, ie 408 valid combinations. Based on this limited vocabulary, the handwriting recognition system can recognize the input phonemes and produce usable results.

병음의 수기 입력, 수기 입력의 인식 및 병음의 중국어 문자들로의 다운스트림 변환(downstream conversion)을 조합함으로써, 다음 중 하나 또는 그 이상이 실현될 수 있다.By combining Pinyin handwriting input, recognition of handwriting input, and downstream conversion of Pinyin into Chinese characters, one or more of the following can be realized.

* 병음의 수기 입력은 (예를 들어, 핸드헬드 컴퓨팅 장치들 및 셀룰러 전화 기들에서) 보다 작은 사용자 인터페이스를 이용하는 사용자들에게 보다 용이하다.Handwriting input of Pinyin is easier for users using a smaller user interface (eg, in handheld computing devices and cellular telephones).

* 사람들이 완전한, 복잡한 중국어 표의 문자들을 직접 쓰는 방법을 잊어버릴 수 있다.* People may forget how to write complete, complex Chinese ideograms directly.

* 어떤 경우들에 있어서, 중국어 문자들을 쓰는 것보다 병음(영어 글자들)을 쓰는 것이 보다 용이하다.In some cases, it is easier to write Pinyin (English letters) than to write Chinese characters.

* 한정된 어휘가 주어진 경우, 시스템들은 복잡한 중국어 문자들보다 병음 문자열들에 대하여 보다 높은 인식율을 갖는다.Given a limited vocabulary, systems have a higher recognition rate for Pinyin strings than complex Chinese characters.

* 흘림체 수기 인식 기술은 일반적으로 라틴(Latin) 글자들에 대해서는 성공적이지만, EA 문자 수기 기술에서는 그다지 성공적이지 못하다.* Handwriting recognition technology is generally successful for Latin letters, but not very successful for EA handwriting technology.

* 병음-중국어 문자 변환은 키보드-기반 IME에서 성공적이다.Pinyin-Chinese character conversion is successful in keyboard-based IMEs.

병음 수기 인식 엔진은 하나 또는 그 이상의 인식 컴포넌트들을 포함할 수 있다. 첫째로, 흘림체 영어 입력(cursive English input)을 인식하는 표준 영어 수기 인식 엔진을 포함할 수 있다. 이 인식 엔진은 유효 병음(예를 들어, 408개 병음)의 어휘 세트로 한정될 수 있거나, 또는 한정되지 않을 수 있다. 이것은 영어 단어들의 보다 큰 어휘와 비교된다. 둘째로, 중국어 키보드 IME 엔진(이를테면, Microsoft Corporation에 의한 MSPY IME)에 관한 것으로서 병음-중국어 문자 변환 엔진을 포함할 수 있다. 대안적으로, 다른 표음-문자 인식 엔진이 병음 IME 대신에 이용될 수 있다(이를테면, 다른 입력들에 의한 일본어, 한국어 및 중국어 중 임의의 것으로 변환하는 엔진).The Pinyin Handwriting Recognition Engine may include one or more recognition components. First, it may include a standard English handwriting recognition engine that recognizes cursive English input. This recognition engine may or may not be limited to a set of vocabularies of valid pinyin (eg, 408 Pinyin). This compares with the larger vocabulary of English words. Second, as for a Chinese keyboard IME engine (such as MSPY IME by Microsoft Corporation), it may include a pinyin-Chinese character conversion engine. Alternatively, other phonetic-character recognition engines may be used in place of the Pinyin IME (eg, an engine that translates to any of Japanese, Korean, and Chinese by other inputs).

또한, 수기 인식 입력(획들로 이루어진 표의 중국어 문자들을 인식하기 위한 능력)은 수기 문자들을 조합하기 위한 통상적인 수기 접근법에 관한 것이다. 본 명세서에서, 병음(표음) 수기 입력은 (이를테면, 노트에 필기하는 시나리오들에서) 텍스트를 신속하게 입력하기 위한 입력 기술을 제공하고, 이것은 수기 인식 기술과 표음-중국어 문자 변환 기술을 조합한다.In addition, handwriting recognition input (capability to recognize tabular Chinese characters made of strokes) relates to a conventional handwriting approach for combining handwritten characters. In the present specification, the Pinyin (phonetic) handwriting input provides an input technique for quickly entering text (such as in scenarios of writing in a note), which combines handwriting recognition technique and phonetic-Chinese character conversion technique.

도 9는 수기 입력과 함께 사용하기 위한 사용자 인터페이스를 도시한다. 영역(1601)은 병음으로부터 변환된 중국어 문자들을 표시한다. 영역(1602)은 입력 수기 잉크에 근거한 새로운 후보를 표시한다. 본 명세서에서, 영역(1602)의 후보는 (영어 표음 병음 문자열 - 본 명세서에서는, "hua" - 을 갖는) 영역(1603)에 결과들이 표시되고, 영역(1604)에 중국어 문자 후보 목록을 갖는 수기 인식 엔진의 결과이다. 본 명세서에서, 영역(1602)은 영역(1604)으로부터의 제1 후보로 채워진다. 영역(1605)은 사용자가 새로운 수기 정보를 입력할 수 있는 곳이다. 본 명세서에서, 사용자가 "mao"의 영어 흘림체 버전을 입력하였다. 후속하여, "mao"에 대한 후보들이 영역(1603)에 나타나고, 영역(1604)에는 그의 중국어 동의어(equivalent)들이 나타날 수 있다.9 shows a user interface for use with handwriting input. Area 1601 displays Chinese characters converted from Pinyin. Region 1602 indicates a new candidate based on input handwritten ink. In this specification, the candidate of the region 1602 is the results are displayed in the region 1603 (having the English phonetic Pinyin string-in this specification, "hua"-), and the handwriting with the Chinese character candidate list in the region 1604 The result of the recognition engine. In this specification, region 1602 is filled with a first candidate from region 1604. Area 1605 is where the user can enter new handwriting information. In this specification, the user has entered an English fluent version of "mao". Subsequently, candidates for "mao" may appear in region 1603 and its Chinese synonyms may appear in region 1604.

본 시스템을 이용하면, 영역(1605)에서의 입력의 인식은, 사용자가 접촉 영역(contact region)으로부터 스타일러스를 들어올리는 경우, 사용자가 다른 영역으로 네비게이트(navigate)하는 경우, 전송(send) 버튼을 두드리는 경우, 중심(focus)을 변경하는 경우, 또는 영역(1605)에서의 잉크의 입력 다음에 지연이 발생한 후에 시작할 수 있다. 다른 이벤트들이 영역(1605)에서 잉크의 인식을 트리거할 수도 있다.With this system, recognition of input in area 1605 can be achieved by a send button when the user lifts the stylus from the contact region and when the user navigates to another area. Can be initiated after changing the focus, or after a delay occurs following the input of ink in the area 1605. Other events may trigger the recognition of ink in area 1605.

영역(1605)에서의 입력은 많은 형태를 가질 수 있다. 예를 들어, 이들 형태들은 (도 10에서 잉크 단어 "mao"로 도시된 바와 같은) 영어 글자들 또는 (/within/among/in/middle/center/while(doing sth)/during/China/Chinese/를 의미하는 "zhong1"을 의미하는) 4개의 획을 갖는 도 11의 중국어 문자 中을 포함할 수 있다.The input in area 1605 can take many forms. For example, these forms may be English letters (as shown by the ink word "mao" in Figure 10) or (/ within / among / in / middle / center / while (doing sth) / during / China / Chinese / Chinese character 中 of FIG. 11 having four strokes (meaning “zhong1”).

도 12를 참조하면, 표음 수기를 인식하기 위한 예시적인 프로세스가 도시되어 있다. 먼저, 사용자는 펜으로 표음 (병음)을 입력하기 시작한다. 이 입력은 단계(1801)에서 잉크 획들로 수집된다. 사용자는 스타일러스(또는 손가락 또는 다른 위치지정 구현)가 스크린에 접촉된 곳 또는 커서의 위치에 또는 그 근처에 잉크의 트래킹(tracking)이 나타나도록 표시할 수도 있다.12, an exemplary process for recognizing phonetic notes is shown. First, the user begins to enter the phoneme (pinyin) with a pen. This input is collected in ink strokes at step 1801. The user may indicate that tracking of the ink appears at or near where the stylus (or finger or other positioning implementation) has touched the screen or at the location of the cursor.

단계(1802)에서, 수집된 획 또는 획들은 예를 들어, 서양 언어 수기 인식 엔진(western language handwriting recognition engine)에 의해 미처리 병음 격자(raw Pinyin lattice)(1803)로 인식될 수 있다. 인식하는 것을 시작하는 때는, 전술한 바와 같이 정의가능할 수 있다.In step 1802, the collected stroke or strokes can be recognized as a raw Pinyin lattice 1803 by, for example, a western language handwriting recognition engine. When starting to recognize, it may be definable as described above.

단계(1804)에서, 미처리 병음 격자는 유효 병음 문자열들(1805)을 발생하기 위한 시도를 하는 병음 구문분석기(Pinyin parser)(1804)로 전송된다. 하나 또는 그 이상의 음절들이 발견되거나, 또는 결과들이 유효 병음 길이 한계와 같거나 이를 초과한다면, 단계(1806)에 표시된 바와 같이 IME 엔진을 이용하여 진행한다. 유효 음절들이 별견되지 않으면, 단계(1801)로 돌아간다.In step 1804, the raw Pinyin grid is sent to a Pinyin parser 1804 which attempts to generate valid Pinyin strings 1805. If one or more syllables are found, or the results are equal to or exceed the effective pinyin length limit, proceed using the IME engine as indicated in step 1806. If no valid syllables are found, return to step 1801.

도 13은 언어 모델 디코더(language model decoder) 및 도 12의 프로세스를 갖는 다른 단계들을 이용하는 예를 도시한다. 단계(1806)으로부터 계속 진행하면, 프로세스는 유효 병음 문자열들을 사용하여, 단계(1901)에서 렉시콘(lexicon)에 근거한 단어 격자를 형성함으로써, 그 결과로 단어 격자(1902)가 발생된다. 그 다음, 단어 격자(1902)는 언어 모델 디코더에 전송된다. 그 다음, 단계(1903)로부터의 최상의 결과들이 중국어 문자들(1904)로 변환된다.FIG. 13 shows an example using a language model decoder and other steps with the process of FIG. 12. Continuing from step 1806, the process uses valid pinyin strings to form a lexicon based lexicon at step 1901, resulting in a word grid 1902. The word grid 1902 is then sent to a language model decoder. Then, the best results from step 1903 are converted to Chinese characters 1904.

다음의 단계들은 후보들의 표시 및 선택에 관한 것이다. 이 단계들은, 본 발명을 실행하는 것과 관련하여 전부 또는 일부가 이용되거나, 또는 어떠한 것도 이용되지 않을 수 있다는 점에서 선택적이다. 이 단계들은 대안성(alternative nature)을 강조하기 위해 절선 상자들(broken boxes)에 도시되어 있다. 단계(1905)에서, 중국어 문자가 사용자에게 표시된다. 이 단계는, 사용자에게 내용들을 표시하기 위해 조합 창의 사이즈를 수정하는 것을 포함할 수 있거나 또는 포함하지 않을 수 있다. 단계(1906)에서, 최종 변환된 단어/문자에 대한 병음 대체들도 도시될 수 있다. 또한, 단계(1906)는 최종 변환된 단어/문자에 대한 문자 대체들이 표시되는 것을 포함할 수 있거나 또는 포함하지 않을 수 있다. 단계(1907)에서, 문자를 전송하도록 지시될 때, 또는 사용자가 네비게이트할 때 등의 경우에, 조합 문자열이 선택시 애플리케이션으로 전송될 수 있다.The following steps relate to the display and selection of candidates. These steps are optional in that all or part of the practice of the invention may be used, or none of them may be used. These steps are shown in broken boxes to emphasize the alternative nature. In step 1905, Chinese characters are displayed to the user. This step may or may not include modifying the size of the combination window to display the contents to the user. In step 1906, pinyin substitutions for the final converted word / letter may also be shown. Further, step 1906 may or may not include displaying character substitutions for the final converted word / character. In step 1907, the combination string may be sent to the application upon selection, such as when instructed to send a character, or when the user navigates.

도 10 및 도 11을 참조하면, 시스템은 2개의 입력 타입들 사이를 구별할 수 있다. 도 10에 도시된 바와 같이 흘림체 입력을 이용하면, 사용자는 다음 획을 그리거나 다음 글자를 쓰기 전에 펜을 들어올릴 필요가 없다. 반대로, 도 11에서, 아시아 표의 문자로 쓰는 사용자는 다음 획이 시작되고 인식되기 전에 펜을 들어올 려야 한다.10 and 11, the system can distinguish between two input types. With the flow input as shown in FIG. 10, the user does not have to lift the pen before drawing the next stroke or writing the next letter. Conversely, in FIG. 11, a user writing in Asian ideographic characters must lift the pen before the next stroke is started and recognized.

다음은 수기 입력을 중국어 문자들로 자동 변환하는 것을 개시하기 위해 상기에 언급한 다른 것들 중에서 다양한 프로세스들을 설명한다.The following describes various processes among the others mentioned above to initiate the automatic conversion of handwritten input into Chinese characters.

* 타이머 이벤트가 발생한 경우, 또는A timer event occurs, or

* 잉크 입력 상태에 있지 않은 경우* When not in ink input

이러한 경우, 미처리 병음 격자는 병음 구문분석기에 의해 유효 병음 문자열들로 변환될 수 있다.In this case, the raw Pinyin grid can be converted into valid Pinyin strings by a Pinyin parser.

다음은 프로세스가 병음 문자열들을 중국어 문자들로 변환하기 위한 시도를 할 때를 설명한다.The following describes when the process attempts to convert Pinyin strings to Chinese characters.

* 복수의 유효 음절들이 발견될 수 있는 경우, 또는A plurality of valid syllables can be found, or

* 최대 가능한 유효 병음 길이와 같거나 초과하는 경우* Equal to or exceed the maximum possible effective Pinyin length

이러한 경우, 변환된 중국어 문자들은 조합 문맥(context)으로 삽입될 수 있고, 그 다음에 인-라인(in-line) 조합 창과 인-라인 잉크 입력 창 둘다 새로운 문맥에 적합하도록 조정될 수 있다.In this case, the translated Chinese characters can be inserted into the combinatorial context, and then both the in-line combinatorial window and the in-line ink input window can be adjusted to suit the new context.

다음은 프로세스가 중국어 문자들을 애플리케이션으로 발송할 수 있을 때를 설명한다.The following describes when a process can send Chinese characters to an application.

* 사용자가 "전송(Send)" 버튼 등과 같은 특정 제어 버튼들/키들 중 하나를 누른 경우* When the user presses one of the specific control buttons / keys, such as the "Send" button

* 조합 창이 가득 차서, 사용자가 추가 잉크를 입력할 수 없는 경우* Combination window is full, user cannot input more ink

* 문장 끝에 입력된 "!"와 같은 기호(구두점)와 만난 경우* Encounters a punctuation mark (such as a punctuation mark) entered at the end of a sentence

다양한 창들(조합 창, 잉크 입력 창 및 후보 창들)은 문맥이 변한 후에 리프레시(refresh)될 수 있거나, 또는 리프레시될 수 없다.The various windows (combined window, ink input window and candidate windows) may or may not be refreshed after the context changes.

인식 프로세스들의 결과들은 흑백으로 표시될 수 있거나, 또는 다양한 오류 정정 행동들을 강조하기 위해 컬러들을 사용할 수 있다. 컬러들이 이용되는 경우, 컬러들은 현재의 선택된 단어 또는 문자를 위해 병음 후보 창에 병음 대체들을 도시하는데 이용될 수 있다(예를 들어, 단어들/문자들의 나머지 - 1601 - 가 흑색으로 도시되는 동안, 현재의 단어 또는 문자 - 1602 - 를 청색으로 도시하는 것). 그 다음, 사용자는 영역(1602)의 단어가 정정되고 있는지, 또는 문자 대체 선택들이 영역들(1603 및 1604)에 제공되었는지에 대하여 인식할 수 있다. 일단 사용자가 후보를 선택했거나, 또는 제안된 후보를 다른 후보로 정정했다면, 고정 문자들을 제외한 전체 문맥(아래 단락 참조)은 다시 변환될 수 있거나, 또는 변환될 수 없다. 이것은 단어들의 문맥에 근거하여 다양한 단어들을 정정하기 위한 시도이다.The results of the recognition processes may be displayed in black and white, or may use colors to highlight various error correction actions. If colors are used, the colors can be used to show pinyin substitutions in the Pinyin candidate window for the currently selected word or character (eg, while the remainder of words / letters-1601-is shown in black, Showing the current word or letter-1602-in blue). The user can then recognize whether the word in region 1602 is being corrected or whether character replacement selections have been provided in regions 1603 and 1604. Once the user has selected a candidate, or corrected the proposed candidate to another candidate, the entire context (see paragraph below), except for fixed characters, may or may not be converted again. This is an attempt to correct various words based on the context of the words.

사용자들은 강조될 수 있거나 또는 강조될 수 없는 현재의 선택된 단어/문자를 대체하기 위해 적당한 대체를 선택할 수도 있다. 적어도 하나의 양태에 있어서, 대체의 사용자 선택은 "고정(fixed)"으로서 표시될 수 있거나, 또는 이미 선택되거나 특정될 수 있다. 추후의 변환들에 있어서, 고정되거나, 사전 선택되거나, 특정된 단어들은, 다른 단어들/문자들이 새로운 문맥에 적합하도록 수정되는 동안에 변경되지 않고 유지할 수 있다.Users may select an appropriate substitution to replace the currently selected word / character that may or may not be highlighted. In at least one aspect, the alternate user selection may be marked as "fixed" or may be already selected or specified. In later transformations, fixed, preselected, or specified words may remain unchanged while other words / letters are modified to suit the new context.

본 발명의 양태들은 일본어, 한국어 및 전형적인 중국어에도 적용될 수 있 다. 예를 들어, 병음 IME를 이용하는 대신에, 개발자는 일본어, 한국어 또는 전형적인 중국어 IME도 포함할 수 있거나, 전술한 바와 같은 키들에 기능들을 추가할 수 있다.Aspects of the invention can also be applied to Japanese, Korean, and traditional Chinese. For example, instead of using the Pinyin IME, a developer may also include Japanese, Korean, or a typical Chinese IME, or add functions to the keys as described above.

Microsoft Corporation으로부터의 IME들이 본 발명의 양태들과 함께 이용될 수 있지만, 다른 IME들도 이용될 수 있다. 예를 들어, International Business Machines로부터의 Unicode IME 및 Sourceforge.net으로부터의 VietIME(크로스 플랫폼 베트남어 입력 방법 편집기: Cross-platform Vietnamese Input Method Editor)와 같은 IME들이 포함될 수 있다.IMEs from Microsoft Corporation can be used with aspects of the present invention, but other IMEs can also be used. For example, it may include Unicode IMEs from International Business Machines and IMEs such as VietIME (Cross-platform Vietnamese Input Method Editor) from Sourceforge.net.

본 발명의 양태들은 예시적인 실시예들에 의하여 설명되었다. 본 명세서를 검토한 이 기술분야의 통상의 지식을 가진 자에게는, 첨부된 청구항들의 범위 및 기술사상 내에서 많은 다른 실시예들, 수정들 및 변형들이 이루어질 것이다.Aspects of the invention have been described by way of example embodiments. Many other embodiments, modifications and variations will be made to those skilled in the art upon reviewing this specification within the scope and spirit of the appended claims.

본 발명에 따르면, 사용자들이 아시아 언어들로 신속하고 용이하게 텍스트를 입력할 수 있다.According to the present invention, users can enter text quickly and easily in Asian languages.

Claims

사용자로부터 입력을 수신하는 단계 - 상기 입력은 잉크(ink)를 포함함 - ;Receiving an input from a user, the input comprising ink;

상기 잉크를 표음 입력(phonetic input)으로서 인식하는 단계; 및Recognizing the ink as a phonetic input; And

상기 표음 입력을 문자(character)로 변환하는 단계Converting the phonetic input to a character

를 포함하는 문자 입력 방법.Character input method comprising a.

제1항에 있어서,The method of claim 1,

상기 인식 단계는 상기 표음 입력을 병음(倂音, pinyin)으로서 인식하는 문자 입력 방법.And the recognition step recognizes the phonetic input as pinyin.

제1항에 있어서,The method of claim 1,

상기 사용자에게 적어도 하나의 대체 인식 결과(alternate recognition result)를 표시하는 단계를 더 포함하는 문자 입력 방법.Displaying at least one alternate recognition result to the user.

제3항에 있어서,The method of claim 3,

상기 표시 단계는 영어 글자(letter)들로부터 형성된 단어(word)들을 표시하는 문자 입력 방법.And the displaying step displays words formed from English letters.

제3항에 있어서,The method of claim 3,

상기 표시 단계는 동아시아 문자들을 표시하는 문자 입력 방법.And the displaying step displays the East Asian characters.

제3항에 있어서,The method of claim 3,

상기 표시 단계는 현재의 선택을, 선택되지 않은 문자들과 다른 컬러로 표시하는 문자 입력 방법.And said displaying step displays the current selection in a different color than the unselected characters.

제1항에 있어서,The method of claim 1,

상기 인식 단계는 서양 언어 수기 인식 엔진(Western Language handwriting recognition engine)의 이용을 포함하는 문자 입력 방법.Wherein the recognition step comprises the use of a Western Language handwriting recognition engine.

제1항에 있어서,The method of claim 1,

상기 인식 단계는, 인식된 잉크가 적어도 하나의 유효 문자열을 포함하는지를 결정하는 단계를 포함하는 문자 입력 방법.Wherein the recognizing step includes determining whether the recognized ink contains at least one valid character string.

사용자로부터 입력을 수신하기 위한 수단 - 상기 입력은 잉크를 포함함 - ;Means for receiving input from a user, the input comprising ink;

상기 잉크를 표음 입력으로서 인식하기 위한 수단; 및Means for recognizing the ink as phonetic input; And

상기 표음 입력을 문자로 변환하기 위한 수단Means for converting the phonetic input to text

를 포함하는 문자 입력 시스템.Character input system comprising a.

제9항에 있어서,The method of claim 9,

상기 인식 수단은 상기 표음 입력을 병음으로서 인식하는 문자 입력 시스템.And the recognition means recognizes the phonetic input as a pinyin.

제9항에 있어서,The method of claim 9,

상기 사용자에게 적어도 하나의 대체 인식 결과를 표시하기 위한 수단을 더 포함하는 문자 입력 시스템.Means for displaying at least one alternative recognition result to the user.

제11항에 있어서,The method of claim 11,

상기 표시 수단은 영어 글자들로부터 형성된 단어들을 표시하는 문자 입력 시스템.And the display means displays words formed from English letters.

제11항에 있어서,The method of claim 11,

상기 표시 수단은 동아시아 문자들을 표시하는 문자 입력 시스템.And the display means is for displaying East Asian characters.

제11항에 있어서,The method of claim 11,

상기 표시 수단은 현재의 선택을, 선택되지 않은 문자들과 다른 컬러로 표시하는 문자 입력 시스템.And the displaying means displays the current selection in a different color than the unselected characters.

제9항에 있어서,The method of claim 9,

상기 인식 수단은 서양 언어 수기 인식 엔진의 이용을 포함하는 문자 입력 시스템.And the recognition means comprises the use of a Western language handwriting recognition engine.

제9항에 있어서,The method of claim 9,

상기 인식 수단은, 인식된 잉크가 적어도 하나의 유효 문자열을 포함하는지를 결정하기 위한 수단을 포함하는 문자 입력 시스템.And the recognizing means comprises means for determining whether the recognized ink contains at least one valid character string.