WO2022075929A1 - Procédé de reconnaissance et de correction visuelle de termes - Google Patents

Procédé de reconnaissance et de correction visuelle de termes Download PDF

Info

Publication number
WO2022075929A1
WO2022075929A1 PCT/TR2020/050925 TR2020050925W WO2022075929A1 WO 2022075929 A1 WO2022075929 A1 WO 2022075929A1 TR 2020050925 W TR2020050925 W TR 2020050925W WO 2022075929 A1 WO2022075929 A1 WO 2022075929A1
Authority
WO
WIPO (PCT)
Prior art keywords
word
words
correct
visual
model
Prior art date
Application number
PCT/TR2020/050925
Other languages
English (en)
Inventor
Adnan ÖNCEVARLIK
Original Assignee
Ünsped Gümrük Müşavirliği Ve Lojistik Hizmetler A.Ş.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ünsped Gümrük Müşavirliği Ve Lojistik Hizmetler A.Ş. filed Critical Ünsped Gümrük Müşavirliği Ve Lojistik Hizmetler A.Ş.
Priority to PCT/TR2020/050925 priority Critical patent/WO2022075929A1/fr
Publication of WO2022075929A1 publication Critical patent/WO2022075929A1/fr

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/232Orthographic correction, e.g. spell checking or vowelisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/103Formatting, i.e. changing of presentation of documents
    • G06F40/106Display of layout of documents; Previewing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/19Recognition using electronic means
    • G06V30/19007Matching; Proximity measures

Definitions

  • the so-called invention is related to a method that solves the problem of the inability to detect correct words due to special characters/letters contained in alphabets of some languages by using visual word recognition and correction method and allows the correct word to be proposed.
  • Levenshtein Distance or Humming Distance is actually based on the method of calculating with how many letter change a word can be found by making at least one change to another word. For example, converting the word “agirlik” to “agirhk” it predicts that the correction can be performed with at least 3 letter changes. In particular, it is not successful in correcting Turkish words that do not use characters specific to Vietnamese. Especially in words where 2 or more letter errors are made, text/character based algorithms consume more resources and cannot achieve accurate results.
  • the current invention concerns the method of visual word recognition and correction, to eliminate the disadvantages mentioned above and brings new advantages to the relevant technical field.
  • the aim of the invention is to present a method that proposes the corrected word that cannot be detected by visual detection of words with letter errors in the known state of the technique by using artificial neural networks.
  • Another goal of the invention is to provide a new method, especially used in correction of Turkish words.
  • Figure 1 is a representation of each of the words held in text transformed into a graphic with white dots on a black background.
  • Figure 2 is an example of the state of the word combinations created on the disk.
  • Figure 3 is class information that will help us find out what the word in the CNN model training. DETAILED DESCRIPTION OF THE INVENTION
  • the invention in question is related to the method of visual word recognition and correction. Instead of the traditional and worldwide Damerau-Levenshtein, Hamming Distance methods and text and character based controls, the visuals of each word are converted into visual form by artificial neural networks modeling and suggesting the visual word that should/most closely resemble it.
  • the system offers a method that finds the correct words that are perceived incorrectly because of some letters (especially g, ⁇ , ti, etc.) specific to the Turkish language encountered in the current situation and suggests the right word. For example, word “sagir” and word “sagir.” In the same way, when “sagir” is written, other algorithms can also produce the word “sigir” but thanks to the visual recognition algorithm, the word “sagir” can be perceived as “sagir.” Because without analyzing on the basis of letters, it looks directly at the word visually and suggests the correct word that most closely resembles it.
  • some letters especially g, ⁇ , ti, etc.
  • CNN Convolutional Neural Network
  • UGM Vision refers to the invention.
  • each of the words held as text is converted into a graphic with white dots on a black background ( Figure- 1).
  • FIG. 1 A format like “nnnnn wordspelling.png” is applied as the file name for the combinations that occur for each word and saved on disk as training data.
  • Figure 2 shows the state of the sample word combinations on the disk.
  • the number in front of each of these combined files created gives information about which word the image actually belongs to. For example, as can be seen in the example image below, the word “yeni” begins with 00020_ and its combinations begin with 00020_ in the same way. In this CNN model training, it is class knowledge that will help to find out what the word is ( Figure-3). In other words, the prefix 00020_ was used as the equivalent of the word “yeni.” In this way, the correct states of the classes/words to be used in the classification are determined by creating a unique prefix of each correct word.
  • the training data set for the CNN model is prepared. These graphic/image files created are given to the model for training. A very serious quantity of word images will be formed here. Millions of combinations of word images make up the CNN model's training dataset.
  • X_train np.reshape(X_train, (len(aFiles), imgHeight, imgWidth, 1))
  • y_train np.array(y_train)
  • X_train a data set is formed as a numerical matrix of each word image to be trained.
  • Y_train also contains the result word index to be used in the classification.
  • a model is created and trained in the following way using the CNN algorithm used in Normal visual classification.
  • the trained model and weights created are then recorded.
  • the text information entered when predicting is first converted to a graphical state as described above, and the resulting graph is converted into a data set in a matrix and estimated using a pretrained model.
  • X_test np.reshape(X_test, ( 1 , imgHeight, imgWidth, 1 ))
  • y_predict model.predict(X_test) return getMostProbableWordSingle(y_predict[0] , aWord, apply_lev)
  • aWord "birlikte”
  • aSentence input("Cumleyi Girin:”)
  • X_test np.reshape(X_test, (1, imgHeight, imgWidth, 1))
  • y_predict model.predict(X_test)
  • aMeaning aMeaning + "[ " + getMostProbableWordMulti(y_predict[0], aWord, 0) + " ] - t! print(" - ") printf'UGM Vision -> ⁇ ⁇ ".format( aMeaning))
  • the one among results that is closer is recommended to the user.
  • the invention is available for any language.
  • visual word recognition and correction method consists of following process steps: i. Creating a class table for the correct states of each word, ii. Training of the classification model using convolutional neural networks and similar visual classification algorithms or models of the training set consisting of correct spelling of words in the relevant language, iii. Creating a graph for each of the words that are required to correct or control, iv. Prediction the generated visual word or text from a pre-trained artificial neural network classification model, v. Suggesting the word to the user by finding the correct word that gives the highest hit rate from the predicted Word.
  • the correction algorithm is adaptable for any visually writable language that has been used/being used around the world and is not limited only to Turkish. It can be adapted for any written language.
  • Another application of the invention is to convert each of the words into a graphical state using color combinations that ensure high contrast or distinctiveness in the graph creation process mentioned in the method.
  • the color combination is white dots on a black background.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Multimedia (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Character Discrimination (AREA)

Abstract

L'invention concerne un procédé qui permet de résoudre le problème lié à l'incapacité à détecter des termes corrects en raison de la présence de caractères spéciaux/de lettres appartenant à des alphabets employés dans certaines langues, à l'aide d'un procédé de reconnaissance et de correction visuelle de termes et permet de proposer le terme correct.
PCT/TR2020/050925 2020-10-07 2020-10-07 Procédé de reconnaissance et de correction visuelle de termes WO2022075929A1 (fr)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/TR2020/050925 WO2022075929A1 (fr) 2020-10-07 2020-10-07 Procédé de reconnaissance et de correction visuelle de termes

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/TR2020/050925 WO2022075929A1 (fr) 2020-10-07 2020-10-07 Procédé de reconnaissance et de correction visuelle de termes

Publications (1)

Publication Number Publication Date
WO2022075929A1 true WO2022075929A1 (fr) 2022-04-14

Family

ID=81125648

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/TR2020/050925 WO2022075929A1 (fr) 2020-10-07 2020-10-07 Procédé de reconnaissance et de correction visuelle de termes

Country Status (1)

Country Link
WO (1) WO2022075929A1 (fr)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170339169A1 (en) * 2016-05-23 2017-11-23 GreatHorn, Inc. Computer-implemented methods and systems for identifying visually similar text character strings
US20180137350A1 (en) * 2016-11-14 2018-05-17 Kodak Alaris Inc. System and method of character recognition using fully convolutional neural networks with attention
CN110765996A (zh) * 2019-10-21 2020-02-07 北京百度网讯科技有限公司 文本信息处理方法及装置

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170339169A1 (en) * 2016-05-23 2017-11-23 GreatHorn, Inc. Computer-implemented methods and systems for identifying visually similar text character strings
US20180137350A1 (en) * 2016-11-14 2018-05-17 Kodak Alaris Inc. System and method of character recognition using fully convolutional neural networks with attention
CN110765996A (zh) * 2019-10-21 2020-02-07 北京百度网讯科技有限公司 文本信息处理方法及装置

Similar Documents

Publication Publication Date Title
CN109190131B (zh) 一种基于神经机器翻译的英语单词及其大小写联合预测方法
CN108628823B (zh) 结合注意力机制和多任务协同训练的命名实体识别方法
CN109492202A (zh) 一种基于拼音的编码与解码模型的中文纠错方法
CN111310447B (zh) 语法纠错方法、装置、电子设备和存储介质
CN107992211B (zh) 一种基于cnn-lstm的汉字拼写错别字改正方法
CN110826334B (zh) 一种基于强化学习的中文命名实体识别模型及其训练方法
TWI567569B (zh) Natural language processing systems, natural language processing methods, and natural language processing programs
CN110968299A (zh) 一种基于手绘网页图像的前端工程化代码生成方法
CN108563632A (zh) 文字拼写错误的修正方法、***、计算机设备及存储介质
CN113657098B (zh) 文本纠错方法、装置、设备及存储介质
CN108563634A (zh) 文字拼写错误的识别方法、***、计算机设备及存储介质
CN112434520A (zh) 命名实体识别方法、装置及可读存储介质
CN114036950B (zh) 一种医疗文本命名实体识别方法及***
CN115438154A (zh) 基于表征学习的中文自动语音识别文本修复方法及***
CN112488111B (zh) 一种基于多层级表达引导注意力网络的指示表达理解方法
CN112528168B (zh) 基于可形变自注意力机制的社交网络文本情感分析方法
CN113297374B (zh) 一种基于bert和字词特征融合的文本分类方法
WO2022075929A1 (fr) Procédé de reconnaissance et de correction visuelle de termes
CN116702760A (zh) 一种基于预训练深度学习的地理命名实体纠错方法
Kalaichelvi et al. Application of neural networks in character recognition
Antunes et al. A bi-directional multiple timescales LSTM model for grounding of actions and verbs
CN116246278A (zh) 文字识别方法、装置、存储介质及电子设备
CN113282746B (zh) 一种网络媒体平台变体评论对抗文本生成方法
CN112257461A (zh) 一种基于注意力机制的xml文档翻译及评价方法
Zhang et al. Drawing order recovery based on deep learning

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20956863

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20956863

Country of ref document: EP

Kind code of ref document: A1