TWM607472U - Text section labeling system - Google Patents
Text section labeling system Download PDFInfo
- Publication number
- TWM607472U TWM607472U TW109212199U TW109212199U TWM607472U TW M607472 U TWM607472 U TW M607472U TW 109212199 U TW109212199 U TW 109212199U TW 109212199 U TW109212199 U TW 109212199U TW M607472 U TWM607472 U TW M607472U
- Authority
- TW
- Taiwan
- Prior art keywords
- text
- module
- text segment
- segment
- document
- Prior art date
Links
- 238000002372 labelling Methods 0.000 title claims abstract description 28
- 238000012545 processing Methods 0.000 claims abstract description 20
- 238000006243 chemical reaction Methods 0.000 claims abstract description 17
- 239000011159 matrix material Substances 0.000 claims description 43
- 238000003062 neural network model Methods 0.000 claims description 11
- 230000006870 function Effects 0.000 claims description 5
- 230000036541 health Effects 0.000 description 7
- 238000012015 optical character recognition Methods 0.000 description 7
- 238000013528 artificial neural network Methods 0.000 description 6
- 238000010586 diagram Methods 0.000 description 5
- 238000005516 engineering process Methods 0.000 description 5
- 238000000034 method Methods 0.000 description 4
- 238000013527 convolutional neural network Methods 0.000 description 3
- 238000012360 testing method Methods 0.000 description 3
- 238000012549 training Methods 0.000 description 3
- 230000000306 recurrent effect Effects 0.000 description 2
- 239000003814 drug Substances 0.000 description 1
- 229940079593 drug Drugs 0.000 description 1
- 230000008676 import Effects 0.000 description 1
- 238000007689 inspection Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 230000000474 nursing effect Effects 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 230000006403 short-term memory Effects 0.000 description 1
Images
Landscapes
- Character Discrimination (AREA)
Abstract
一種文字區段標籤系統,是連接到輸入裝置,輸入裝置接受待識別文件,待識別文件包括多個文字影像,文字區段標籤系統包括文字影像識別模組、語言處理模組、文字區段關係分析模組、信心轉換模組、標籤庫、及標籤輸出模組。其中,文字影像識別模組連接到輸入裝置以接受該待識別文件,文字影像識別模組辨識出該待識別文件中至少文字區段,文字區段包括上述文字影像,且語言處理模組將文字區段中的文字影像轉換為可編輯文字。藉由文字區段標籤系統,可賦予每一文字區段所對應的標籤。 A text section labeling system is connected to an input device. The input device accepts a document to be recognized. The document to be recognized includes a plurality of text images. The text section labeling system includes a text image recognition module, a language processing module, and a text section relationship Analysis module, confidence conversion module, label library, and label output module. Wherein, the text image recognition module is connected to the input device to receive the document to be recognized, the text image recognition module recognizes at least a text segment in the document to be recognized, the text segment includes the text image, and the language processing module converts the text The text image in the section is converted to editable text. With the text section label system, a label corresponding to each text section can be assigned.
Description
本新型是指一種標籤系統,特別是指一種文字區段標籤系統。 This model refers to a labeling system, especially a text segment labeling system.
目前,為了有效提高紙本診斷書或相關單據輸入時的效率,在輸入該診斷書或該相關單據的作業過程中會使用OCR(Optical Character Recognition,光學字元識別)技術,以將該診斷書或該相關單據中的文字影像自動轉換為可編輯文字。然而,在轉換成可編輯文字後,仍需要人工將這些可編輯文字輸入到資料庫的相應欄位中。舉例來說,紙本診斷書上的“醫療財團法人XX紀念醫院”在轉換成可編輯字元後,仍須人工將其輸入到資料庫的“醫院名稱”這個欄位中。這樣一來,還是會有一定的人工成本且更增加錯誤的機會。 At present, in order to effectively improve the efficiency of inputting a paper medical certificate or related documents, OCR (Optical Character Recognition) technology is used to input the medical certificate or related documents. Or the text image in the related bill is automatically converted into editable text. However, after being converted into editable text, it is still necessary to manually input the editable text into the corresponding field of the database. For example, after the "Medical Consortium XX Memorial Hospital" on the paper medical certificate is converted into editable characters, it must be manually entered into the "Hospital Name" field of the database. In this way, there will still be a certain labor cost and increase the chance of error.
因此,如何將OCR轉換而成的可編輯字元自動填入到資料庫的相應欄位中,便是值得本領域具有通常知識者去思量地。 Therefore, how to automatically fill in the editable characters converted from OCR into the corresponding fields of the database is worth considering by those with ordinary knowledge in the field.
本新型之目的在於提供一文字區段標籤系統,本新型之文字區段標籤系統能將OCR轉換而成的可編輯字元自動填入到資料庫的相應欄位中。 The purpose of the present invention is to provide a text segment labeling system. The text segment labeling system of the present invention can automatically fill in the editable characters converted from OCR into the corresponding fields of the database.
本新型之文字區段標籤系統是連接到一輸入裝置,輸入裝置接受一待識別文件,待識別文件包括多個文字影像,文字區段標籤系統包括一文字影像識別模組、一語言處理模組、一文字區段關係分析模組、一信心轉換模組、一標籤庫、及一標籤輸出模組。其中,文字影像識別模組連接到輸入裝置以接受該待識別文件,文字影像識別模組辨識出該待識別文件中至少一文字區段,文字區段包括至少一上述文字影像,且語言處理模組將文字區段中的文字影像轉換為一可 編輯文字。此外,語言處理模組與文字影像識別模組相連接,語言處理模組衡量該文字區段與待識別文件間的至少一第一關聯資訊,並將可編輯文字與第一關聯資訊轉為一第一特徵矩陣。另外,文字區段關係分析模組與該語言處理模組相連接,文字區段關係分析模組衡量各個文字區段與其他文字區段的一第二關聯資訊,藉由第二關聯資訊將第一特徵矩陣轉換為一第二特徵矩陣。此外,信心轉換模組與文字區段關係分析模組相連接,信心轉換模組將第二特徵矩陣轉換為代表著信心水準一第三特徵矩陣。標籤庫是儲存有多個標籤。標籤輸出模組與信心轉換模組及該標籤庫相連接,標籤輸出模組將第三特徵矩陣轉換為一一維矩陣,一維矩陣的每一元素代表每一文字區段所對應的一標籤代碼,且標籤輸出模組是根據該標籤代碼於一標籤庫尋找所對應的一標籤,並賦予每一文字區段所對應的標籤。 The text segment labeling system of the present invention is connected to an input device. The input device accepts a document to be recognized. The document to be recognized includes multiple text images. The text segment labeling system includes a text image recognition module, a language processing module, A text segment relationship analysis module, a confidence conversion module, a label library, and a label output module. Wherein, the text image recognition module is connected to the input device to receive the document to be recognized, the text image recognition module recognizes at least one text segment in the document to be recognized, the text segment includes at least one text image as described above, and the language processing module Convert the text image in the text section into a Edit text. In addition, the language processing module is connected to the text image recognition module. The language processing module measures at least one first associated information between the text segment and the document to be recognized, and converts the editable text and the first associated information into one The first feature matrix. In addition, the text segment relationship analysis module is connected to the language processing module, and the text segment relationship analysis module measures the second correlation information between each text segment and other text segments, and uses the second correlation information to compare the second correlation information of each text segment with other text segments. A feature matrix is converted into a second feature matrix. In addition, the confidence conversion module is connected to the text segment relationship analysis module, and the confidence conversion module converts the second feature matrix into a third feature matrix representing the confidence level. The tag library stores multiple tags. The label output module is connected with the confidence conversion module and the label library. The label output module converts the third feature matrix into a one-dimensional matrix. Each element of the one-dimensional matrix represents a label code corresponding to each text segment. , And the label output module searches for a label corresponding to a label library according to the label code, and assigns a label corresponding to each text segment.
如上述之文字區段標籤系統,其中第一關聯資訊包括以下資訊的至少其中之一:文字區段於該文件中所佔的面積比例;文字區段的長寬比;或文字區段的位置。 Such as the text section labeling system described above, wherein the first associated information includes at least one of the following information: the proportion of the area occupied by the text section in the document; the aspect ratio of the text section; or the position of the text section .
如上述之文字區段標籤系統,其中文字影像識別模組、語言處理模組、該文字區段關係分析模組、信心轉換模組、與標籤輸出模組皆包括至少一神經網路模型。 As in the above-mentioned text segment labeling system, the text image recognition module, the language processing module, the text segment relationship analysis module, the confidence conversion module, and the label output module all include at least one neural network model.
如上述之文字區段標籤系統,其中文字區段關係分析模組是藉由一圖像神經網路模型衡量各個文字區段與其他文字區段的第二關聯資訊。 As in the above-mentioned text segment labeling system, the text segment relationship analysis module uses an image neural network model to measure the second association information between each text segment and other text segments.
為讓本之上述特徵和優點能更明顯易懂,下文特舉較佳實施例,並配合所附圖式,作詳細說明如下。 In order to make the above-mentioned features and advantages of the present invention more obvious and easy to understand, the following is a detailed description of preferred embodiments in conjunction with the accompanying drawings.
S210~S290:流程圖符號 S210~S290: flow chart symbols
10:輸入裝置 10: Input device
20:資料庫 20: Database
100:文字區段標籤系統 100: Text section labeling system
102:伺服端 102: server
110:文字影像識別模組 110: text image recognition module
120:語言處理模組 120: Language Processing Module
130:文字區段關係分析模組 130: Text segment relationship analysis module
140:信心轉換模組 140: Confidence Conversion Module
150:標籤庫 150: Tag Library
160:標籤輸出模組 160: Label output module
80:待識別文件 80: File to be recognized
81:文字區段 81: Text section
下文將根據附圖來描述各種實施例,所述附圖是用來說明而不是用以任何方式來限制範圍,其中相似的標號表示相似的元件,並且其中: 圖1所繪示為本新型之文字區段標籤系統的實施例。 Hereinafter, various embodiments will be described based on the accompanying drawings, which are used for illustration rather than limiting the scope in any way, in which similar reference numerals indicate similar elements, and in which: Figure 1 shows an embodiment of the new text segment labeling system.
圖2A至圖2D所繪示為待識別文件與其隨處理過程所呈現之變化的其中一實施例。 FIG. 2A to FIG. 2D illustrate one embodiment of the document to be recognized and its changes during the processing.
圖3所繪示為本新型之文字區段標籤方法的實施例。 Figure 3 shows an embodiment of the new text segment labeling method.
圖4A所繪示為第一特徵矩陣的示意圖。 FIG. 4A shows a schematic diagram of the first feature matrix.
圖4B所繪示為第二特徵矩陣的示意圖。 FIG. 4B shows a schematic diagram of the second feature matrix.
圖4C所繪示為第三特徵矩陣的示意圖。 FIG. 4C shows a schematic diagram of the third feature matrix.
圖4D所繪示為一維矩陣的示意圖。 FIG. 4D shows a schematic diagram of a one-dimensional matrix.
參照本文闡述的詳細內容和附圖說明是最好理解本創作。下面參照附圖會討論各種實施例。然而,本領域技術人員將容易理解,這裡關於附圖給出的詳細描述僅僅是為了解釋的目的,因為這些方法和系統可超出所描述的實施例。例如,所給出的教導和特定應用的需求可能產生多種可選的和合適的方法來實現在此描述的任何細節的功能。因此,任何方法可延伸超出所描述和示出的以下實施例中的特定實施選擇範圍。 It is best to understand this creation by referring to the detailed content and the accompanying drawings explained in this article. Various embodiments will be discussed below with reference to the drawings. However, those skilled in the art will easily understand that the detailed description given here with respect to the drawings is only for the purpose of explanation, because these methods and systems may go beyond the described embodiments. For example, the given teachings and the requirements of specific applications may produce a variety of alternative and suitable methods to implement any detailed functions described herein. Therefore, any method can extend beyond the specific implementation options described and illustrated in the following embodiments.
在說明書及後續的申請專利範圍當中使用了某些詞彙來指稱特定的元件。所屬領域中具有通常知識者應可理解,不同的廠商可能會用不同的名詞來稱呼同樣的元件。本說明書及後續的申請專利範圍並不以名稱的差異來作為區分元件的方式,而是以元件在功能上的差異來作為區分的準則。在通篇說明書及後續的請求項當中所提及的「包含」或「包括」係為一開放式的用語,故應解釋成「包含但不限定於」。另外,「耦接」或「連接」一詞在此係包含任何直接及間接的電性連接手段。因此,若文中描述一第一裝置耦接於一第二裝置,則代表該第一裝置可直接電性連接於該第二裝置,或透過其他裝置或連接手段間接地電性連接至該第二裝置。 In the specification and subsequent patent applications, certain words are used to refer to specific elements. Those with ordinary knowledge in the field should understand that different manufacturers may use different terms to refer to the same components. The scope of this specification and subsequent patent applications does not use differences in names as a way of distinguishing elements, but uses differences in functions of elements as a criterion for distinguishing. The "include" or "include" mentioned in the entire specification and subsequent request items is an open term, so it should be interpreted as "includes but is not limited to". In addition, the term "coupled" or "connected" herein includes any direct and indirect electrical connection means. Therefore, if it is described that a first device is coupled to a second device, it means that the first device can be directly electrically connected to the second device, or indirectly electrically connected to the second device through other devices or connection means. Device.
請參閱圖1,圖1所繪示為本新型之文字區段標籤系統的實施例。文字區段標籤系統100包括一文字影像識別模組110、一語言處理模組120、一文字區段關係分析模組130、一信心轉換模組140、一標籤庫150、與一標籤輸出模組160,其中文字區段標籤系統100還包括一輸入裝置10,此輸入裝置10例如為一掃描裝置、一數位相機、或一具有拍照功能的智慧型手機。藉由此輸入裝置10,可將一待識別文件(如圖2A)匯入到文字區段標籤系統100中。在本實施例中,文字影像識別模組110、語言處理模組120、文字區段關係分析模組130、信心轉換模組140、標籤庫150、與標籤輸出模組160是設置於伺服端102,該伺服端102例如是由一台或多台伺服器所組成。
Please refer to FIG. 1. FIG. 1 shows an embodiment of the new text segment labeling system. The text
另外,也請參照圖2A,圖2A所繪示為待識別文件的其中一實施例,在本實施例中待識別文件為醫療費用收據。從圖2A可知,此待識別文件80包括多個文字,而當待識別文件80的影像被輸入裝置10捕捉後,待識別文件80上的文字當然也是以影像的方式存在的,也就是說由輸入裝置10匯入到文字區段標籤系統100的待識別文件上的文字是無法編輯的,以下將這些文字稱為文字影像。
In addition, please also refer to FIG. 2A. FIG. 2A shows an embodiment of the document to be identified. In this embodiment, the document to be identified is a medical expense receipt. It can be seen from FIG. 2A that the
此外,請同時參照圖3,圖3所繪示為本新型之文字區段標籤方法的實施例。首先,實施步驟S210,匯入如圖2A的待識別文件,其詳細流程已如上文所述,在此不再贅述。接著,實施步驟S220,辨識出待識別文件80中的文字區段81。在圖2B中,文字區段81是由虛線所框出來的區域,文字區段81例如是由文字影像識別模組110識別出來。由圖2B可清楚得知,文字區段81是將待識別文件80上的文字影像選取出來,尤其是將集合在一起的文字以一個區段的方式選取出來。之後,實施步驟S230,藉由文字影像識別模組110將文字區段81中的文字影像轉換為可編輯字元。也就是說,原本由輸入裝置10所匯入的待識別文件的影像,其上的文字影像是無法編輯的,然而文字影像識別模組110可將這
些文字影像轉換為可編輯文字,其例如是採用OCR(Optical Character Recognition,光學字元識別)的技術。然而,若單純採用OCR的技術,在待識別文件上的字元影像模糊不清或是被髒污附著時,便可能發生判別錯誤的情形。此時,便可採用例如台灣專利申請號107145984所揭露的技術對發生判別錯誤的情形進行修正。在此,文字影像識別模組110可包括遞歸神經網路(Recurrent Neural Network)、長短期記憶模型(Long Short-Term Memory)或是卷積神經網路(Convolutional Neural Network)等神經網路模型。
In addition, please refer to FIG. 3 at the same time. FIG. 3 illustrates an embodiment of the new text segment labeling method. First, step S210 is implemented to import the file to be recognized as shown in FIG. 2A. The detailed process is as described above, and will not be repeated here. Next, step S220 is implemented to identify the
之後,實施步驟S240,可藉由語言處理模組120衡量文字區段81與待識別文件80間的至少一第一關聯資訊。詳細來說,第一關聯資訊是指文字區段81與待識別文件80間的相對關係;例如:文字區段81於該待識別文件80中所佔的面積比例、文字區段81的長寬比、文字區段81於該待識別文件80中的位置(例如:座標)。然後,實施步驟S250,將文字區段81中的可編輯文字與第一關聯資訊轉為一第一特徵矩陣。請同時參照圖4A,圖4A所繪示為第一特徵矩陣的示意圖。從圖4A可知,第一特徵矩陣為N x F的二維矩陣,也就是說具有N列和F行的二維矩陣。其中,N的列數是代表於該待識別文件80中文字區段81的數量,F則代表每一個文字區段81所對應的參數。從圖4A可知,F所代表的參數可由文字資訊與第一關聯資訊所構成,在本實施例中第n行前的元素用以表示文字資訊。文字資訊是由文字區段81的可編輯文字轉換而成,其例如是使用詞嵌入(word embedding)的技術轉換而成的向量。在第一特徵矩陣中,第一關聯資訊則是用數值來表示,並將其加入於文字資訊的後方,在本實施例是用第n+1行後的元素來進行表示。舉例來說,若文字區段81於該待識別文件80中所佔的面積比例為10.53%,則可表為0.1053。或者,若是文字區段81的長寬比為4:1,則可表為0.2。又或者,文字區段81的座標資訊為(20,31)且整張文件
的大小為(1000,800),則座標資訊經正規化可表為(0.02,0.03875)。這樣一來,此第一關聯資訊可表為[0.1053,0.2,0.02,0.03875]。
After that, step S240 is implemented, and the
再來,實施步驟S260,可藉由文字區段關係分析模組130衡量各個文字區段81與其他文字區段81的一第二關聯資訊。請同時參照圖2C,若將每一個文字區段81與其他文字區段81間都畫有一線段(在圖2C中並未畫出全部的線段,而僅是示意),則線段的數量將有N2個(其中N為文字區段81的數量),此圖形即為數學上的完全圖(Complete Graph)。也就是說,若第二關聯資訊在圖中以文字區段81與其他文字區段81間的線段來表示,則可很清楚得知第二關聯資訊的數量為N2個。舉例來說,若待識別文件80中的文字區段81有20個,則第二關聯資訊的數量為202個,意即400個。在本實施例中,由於文字區段81與文字區段81之間的關係(意即:第二關聯資訊)可用一完全圖(Complete Graph)來進行表示,故藉由圖像神經網路(Graph Neural Network)的模型來衡量第二關聯資訊。也就是說,文字區段關係分析模組130可包括圖像神經網路的模型。藉由圖像神經網路的模型,文字區段81與文字區段81之間可交換重要的資訊,讓文字區段81與文字區段81之間的關係能用數值來進行表示。
Then, in step S260, the text segment
舉例來說,當「健保」這個文字區段81與左側「身份」的文字區段81間的第二關聯資訊可能就用代表關聯性較高的數值來表示。更具體來說,「身份」的數值化向量就會提供給「健保」這個文字區段81較多的向量資訊,例如將「身份」這個文字區段81的數值化向量加在「健保」這個文字區段81的後方,或者將「身份」這個文字區段81的數值化向量乘以一個較大的權重後加在「健保」這個文字區段81的後方;另外,「醫療費用收據」這個文字區段81與「健保」這個文字區段81之間的關聯性可能較低,故「醫療費用收據」這個文字區段81的數值化向量乘以一個較小的權重後加在「健保」這個文字區段81的後方。也因此,在經由步驟S260後,第一特徵矩陣會轉換為如圖4B所示的第二特徵矩
陣,而第二特徵矩陣為N x F2的二維矩陣。其中,N是代表於該待識別文件80中文字區段81的數量,F2則代表每一個文字區段81在併入第二關聯資訊後所對應的參數,F2的數量例如為F*N。須注意的是,以上僅是舉例,文字區段81間的關聯性何者較高何者較低是由訓練過後的圖像神經網路模型或其他神經網路模型來進行判定。在本實施例中,是藉由圖像神經網路(Graph Neural Network)的模型來衡量第二關聯資訊。然而,本領域具有通常知識者也可用其他的神經網路模型,如:卷積神經網路(Convolutional Neural Network,CNN)或循環神經網路(Recurrent neural network,RNN)。
For example, when the second correlation information between the
之後,實施步驟S270,信心轉換模組140例如是藉由Softmax函數將第二特徵矩陣轉換為代表著信心水準的一第三特徵矩陣,此第三特徵矩陣為N x C的二維矩陣,如圖4C所示。其中,N的列數是代表於該待識別文件80中文字區段81的數量,C的行數則代表標籤的全部數量。以下,將對標籤庫150中的標籤進行介紹。
Afterwards, step S270 is implemented. The
在本實施例中,標籤庫150儲存有多個標籤,這些標籤是用於標示文字區段81的種類。舉例來說,請參照圖2D,「醫療財團法人XXX紀念醫院」這個文字區段81就會被標籤為標題資訊,位於待識別文件80中間區域的數字則會被標籤為費用,而位於待識別文件80最右方的警語則會被標籤為非重要資訊。此外,在這些標籤中,彼此間也可具有階層關係。舉例來說,標題資訊可進一步分類為:醫院名稱、收據類別、健保身份、身份證字號等;費用可進一步分類為:藥品費、護理費、檢查費、藥事服務費等。請回去參照圖4C,在第三特徵矩陣中,每一個元素(element)代表對應到每一個標籤的信心水準。舉例來說,對於藥事服務費這個文字區段81,代表藥事服務費這個元素可能有最高的數值,而代表費用這個元素可能有次高的數值。
In this embodiment, the
再來,實施步驟S280,藉由標籤輸出模組160將第三特徵矩陣轉換為一一維矩陣(如圖4D所示),此一維矩陣的每一元素代表每一文字區段所對應的一標籤代碼。接著,實施步驟S290,標籤輸出模組160於標籤庫150尋找該標籤代碼所對應的標籤,並賦予每一文字區段81所對應的標籤。這樣一來,之後資料庫相關處理軟體便可根據文字區段81所對應的標籤,將正確的資料輸入到資料庫20所對應的欄位中。因此,藉由本實施例所提到的文字區段標籤方法,使用者在對待識別文件進行拍照後,後續就能完全交由電腦將相關資料輸入到資料庫相對應的欄位。
Then, step S280 is implemented to convert the third feature matrix into a one-dimensional matrix (as shown in FIG. 4D) by the
在上述的實施例中,文字影像識別模組110、語言處理模組120、文字區段關係分析模組130、信心轉換模組140、及標籤輸出模組160都包含神經網路模型,這些神經網路模型於訓練時可將樣本分成訓練集與測試集,先由訓練集訓練後,再由測試集進行測試。在其中一個實施例中,訓練集的樣本數約是測試集的三倍。
In the foregoing embodiment, the text
雖然本創作已以較佳實施例揭露如上,然其並非用以限定本創作,任何所屬技術領域中具有通常知識者,在不脫離本創作之精神和範圍內,當可作些許之更動與潤飾,因此本創作之保護範圍當視後附之申請專利範圍所界定者為準。 Although this creation has been disclosed as above in a preferred embodiment, it is not intended to limit this creation. Anyone with ordinary knowledge in the technical field can make some changes and modifications without departing from the spirit and scope of this creation. Therefore, the scope of protection of this creation shall be subject to those defined in the attached patent scope.
10:輸入裝置 10: Input device
20:資料庫 20: Database
100:文字區段標籤系統 100: Text section labeling system
102:伺服端 102: server
110:文字影像識別模組 110: text image recognition module
120:語言處理模組 120: Language Processing Module
130:文字區段關係分析模組 130: Text segment relationship analysis module
140:信心轉換模組 140: Confidence Conversion Module
150:標籤庫 150: Tag Library
160:標籤輸出模組 160: Label output module
Claims (7)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
TW109212199U TWM607472U (en) | 2020-09-16 | 2020-09-16 | Text section labeling system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
TW109212199U TWM607472U (en) | 2020-09-16 | 2020-09-16 | Text section labeling system |
Publications (1)
Publication Number | Publication Date |
---|---|
TWM607472U true TWM607472U (en) | 2021-02-11 |
Family
ID=75783327
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
TW109212199U TWM607472U (en) | 2020-09-16 | 2020-09-16 | Text section labeling system |
Country Status (1)
Country | Link |
---|---|
TW (1) | TWM607472U (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
TWI807467B (en) * | 2021-11-02 | 2023-07-01 | 中國信託商業銀行股份有限公司 | Key-item detection model building method, business-oriented key-value identification system and method |
-
2020
- 2020-09-16 TW TW109212199U patent/TWM607472U/en not_active IP Right Cessation
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
TWI807467B (en) * | 2021-11-02 | 2023-07-01 | 中國信託商業銀行股份有限公司 | Key-item detection model building method, business-oriented key-value identification system and method |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109543690B (en) | Method and device for extracting information | |
US20210073531A1 (en) | Multi-page document recognition in document capture | |
Xu et al. | Remote sensing image scene classification based on generative adversarial networks | |
KR20190095651A (en) | Apparatus for generating training data for character learning and method thereof | |
CN112380870A (en) | User intention analysis method and device, electronic equipment and computer storage medium | |
US10339373B1 (en) | Optical character recognition utilizing hashed templates | |
CN113963147B (en) | Key information extraction method and system based on semantic segmentation | |
CN112862024B (en) | Text recognition method and system | |
CN110111902B (en) | Acute infectious disease attack period prediction method, device and storage medium | |
US9710769B2 (en) | Methods and systems for crowdsourcing a task | |
CN114005126A (en) | Table reconstruction method and device, computer equipment and readable storage medium | |
CN108921193B (en) | Picture input method, server and computer storage medium | |
CN111881943A (en) | Method, device, equipment and computer readable medium for image classification | |
TWM607472U (en) | Text section labeling system | |
CN116701637B (en) | Zero sample text classification method, system and medium based on CLIP | |
US20230334889A1 (en) | Systems and methods for spatial-aware information extraction from electronic source documents | |
US10922537B2 (en) | System and method for processing and identifying content in form documents | |
US20230196558A1 (en) | Medicine image recognition method, electronic device and readable storage medium | |
TWI787651B (en) | Method and system for labeling text segment | |
CN115994232A (en) | Online multi-version document identity authentication method, system and computer equipment | |
CN116562247A (en) | Electronic form content generation method, electronic form content generation device and computer equipment | |
US20100023517A1 (en) | Method and system for extracting data-points from a data file | |
CN114943306A (en) | Intention classification method, device, equipment and storage medium | |
Asha et al. | Artificial Neural Networks based DIGI Writing | |
CN111898612A (en) | OCR recognition method and device combining RPA and AI, equipment and medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
MM4K | Annulment or lapse of a utility model due to non-payment of fees |