TWM576281U

TWM576281U - Automatic text labeling system

Info

Publication number: TWM576281U
Application number: TW107216084U
Authority: TW
Inventors: 趙式隆; 林奕辰; 沈昇勳
Original assignee: 洽吧智能股份有限公司
Priority date: 2018-11-27
Filing date: 2018-11-27
Publication date: 2019-04-01

Abstract

本新型揭露一種自動文本標籤系統。該系統包括一人工智慧語言處理模組、一人工智慧文本分類模組、一標籤庫以及一人工智慧標籤輸出模組。該人工智慧語言處理模組用來接收一輸入文字以及將該輸入文字轉換為至少一特定維度的量化特徵；該人工智慧文本分類模組用來將該至少一特定維度的量化特徵轉換為至少一特定信心水準的特徵矩陣；該標籤庫用來儲存需要輸出之至少一標籤；以及該人工智慧標籤輸出模組用來將該至少一特定信心水準的特徵矩陣轉換為至少一文本標籤以及一相對應的量化信心水準。The present invention discloses an automatic text labeling system. The system includes an artificial intelligence language processing module, an artificial intelligence text classification module, a tag library, and an artificial intelligence label output module. The artificial intelligence language processing module is used for receiving an input text and converting the input text into at least one specific dimension quantized feature; the artificial intelligence text classification module is used for converting the at least one specific dimension quantized feature into at least one Feature matrix with specific confidence level; the tag library is used to store at least one tag to be output; and the artificial intelligence tag output module is used to convert the feature matrix of at least one specific confidence level into at least one text tag and a corresponding Level of quantitative confidence.

Description

自動文本標籤系統Automatic text labeling system

本新型是關於一種自動文本標籤系統；特別關於一種人工智慧解析文本內容文字的自動文本標籤系統，用來將該文本內容文字給予可能的標籤，讓分類文本可以透過人工智慧加快人類分類效率，或取代人類完成文本分類。The present invention relates to an automatic text labeling system; in particular, to an automatic text labeling system that analyzes text of text content by artificial intelligence, which is used to give possible labels to the text content text, so that classified text can accelerate human classification efficiency through artificial intelligence, or Replace humans to complete text classification.

在現今的資料科技時代，各種各樣的文本資料層出不窮，如醫生的診斷書、多樣化的新聞、海量的自媒體原創文章。面對如此豐富多樣的資訊，人們迫切需要一些自動化工具來讓他們從浩瀚的資訊汪洋中準確、快速地找到自己需要的關鍵信息，因此，如何快速並精確地解析文本內容文字，並給予可能的標籤，讓分類文本可以透過人工智慧加快人類分類效率，是值得本領域具有通常知識者去思量的重要課題之一。In the current era of data science and technology, a variety of textual materials are emerging, such as doctor's medical certificates, diversified news, and a large number of original articles from the media. Faced with such rich and diverse information, people urgently need some automated tools to enable them to accurately and quickly find the key information they need from the vast ocean of information. Therefore, how to quickly and accurately parse the text content and give the possible Labels, so that classified text can accelerate human classification efficiency through artificial intelligence, is one of the important topics worthy of consideration by those with ordinary knowledge in the field.

本新型提供一種人工智慧解析文本內容文字的自動文本標籤方法與系統，用來將該文本內容文字給予可能的標籤，讓分類文本可以透過人工智慧加快人類分類效率，或取代人類完成文本分類。本新型提供一種自動文本標籤系統。該系統包括一人工智慧語言處理模組、一人工智慧文本分類模組、一標籤庫以及一人工智慧標籤輸出模組。該人工智慧語言處理模組用來接收一輸入文字以及將該輸入文字轉換為至少一特定維度的量化特徵；該人工智慧文本分類模組用來將該至少一特定維度的量化特徵轉換為至少一特定信心水準的特徵矩陣；該標籤庫用來儲存需要輸出之至少一標籤；以及該人工智慧標籤輸出模組用來將該至少一特定信心水準的特徵矩陣轉換為至少一文本標籤以及一相對應的量化信心水準。The present invention provides an automatic text labeling method and system for analyzing text content text by artificial intelligence, which is used to give possible labels to the text content text, so that classified text can accelerate human classification efficiency through artificial intelligence, or replace humans to complete text classification. The present invention provides an automatic text labeling system. The system includes an artificial intelligence language processing module, an artificial intelligence text classification module, a tag library, and an artificial intelligence label output module. The artificial intelligence language processing module is used for receiving an input text and converting the input text into at least one specific dimension quantized feature; the artificial intelligence text classification module is used for converting the at least one specific dimension quantized feature into at least one Feature matrix with specific confidence level; the tag library is used to store at least one tag to be output; and the artificial intelligence tag output module is used to convert the feature matrix of at least one specific confidence level into at least one text tag and a corresponding Level of quantitative confidence.

參照本文闡述的詳細內容和附圖說明是最好理解本新型。下面參照附圖會討論各種實施例。然而，本領域技術人員將容易理解，這裡關於附圖給出的詳細描述僅僅是為了解釋的目的，因為這些方法和系統可超出所描述的實施例。例如，所給出的教導和特定應用的需求可能產生多種可選的和合適的方法來實現在此描述的任何細節的功能。因此，任何方法可延伸超出所描述和示出的以下實施例中的特定實施選擇範圍。在說明書及後續的申請專利範圍當中使用了某些詞彙來指稱特定的元件。所屬領域中具有通常知識者應可理解，硬體製造商可能會用不同的名詞來稱呼同樣的元件。本說明書及後續的申請專利範圍並不以名稱的差異來作為區分元件的方式，而是以元件在功能上的差異來作為區分的準則。在通篇說明書及後續的請求項當中所提及的「包含」係為一開放式的用語，故應解釋成「包含但不限定於」。另外，「耦接」一詞在此係包含任何直接及間接的電氣連接手段。因此，若文中描述一第一裝置耦接於一第二裝置，則代表該第一裝置可直接電氣連接於該第二裝置，或透過其他裝置或連接手段間接地電氣連接至該第二裝置。請參照圖1，圖1係為本新型一種自動文本標籤系統100的方塊圖。如圖1所示，自動文本標籤系統100包括一伺服端102與一影像輸入裝置104。伺服端102包括：一人工智慧語言處理模組110、一人工智慧文本分類模組120、一標籤庫130以及一人工智慧標籤輸出模組140。影像輸入裝置104電性連接到伺服端102，影像輸入裝置104例如為一掃描裝置或一數位相機，該掃描裝置或數位相機獲取文件影像後，利用光學字元辨識(Optical Character Recognition)的技術，以將掃描或拍照後的文件影像中的至少一字元影像數值化，以轉化為可編輯的一輸入文字d。該人工智慧語言處理模組110用來接收輸入文字d以及將該輸入文字d轉換為至少一特定維度的量化特徵。也就是說，該人工智慧語言處理模組110會用來數值化非結構化的文字d。關於將輸入文字d轉換為至少一特定維度的量化特徵，舉例來說，該人工智慧語言處理模組110利用單詞嵌入(word embedding)之技術將該輸入文字d轉換為一二維矩陣（TxW）的特徵矩陣，其中T為時間軸長度也就是輸入文本之序列長，而W則為文本分類核心之自設參數，此參數大小與本標籤系統之複雜度呈正相關，請注意，在本新型實施例中，該二維矩陣（TxW）的特徵矩陣亦可視為一個依照時間順序排列的一維矩陣。該人工智慧文本分類模組120用來將該至少一特定維度的量化特徵轉換為至少一特定信心水準的特徵矩陣。也就是說，該人工智慧文本分類模組120用來接收數值化後的文字特徵矩陣以及給定標籤該相對應的輸出信心水準的特徵矩陣。舉例來說，該人工智慧文本分類模組120接收到該二維矩陣（TxW）的特徵矩陣之後，該人工智慧文本分類模組120會利用類神經網絡（Neural Networks）進行處理來將該二維矩陣（TxW）的特徵矩陣轉換成一個一維矩陣，其中該陣列大小與標籤庫的詞庫大小等大。而類神經網絡（Neural Networks）可為遞歸神經網路（Recurrent Neural Network）、長短期記憶模型（Long Short-Term Memory）或是卷積神經網路（Convolutional Neural Network），請注意，此僅為本新型的實施例，並非本新型的限制條件。如圖1所示，該標籤庫130用來儲存需要輸出之至少一標籤，也就是說，文本中的核心詞語的標籤會儲存在該標籤庫130中，該人工智慧標籤輸出模組140會將該至少一特定信心水準的特徵矩陣轉換為至少一文本標籤以及一相對應的量化信心水準。也就是說，該人工智慧標籤輸出模組140用來將數值化的信心水準的特徵矩陣轉換為相對應的標籤，舉例來說，本新型會定義一損失函數（loss function），該損失函數可以但不限於是交叉熵(cross-entropy)、焦點損失(focal loss)、均方誤差MSE等方法來計算核心輸出之信心水準矩陣與真實結果之差距，差異越大則損失函數會得到越大的值。另外，該人工智慧標籤輸出模組140利用隨機梯度下降（SGD, Stochastic gradient descent）、Adagrad、AdaDelta、Adam、RMSProp等深度學習的演算法來迭代更新核心之參數，以準確預測其信心水準。請注意，這些深度學習的演算法僅為本新型的實施例，並非本新型的限制條件，凡是可以更優化參數的演算法皆符合本新型的精神，而落入本新型的範疇。在本新型一實施例中，該人工智慧語言處理模組110用來接收到診斷書文字內容後，該人工智慧語言處理模組110會提取診斷書文字內容的各種資訊，進行語義表徵，以及將診斷書文字內容轉換為該二維矩陣（TxW）的特徵矩陣。之後，該人工智慧文本分類模組120便可以算出各標籤所對應的信心水準，例如“腿部外傷”有多少信心水準的結果。該人工智慧標籤輸出模組140便將數值化的信心水準矩陣轉換為相對應的標籤，例如如果信心水準是設定為0.5，則該標籤庫130中的“腿部外傷”標籤對應到信心水準0.5以上的便會被判定對應到“腿部外傷”的標籤。依據上述的說明，熟知此項技藝人士便可輕易了解，在本新型其他實施例中，如果該人工智慧語言處理模組110接收到的診斷書文字內容為 “左足挫傷，左足第四趾遠端指骨骨折”，該人工智慧標籤輸出模組140便會轉換為標籤：(1)下肢挫傷 (2)軀幹壓砸傷。如果該人工智慧語言處理模組110接收到的診斷書文字內容為“左上肢撕裂傷（門診手術縫合３針）”，該人工智慧標籤輸出模組140便會轉換為標籤：(1)手開放性傷口，手指除外。或者，如果該人工智慧語言處理模組110接收到的診斷書文字內容為 “左側輸尿管下段結石合併阻塞性腎水腫”，該人工智慧標籤輸出模組140便會轉換為標籤：(1)腎水腫 (2)腎及輸尿管結石。請注意，上述標籤的實施例僅是用來說明本新型，並非是本新型的限制條件。同理，在本新型另一實施例中，利用本新型的方法和系統可以將病歷的事後核查提前至事中提醒。醫生書寫診斷書的同時，便可以實時提醒其不合規內容，從源頭杜絕非規範病歷的產生。本系統還能基於自然語言理解及醫療知識，自動識別醫生的診斷是否符合醫療規範，給診療上一道人工智能的保險。請參照圖2，圖2係為配合本新型之自動文本標籤系統的一種自動文本標籤方法的流程圖，其包含（但不侷限於）以下的步驟(請注意，假若可獲得實質上相同的結果，則這些步驟並不一定要遵照圖2所示的執行次序來執行)：步驟S200：開始。步驟S210：接收一輸入文字以及將該輸入文字轉換為至少一特定維度的量化特徵。步驟S220: 將該至少一特定維度的量化特徵轉換為至少一特定信心水準的特徵矩陣。步驟S230: 儲存需要輸出之至少一標籤。步驟S240: 將該至少一特定信心水準的特徵矩陣轉換為至少一文本標籤以及一相對應的量化信心水準。請搭配圖2所示之各步驟以及圖1所示之各元件即可知各元件如何運作，為簡潔起見，故於此不再贅述。在本實施例中，提出了一種自動文本標籤方法與系統，該方法的實現可依賴於電腦程式，該電腦程式可以是基於診斷書管理系統對診斷書進行病名自動診斷的應用程式。該電腦系統可以是運行上述電腦程式的例如智慧手機、平板電腦、個人電腦等終端設備。此外，本新型的自動文本標籤系統100中的各模組，亦即：人工智慧語言處理模組110、人工智慧文本分類模組120、標籤庫130、人工智慧標籤輸出模組140，可以用電腦程式來實現，也可以用硬體的方式（如直接作成晶片）來實現。本新型的優點在於，利用自動文本標籤方法與系統不但可以取代人工，讓分類文本可以透過人工智慧加快人工分類效率，而且利用本新型的方法和系統，醫療主管部門以及保險公司能通過對病歷的自然語言分析，對診療情況、疾病趨勢進行大數據分析，從而提升醫療管理水平及保險的服務。以上所述僅為本新型之各種實施例而已，非因此而侷限本新型之專利範圍，故舉凡運用本新型說明書及圖式內容所為之簡易修飾及等效結構變化，均應包含於本新型所涵蓋專利範圍內。The novelty is best understood with reference to the detailed description set forth herein and the accompanying drawings. Various embodiments are discussed below with reference to the drawings. However, those skilled in the art will readily understand that the detailed description given herein with reference to the accompanying drawings is for explanation purposes only, as these methods and systems may exceed the described embodiments. For example, the teachings given and the needs of a particular application may lead to a number of alternative and suitable ways to implement any of the details described herein. Thus, any method may extend beyond the specific implementation selections described and illustrated in the following embodiments. Certain terms are used in the description and the scope of subsequent patent applications to refer to specific elements. Those of ordinary skill in the art will understand that hardware manufacturers may use different terms to refer to the same components. The scope of this specification and subsequent patent applications does not take the difference in names as a way to distinguish components, but rather uses the difference in functions of components as a criterion for distinguishing components. "Inclusion" mentioned throughout the specification and subsequent claims is an open-ended term and should be interpreted as "including but not limited to." In addition, the term "coupled" includes any direct and indirect means of electrical connection. Therefore, if a first device is described as being coupled to a second device, it means that the first device can be electrically connected directly to the second device or indirectly electrically connected to the second device through other devices or connection means. Please refer to FIG. 1, which is a block diagram of an automatic text labeling system 100 according to the present invention. As shown in FIG. 1, the automatic text label system 100 includes a server 102 and an image input device 104. The server 102 includes an artificial intelligence language processing module 110, an artificial intelligence text classification module 120, a tag library 130, and an artificial intelligence label output module 140. The image input device 104 is electrically connected to the server 102. The image input device 104 is, for example, a scanning device or a digital camera. After the scanning device or the digital camera obtains a document image, it uses an optical character recognition (Optical Character Recognition) technology. The digitized image of at least one character in the scanned or photographed document image is converted into an editable input text d. The artificial intelligence language processing module 110 is used for receiving the input text d and converting the input text d into at least one quantized feature with a specific dimension. That is, the artificial intelligence language processing module 110 is used to digitize the unstructured text d. Regarding transforming the input text d into at least one quantized feature of a specific dimension, for example, the artificial intelligence language processing module 110 uses word embedding technology to convert the input text d into a two-dimensional matrix (TxW) Feature matrix, where T is the length of the time axis, which is the sequence length of the input text, and W is a self-set parameter of the core of text classification. The size of this parameter is positively related to the complexity of the labeling system. In the example, the feature matrix of the two-dimensional matrix (TxW) can also be regarded as a one-dimensional matrix arranged in time order. The artificial intelligence text classification module 120 is configured to convert the quantized features of at least one specific dimension into at least one feature matrix of a specific confidence level. That is, the artificial intelligence text classification module 120 is used to receive a digitized text feature matrix and a corresponding output confidence level feature matrix for a given label. For example, after the artificial intelligence text classification module 120 receives the feature matrix of the two-dimensional matrix (TxW), the artificial intelligence text classification module 120 uses Neural Networks to process the two-dimensional matrix. The feature matrix of the matrix (TxW) is converted into a one-dimensional matrix, where the size of the array is as large as the thesaurus size of the tag library. Neural Networks can be Recurrent Neural Network, Long Short-Term Memory, or Convolutional Neural Network. Please note that this is only The embodiments of the present invention are not a limitation of the present invention. As shown in FIG. 1, the tag library 130 is used to store at least one tag that needs to be output, that is, the tags of the core words in the text will be stored in the tag library 130, and the artificial intelligence tag output module 140 will The feature matrix of the at least one specific confidence level is converted into at least one text label and a corresponding quantified confidence level. That is, the artificial intelligence label output module 140 is used to convert the numerical confidence level feature matrix into corresponding labels. For example, the new model will define a loss function. The loss function can be But it is not limited to cross-entropy, focal loss, mean square error MSE and other methods to calculate the difference between the core output confidence level matrix and the real result. The larger the difference, the larger the loss function will be. value. In addition, the artificial intelligence label output module 140 utilizes deep learning algorithms such as stochastic gradient descent (SGD), Adagrad, AdaDelta, Adam, RMSProp to iteratively update core parameters to accurately predict its confidence level. Please note that these deep learning algorithms are only examples of the present invention and are not a limitation of the new model. Any algorithm that can optimize the parameters is in line with the spirit of the new model and falls into the scope of the new model. In an embodiment of the present invention, after the artificial intelligence language processing module 110 is used to receive the text content of the medical certificate, the artificial intelligence language processing module 110 extracts various information of the text content of the medical certificate, performs semantic representation, and The text of the medical certificate is converted into the feature matrix of the two-dimensional matrix (TxW). After that, the artificial intelligence text classification module 120 can calculate the confidence level corresponding to each label, such as the result of how much confidence level the "traumatic leg injury" has. The artificial intelligence label output module 140 converts the numerical confidence level matrix into corresponding labels. For example, if the confidence level is set to 0.5, the "leg injury" label in the tag library 130 corresponds to the confidence level 0.5. The above will be determined to correspond to the "leg injury" label. According to the above description, those skilled in the art can easily understand that in other embodiments of the present invention, if the text of the diagnosis text received by the artificial intelligence language processing module 110 is "left foot contusion, left distal toe fourth foot Phalanx fracture ", the artificial intelligence label output module 140 will be converted into a label: (1) a lower limb contusion and (2) a torso crush injury. If the text of the diagnosis text received by the artificial intelligence language processing module 110 is “left upper limb laceration (outpatient surgical suture 3 stitches)”, the artificial intelligence label output module 140 will be converted into a label: (1) hand Open wounds, except for fingers. Or, if the text of the diagnosis text received by the artificial intelligence language processing module 110 is "left lower ureteral stones with obstructive renal edema", the artificial intelligence label output module 140 will be converted into a label: (1) renal edema (2) Kidney and ureteral stones. Please note that the above examples of labels are only used to illustrate the present invention, and are not a limitation of the present invention. Similarly, in another embodiment of the present invention, the method and system of the present invention can be used to advance the post-examination of medical records to reminders in advance. When doctors write the diagnosis, they can remind them of non-compliance in real time, and prevent the generation of non-standard medical records from the source. Based on natural language understanding and medical knowledge, the system can also automatically identify whether the doctor's diagnosis meets the medical norms and provide artificial intelligence insurance for diagnosis and treatment. Please refer to FIG. 2. FIG. 2 is a flowchart of an automatic text labeling method in conjunction with the new type of automatic text labeling system, which includes (but is not limited to) the following steps (please note that if substantially the same results can be obtained , These steps need not necessarily be performed in accordance with the execution sequence shown in FIG. 2): Step S200: Start. Step S210: Receive an input text and convert the input text into a quantized feature of at least one specific dimension. Step S220: Convert the quantized feature of the at least one specific dimension into a feature matrix of at least one specific confidence level. Step S230: Store at least one label to be output. Step S240: The feature matrix of the at least one specific confidence level is converted into at least one text label and a corresponding quantified confidence level. Please match each step shown in FIG. 2 and each element shown in FIG. 1 to know how each element works. For the sake of brevity, it will not be repeated here. In this embodiment, an automatic text labeling method and system are proposed. The implementation of this method may depend on a computer program. The computer program may be an application program for automatically diagnosing a medical certificate based on a medical certificate management system. The computer system may be a terminal device such as a smart phone, a tablet computer, or a personal computer running the computer program. In addition, each module in the new type of automatic text labeling system 100, namely: artificial intelligence language processing module 110, artificial intelligence text classification module 120, tag library 130, artificial intelligence label output module 140, can use a computer It can be realized by program, and it can also be realized by hardware (such as making a chip directly). The advantage of the new model is that using the automatic text labeling method and system can not only replace humans, allow classified texts to accelerate artificial classification efficiency through artificial intelligence, but also use the new method and system, medical authorities and insurance companies can Natural language analysis, big data analysis of diagnosis and treatment conditions, and disease trends to improve medical management and insurance services. The above descriptions are only various embodiments of the new model, and do not limit the scope of the patent of the new model. Therefore, any simple modifications and equivalent structural changes made by using the new model's description and drawings should be included in the new model. Covering patents.

100‧‧‧自動文本標籤系統100‧‧‧ automatic text labeling system

102‧‧‧伺服端 102‧‧‧Server

104‧‧‧影像輸入裝置 104‧‧‧Image input device

110‧‧‧人工智慧語言處理模組 110‧‧‧ Artificial Intelligence Language Processing Module

120‧‧‧人工智慧文本分類模組 120‧‧‧ Artificial Intelligence Text Classification Module

130‧‧‧標籤庫 130‧‧‧tag library

140‧‧‧人工智慧標籤輸出模組 140‧‧‧ Artificial Intelligence Label Output Module

S200~S240‧‧‧流程圖符號 S200 ~ S240‧‧‧‧flowchart symbols

下文將根據附圖來描述各種實施例，所述附圖是用來說明而不是用以任何方式來限制範圍，其中相似的標號表示相似的組件，並且其中：圖1係為本新型一種自動文本標籤系統的方塊圖。圖2係為配合本新型之自動文本標籤系統的一種自動文本標籤方法的流程圖。Various embodiments will be described below with reference to the accompanying drawings, which are used to illustrate rather than limit the scope in any way, wherein similar reference numerals indicate similar components, and wherein: FIG. 1 is an automatic text of the novel type Block diagram of the labeling system. FIG. 2 is a flowchart of an automatic text labeling method in cooperation with the novel automatic text labeling system.

Claims

一種自動文本標籤系統，包括：一影像輸入裝置，獲取一文件影像，並將該文件影像中的至少一字元影像數值化，以轉化為可編輯的一輸入文字；以及一伺服端，電性連接到該影像輸入裝置，該伺服端包括：一人工智慧語言處理模組，用來接收該輸入文字以及將該輸入文字轉換為至少一特定維度的量化特徵；一人工智慧文本分類模組，耦接於該人工智慧語言處理模組，用來將該至少一特定維度的量化特徵轉換為至少一特定信心水準的特徵矩陣；一標籤庫，用來儲存需要輸出之至少一標籤；以及一人工智慧標籤輸出模組，耦接於該人工智慧文本分類模組及該標籤庫，用來將該至少一特定信心水準的特徵矩陣轉換為至少一文本標籤以及一相對應的量化信心水準。An automatic text labeling system includes: an image input device, obtaining a document image, and digitizing at least one character image in the document image to convert it into an editable input text; and a servo terminal, electrical Connected to the image input device, the server includes: an artificial intelligence language processing module for receiving the input text and converting the input text into at least one specific dimension of a quantitative feature; an artificial intelligence text classification module, coupled Connected to the artificial intelligence language processing module for converting the quantized features of at least one specific dimension into at least a feature matrix of a specific confidence level; a tag library for storing at least one tag to be output; and an artificial intelligence A label output module is coupled to the artificial intelligence text classification module and the label library, and is used to convert the feature matrix of at least one specific confidence level into at least one text label and a corresponding quantified confidence level.

根據申請專利範圍第1項之自動文本標籤系統，其中該人工智慧語言處理模組另用來數值化非結構化的文字。According to the automatic text labeling system of the first patent application scope, the artificial intelligence language processing module is further used for digitizing unstructured text.

根據申請專利範圍第1項之自動文本標籤系統，其中該人工智慧文本分類模組另用來接收數值化後的文字特徵矩陣以及給定標籤該相對應的輸出信心水準的特徵矩陣。According to the automatic text labeling system of the first patent application scope, the artificial intelligence text classification module is further used to receive a digitized text feature matrix and a corresponding feature matrix with a corresponding output confidence level for a given tag.

根據申請專利範圍第1項之自動文本標籤系統，其中該人工智慧文本分類模組另用來將該量化特徵轉換成一個一維矩陣，其中該矩陣陣列大小與該標籤庫的詞庫大小等大。According to the automatic text labeling system of the first patent application range, the artificial intelligence text classification module is further used to convert the quantized feature into a one-dimensional matrix, wherein the size of the matrix array is equal to the size of the thesaurus of the tag library. .

根據申請專利範圍第1項之自動文本標籤系統，其中該人工智慧標籤輸出模組另用來將數值化的信心水準的特徵矩陣轉換為相對應的標籤。According to the automatic text labeling system of the first patent application scope, the artificial intelligence label output module is further used to convert the numerical confidence level feature matrix into corresponding labels.

根據申請專利範圍第1項之自動文本標籤系統，其中影像輸入裝置為一掃描裝置或一數位相機。According to the automatic text labeling system of the first patent application scope, the image input device is a scanning device or a digital camera.