KR101627550B1 - The intelligent disclosure of public records management system based machine learning - Google Patents
The intelligent disclosure of public records management system based machine learning Download PDFInfo
- Publication number
- KR101627550B1 KR101627550B1 KR1020150086510A KR20150086510A KR101627550B1 KR 101627550 B1 KR101627550 B1 KR 101627550B1 KR 1020150086510 A KR1020150086510 A KR 1020150086510A KR 20150086510 A KR20150086510 A KR 20150086510A KR 101627550 B1 KR101627550 B1 KR 101627550B1
- Authority
- KR
- South Korea
- Prior art keywords
- public
- information
- disclosure
- data
- classification
- Prior art date
Links
Images
Classifications
-
- G06F17/211—
-
- G06F15/18—
-
- G06F17/30722—
Landscapes
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Description
The present invention relates to a machine learning-based intelligent document disclosure management system, and more particularly, to a machine learning-based intelligent document disclosure management system, File disclosure information, and file content information to generate public classification pattern data, compares the object data to be public classified with the public classification pattern data, and automatically recommends one of public, partial public, and private Based intelligent document disclosure management system.
There are tens of thousands and hundreds of millions of documents on the Internet, and the amount of documents is increasing exponentially due to the activation of blogs and mini homepages. These documents have a lot of information, and search and analysis systems are used in various ways to access the information contained in the documents.
Most of the retrieval and analysis systems for accessing the information in the document increase the accessibility by dividing the documents into categories.
For example, in the case of a portal search system that provides Internet news, documents are classified by categories such as politics, society, economy, and entertainment, thereby improving accessibility to documents. Initially, the classification of these documents was done directly by the person.
However, as the amount of information increases, there is a growing need for a document classifier capable of automatically classifying a large number of documents.
In particular, an administrative agency related to the present invention generates a large amount of administrative documents and stores the generated administrative documents in a document storage location.
In order to efficiently store a large amount of administrative documents, attribute information of each administrative document is stored and managed on the system.
At this time, a part of the metadata of the administrative document, for example, the disclosure division for the public service, is made by the manager's judgment.
If an administrator handles a large amount of administrative documents, errors may arise in establishing a public division for public service.
For example, when setting up an administrative document that should be kept private according to the closed subcriteria of the administrative agency, confusion and inefficiency of the administrative process may arise.
Accordingly, there is a need for a method for effectively managing the disclosure of administrative documents.
To be more specific, public records disclosure refers to the provision of information or records by a public agency to view, copy, or distribute information or records in accordance with the Records Management Act or the Information Disclosure Act.
The records produced and held by the public authorities are disclosed in principle, but where there is a mixture of public and non-public information, public information should be disclosed except for the information to be disclosed.
In addition, public archives should periodically reclassify the non-public records they hold and disclose them if the non-public cause disappears.
As such, we intend to actively disclose the records produced and held by public institutions. However, public agencies are required to disclose publicly, publicly, and privately in consideration of the nature of the institution's business within the scope of Article 9 (1) Should be set.
However, manual processing of mass-produced and managed record information is limited and requires a large amount of budget and resources.
Therefore, by using the machine learning technology, it is possible to automatically classify the public classification through comparison analysis with the public classification target data based on the existing public classified list, detailed reference information of the private object information, and electronic file contents information as the learning data The system can be done.
A first object of the present invention to solve the above problems is to provide a method and apparatus for using a machine learning technique to utilize existing existing classified list, detailed reference information of private object information, electronic file contents information as learning data, And to automatically recommend the public classification through data and comparative analysis.
The second object of the present invention is to extract the legal provision information to be matched from the legal provision information database of the institution when the recommended result is partial disclosure or non-disclosure, and to provide information of reason for partial disclosure or privatization.
SUMMARY OF THE INVENTION [0006] The present invention provides a solution for achieving the above object.
That is, the basic learning data including the public record list in which the public classification has been completed is obtained from the record
A public
An unspecified information
By providing the subject data to be publicly classified by the public category management means and including the
The present invention has the following effects.
By using the machine learning technology, it is possible to automatically recommend the public classification through the comparative analysis with the public classification object data based on the existing public classified list, the detailed criteria of the private object information, and the electronic file contents information as the learning data It can dramatically improve the quality of public service, and outsourcing can greatly reduce the budget for manual open reclassification business.
In addition, if the recommendation result is partially public or non-public, the provisional information of the legal statutory information to be matched is extracted from the institutional legal statutory information database, and the reason for disclosure can be presented by providing the reason disclosure information about the partial disclosure or confidentiality.
BRIEF DESCRIPTION OF THE DRAWINGS FIG. 1 is an overall configuration diagram of a machine learning-based intelligent document disclosure management system according to an embodiment of the present invention; FIG.
FIG. 2 is a block diagram of an open classification management means of a machine learning-based intelligent document disclosure management system according to an embodiment of the present invention.
FIG. 3 is a diagram illustrating disclosure classification pattern data of the intelligent document disclosure management system based on a machine learning according to an embodiment of the present invention, and FIG. 4 is a conceptual diagram of a learning layer.
The machine learning-based intelligent document disclosure management system according to an embodiment of the present invention includes:
The disclosure classification pattern data may be generated using at least one or more of the basic learning data including the publicized list of the publicly classified records, the detailed criteria information of the private information of each public institution, and the electronic file content information obtained from the record production management system And a public segmentation management means (100) for automatically comparing the analyzed data with the public segmentation pattern data, and automatically recommending one of public, partial, and non-public data.
The machine learning-based intelligent document disclosure management system according to another embodiment includes:
Acquires the basic learning data including the public disclosure list in which the public classification has been completed from the public
A public
An unspecified information
And a
The system of the present invention is further characterized in that the system further comprises weight setting means for extracting the public division pattern data stored in the learning data directory, giving the weight to the public division pattern data, and providing the weight to the public division management means 100. [
The open classification management means 100 includes an electronic file content
(500), and generates public classification pattern data for generating public classification pattern data using at least one or more pieces of electronic file contents information obtained from the recording material production management system (120), a target data to be publicly distinguished from the request terminal (300), compares the data with the open classification pattern data stored in the learning data database, and automatically recommends one of open, partial disclosure, And an automatic additional
In this case, the open classification pattern data of the present invention includes at least one of a unit job name, a process name, a recorded article title, a recommended disclosure classification, a storage period, and a non-disclosure cause.
Hereinafter, preferred embodiments of the present invention will be described in detail with reference to the drawings.
It should be understood, however, that the scope of the present invention is not limited to these embodiments, and all of the technical ideas that fall within the scope of the present invention are within the scope of the present invention.
BRIEF DESCRIPTION OF THE DRAWINGS FIG. 1 is an overall configuration diagram of a machine learning-based intelligent document disclosure management system according to an embodiment of the present invention; FIG.
1, the machine learning-based intelligent document disclosure management system according to the present invention includes a public
Public records disclosure refers to the provision of information or records by a public agency to view or copy, reproduce, or provide information or information through the information communication network in accordance with the Records Management Act and the Information Disclosure Act.
The public
In other words, the system of the present invention generates public classification pattern data using at least one of basic learning data including a list of public records, detailed reference information of private information to be public institutions, and electronic file contents information .
At this time, the generated classification pattern data is updated and stored in the
Accordingly, if the subject data to be publicly classified is obtained from the
In the record
In addition, the
FIG. 2 is a block diagram of an open classification management means of a machine learning-based intelligent document disclosure management system according to an embodiment of the present invention.
2, the open
The electronic file content
That is, the records management system currently used by the public institutions manages the electronic approval system and the work management system, and the system is applied from the production and management stage of the records.
The public classification pattern
In addition, the open
At this time, the automatic additional
That is, the information of the existing document disclosure list, the details of the non-disclosure target information, the contents of the electronic file (for example, draft document, attachment file, etc.) are automatically generated as a public distinguishing pattern, and the generated pattern is learned and applied .
On the other hand, the disclosure classification management means 100 of the present invention includes an information divice for each institution, which stores information on statutory provision related to information disclosure by a public institution, and a statutory information provision division, A legal term extraction unit for extracting legal term information to be matched from the legal term information and a reason display processing unit for obtaining the extracted legal term information to generate the reason display information for partial disclosure or non-disclosure.
In other words, the statutory information provision division of the institution stores information on information provision related to information disclosure by the public institution, and when the recommendation result through the legal section extraction department is partial disclosure or non-disclosure, .
At this time, as shown in FIG. 4, the reason display processing unit acquires the extracted legal provision information and generates the reason display information about the partial disclosure or the non-disclosure.
In the case of FIG. 4, a public classification data field is included among the registration list items, and is displayed in at least one of public, partial, and non-public.
In addition, the open classification management means 100 may further comprise a learning period setting unit for setting the information gathering period of the open classification pattern
As shown in FIG. 3, the public classification pattern
In particular, as shown in Fig. 4, the open classification pattern data is characterized by including at least any one of a unit job name, a process name, a recorded article title, a recommended open classification, a storage period, and a private cause .
The item of the data field for disclosure classification is preferably 38, and the learning data to be compared includes the existing disclosure classification completion list, the detailed criteria of the non-disclosure target information, and the electronic record content information.
The learning data described in the system of the present invention is generally processed in natural language. Here, natural language processing means to mechanically analyze the language phenomena that human utterances are made into a form that the computer can understand.
For example, natural language processing can be done through morphological analysis, part-of-speech, phrase unit analysis, and syntactic analysis.
Further, the system of the present invention is operated on a machine learning basis, which means that machine learning is generally performed on learning data processed in a natural language.
That is, the learning data processed in the natural language is generalized or trained.
For example, a series of learning data may be machine-learned to learn public classification information and private reason information according to the public classification and title information of the administrative document.
To this end, the system of the present invention can constitute a machine learning unit, and its operation principle is a general technique, and thus a detailed description thereof will be omitted.
In addition, the system of the present invention generates and stores pattern data (sample data) as a result of learning by the machine learning unit. At this time, the sample data may include a result of processing natural data of the learning data.
Then, in the case of acquiring the data to be publicly classified, it is subjected to natural language processing and then compared and analyzed.
Meanwhile, the system of the present invention compares the matching rate between the natural language processing result of the new data and the sample data. For example, the matching rate between the processing of the new data, the processing of the information, the title information and the sample data, Can be compared.
It can be judged that the matching rate is higher as the patterns of the new data and the sampled data are similar.
Meanwhile, the system of the present invention may further include weight setting means for extracting the public division pattern data stored in the learning data directory, giving the weight to the public division pattern data, and providing the weight to the public division management means 100. [
The weight can be calculated on the basis of a function giving a general weight or can be obtained based on the degree of proximity to a specific word. In addition, the weighted object may be an existing document disclosure list which is a learning data area, a non-disclosure target information criterion according to a public institution, and an electronic file content.
The purpose of weighting words is to express relative value as an index word according to their importance as a subject element of concepts handled by a document.
That is, by assigning a value (weight value) within a certain range to each term representing each concept, even if the same index is used, it indicates that the degree of importance differs according to each document.
As a result, through the above-described configuration and operation, the existing classified list, the detailed criteria of the non-disclosure target information, and the electronic file content information are used as the learning data using the machine learning technology, It is possible to improve the quality of the public service by using the effect of automatically recommending the public classification, and outsourcing can significantly reduce the budget for manual open reclassification business.
Meanwhile, the method according to various embodiments of the present invention may be stored in a computer-readable recording medium. The computer-readable recording medium may be a ROM, a RAM, CDROMs, magnetic tapes, floppy disks, optical data storage devices, and the like, as well as carrier waves (e.g., transmission over the Internet).
While the present invention has been particularly shown and described with reference to exemplary embodiments thereof, it is to be understood that the invention is not limited to the disclosed exemplary embodiments, but, on the contrary, It should be understood that various modifications may be made by those skilled in the art without departing from the spirit and scope of the present invention.
100: Open classification management means
200: Record production management system
300: request terminal
400: Archives List
500: Details subject to private information subject to public institution
Claims (5)
Acquires the basic learning data including the public disclosure list in which the publicity classification has been completed from the public record list database 400 and obtains the detailed reference information of the private information targeted for each public institution from the private information information criteria database 500 for each public institution And generates the open classification pattern data using at least one of the electronic file content information obtained from the record production management system 200 and includes the learning data database 130 storing the open classification pattern data And acquires the object data to be publicly classified from the request terminal 300 and compares the data with the open classification pattern data to automatically recommend one of open, partial disclosure, and non-public as a requesting terminal, and at the same time, A public classification management means (100) for performing an additional update process to the database;
A public record list database 400 storing basic learning data including a public record list in which publicity classification has been completed;
An unspecified information detail criteria database 500 for each public institution that stores detailed reference information of private information to be disclosed by each public institution;
And a request terminal 300 for providing the object data to be publicly classified by the public category management means and obtaining any one of the public presence information, the partial publicity information, and the publicly recommended private information provided by the public category management means Intelligent documentary disclosure management system based on machine learning.
Further comprising weight setting means for extracting the public division pattern data stored in the learning data directory and providing the weight to the public division management means (100).
The open classification management means (100)
An electronic file content information acquisition unit 110 for acquiring electronic file content information in association with the record production management system 200,
Acquires the basic learning data including the public disclosure list in which the publicity classification has been completed from the public record list database 400 and obtains the detailed reference information of the private information targeted for each public institution from the private information information criteria database 500 for each public institution An open classification pattern data generation unit 120 for generating open classification pattern data using at least one of the electronic file content information obtained from the record production management system,
A public classification recommendation unit for acquiring the object data to be publicly classified from the request terminal 300 and comparing the analyzed data with the public classification pattern data stored in the learning data directory and automatically recommending any one of public, 140,
And an automatic additional learning progress unit (150) for automatically acquiring the recommended information and additionally updating the learning data to the learning data database.
Wherein the open classification pattern data includes at least one data field of at least one of a unit business name, a process name, a recorded object title, a recommended disclosure classification, a storage period, and a non-disclosure cause.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR1020150086510A KR101627550B1 (en) | 2015-06-18 | 2015-06-18 | The intelligent disclosure of public records management system based machine learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR1020150086510A KR101627550B1 (en) | 2015-06-18 | 2015-06-18 | The intelligent disclosure of public records management system based machine learning |
Publications (1)
Publication Number | Publication Date |
---|---|
KR101627550B1 true KR101627550B1 (en) | 2016-06-07 |
Family
ID=56193159
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
KR1020150086510A KR101627550B1 (en) | 2015-06-18 | 2015-06-18 | The intelligent disclosure of public records management system based machine learning |
Country Status (1)
Country | Link |
---|---|
KR (1) | KR101627550B1 (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR101887629B1 (en) * | 2018-02-14 | 2018-08-10 | 대신네트웍스 주식회사 | system for classifying and opening information based on natural language |
KR20180124529A (en) | 2017-05-12 | 2018-11-21 | 이세희 | Document digitalization system capable of minimizing private document and method of the same |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20120059935A (en) | 2010-12-01 | 2012-06-11 | 경북대학교 산학협력단 | Text classification device and classification method thereof |
KR20140049680A (en) * | 2012-10-18 | 2014-04-28 | 한국항공대학교산학협력단 | Sentiment classification system using rule-based multi agents |
KR20140069756A (en) * | 2012-11-29 | 2014-06-10 | 대한민국(국가기록원) | System of preserving twitter records |
KR20140080594A (en) * | 2012-12-12 | 2014-07-01 | 한국발명진흥회 | Method for evaluating patents using engine and evaluation server |
KR101500900B1 (en) * | 2014-04-28 | 2015-03-12 | 한양대학교 산학협력단 | Method and System for Classifying Text Using Classifier Produced by Learning Data |
-
2015
- 2015-06-18 KR KR1020150086510A patent/KR101627550B1/en active IP Right Grant
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20120059935A (en) | 2010-12-01 | 2012-06-11 | 경북대학교 산학협력단 | Text classification device and classification method thereof |
KR20140049680A (en) * | 2012-10-18 | 2014-04-28 | 한국항공대학교산학협력단 | Sentiment classification system using rule-based multi agents |
KR20140069756A (en) * | 2012-11-29 | 2014-06-10 | 대한민국(국가기록원) | System of preserving twitter records |
KR20140080594A (en) * | 2012-12-12 | 2014-07-01 | 한국발명진흥회 | Method for evaluating patents using engine and evaluation server |
KR101500900B1 (en) * | 2014-04-28 | 2015-03-12 | 한양대학교 산학협력단 | Method and System for Classifying Text Using Classifier Produced by Learning Data |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20180124529A (en) | 2017-05-12 | 2018-11-21 | 이세희 | Document digitalization system capable of minimizing private document and method of the same |
KR101887629B1 (en) * | 2018-02-14 | 2018-08-10 | 대신네트웍스 주식회사 | system for classifying and opening information based on natural language |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2021098648A1 (en) | Text recommendation method, apparatus and device, and medium | |
US7912816B2 (en) | Adaptive archive data management | |
US20170060894A1 (en) | Systems and methods for management of data platforms | |
TW556085B (en) | File classification management system and method used in operating system | |
US20150356123A1 (en) | Systems and methods for management of data platforms | |
US8954436B2 (en) | Monitoring content repositories, identifying misclassified content objects, and suggesting reclassification | |
EP3270303A1 (en) | An automated monitoring and archiving system and method | |
CN101490675A (en) | Methods and apparatus for reusing data access and presentation elements | |
US20130311517A1 (en) | Representing Incomplete and Uncertain Information in Graph Data | |
Dang et al. | Framework for retrieving relevant contents related to fashion from online social network data | |
CN106227788A (en) | Database query method based on Lucene | |
Gawriljuk et al. | A scalable approach to incrementally building knowledge graphs | |
Gerrard et al. | Digital preservation at Big Data scales: proposing a step-change in preservation system architectures | |
US10146881B2 (en) | Scalable processing of heterogeneous user-generated content | |
Ménard et al. | Faceted classification for museum artefacts: A methodology to support web site development of large cultural organizations | |
KR101627550B1 (en) | The intelligent disclosure of public records management system based machine learning | |
Topçu et al. | Data standardization in digital libraries: An ETD case in Turkey | |
EP3152678B1 (en) | Systems and methods for management of data platforms | |
CN117592450A (en) | Panoramic archive generation method and system based on employee information integration | |
Shepherd et al. | Are ISO 15489‐1: 2001 and ISAD (G) compatible? Part 1 | |
KR101752259B1 (en) | High value-added content management device and method and recording medium storing program for executing the same and recording medium storing program for executing the same | |
McGee et al. | Towards visual analytics of multilayer graphs for digital cultural heritage | |
KR20180006518A (en) | Automatically writing service system for sales material kit | |
Sattar Chaudhry | Assessment of taxonomy building tools | |
Timonin et al. | Research of filtration methods for reference social profile data |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
E701 | Decision to grant or registration of patent right | ||
GRNT | Written decision to grant | ||
FPAY | Annual fee payment |
Payment date: 20190313 Year of fee payment: 4 |