CN112651710A - Data annotation platform - Google Patents
Data annotation platform Download PDFInfo
- Publication number
- CN112651710A CN112651710A CN202011520984.8A CN202011520984A CN112651710A CN 112651710 A CN112651710 A CN 112651710A CN 202011520984 A CN202011520984 A CN 202011520984A CN 112651710 A CN112651710 A CN 112651710A
- Authority
- CN
- China
- Prior art keywords
- data
- annotation
- management
- task
- interface
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000007726 management method Methods 0.000 claims description 105
- 230000006870 function Effects 0.000 claims description 33
- 238000004891 communication Methods 0.000 claims description 9
- 230000011218 segmentation Effects 0.000 claims description 9
- 238000000605 extraction Methods 0.000 claims description 6
- 238000013500 data storage Methods 0.000 claims description 4
- 238000002372 labelling Methods 0.000 abstract description 25
- 230000009286 beneficial effect Effects 0.000 description 2
- 238000013135 deep learning Methods 0.000 description 2
- 230000007547 defect Effects 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 230000018109 developmental process Effects 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 206010061623 Adverse drug reaction Diseases 0.000 description 1
- 206010067484 Adverse reaction Diseases 0.000 description 1
- 208000030453 Drug-Related Side Effects and Adverse reaction Diseases 0.000 description 1
- 230000006838 adverse reaction Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000012937 correction Methods 0.000 description 1
- 239000003814 drug Substances 0.000 description 1
- 229940079593 drug Drugs 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000002349 favourable effect Effects 0.000 description 1
- 230000005389 magnetism Effects 0.000 description 1
- 238000000034 method Methods 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 238000003908 quality control method Methods 0.000 description 1
- 238000012549 training Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/10—Office automation; Time management
- G06Q10/103—Workflow collaboration or project management
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/60—Protecting data
- G06F21/62—Protecting access to data via a platform, e.g. using keys or access control rules
- G06F21/6218—Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
- G06F40/295—Named entity recognition
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- Business, Economics & Management (AREA)
- Strategic Management (AREA)
- Human Resources & Organizations (AREA)
- General Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Entrepreneurship & Innovation (AREA)
- Artificial Intelligence (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computer Security & Cryptography (AREA)
- Software Systems (AREA)
- Computer Hardware Design (AREA)
- Bioethics (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Economics (AREA)
- Marketing (AREA)
- Operations Research (AREA)
- Quality & Reliability (AREA)
- Tourism & Hospitality (AREA)
- General Business, Economics & Management (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention discloses a data annotation platform, wherein a user interface module of the data annotation platform is provided with a team management interface, so that team management is facilitated, annotation tasks are completed by teams, and the data annotation platform is good in practicability; the role authority management unit is convenient for setting whether the team member has the authority of submitting the final draft for the marked task, the members without the authority of submitting the final draft can store the completed task through the task storage function key, and the members with the authority can submit the task through the function key of submitting the final draft, so that the practicability is good; the setting of the project setting unit is convenient for creating projects, establishing team projects and selecting team members, so that task assignment is supported, standard project team management is provided, a team is helped to complete annotation task cooperation more easily, and the practicability is good; the setting of the task submitting function key in the labeling task management unit is convenient for setting whether the team member has the submitting function to the labeled task or not, and the member with authority can finally submit the task, so that better management control is facilitated.
Description
Technical Field
The invention relates to a data annotation platform.
Background
With the development of deep learning in recent years, some great breakthroughs appear in the aspects of computer vision, natural voice processing and the like, and with the rapid development of a deep learning method, the demand of data annotation is greatly increased; the existing labeling tasks are all completed by users independently, and a team mode is not adopted, so how to overcome the existing defects becomes an important subject to be solved urgently by technical personnel in the field.
Disclosure of Invention
The invention overcomes the defects of the technology and provides a data annotation platform.
In order to achieve the purpose, the invention adopts the following technical scheme:
a data annotation platform comprises a user management module, a data annotation management module, a data center module, a data acquisition unit and a user interface module, wherein the user management module is used for user authentication and authority management, the data annotation management module is used for data annotation management, the data center module is used for data storage, the data acquisition unit is used for acquiring input data which are respectively in data communication with the data annotation management module, the data center module and the user interface module, the data annotation management module is in data communication with the user management module and the data center module, the user interface module is in data communication with the user management module, the user interface module comprises an annotation workbench interface, a project task interface, a data source management interface, a team management interface and a basic setting interface, the data annotation management module comprises a task team member management unit and a role authority management unit which are used for displaying in the team management interface, the role authority management unit has the authority of setting whether a team member has a final draft submission to a marked task, the data annotation management module comprises a task storage function key and a final draft submission function key which are used for displaying in the annotation workbench interface, a project setting unit and a marking task management unit which are used for displaying in the project task interface, the final draft submission function key of a user without the final draft submission authority is gray or hidden, the project setting unit is provided with the functions of creating projects, building team projects and selecting team members, each project can be provided with a plurality of tasks, and the marking task management unit is provided with a task management list and a task submission function key for submitting tasks, the task submission function keys of users without task submission authority are grayed out or hidden.
Preferably, a data set is arranged in the data center module, and the data labeling management module includes a data set management unit corresponding to the data set and used for operating and displaying in the data source management interface.
Preferably, a corpus and a graph database are further arranged in the data center module, the data labeling management module comprises a data acquisition unit for operating and displaying in the data source management interface, and the data acquisition unit is provided with an acquisition source list and a function key for adding an acquisition source.
Preferably, the data labeling management module comprises a labeling scheme template management unit and an auxiliary tool setting unit, wherein the labeling scheme template management unit and the auxiliary tool setting unit are used for operating and displaying in the basic setting interface, and a labeling scheme template list which is used after being opened and a function key with a user-defined labeling scheme are arranged in the labeling scheme template management unit.
Preferably, the data annotation management module further comprises a former reference list, an entity label list, a shortcut key list, a relationship label list and an annotation status display unit, which are used for operating and displaying in the annotation workbench interface, and an annotation guide setting unit and an annotation scheme setting unit, which are used for operating and displaying in the project task interface, wherein the annotation status display unit displays the former reference list, the entity label list, the shortcut key list, the relationship label list and the annotation data in the annotation workbench interface.
Preferably, the auxiliary tool setting unit comprises an entity extraction tool setting and an automatic word segmentation tool setting, the entity extraction tool setting is a setting for judging whether to perform magnetic labeling on an entity in the extracted text or not when the data is labeled, and the automatic word segmentation tool setting is a setting for judging whether to automatically perform word segmentation on the labeled text or not when the data is labeled.
Compared with the prior art, the invention has the beneficial effects that:
the user interface module of the data annotation platform is provided with a team management interface, so that team management is facilitated, annotation tasks are completed by teams, and the practicability is good; the role authority management unit is convenient for setting whether the team member has the authority of submitting the final draft for the marked task, the members without the authority of submitting the final draft can store the completed task through the task storage function key, and the members with the authority can submit the task through the function key of submitting the final draft, so that the practicability is good; the setting of the project setting unit is convenient for creating projects, establishing team projects and selecting team members, so that task assignment is supported, standard project team management is provided, a team is helped to complete annotation task cooperation more easily, and the practicability is good; the setting of the task submitting function key in the labeling task management unit is convenient for setting whether a team member has a submitting function to the labeled task or not, and the member with authority can finally submit the task, so that better management control is facilitated.
Drawings
Fig. 1 is a schematic structural diagram of the present disclosure.
FIG. 2 is a diagram of some key elements or function keys in the data annotation management module in the user interface module.
Detailed Description
The features of the present invention and other related features are further described in detail below by way of examples to facilitate understanding by those skilled in the art:
as shown in fig. 1 to 2, a data annotation platform comprises a user management module, a data annotation management module, a data center module, a data acquisition unit and a user interface module, wherein the user management module is used for user authentication and authority management, the data annotation management module is used for data annotation management, the data center module is used for data storage, the data acquisition unit is used for acquiring input data, and the data acquisition unit is respectively in data communication with the data annotation management module, the data center module and the user interface module, the data annotation management module is in data communication with the user management module and the data center module, and the user interface module is in data communication with the user management module, wherein the user interface module comprises an annotation workbench interface, a project task interface, a data source management interface, a team management interface, And a basic setting interface, wherein the data annotation management module comprises a task team member management unit and a role authority management unit which are used for operating and displaying in the team management interface, the role authority management unit has the authority of setting whether a team member has a final draft submission to a marked task, the data annotation management module comprises a task saving function key and a final draft submission function key which are used for operating and displaying in the annotation workbench interface, a project setting unit and a marking task management unit which are used for operating and displaying in the project task interface, the final draft submission function key of a user without the final draft submission authority is gray or hidden, the project setting unit is provided with the functions of creating projects, creating team projects and selecting team members, each project can be provided with a plurality of tasks, and the marking task management unit is provided with a task management list and a task submission function key for submitting tasks, the task submission function keys of users without task submission authority are grayed out or hidden.
As described above, the user interface module of the annotation platform of the present disclosure is provided with a team management interface, which is convenient for team management, and the annotation task is completed by a team, so that the practicability is good; the role authority management unit is convenient for setting whether the team member has the authority of submitting the final draft for the marked task, the members without the authority of submitting the final draft can store the completed task through the task storage function key, and the members with the authority can submit the task through the function key of submitting the final draft, so that the practicability is good; the setting of the project setting unit is convenient for creating projects, establishing team projects and selecting team members, so that task assignment is supported, standard project team management is provided, a team is helped to complete annotation task cooperation more easily, and the practicability is good; the setting of the task submitting function key in the labeling task management unit is convenient for setting whether a team member has a submitting function to the labeled task or not, and the member with authority can finally submit the task, so that better management control is facilitated.
As described above, the data center module is provided with the data set, and the data annotation management module includes the data set management unit corresponding to the data set and used for operating and displaying in the data source management interface.
As described above, in the specific implementation, the data center module is further provided with a corpus and a graph database, the data labeling management module includes a data acquisition unit for operating and displaying in the data source management interface, and the data acquisition unit is provided with an acquisition source list and a function key for adding an acquisition source.
As described above, in specific implementation, the data labeling management module includes a labeling scheme template management unit and an auxiliary tool setting unit for operating and displaying in the basic setting interface, and the labeling scheme template management unit is provided with a list of labeling scheme templates ready for use when opening a box and a function key with a customized labeling scheme, so that the data labeling management module is more convenient to use.
As described above, in specific implementation, the data annotation management module further includes a precedent reference list, an entity tag list, a shortcut key list, a relationship tag list, and an annotation status display unit for performing operation and display in the annotation workbench interface, and an annotation guide setting unit and an annotation scheme setting unit for performing operation and display in the project task interface, where the annotation status display unit displays the precedent reference list, the entity tag list, the shortcut key list, the relationship tag list, and the annotation data in the annotation workbench interface.
As mentioned above, before data labeling, a team firstly makes a labeling scheme, can add a labeling scheme template, set a label set and define each label shortcut key, which is beneficial to better labeling work.
As above, during concrete implementation, appurtenance setting unit is including entity extraction tool setting and automatic word segmentation tool setting, entity extraction tool sets up the setting of whether carrying out the magnetism mark to the entity in extracting the text when the data mark, automatic word segmentation tool sets up the setting of whether carrying out the word segmentation to the mark text voluntarily when the data mark, so, be favorable to carrying out automatic word segmentation of AI and the automatic mark of AI at the during operation to reduce some work load, later carry out artifical quality control and correction again, accomplish man-machine cooperation mark, the practicality is good.
As described above, in the specific implementation, high-quality labeling is performed on the data of the content operation platform, so that the public service level and efficiency can be improved. Data annotation is carried out on commodity data, searched commodity contents, sentence contexts and the like of the e-commerce industry, and an accurate user portrait can be established through an intelligent recommendation system, so that commodities which are more in line with interests of the user are recommended for the user, and the conversion rate is effectively improved. In the adverse drug reaction early warning system, the text content of the drug indications is labeled, so that a high-quality data set is provided for the training of an adverse reaction prediction algorithm, and the accuracy of the algorithm is improved.
As described above, the present disclosure is directed to a data annotation platform, and all technical solutions that are the same as or similar to the present disclosure should be considered as falling within the scope of the present disclosure.
Claims (6)
1. A data annotation platform is characterized by comprising a user management module, a data annotation management module, a data center module, a data acquisition unit and a user interface module, wherein the user management module is used for user authentication and authority management, the data annotation management module is used for data annotation management, the data center module is used for data storage, the data acquisition unit is used for acquiring input data which are respectively in data communication with the data annotation management module, the data center module and the user interface module, the data annotation management module is in data communication with the user management module and the data center module, and the user interface module is in data communication with the user management module, wherein the user interface module comprises an annotation workbench interface, a project task interface, a data source management interface, a team management interface, a data storage module, a data, And a basic setting interface, wherein the data annotation management module comprises a task team member management unit and a role authority management unit which are used for operating and displaying in the team management interface, the role authority management unit has the authority of setting whether a team member has a final draft submission to a marked task, the data annotation management module comprises a task saving function key and a final draft submission function key which are used for operating and displaying in the annotation workbench interface, a project setting unit and a marking task management unit which are used for operating and displaying in the project task interface, the final draft submission function key of a user without the final draft submission authority is gray or hidden, the project setting unit is provided with the functions of creating projects, creating team projects and selecting team members, each project can be provided with a plurality of tasks, and the marking task management unit is provided with a task management list and a task submission function key for submitting tasks, the task submission function keys of users without task submission authority are grayed out or hidden.
2. The data annotation platform of claim 1, wherein the data center module comprises a data set, and the data annotation management module comprises a data set management unit corresponding to the data set and configured to operate and display in the data source management interface.
3. The data annotation platform according to claim 2, wherein a corpus database and a graph database are further disposed in the data center module, the data annotation management module includes a data acquisition unit for operating and displaying in the data source management interface, and the data acquisition unit is provided with a list of acquisition sources and function keys for adding acquisition sources.
4. The data annotation platform of any one of claims 1 to 3, wherein the data annotation management module comprises an annotation scheme template management unit and an auxiliary tool setting unit for operating and displaying in the basic setting interface, and the annotation scheme template management unit is provided with an out-of-box annotation scheme template list and a function key with a custom annotation scheme.
5. The data annotation platform of claim 4, wherein the data annotation management module further comprises a precedent reference list, an entity label list, a shortcut key list, a relationship label list, and an annotation status display unit for displaying in the annotation workbench interface, and an annotation guide setting unit and an annotation scheme setting unit for displaying in the project task interface, wherein the annotation status display unit displays the precedent reference list, the entity label list, the shortcut key list, the relationship label list, and the annotation data in the annotation workbench interface.
6. The data annotation platform of claim 4, wherein the auxiliary tool setting unit comprises an entity extraction tool setting and an automatic word segmentation tool setting, the entity extraction tool setting is a setting for determining whether to magnetically label the entity in the extracted text during data annotation, and the automatic word segmentation tool setting is a setting for determining whether to automatically segment the annotated text during data annotation.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011520984.8A CN112651710A (en) | 2020-12-21 | 2020-12-21 | Data annotation platform |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011520984.8A CN112651710A (en) | 2020-12-21 | 2020-12-21 | Data annotation platform |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112651710A true CN112651710A (en) | 2021-04-13 |
Family
ID=75358691
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011520984.8A Pending CN112651710A (en) | 2020-12-21 | 2020-12-21 | Data annotation platform |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112651710A (en) |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20030187821A1 (en) * | 2002-03-29 | 2003-10-02 | Todd Cotton | Enterprise framework and applications supporting meta-data and data traceability requirements |
CN110717317A (en) * | 2019-09-12 | 2020-01-21 | 中国科学院自动化研究所 | On-line artificial Chinese text marking system |
CN111159494A (en) * | 2019-12-30 | 2020-05-15 | 北京航天云路有限公司 | Multi-user concurrent processing data labeling method |
-
2020
- 2020-12-21 CN CN202011520984.8A patent/CN112651710A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20030187821A1 (en) * | 2002-03-29 | 2003-10-02 | Todd Cotton | Enterprise framework and applications supporting meta-data and data traceability requirements |
CN110717317A (en) * | 2019-09-12 | 2020-01-21 | 中国科学院自动化研究所 | On-line artificial Chinese text marking system |
CN111159494A (en) * | 2019-12-30 | 2020-05-15 | 北京航天云路有限公司 | Multi-user concurrent processing data labeling method |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN104899304B (en) | Name entity recognition method and device | |
CN106777275B (en) | Entity attribute and property value extracting method based on more granularity semantic chunks | |
CN113220836B (en) | Training method and device for sequence annotation model, electronic equipment and storage medium | |
CN110705265A (en) | Contract clause risk identification method and device | |
CN111506696A (en) | Information extraction method and device based on small number of training samples | |
CN112100422A (en) | Engineering drawing processing method, device, equipment and storage medium | |
CN112836018A (en) | Method and device for processing emergency plan | |
CN112364145A (en) | Work order processing method and device, electronic equipment and storage medium | |
CN116245177A (en) | Geographic environment knowledge graph automatic construction method and system and readable storage medium | |
CN108491384A (en) | A kind of auxiliary writing system of patent application document | |
CN112651710A (en) | Data annotation platform | |
CN112182157A (en) | Training method of online sequence labeling model, online labeling method and related equipment | |
CN115113919B (en) | Software scale measurement intelligent informatization system based on BERT model and Web technology | |
CN116090560A (en) | Knowledge graph establishment method, device and system based on teaching materials | |
CN115730603A (en) | Information extraction method, device, equipment and storage medium based on artificial intelligence | |
CN112052652B (en) | Automatic generation method and device for electronic courseware script | |
CN112488593B (en) | Auxiliary bid evaluation system and method for bidding | |
CN114996494A (en) | Image processing method, image processing device, electronic equipment and storage medium | |
CN111612437B (en) | Audit operation guiding method and device | |
CN114860901A (en) | Knowledge graph construction method based on ancient book information and question and answer system | |
CN110837735B (en) | Intelligent data analysis and identification method and system | |
CN115248970A (en) | Parameterized drawing frame construction method for AutoCAD | |
CN117332761B (en) | PDF document intelligent identification marking system | |
CN113487698B (en) | Form generation method and device based on two-channel neural network model | |
CN101350074A (en) | Method for comparing and recording product number simultaneously |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |