CN112651710A - Data annotation platform - Google Patents

Data annotation platform Download PDF

Info

Publication number
CN112651710A
CN112651710A CN202011520984.8A CN202011520984A CN112651710A CN 112651710 A CN112651710 A CN 112651710A CN 202011520984 A CN202011520984 A CN 202011520984A CN 112651710 A CN112651710 A CN 112651710A
Authority
CN
China
Prior art keywords
data
annotation
management
task
interface
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011520984.8A
Other languages
Chinese (zh)
Inventor
黄国骏
朱晓杰
黄永健
张青会
周石龙
王勇泽
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Xuanyuan Network & Technology Co ltd
Original Assignee
Guangdong Xuanyuan Network & Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Xuanyuan Network & Technology Co ltd filed Critical Guangdong Xuanyuan Network & Technology Co ltd
Priority to CN202011520984.8A priority Critical patent/CN112651710A/en
Publication of CN112651710A publication Critical patent/CN112651710A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/10Office automation; Time management
    • G06Q10/103Workflow collaboration or project management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • Strategic Management (AREA)
  • Human Resources & Organizations (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computer Security & Cryptography (AREA)
  • Software Systems (AREA)
  • Computer Hardware Design (AREA)
  • Bioethics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a data annotation platform, wherein a user interface module of the data annotation platform is provided with a team management interface, so that team management is facilitated, annotation tasks are completed by teams, and the data annotation platform is good in practicability; the role authority management unit is convenient for setting whether the team member has the authority of submitting the final draft for the marked task, the members without the authority of submitting the final draft can store the completed task through the task storage function key, and the members with the authority can submit the task through the function key of submitting the final draft, so that the practicability is good; the setting of the project setting unit is convenient for creating projects, establishing team projects and selecting team members, so that task assignment is supported, standard project team management is provided, a team is helped to complete annotation task cooperation more easily, and the practicability is good; the setting of the task submitting function key in the labeling task management unit is convenient for setting whether the team member has the submitting function to the labeled task or not, and the member with authority can finally submit the task, so that better management control is facilitated.

Description

Data annotation platform
Technical Field
The invention relates to a data annotation platform.
Background
With the development of deep learning in recent years, some great breakthroughs appear in the aspects of computer vision, natural voice processing and the like, and with the rapid development of a deep learning method, the demand of data annotation is greatly increased; the existing labeling tasks are all completed by users independently, and a team mode is not adopted, so how to overcome the existing defects becomes an important subject to be solved urgently by technical personnel in the field.
Disclosure of Invention
The invention overcomes the defects of the technology and provides a data annotation platform.
In order to achieve the purpose, the invention adopts the following technical scheme:
a data annotation platform comprises a user management module, a data annotation management module, a data center module, a data acquisition unit and a user interface module, wherein the user management module is used for user authentication and authority management, the data annotation management module is used for data annotation management, the data center module is used for data storage, the data acquisition unit is used for acquiring input data which are respectively in data communication with the data annotation management module, the data center module and the user interface module, the data annotation management module is in data communication with the user management module and the data center module, the user interface module is in data communication with the user management module, the user interface module comprises an annotation workbench interface, a project task interface, a data source management interface, a team management interface and a basic setting interface, the data annotation management module comprises a task team member management unit and a role authority management unit which are used for displaying in the team management interface, the role authority management unit has the authority of setting whether a team member has a final draft submission to a marked task, the data annotation management module comprises a task storage function key and a final draft submission function key which are used for displaying in the annotation workbench interface, a project setting unit and a marking task management unit which are used for displaying in the project task interface, the final draft submission function key of a user without the final draft submission authority is gray or hidden, the project setting unit is provided with the functions of creating projects, building team projects and selecting team members, each project can be provided with a plurality of tasks, and the marking task management unit is provided with a task management list and a task submission function key for submitting tasks, the task submission function keys of users without task submission authority are grayed out or hidden.
Preferably, a data set is arranged in the data center module, and the data labeling management module includes a data set management unit corresponding to the data set and used for operating and displaying in the data source management interface.
Preferably, a corpus and a graph database are further arranged in the data center module, the data labeling management module comprises a data acquisition unit for operating and displaying in the data source management interface, and the data acquisition unit is provided with an acquisition source list and a function key for adding an acquisition source.
Preferably, the data labeling management module comprises a labeling scheme template management unit and an auxiliary tool setting unit, wherein the labeling scheme template management unit and the auxiliary tool setting unit are used for operating and displaying in the basic setting interface, and a labeling scheme template list which is used after being opened and a function key with a user-defined labeling scheme are arranged in the labeling scheme template management unit.
Preferably, the data annotation management module further comprises a former reference list, an entity label list, a shortcut key list, a relationship label list and an annotation status display unit, which are used for operating and displaying in the annotation workbench interface, and an annotation guide setting unit and an annotation scheme setting unit, which are used for operating and displaying in the project task interface, wherein the annotation status display unit displays the former reference list, the entity label list, the shortcut key list, the relationship label list and the annotation data in the annotation workbench interface.
Preferably, the auxiliary tool setting unit comprises an entity extraction tool setting and an automatic word segmentation tool setting, the entity extraction tool setting is a setting for judging whether to perform magnetic labeling on an entity in the extracted text or not when the data is labeled, and the automatic word segmentation tool setting is a setting for judging whether to automatically perform word segmentation on the labeled text or not when the data is labeled.
Compared with the prior art, the invention has the beneficial effects that:
the user interface module of the data annotation platform is provided with a team management interface, so that team management is facilitated, annotation tasks are completed by teams, and the practicability is good; the role authority management unit is convenient for setting whether the team member has the authority of submitting the final draft for the marked task, the members without the authority of submitting the final draft can store the completed task through the task storage function key, and the members with the authority can submit the task through the function key of submitting the final draft, so that the practicability is good; the setting of the project setting unit is convenient for creating projects, establishing team projects and selecting team members, so that task assignment is supported, standard project team management is provided, a team is helped to complete annotation task cooperation more easily, and the practicability is good; the setting of the task submitting function key in the labeling task management unit is convenient for setting whether a team member has a submitting function to the labeled task or not, and the member with authority can finally submit the task, so that better management control is facilitated.
Drawings
Fig. 1 is a schematic structural diagram of the present disclosure.
FIG. 2 is a diagram of some key elements or function keys in the data annotation management module in the user interface module.
Detailed Description
The features of the present invention and other related features are further described in detail below by way of examples to facilitate understanding by those skilled in the art:
as shown in fig. 1 to 2, a data annotation platform comprises a user management module, a data annotation management module, a data center module, a data acquisition unit and a user interface module, wherein the user management module is used for user authentication and authority management, the data annotation management module is used for data annotation management, the data center module is used for data storage, the data acquisition unit is used for acquiring input data, and the data acquisition unit is respectively in data communication with the data annotation management module, the data center module and the user interface module, the data annotation management module is in data communication with the user management module and the data center module, and the user interface module is in data communication with the user management module, wherein the user interface module comprises an annotation workbench interface, a project task interface, a data source management interface, a team management interface, And a basic setting interface, wherein the data annotation management module comprises a task team member management unit and a role authority management unit which are used for operating and displaying in the team management interface, the role authority management unit has the authority of setting whether a team member has a final draft submission to a marked task, the data annotation management module comprises a task saving function key and a final draft submission function key which are used for operating and displaying in the annotation workbench interface, a project setting unit and a marking task management unit which are used for operating and displaying in the project task interface, the final draft submission function key of a user without the final draft submission authority is gray or hidden, the project setting unit is provided with the functions of creating projects, creating team projects and selecting team members, each project can be provided with a plurality of tasks, and the marking task management unit is provided with a task management list and a task submission function key for submitting tasks, the task submission function keys of users without task submission authority are grayed out or hidden.
As described above, the user interface module of the annotation platform of the present disclosure is provided with a team management interface, which is convenient for team management, and the annotation task is completed by a team, so that the practicability is good; the role authority management unit is convenient for setting whether the team member has the authority of submitting the final draft for the marked task, the members without the authority of submitting the final draft can store the completed task through the task storage function key, and the members with the authority can submit the task through the function key of submitting the final draft, so that the practicability is good; the setting of the project setting unit is convenient for creating projects, establishing team projects and selecting team members, so that task assignment is supported, standard project team management is provided, a team is helped to complete annotation task cooperation more easily, and the practicability is good; the setting of the task submitting function key in the labeling task management unit is convenient for setting whether a team member has a submitting function to the labeled task or not, and the member with authority can finally submit the task, so that better management control is facilitated.
As described above, the data center module is provided with the data set, and the data annotation management module includes the data set management unit corresponding to the data set and used for operating and displaying in the data source management interface.
As described above, in the specific implementation, the data center module is further provided with a corpus and a graph database, the data labeling management module includes a data acquisition unit for operating and displaying in the data source management interface, and the data acquisition unit is provided with an acquisition source list and a function key for adding an acquisition source.
As described above, in specific implementation, the data labeling management module includes a labeling scheme template management unit and an auxiliary tool setting unit for operating and displaying in the basic setting interface, and the labeling scheme template management unit is provided with a list of labeling scheme templates ready for use when opening a box and a function key with a customized labeling scheme, so that the data labeling management module is more convenient to use.
As described above, in specific implementation, the data annotation management module further includes a precedent reference list, an entity tag list, a shortcut key list, a relationship tag list, and an annotation status display unit for performing operation and display in the annotation workbench interface, and an annotation guide setting unit and an annotation scheme setting unit for performing operation and display in the project task interface, where the annotation status display unit displays the precedent reference list, the entity tag list, the shortcut key list, the relationship tag list, and the annotation data in the annotation workbench interface.
As mentioned above, before data labeling, a team firstly makes a labeling scheme, can add a labeling scheme template, set a label set and define each label shortcut key, which is beneficial to better labeling work.
As above, during concrete implementation, appurtenance setting unit is including entity extraction tool setting and automatic word segmentation tool setting, entity extraction tool sets up the setting of whether carrying out the magnetism mark to the entity in extracting the text when the data mark, automatic word segmentation tool sets up the setting of whether carrying out the word segmentation to the mark text voluntarily when the data mark, so, be favorable to carrying out automatic word segmentation of AI and the automatic mark of AI at the during operation to reduce some work load, later carry out artifical quality control and correction again, accomplish man-machine cooperation mark, the practicality is good.
As described above, in the specific implementation, high-quality labeling is performed on the data of the content operation platform, so that the public service level and efficiency can be improved. Data annotation is carried out on commodity data, searched commodity contents, sentence contexts and the like of the e-commerce industry, and an accurate user portrait can be established through an intelligent recommendation system, so that commodities which are more in line with interests of the user are recommended for the user, and the conversion rate is effectively improved. In the adverse drug reaction early warning system, the text content of the drug indications is labeled, so that a high-quality data set is provided for the training of an adverse reaction prediction algorithm, and the accuracy of the algorithm is improved.
As described above, the present disclosure is directed to a data annotation platform, and all technical solutions that are the same as or similar to the present disclosure should be considered as falling within the scope of the present disclosure.

Claims (6)

1. A data annotation platform is characterized by comprising a user management module, a data annotation management module, a data center module, a data acquisition unit and a user interface module, wherein the user management module is used for user authentication and authority management, the data annotation management module is used for data annotation management, the data center module is used for data storage, the data acquisition unit is used for acquiring input data which are respectively in data communication with the data annotation management module, the data center module and the user interface module, the data annotation management module is in data communication with the user management module and the data center module, and the user interface module is in data communication with the user management module, wherein the user interface module comprises an annotation workbench interface, a project task interface, a data source management interface, a team management interface, a data storage module, a data, And a basic setting interface, wherein the data annotation management module comprises a task team member management unit and a role authority management unit which are used for operating and displaying in the team management interface, the role authority management unit has the authority of setting whether a team member has a final draft submission to a marked task, the data annotation management module comprises a task saving function key and a final draft submission function key which are used for operating and displaying in the annotation workbench interface, a project setting unit and a marking task management unit which are used for operating and displaying in the project task interface, the final draft submission function key of a user without the final draft submission authority is gray or hidden, the project setting unit is provided with the functions of creating projects, creating team projects and selecting team members, each project can be provided with a plurality of tasks, and the marking task management unit is provided with a task management list and a task submission function key for submitting tasks, the task submission function keys of users without task submission authority are grayed out or hidden.
2. The data annotation platform of claim 1, wherein the data center module comprises a data set, and the data annotation management module comprises a data set management unit corresponding to the data set and configured to operate and display in the data source management interface.
3. The data annotation platform according to claim 2, wherein a corpus database and a graph database are further disposed in the data center module, the data annotation management module includes a data acquisition unit for operating and displaying in the data source management interface, and the data acquisition unit is provided with a list of acquisition sources and function keys for adding acquisition sources.
4. The data annotation platform of any one of claims 1 to 3, wherein the data annotation management module comprises an annotation scheme template management unit and an auxiliary tool setting unit for operating and displaying in the basic setting interface, and the annotation scheme template management unit is provided with an out-of-box annotation scheme template list and a function key with a custom annotation scheme.
5. The data annotation platform of claim 4, wherein the data annotation management module further comprises a precedent reference list, an entity label list, a shortcut key list, a relationship label list, and an annotation status display unit for displaying in the annotation workbench interface, and an annotation guide setting unit and an annotation scheme setting unit for displaying in the project task interface, wherein the annotation status display unit displays the precedent reference list, the entity label list, the shortcut key list, the relationship label list, and the annotation data in the annotation workbench interface.
6. The data annotation platform of claim 4, wherein the auxiliary tool setting unit comprises an entity extraction tool setting and an automatic word segmentation tool setting, the entity extraction tool setting is a setting for determining whether to magnetically label the entity in the extracted text during data annotation, and the automatic word segmentation tool setting is a setting for determining whether to automatically segment the annotated text during data annotation.
CN202011520984.8A 2020-12-21 2020-12-21 Data annotation platform Pending CN112651710A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011520984.8A CN112651710A (en) 2020-12-21 2020-12-21 Data annotation platform

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011520984.8A CN112651710A (en) 2020-12-21 2020-12-21 Data annotation platform

Publications (1)

Publication Number Publication Date
CN112651710A true CN112651710A (en) 2021-04-13

Family

ID=75358691

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011520984.8A Pending CN112651710A (en) 2020-12-21 2020-12-21 Data annotation platform

Country Status (1)

Country Link
CN (1) CN112651710A (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030187821A1 (en) * 2002-03-29 2003-10-02 Todd Cotton Enterprise framework and applications supporting meta-data and data traceability requirements
CN110717317A (en) * 2019-09-12 2020-01-21 中国科学院自动化研究所 On-line artificial Chinese text marking system
CN111159494A (en) * 2019-12-30 2020-05-15 北京航天云路有限公司 Multi-user concurrent processing data labeling method

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030187821A1 (en) * 2002-03-29 2003-10-02 Todd Cotton Enterprise framework and applications supporting meta-data and data traceability requirements
CN110717317A (en) * 2019-09-12 2020-01-21 中国科学院自动化研究所 On-line artificial Chinese text marking system
CN111159494A (en) * 2019-12-30 2020-05-15 北京航天云路有限公司 Multi-user concurrent processing data labeling method

Similar Documents

Publication Publication Date Title
CN104899304B (en) Name entity recognition method and device
CN106777275B (en) Entity attribute and property value extracting method based on more granularity semantic chunks
CN113220836B (en) Training method and device for sequence annotation model, electronic equipment and storage medium
CN110705265A (en) Contract clause risk identification method and device
CN111506696A (en) Information extraction method and device based on small number of training samples
CN112100422A (en) Engineering drawing processing method, device, equipment and storage medium
CN112836018A (en) Method and device for processing emergency plan
CN112364145A (en) Work order processing method and device, electronic equipment and storage medium
CN116245177A (en) Geographic environment knowledge graph automatic construction method and system and readable storage medium
CN108491384A (en) A kind of auxiliary writing system of patent application document
CN112651710A (en) Data annotation platform
CN112182157A (en) Training method of online sequence labeling model, online labeling method and related equipment
CN115113919B (en) Software scale measurement intelligent informatization system based on BERT model and Web technology
CN116090560A (en) Knowledge graph establishment method, device and system based on teaching materials
CN115730603A (en) Information extraction method, device, equipment and storage medium based on artificial intelligence
CN112052652B (en) Automatic generation method and device for electronic courseware script
CN112488593B (en) Auxiliary bid evaluation system and method for bidding
CN114996494A (en) Image processing method, image processing device, electronic equipment and storage medium
CN111612437B (en) Audit operation guiding method and device
CN114860901A (en) Knowledge graph construction method based on ancient book information and question and answer system
CN110837735B (en) Intelligent data analysis and identification method and system
CN115248970A (en) Parameterized drawing frame construction method for AutoCAD
CN117332761B (en) PDF document intelligent identification marking system
CN113487698B (en) Form generation method and device based on two-channel neural network model
CN101350074A (en) Method for comparing and recording product number simultaneously

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination