CN110308931B

CN110308931B - Data processing method and related device

Info

Publication number: CN110308931B
Application number: CN201910537998.1A
Authority: CN
Inventors: 侯丽; 秦丽丽
Original assignee: Ping An Technology Shenzhen Co Ltd
Current assignee: Ping An Technology Shenzhen Co Ltd
Priority date: 2019-06-20
Filing date: 2019-06-20
Publication date: 2024-06-07
Anticipated expiration: 2039-06-20
Also published as: CN110308931A

Abstract

The application relates to the field of data processing, and provides a data processing method and a related device. A method of data processing comprising: acquiring data of an item, including code data and annotation data; carrying out structuring processing on the annotation data to obtain structured annotation data; classifying the structured annotation data to obtain configuration annotation data and M pieces of code annotation data; inputting M pieces of code annotation data into a pre-trained frame model to obtain a frame of the project; marking M marks on M pieces of code annotation data, and acquiring M pieces of module codes of items corresponding to the M marks from the code data; extracting database connection information of items in the configuration annotation data; when the terminal receives the search instruction, a frame of the item, an M-segment module code of the item or database connection information of the item matched with the search keyword is displayed on the terminal. The technical scheme of the embodiment of the application improves the efficiency of acquiring information from the project by the new member.

Description

Data processing method and related device

Technical Field

The present application relates to the field of data processing, and in particular, to a data processing method and related apparatus.

Background

In general, in a web page project or a terminal application project, the project is divided into a plurality of modules, each module is responsible for a different person, and the responsible persons of all the modules of the project work separately and cooperate with each other.

Currently, when an item starts, if a new member joins the item, the new member needs to know and learn all contents of the item from scratch, and when the new member needs to know a certain piece of information of the item, it takes a lot of time to learn the item to acquire the information needed to be known from the item, resulting in low efficiency of acquiring the information from the item by the new member.

Disclosure of Invention

The embodiment of the application provides a data processing method and a related device, which are used for improving the efficiency of acquiring information from projects by new members.

The first aspect of the present application provides a method for data processing, comprising:

acquiring data of an item, wherein the data of the item comprises code data and annotation data;

Carrying out structuring processing on the annotation data to obtain structured annotation data;

Extracting keywords in the structured annotation data, and classifying the structured annotation data according to the keywords to obtain configuration annotation data and M code annotation data, wherein the M code annotation data are in one-to-one correspondence with M sections of module codes of the project, and M is a positive integer;

inputting the M pieces of code annotation data into a pre-trained frame model to obtain a frame of the project;

Marking M marks on the M code annotation data, and acquiring M sections of module codes of the item corresponding to the M marks from the code data;

Extracting database connection information of the items in the configuration annotation data;

when a terminal receives a search instruction, obtaining a search keyword according to a search term carried by the search instruction;

And displaying the frame of the item matched with the search keyword, M sections of module codes of the item or database connection information of the item on the terminal.

A second aspect of the present application provides an apparatus for data processing, comprising:

An acquisition unit configured to acquire data of an item, wherein the data of the item includes code data and annotation data;

the structuring processing unit is used for carrying out structuring processing on the annotation data to obtain structured annotation data;

the classification unit is used for extracting keywords in the structured annotation data, classifying the structured annotation data according to the keywords to obtain configuration annotation data and M code annotation data, wherein the M code annotation data are in one-to-one correspondence with M section module codes of the project, and M is a positive integer;

An input unit for inputting the M pieces of code annotation data into a pre-trained frame model to obtain a frame of the item;

Marking unit, which is used to mark M marks on the M code annotation data, and obtain M section module codes of the item corresponding to the M marks from the code data;

An extracting unit configured to extract database connection information of the item in the configuration annotation data;

the terminal is used for receiving a search instruction, and obtaining search keywords according to the search terms carried by the search instruction;

And the display unit is used for displaying the frame of the item matched with the search keyword, M sections of module codes of the item or database connection information of the item on the terminal.

A third aspect of the present application provides an electronic device for data processing, the electronic device comprising a processor, a memory, a communication interface and one or more programs, wherein the one or more programs are stored in the memory and configured for execution by the processor, the programs comprising instructions for performing any of the embodiments described above.

A fourth aspect of the present application provides a computer readable storage medium storing a computer program for execution by a processor to implement any one of the embodiments described above.

It can be seen that, by the data processing method and the related device provided by the application, the terminal obtains the data of the project, processes the data of the project to obtain the frame of the project, the M-section module code of the project and the database connection information of the project, when the user inputs the search term at the terminal, the terminal receives the search instruction, and then displays the frame of the project, the M-section module code of the project or the database connection information of the project on the terminal. Therefore, when a new member joins in the project, if the new member needs to know a certain piece of information of the project, the new member does not need to spend a great deal of time learning the project, only the search term is input in the terminal, then the terminal determines the information needed by the new member according to the search term, and displays the information needed by the new member, so that the time for the new member to learn the project is saved, and the efficiency of the new member to acquire the information from the project is improved.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are required to be used in the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present application, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a flow chart of a data processing method according to an embodiment of the present application;

FIG. 2 is a flowchart of another data processing method according to an embodiment of the present application;

FIG. 3 is a flowchart of another data processing method according to an embodiment of the present application;

FIG. 4 is a schematic diagram of a system architecture according to an embodiment of the present application;

FIG. 5 is a schematic diagram of project data according to an embodiment of the present application;

FIG. 6 is a schematic diagram of M-section module codes for obtaining an item according to M-section code annotation data according to an embodiment of the present application;

FIG. 7 is a schematic diagram of an apparatus for data processing according to an embodiment of the present application;

Fig. 8 is a schematic structural diagram of an electronic device in a hardware running environment according to an embodiment of the present application.

Detailed Description

The data processing method and the related device provided by the embodiment of the application are used for improving the efficiency of acquiring information from the project by the new member.

In order that those skilled in the art will better understand the present application, a technical solution in the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in which it is apparent that the described embodiments are only some embodiments of the present application, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present application without making any inventive effort, shall fall within the scope of the present application.

The terms "first," "second," "third," "fourth," and the like in the description and in the claims and drawings are used for distinguishing between different objects and not necessarily for describing a particular sequential or chronological order. Furthermore, the terms "comprise" and "have," as well as any variations thereof, are intended to cover a non-exclusive inclusion. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those listed steps or elements but may include other steps or elements not listed or inherent to such process, method, article, or apparatus.

The following describes embodiments of the present application in detail.

In the embodiment of the application, the terminal is provided with the data processing plug-in, wherein the data processing plug-in is independent software, and the terminal calls the data processing plug-in to acquire and process the data of the project.

Referring first to fig. 1, fig. 1 is a flowchart of a data processing method according to an embodiment of the present application. As shown in fig. 1, a data processing method according to an embodiment of the present application may include:

101. data of an item is obtained, wherein the data of the item comprises code data and annotation data.

The terminal is provided with a data processing plug-in, wherein the terminal can be a mobile phone, a tablet computer, a notebook computer, a palm computer, a mobile internet device or other types of terminals.

Optionally, the terminal invokes a data processing plug-in, and obtains data of the project through a version control System (SVN), where the version control system adopts a branch management mode for multiple people to commonly develop the same project and share resources.

For a developer, when centralized code management is performed by adopting SVN, the latest version data of a project needs to be acquired from a server, then the developer enters own branches to develop, and after the completion, the own branches are combined on the project main branch of the server, so before the project data is acquired, authority authentication is required to be performed on a terminal, and the project data leakage is avoided, wherein the authority authentication mode for the terminal can be as follows:

acquiring identification information of a terminal, then sending a permission authentication message to a server by the terminal, wherein the permission authentication message carries the identification information of the terminal, judging whether the terminal has permission to acquire data of a project according to the identification information of the terminal when the server receives the permission authentication message sent by the server, determining that the terminal has permission to acquire the data of the project when the terminal receives the permission authentication passing message sent by the server, determining that the terminal does not have permission to acquire the data of the project when the terminal receives the permission authentication failing message sent by the server, and generating a prompt interface or popup window without permission to acquire the data by the terminal.

When the terminal obtains the data of the project from the server, the terminal needs to obtain the latest version data of the project from the server because the data of the project may be added, deleted or changed by a developer, and the method for obtaining the data of the project may be:

The method comprises the steps that a terminal obtains N updating times of N version data of a project from a server, wherein the N version data of the project are in one-to-one correspondence with the N updating times, N is a positive integer, and the server records the updating times when a developer uploads the version data each time.

Comparing the N updating times with the current time to obtain N time differences, wherein the N updating times correspond to the N time differences one by one, determining the minimum time difference in the N time differences, and determining the updating time corresponding to the minimum time difference in the N updating times as the latest updating time.

And acquiring version data corresponding to the latest update time from the N pieces of version data of the project, namely the latest version data of the project.

The method for acquiring the data of the project can also be as follows:

The timing time t is set, the terminal acquires the data of the project from the server at intervals of the time t, and replaces the data of the project acquired at the previous moment with the data of the project acquired at the latest, so that the acquired data can be ensured to be the latest version data of the project.

The method for acquiring the data of the project can also be as follows:

When the data of the project is monitored to change, such as adding, deleting or changing, the terminal acquires the data of the project from the server, so that the acquired data can be ensured to be the latest version data of the project.

102. And carrying out structuring processing on the annotation data to obtain structured annotation data.

In a webpage project or a terminal application project, when a developer develops the project, the developer can make corresponding code annotation after each section of module code of the project and make corresponding configuration annotation at the database connection information, so the data of the project comprise the code data of the project and the annotation data of the project.

The terminal acquires annotation data of the project from the data of the project, wherein the annotation data is unstructured data, so the terminal needs to carry out structuring processing on the annotation data to obtain structured annotation data, and the unstructured data is irregular or incomplete in data structure, has no predefined data model and is inconvenient to express by a two-dimensional logical table of a database, and comprises office documents, texts, pictures, XML, HTML, various reports, images, audios, videos and the like in all formats.

The method for the terminal to perform the structuring processing on the annotation data to obtain the structured annotation data may be:

And performing word segmentation processing on the annotation data to obtain K pieces of first annotation data, wherein K is a positive integer, calculating K joint distribution probabilities of the K pieces of first annotation data, wherein the K pieces of first annotation data are in one-to-one correspondence with the K joint distribution probabilities, determining the maximum joint distribution probability of the K joint distribution probabilities, acquiring second annotation data corresponding to the maximum joint distribution probability from the K pieces of first annotation data, and performing stop word processing on the second annotation data to obtain structured annotation data. The stop words include conjunctions and prepositions with high occurrence frequency, play an important role in connection in natural language, and form a language structure conforming to grammar together with word sequences, but in machine learning, the stop words have no utilization value and have an interference effect, so that annotation data are required to be subjected to stop word processing.

103. And extracting keywords in the structured annotation data, and classifying the structured annotation data according to the keywords to obtain configuration annotation data and M code annotation data, wherein the M code annotation data are in one-to-one correspondence with M section module codes of the project, and M is a positive integer.

In a webpage project or a terminal application project, a developer can divide the project into a plurality of modules for development when developing the project, each module is provided with a corresponding module code, each module code is provided with a corresponding code annotation, the code annotation is used for explaining the function of the module code, and the configuration annotation is used for explaining the database connection information.

When the project is divided into M modules, there are M module codes and M code annotation data corresponding to each other, and the method for classifying the structured annotation data by the terminal to obtain M code annotation data and configuration annotation data may be:

And extracting keywords in the structured annotation data, and classifying the structured annotation data according to the keywords to obtain configuration annotation data and M code annotation data, wherein the M code annotation data corresponds to M sections of module codes of the project one by one, and M is a positive integer.

Specifically, the code annotation and the configuration annotation are both written artificially, because the code annotation is used for explaining the function of the module code, the keywords in the code annotation can comprise functional keywords such as login, registration, update, deletion, addition and the like, and similarly, because the configuration annotation is used for explaining the database connection information, the keywords in the configuration annotation can comprise keywords related to a database, connection and the like, the keywords in the structured annotation data are extracted, the structured annotation data are classified according to the keywords, the code annotation data comprising the functional keywords, and the configuration annotation data comprising the keywords related to the database.

104. The M pieces of code annotation data are input into a pre-trained frame model to arrive at a frame of the project.

Specifically, M pieces of code annotation data are input into a pre-trained frame model to obtain a frame of the project, wherein when the frame model is pre-trained, a sample data set is input into the model to be trained to obtain the frame model, and the sample data set comprises a large amount of code annotation data and a corresponding frame thereof.

105. And marking M marks on the M pieces of code annotation data, and acquiring M sections of module codes of the item corresponding to the M marks from the code data.

Specifically, M header annotators of M pieces of code annotation data are obtained, wherein the M pieces of code annotation data are in one-to-one correspondence with the M header annotators, and when a developer develops a project, the code data and the annotation data are separated through the annotators.

And marking M marks on the M header annotation symbols, wherein the M header annotation symbols are in one-to-one correspondence with the M marks, and obtaining M-segment module codes of the items corresponding to the M marks from the code data, wherein the M marks are in one-to-one correspondence with the M-segment module codes of the items.

106. And extracting database connection information of the items in the configuration annotation data.

The configuration annotation is used for explaining the database connection information, so that the terminal acquires the configuration annotation data, namely, the database connection information of the item can be extracted from the configuration annotation data.

107. When the terminal receives a search instruction, obtaining search keywords according to search terms carried by the search instruction.

When a user needs to know a certain item of information of an item, the user inputs a search term at the terminal, and the terminal receives a search instruction, wherein the search instruction carries the search term input by the user.

Since the search terms input by the user may be inaccurate, a fuzzy search mode is adopted in the search, namely, a certain difference between the search terms and the search keywords is allowed, and the search can be performed through the fuzzy search as long as the preset search keyword range comprises the search keywords.

The method for obtaining the search keywords by the terminal according to the search terms carried by the search instruction can be as follows:

first keywords in the search terms are extracted, and whether the first keywords belong to a preset search keyword range is judged, wherein the preset search keyword range comprises a framework, login, registration, database, connection and the like.

If the first keyword belongs to the preset search keyword range, determining that the first keyword is the search keyword.

If the first keyword does not belong to the preset search keyword range, acquiring Q pieces of second information matched with the first keyword from a database, wherein Q is a positive integer, the database comprises a synonym dictionary base, for example, the first keyword is an architecture, the preset search keyword range does not comprise the architecture, and then the terminal acquires words matched with the architecture, including the architecture, the framework and the like, from the synonym dictionary base.

And determining information belonging to a preset search keyword range in the Q pieces of second keywords as search keywords. For example, the terminal acquires the words matched with the framework from the synonym dictionary library, including the structure, the framework and the like, and determines the framework as the search keyword because the framework belongs to the preset search keyword range.

108. And displaying the frame of the item matched with the search keyword, M sections of module codes of the item or database connection information of the item on the terminal.

After obtaining a search keyword of a user, the terminal needs to determine information needed by the user according to the search keyword, and display the information needed by the user to the user, and the method for displaying a frame of a project, M sections of module codes of the project or database connection information of the project on the terminal according to the search keyword can be as follows:

Calculating a first probability of a framework of search terms according to the search keywords, wherein a formula for calculating the first probability may be:

p (a|b) =p (b|a) =p (a)/P (B), where P (a|b) is a conditional probability of a frame of a search term under a condition that a search keyword is input, i.e., a first probability, P (b|a) is a conditional probability of information input under the condition that the frame of the search term is the search keyword, P (a) is a probability of the frame of the search term, and P (B) is a probability of inputting the search keyword.

According to the search keywords, M second probabilities of M section module codes of the search item are calculated respectively, wherein an ith section code in the M section module codes of the item corresponds to an ith second probability in the M second probabilities, i is a positive integer not greater than M, and a formula for calculating the M second probabilities can be as follows:

P (i|b) =p (b|i) =p (i)/P (B), where P (i|b) is a conditional probability of an ith code in M-segment module codes of a search term under the condition that a search keyword is input, i.e., an ith second probability, P (b|i) is a conditional probability of the search keyword as information input under the condition of the ith code in M-segment module codes of the search term, P (i) is a probability of the ith code in M-segment module codes of the search term, and P (B) is a probability of the search keyword is input.

Calculating a third probability of the database connection information of the search term according to the search keyword, wherein a formula for calculating the third probability can be:

P (c|b) =p (b|c) ×p (C)/P (B), wherein P (c|b) is a conditional probability of database connection information of a search term under the condition that a search term is input, i.e., a third probability, P (b|c) is a conditional probability of information input under the condition that database connection information of the search term is input, P (C) is a probability of database connection information of the search term, and P (B) is a probability of inputting the search term.

And comparing the first probability, the M second probabilities and the third probabilities, wherein the probabilities of inputting the search keywords are the same in one search, namely P (B) is the same, so that when comparing the first probability, the M second probabilities and the third probabilities, only P (B|A) P (A), P (B|i) P (i) and P (B|C) P (C) are compared.

When the first probability is maximum, the information which the user needs to know is a frame of the item, namely, a frame for displaying the item on the terminal.

When the ith second probability in the M second probabilities is the largest, the information which the user needs to know is the ith code in the M section module codes of the item, namely the ith code in the M section module codes of the item is displayed on the terminal.

When the third probability is the largest, the information which the user needs to know is the database connection information of the project, namely the database connection information of the project is displayed on the terminal.

Referring to fig. 2, fig. 2 is a flowchart of another data processing method according to another embodiment of the present application. As shown in fig. 2, another data processing method provided in another embodiment of the present application may include:

201. The terminal sends the authority authentication message to the server.

The terminal may be a mobile phone, a tablet computer, a notebook computer, a palm computer, a mobile internet device, or other types of terminals.

Before acquiring the data of the project, permission authentication needs to be performed on the terminal to avoid data leakage of the project, wherein the permission authentication mode for the terminal can be as follows:

202. And after the authority authentication is passed, the terminal calls a data processing plug-in unit, and the data of the project is obtained through a version control system.

The version control system adopts a branch management mode and is used for multiple persons to commonly develop the same project and common resources.

When the terminal obtains the data of the project from the server, since the data of the project may be added, deleted or changed by a developer, the terminal needs to obtain the latest version data of the project from the server, and the method for obtaining the data of the project may be:

The method for acquiring the data of the project can also be as follows:

203. The terminal acquires annotation data from the data of the item.

204. And the terminal performs structuring processing on the annotation data to obtain structured annotation data.

205. And classifying the structured annotation data to obtain configuration annotation data and M code annotation data, wherein the M code annotation data corresponds to M sections of module codes of the project one by one, and M is a positive integer.

206. And obtaining a framework of the project according to the M pieces of code annotation data.

After the M pieces of code annotation data and the configuration annotation data are obtained by the terminal, the frame of the project and the M pieces of module codes of the project can be obtained according to the M pieces of code annotation data.

The method for obtaining the frame of the project according to the M pieces of code annotation data can be as follows:

M pieces of code annotation data are input into a pre-trained frame model to obtain a frame of the project, wherein when the frame model is pre-trained, a sample data set is input into the model to be trained to obtain the frame model, and the sample data set comprises a large amount of code annotation data and a corresponding frame thereof.

207. And obtaining M sections of module codes of the project according to the M sections of code annotation data.

The method for obtaining the M sections of module codes of the item according to the M sections of code annotation data can be as follows:

M pieces of header annotation symbols of M pieces of code annotation data are obtained, wherein the M pieces of code annotation data are in one-to-one correspondence with the M pieces of header annotation symbols, and when a developer develops a project, the code data and the annotation data are separated through the annotation symbols.

208. And obtaining database connection information of the project according to the configuration annotation data.

209. And when the terminal receives the search instruction, performing fuzzy search according to the search instruction.

When a user needs to know a certain item of information of an item, the user inputs a search term at the terminal, and the terminal receives a search instruction, wherein the search instruction carries the search term input by the user, and as the search term input by the user may be inaccurate, a fuzzy search mode is adopted in search, namely, certain difference between the search term and the search keyword is allowed, and the search can be performed through fuzzy search as long as the preset search keyword range comprises the search keyword.

When the fuzzy search is performed, the search keywords need to be determined, and the method for determining the search keywords can be as follows:

If the first keyword does not belong to the preset search keyword range, Q pieces of second keywords matched with the first keyword are obtained from a database, wherein Q is a positive integer, the database comprises a synonym dictionary library, for example, the first keyword is in an architecture, the preset search keyword range does not comprise the architecture, and then the terminal obtains words matched with the architecture, including a structure, a framework and the like, from the synonym dictionary library.

And determining information belonging to a preset search keyword range in the Q pieces of second keywords as search keywords. For example, the terminal obtains words matched with the "framework" from the synonym dictionary library, including "structure", "framework", and the like, and determines the "framework" as a search keyword because the "framework" belongs to a preset search keyword range.

210. And displaying the frame of the project, M-segment module codes of the project or database connection information of the project on the terminal.

A first probability of a frame of search terms is calculated based on the search keywords.

And respectively calculating M second probabilities of M section module codes of the search item according to the search keyword, wherein the ith section code in the M section module codes of the item corresponds to the ith second probability in the M second probabilities, and i is a positive integer not more than M.

A third probability of database connection information for the search term is calculated based on the search keyword.

The first probability, the M second probabilities, and the third probability are compared.

Referring to fig. 3, fig. 3 is a flowchart of another data processing method according to another embodiment of the present application, where the item is an e-commerce website item. As shown in fig. 3, another data processing method provided in another embodiment of the present application may include:

301. The terminal sends the authority authentication message to the server.

Before acquiring the data of the e-commerce website item, authority authentication is required to be performed on the terminal, so that the data leakage of the e-commerce website item is avoided, wherein the authority authentication on the terminal can be performed by the following steps:

Acquiring identification information of a terminal, then sending a permission authentication message to a server by the terminal, wherein the permission authentication message carries the identification information of the terminal, judging whether the terminal has permission to acquire data of an E-commerce website project according to the identification information of the terminal when the server receives the permission authentication passing message sent by the server, determining that the terminal has permission to acquire the data of the project, determining that the terminal does not have permission to acquire the data of the project when the terminal receives the permission authentication failing message sent by the server, and generating a prompt interface or popup window without permission to acquire the data by the terminal.

302. And after the authority authentication is passed, the terminal calls a data processing plug-in unit, and the data of the E-commerce website project is obtained through a version control system.

When the terminal obtains the data of the e-commerce website project from the server, the terminal needs to obtain the latest version data of the project from the server because the project data may be added, deleted or changed by a developer.

303. And the terminal acquires annotation data from the data of the E-commerce website project.

In the e-commerce website project, when a developer develops, the developer can make corresponding code annotation after each section of module code of the project and make corresponding configuration annotation at the database connection information, so that the data of the project comprise the code data of the project and the annotation data of the project.

For example, the e-commerce website project is divided into 9 modules, which are respectively a background module, a commodity module, a sales module, an order module, an inventory module, a content module, a client module, a system module and a report module, and each module corresponds to a section of module code.

The terminal acquires annotation data of the item from the data of the item, and the annotation data is unstructured data, so the terminal needs to perform structuring processing on the annotation data to obtain structured annotation data.

304. And the terminal performs structuring processing on the annotation data to obtain structured annotation data.

And performing word segmentation processing on the annotation data to obtain K pieces of first annotation data, wherein K is a positive integer, calculating K joint distribution probabilities of the K pieces of first annotation data, wherein the K pieces of first annotation data are in one-to-one correspondence with the K joint distribution probabilities, determining the maximum joint distribution probability of the K joint distribution probabilities, acquiring second annotation data corresponding to the maximum joint distribution probability from the K pieces of first annotation data, and performing stop word processing on the second annotation data to obtain structured annotation data.

305. The structured annotation data is classified to obtain 9 pieces of code annotation data and configuration annotation data, wherein the 9 pieces of code annotation data are in one-to-one correspondence with 9 pieces of module code of the item.

In the e-commerce website project, a developer divides the project into 9 modules for development when developing, each module is provided with a corresponding module code, each module code is provided with a corresponding code annotation, the code annotations are used for explaining the functions of the module codes, and the configuration annotations are used for explaining the database connection information.

When the project is divided into 9 modules, the corresponding 9 modules of codes and 9 pieces of code annotation data are provided, and the method for classifying the structured annotation data by the terminal to obtain 9 pieces of code annotation data and configuration annotation data can be as follows:

and extracting keywords in the structured annotation data, and classifying the structured annotation data according to the keywords to obtain configuration annotation data and 9 pieces of code annotation data, wherein the 9 pieces of code annotation data are in one-to-one correspondence with 9 pieces of module codes of the project.

Specifically, the code annotation and the configuration annotation are both written artificially, because the code annotation is used for explaining the function of the module code, the keywords in the code annotation can comprise functional keywords such as background, commodity, sales, order, inventory, content, clients, system, report and the like, and likewise, because the configuration annotation is used for explaining the database connection information, the keywords in the configuration annotation can comprise keywords related to a database, connection and the like, the keywords in the structured annotation data are extracted, the structured annotation data are classified according to the keywords, the functional keywords are the code annotation data, and the keywords related to the database are the configuration annotation data.

306. The framework of the project is derived from 9 pieces of code annotation data.

After the 9 pieces of code annotation data and the configuration annotation data are obtained by the terminal, the frame of the project and the 9 pieces of module codes of the project can be obtained according to the 9 pieces of code annotation data.

The method for obtaining the frame of the project according to the 9 pieces of code annotation data can be as follows:

Inputting 9 pieces of code annotation data into a pre-trained frame model to obtain a frame of the project, wherein when the frame model is pre-trained, a sample data set is input into the model to be trained to obtain the frame model, and the sample data set comprises a large amount of code annotation data and a corresponding frame thereof.

307. And 9 pieces of module codes of the item are obtained according to the 9 pieces of code annotation data.

The method for obtaining the 9-segment module codes of the item according to the 9-segment code annotation data can be as follows:

9 header annotators of 9 pieces of code annotation data are obtained, wherein the 9 pieces of code annotation data are in one-to-one correspondence with the 9 header annotators, and a developer partitions the code data and the annotation data through the annotators when developing a project.

9 Labels are marked on the 9 header annotation symbols, wherein the 9 header annotation symbols are in one-to-one correspondence with the M labels, 9 section module codes of the item corresponding to the 9 labels are obtained from the code data, and the 9 labels are in one-to-one correspondence with the 9 section module codes of the item.

308. And obtaining database connection information of the E-commerce website project according to the configuration annotation data.

309. When the terminal receives the search instruction, fuzzy search is carried out according to the search word architecture information carried by the search instruction.

When a user needs to know the framework of the website project of the electronic commerce, inputting 'architecture information' at the terminal, and receiving a search instruction by the terminal, wherein the search instruction carries search words 'architecture information' input by the user, and the search words input by the user can be inaccurate, so that a fuzzy search mode is adopted in search, namely, certain difference between the search words and the search keywords is allowed, and the search can be performed through fuzzy search as long as the preset search keyword range comprises the search keywords.

Extracting a first keyword in the architecture information as an architecture, and judging that the architecture does not belong to a preset search keyword range, wherein the preset search keyword range comprises a frame, a background, commodities, sales, orders, stock, contents, clients, systems, reports, databases, connections and the like.

The first keyword is "architecture", and the preset search keyword range does not include "architecture", so that the terminal obtains words matched with "architecture" from the synonym dictionary library, including "structure", "framework" and the like.

Since the "frame" belongs to the preset search keyword range, the "frame" is determined as the search keyword.

310. And displaying the framework of the E-commerce website project on the terminal.

After the terminal obtains the search keyword framework of the user, calculating the first probability of the framework of the search item according to the search keyword framework.

And respectively calculating 9 second probabilities of 9 sections of module codes of the search item according to the search keyword framework, wherein the ith section of code in the 9 sections of module codes of the item corresponds to the ith second probability in the 9 second probabilities, and i is a positive integer not more than 9.

A third probability of database connection information for the search term is calculated based on the search keyword "framework".

The first probability is the largest, which means that the information the user needs to know is the frame of the item, i.e. the frame of the item is displayed on the terminal.

Referring to fig. 4, fig. 4 is a schematic diagram of a system architecture according to an embodiment of the present application. As shown in fig. 4, the system architecture provided in the embodiment of the present application includes a terminal and a server, where the terminal and the server establish a communication connection, and a user may obtain project data through the terminal.

Referring to fig. 5, fig. 5 is a schematic diagram of project data according to an embodiment of the present application. As shown in fig. 5, the project data includes code data and comment data, the comment data includes configuration comment data and M pieces of code comment data, and M is a positive integer.

Referring to fig. 6, fig. 6 is a schematic diagram of an M-segment module code for obtaining an item according to M-segment code annotation data according to an embodiment of the present application. Wherein, as shown in fig. 6, M header annotators of M pieces of code annotation data are obtained, wherein the M pieces of code annotation data are in one-to-one correspondence with the M header annotators; marking M marks on the M header annotation symbols, wherein the M header annotation symbols are in one-to-one correspondence with the M marks; m-segment module codes of the items corresponding to the M labels are obtained from the code data.

Referring to fig. 7, fig. 7 is a schematic diagram of an apparatus for data processing according to another embodiment of the present application. As shown in fig. 7, an apparatus for data processing according to another embodiment of the present application may include:

An acquisition unit 701 for acquiring data of an item, wherein the data of the item includes code data and annotation data;

A structuring processing unit 702, configured to perform structuring processing on the annotation data to obtain structured annotation data;

A classifying unit 703, configured to extract keywords in the structured annotation data, classify the structured annotation data according to the keywords to obtain configuration annotation data and M code annotation data, where the M code annotation data corresponds to M module codes of the item one by one, and M is a positive integer;

an input unit 704 for inputting the M pieces of code annotation data into a pre-trained frame model to obtain a frame of the item;

marking unit 705, configured to mark M pieces of code annotation data, and obtain M pieces of module codes of the item corresponding to the M pieces of marks from the code data;

An extracting unit 706, configured to extract database connection information of the item in the configuration annotation data;

a searching unit 707, configured to obtain a search keyword according to a search term carried by a search instruction when the terminal receives the search instruction;

And a display unit 708 for displaying, on the terminal, a frame of the item, an M-segment module code of the item, or database connection information of the item, which matches the search keyword.

The specific implementation of the data processing device of the present application can be found in each embodiment of the data processing method, and will not be described herein.

Referring to fig. 8, fig. 8 is a schematic diagram of an electronic device structure of a hardware running environment according to an embodiment of the present application. As shown in fig. 8, an electronic device of a hardware running environment according to an embodiment of the present application may include:

a processor 801, such as a CPU.

The memory 802, which may alternatively be a high-speed RAM memory, may also be a stable memory, such as a disk memory.

A communication interface 803 for enabling a connected communication between the processor 801 and the memory 802.

It will be appreciated by those skilled in the art that the structure of the data processing electronic device shown in fig. 8 does not constitute a limitation of the data processing electronic device and may include more or fewer components than shown, or may combine certain components, or a different arrangement of components.

As shown in fig. 8, the memory 802 may include an operating system, a network communication module, and a program for data processing. An operating system is a program of electronic device hardware and software resources that manages and controls data processing, programs that support data processing, and other software or program runs. The network communication module is used to enable communication between components within the memory 802 and with other hardware and software in the data processing electronic device.

In the electronic device for data processing shown in fig. 8, a processor 801 is configured to execute a program for data processing stored in a memory 802, implementing the steps of:

The specific implementation of the electronic device for data processing of the present application can be found in each embodiment of the above-mentioned data processing method, and will not be described herein.

Another embodiment of the present application provides a computer-readable storage medium storing a computer program that is executed by a processor to implement the steps of:

The embodiment of the computer readable storage medium of the present application can be referred to in the embodiments of the data processing method, and will not be described herein.

It should also be noted that, for simplicity of description, the foregoing method embodiments are all illustrated as a series of acts, but it should be understood and appreciated by those skilled in the art that the present application is not limited by the order of acts, as some steps may be performed in other orders or concurrently in accordance with the present application. Further, those skilled in the art will also appreciate that the embodiments described in the specification are all preferred embodiments, and that the acts and modules referred to are not necessarily required for the present application. In the foregoing embodiments, the descriptions of the embodiments are emphasized, and for parts of one embodiment that are not described in detail, reference may be made to related descriptions of other embodiments.

The above embodiments are only for illustrating the technical solution of the present application, and not for limiting the same; although the application has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit of the application.

Claims

1. A method of data processing, comprising:

carrying out structuring processing on the annotation data to obtain structured annotation data; comprising the following steps:

Word segmentation is carried out on the annotation data to obtain K pieces of first annotation data, wherein K is a positive integer; calculating K joint distribution probabilities of the K pieces of first annotation data, wherein the K pieces of first annotation data are in one-to-one correspondence with the K joint distribution probabilities; determining the maximum joint distribution probability of the K joint distribution probabilities; acquiring second annotation data corresponding to the maximum joint distribution probability from the K pieces of first annotation data; performing stop word removal processing on the second annotation data to obtain the structured annotation data;

When a terminal receives a search instruction, obtaining a search keyword according to a search term carried by the search instruction, wherein the search keyword comprises the following steps:

extracting a first keyword in the search term; judging whether the first keyword belongs to a preset search keyword range or not; if the first keyword belongs to the preset search keyword range, determining that the first keyword is the search keyword; if the first keyword does not belong to the preset search keyword range, Q second keywords matched with the first keyword are obtained from a database, wherein Q is a positive integer; determining keywords belonging to the preset search keyword range in the Q second keywords as the search keywords;

Displaying, on the terminal, a frame of the item, an M-segment module code of the item, or database connection information of the item, which matches the search keyword, including:

Calculating a first probability of searching a framework of the item according to the search keyword; respectively calculating M second probabilities of M section module codes of the item according to the search keyword, wherein an ith section code in the M section module codes of the item corresponds to an ith second probability in the M second probabilities, and i is a positive integer not more than M; calculating a third probability of searching the database connection information of the item according to the search keyword; comparing the first probability, the M second probabilities, and the third probability; displaying a frame of the item on the terminal when the first probability is maximum; displaying an ith section of code in M sections of module codes of the item on the terminal when the ith second probability in the M second probabilities is maximum; and when the third probability is maximum, displaying database connection information of the item on the terminal.

2. The method of claim 1, wherein prior to the acquiring the data of the item, comprising:

acquiring the identification information of the terminal;

transmitting a permission authentication message carrying the identification information of the terminal to a server, wherein the permission authentication message is used for indicating the server to judge whether the terminal has permission to acquire the data of the item;

and when receiving the permission authentication passing message sent by the server, determining that the terminal has permission to acquire the data of the item.

3. The method of claim 2, wherein the acquiring data for an item comprises:

invoking a data processing plug-in to acquire N updating times of N version data of the project from the server, wherein the N version data of the project corresponds to the N updating times one by one, and N is a positive integer;

comparing the N updating times with the current time to obtain N time differences, wherein the N updating times are in one-to-one correspondence with the N time differences;

Determining a minimum time difference of the N time differences;

determining the update time corresponding to the minimum time difference in the N update times as the latest update time;

And acquiring data of the item corresponding to the latest update time from the N pieces of version data of the item.

4. The method of claim 1, wherein marking the M pieces of code annotation data comprises:

M head annotators of the M pieces of code annotation data are obtained, wherein the M pieces of code annotation data are in one-to-one correspondence with the M head annotators;

and marking M marks on the M header annotators, wherein the M header annotators are in one-to-one correspondence with the M marks.

5. An apparatus for data processing, the apparatus comprising:

The structuring processing unit is used for carrying out structuring processing on the annotation data to obtain structured annotation data; comprising the following steps:

The searching unit is used for obtaining a searching keyword according to a searching word carried by a searching instruction when the terminal receives the searching instruction, and comprises the following steps:

A display unit for displaying, on the terminal, a frame of the item, an M-segment module code of the item, or database connection information of the item, which matches the search keyword, including:

6. An electronic device for data processing, characterized in that the electronic device comprises a processor, a memory, a communication interface and one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the processor, the programs comprising instructions for performing the steps of the method of any of claims 1 to 4.

7. A computer readable storage medium, characterized in that the computer readable storage medium stores a computer program, which is executed by a processor to implement the method of any one of claims 1 to 4.